Automated Crisis Content Categorization for COVID-19 Tweet Streams   [open pdf - 1MB]

From the Abstract: "Social media platforms, like Twitter, are increasingly used by billions of people internationally to share information. As such, these platforms contain vast volumes of real-time multimedia content about the world, which could be invaluable for a range of tasks such as incident tracking, damage estimation during disasters, insurance risk estimation, and more. By mining this real-time data, there are substantial economic benefits, as well as opportunities to save lives. Currently, the COVID-19 [coronavirus disease 2019] pandemic is attacking societies at an unprecedented speed and scale, forming an important use-case for social media analysis. However, the amount of information during such crisis events is vast and information normally exists in unstructured and multiple formats, making manual analysis very time consuming. Hence, in this paper, we examine how to extract valuable information from tweets related to COVID-19 automatically. For 12 geographical locations, we experiment with supervised approaches for labelling tweets into 7 crisis categories, as well as investigated automatic priority estimation, using both classical and deep learned approaches. Through evaluation using the TREC-IS [Text Retrieval Conference-Incident Streams] 2020 COVID-19 datasets, we demonstrated that effective automatic labelling for this task is possible with an average of 61% F1 performance across crisis categories, while also analysing key factors that affect model performance and model generalizability across locations."

Information Systems for Crisis Response and Management (ISCRAM), Posted here with permission. Documents are for personal use only and not for commercial profit.
Retrieved From:
Information Systems for Crisis Response and Management (ISCRAM): http://idl.iscram.org/
Media Type:
Proceedings of the 18th ISCRAM Conference. Blacksburg, VA. May 2021
Help with citations