Detecting early warning indicators to the rise of COVID-19 infection cases in the context of U.S.: An exploratory data analysis

This work aims to investigate if social media data, Twitter in particular can be used to detect early warning indicators of COVID-19 pandemic in the United States (US). To demonstrate the viability of this work, English tweets were collected with a hasghtag of COVID-19 related topics ranges from 12th March to end of April 2020. With the help of with N-gram language model and Term Frequency and Inverse Document Frequency (TF-IDF) significant N-grams (N=2) such as (“new, york”), (“social, distancing”), (“stay, safe”), (“toilet, paper”), (“wash, hand”), (“tested, positive”), (look, like), (“front, line”), (“grocery, store”) etc. are extracted. The analysis shows that the appearances of the N-grams in Twitter directly reflect the characteristics of the infection cases and are almost similarly distributed over different clusters. This study also reveals that the tweets of (“new, york”) increases with (“stay, home”), (“social, distancing”), (“stay, safe”), (“look, like”) and (“tested positive”); and decreases with (“toilet, paper”). Ngrams with such relationships are recognized as indicators and are validated with the mapping of number of infection cases. Results show that social media data can project the actual scenario of infection curve and able to detect early warning indicators once the pandemic is moderately recognized.

Adnan Morshed, Jaman, 2022

Art der Arbeit Master Thesis
Betreuende Dozierende Laurenzi, Emanuele, Hinkelmann, Knut
Views: 18
Studiengang: Business Information Systems (Master)
Vertraulichkeit: öffentlich
Art der Arbeit
Master Thesis
Autorinnen und Autoren
Adnan Morshed, Jaman
Betreuende Dozierende
Laurenzi, Emanuele, Hinkelmann, Knut
Sprache der Arbeit
Business Information Systems (Master)
Standort Studiengang