Detecting early warning indicators to the rise
of COVID-19 infection cases in the context of
U.S.: An exploratory data analysis
This work aims to investigate if social media data, Twitter in particular can be used to
detect early warning indicators of COVID-19 pandemic in the United States (US). To
demonstrate the viability of this work, English tweets were collected with a hasghtag of
COVID-19 related topics ranges from 12th March to end of April 2020. With the help of
with N-gram language model and Term Frequency and Inverse Document Frequency
(TF-IDF) significant N-grams (N=2) such as (“new, york”), (“social, distancing”),
(“stay, safe”), (“toilet, paper”), (“wash, hand”), (“tested, positive”), (look, like), (“front,
line”), (“grocery, store”) etc. are extracted. The analysis shows that the appearances of
the N-grams in Twitter directly reflect the characteristics of the infection cases and are
almost similarly distributed over different clusters. This study also reveals that the
tweets of (“new, york”) increases with (“stay, home”), (“social, distancing”), (“stay,
safe”), (“look, like”) and (“tested positive”); and decreases with (“toilet, paper”). Ngrams
with such relationships are recognized as indicators and are validated with the
mapping of number of infection cases. Results show that social media data can project
the actual scenario of infection curve and able to detect early warning indicators once
the pandemic is moderately recognized.
Adnan Morshed, Jaman, 2022
Art der Arbeit Master Thesis
Auftraggebende
Betreuende Dozierende Laurenzi, Emanuele, Hinkelmann, Knut
Keywords
Views: 23
Studiengang: Business Information Systems (Master)
Vertraulichkeit: öffentlich