Multilingual Topic Identification and Sentiment Analysis in the Gig Economy

Frieder, Manuel, 2019

Art der Arbeit Master Thesis
Betreuende Dozierende Pustulka, Elzbieta
Views: 36 - Downloads: 7
We work in the context of business innovation with a company providing human resource services on a web platform. We analyze written feedback given by workers and employers of a gig platform company (GPC). The project has two main goals: to reveal topics and to measure the opinion polarity towards topics. We applied machine learning methods on 66’376 sentences originating from 39’614 comments. We used the biterm topic model (BTM) for topic identification. For sentiment analysis we tested several methods trained and tested on a subset of 3583 hand annotated sentences. We include emoticons and star ratings as additional features to determine the polarities. Our approach revealed new topics, such as work breaks or the workload, and confirmed topics found by interviewing stakeholders. Thus our method can find topics in gig economy feedback. However, they show many intersections and we found it hard to assign topics to the sentences reliably. We believe more data is required to improve the outcome. Sentiment analysis on sentence level achieved an accuracy of 0.86 with the Matthews correlation coefficient (MCC) of 0.66. Although processing entire comments produces a slightly higher accuracy, we argue that this is biased as in our training data comments with a mix of opinions were usually not labeled as negative. Breaking the comments up into sentences increases the number of negative labels and makes the analysis more accurate. The result of topic and sentiment analysis are going to provide a basis to extend the GPC’s web platform in the future.
Studiengang: Business Information Systems (Master)
Vertraulichkeit: öffentlich
Art der Arbeit
Master Thesis
Autorinnen und Autoren
Frieder, Manuel
Betreuende Dozierende
Pustulka, Elzbieta
Sprache der Arbeit
Business Information Systems (Master)
Standort Studiengang