FHNW RoboLab - Improving a Robot's Swiss German Speech Recognition

Enable Swiss German speech recognition for the humanoid robot Nao.

Hauenstein, Florian, 2020

Art der Arbeit Bachelor Thesis
Auftraggebende Institut für Wirtschaftsinformatik, HSW FHNW
Betreuende Dozierende Zhong, Vivienne Jia, Schmiedel, Theresa
Keywords Speech Recognition, Nao, Social Robot, Swiss German
Views: 30
Advances in speech recognition have contributed to the prevalence of virtual assistants (e.g. Siri) and smart speakers (e.g. Alexa) in our daily life. In the context of robotics, speech recognition is an important ability to facilitate a smooth conversation between human and the robots. This bachelor thesis aims to develop a speech recognition system for Swiss German. Additionally, a state-of-the-art tool to enable speech recognition in Swiss German will be evaluated.
The project was divided into three phases: literature research, configuration setup and implementation of the robot. Knowledge of speech recognition has been gained in literature research. A search of the current open-source toolkits was carried out and evaluated. The whole speech recognition system was then set up on a remote server in the configuration setup phase. Furthermore, an audio data collection in Swiss German was acquired for the training of the system. The robot Nao including Choregraphe was connected to the remote server in the third implementation phase.
The project created several deliverables and outcomes that benefit the client in their further work. A literature research on speech recognition was conducted and currently available open-source speech recognition tools were collected. The best fitting speech recognition tool was evaluated and an architecture of the communication between the robot and the speech recognition tool on a remote server was created. Furthermore, a not completely developed speech recognition system was described and the data collection created in the project was transferred to the client. The setup of the whole system was documented and can be recreated. Finally, the project shared learnings and made suggestions for a further development of the system.
Studiengang: Business Information Technology (Bachelor)
Vertraulichkeit: vertraulich
Art der Arbeit
Bachelor Thesis
Auftraggebende
Institut für Wirtschaftsinformatik, HSW FHNW, Basel
Autorinnen und Autoren
Hauenstein, Florian
Betreuende Dozierende
Zhong, Vivienne Jia, Schmiedel, Theresa
Publikationsjahr
2020
Sprache der Arbeit
Englisch
Vertraulichkeit
vertraulich
Studiengang
Business Information Technology (Bachelor)
Standort Studiengang
Basel
Keywords
Speech Recognition, Nao, Social Robot, Swiss German