FHNW RoboLab - Improving a Robot's Swiss German Speech Recognition
Enable Swiss German speech recognition for the humanoid robot Nao.
Hauenstein, Florian, 2020
Art der Arbeit Bachelor Thesis
Auftraggebende Institut für Wirtschaftsinformatik, HSW FHNW
Betreuende Dozierende Zhong, Vivienne Jia, Schmiedel, Theresa
Views: 40
Advances in speech recognition have contributed to the prevalence of virtual
assistants (e.g. Siri) and smart speakers (e.g. Alexa) in our daily life. In the context of
robotics, speech recognition is an important ability to facilitate a smooth conversation
between human and the robots.
This bachelor thesis aims to develop a speech recognition system for Swiss
German. Additionally, a state-of-the-art tool to enable speech recognition in Swiss German will be evaluated.
The project was divided into three phases: literature research, configuration setup and implementation of the robot. Knowledge of speech recognition has been gained in literature research. A search of the current open-source toolkits was carried out and evaluated. The whole speech recognition system was then set up on a remote server in the configuration setup phase. Furthermore, an audio data collection in Swiss German was acquired for the training of the system. The robot Nao including Choregraphe was connected to the remote server in the third implementation phase.
The project created several deliverables and outcomes that benefit the client in their further work. A literature research on speech recognition was conducted and currently available open-source speech recognition tools were collected. The best fitting speech recognition tool was evaluated and an architecture of the communication between the robot and the speech recognition tool on a remote server was created. Furthermore, a not completely developed speech recognition system was described and the data collection created in the project was transferred to the client. The setup of the whole system was documented and can be recreated. Finally, the project shared learnings and made suggestions for a further development of the system.
Studiengang: Business Information Technology (Bachelor)
Keywords Speech Recognition, Nao, Social Robot, Swiss German
Vertraulichkeit: vertraulich