FHNW RoboLab - Mapping Robot’s Body Language to Its Speech

NAO is an example of a social robot. Its speech capabilities were enhanced in a previous project. Now we enable NAO to use gestures even when using the enhanced speech system.

Rosario Brancato, 2020

Bachelor Thesis, Institute for Information Systems, School of Business FHNW
Betreuende Dozierende: Vivienne Jia Zhong, Theresa Schmiedel
Keywords: Gesture, speech, mapper, NAO, robot, animation, FHNW, body language, part-of-speech, vector model
Views: 6
Humanoid robots are becoming more common in the business world. NAO is an example of a robot that uses voice and body language to interact with other humans, as a person would. In a previous project, NAO’s speech capabilities were enhanced with an external speech system. However, so far, the robot has not performed any gestures while using the external system. The objective of the project is to map NAO’s native animations according to its speech, also when the enhanced speech system is used.
First, the existing system was analysed and the main technologies were recognized. This was followed by the development of a gesture-to-speech mapping system. The implementation and integration of the new system named AnimationMapper was documented with flowcharts and class diagrams. An empirical evaluation was carried out to test the new system on users. The aim was to see whether the gesture-to-speech mapping system had an impact on the perception of NAO as a communication partner and whether the quality of NAO’s presentation had improved.
The Python module AnimationMapper was developed to reach the project goal of mapping gestures to speech. It is integrated into the intermediate server and extends the response sent to NAO with an animation command. AnimationMapper uses the response text generated by Dialogflow, a conversational user interface service, as an input to decide which gesture is the most appropriate. With the NLP library NLTK the most relevant words in the input text were discovered. With string and similarity comparison, the relevant words were linked to tags of NAO’s animations. The library word2vec was used to create a vector model for the similarity comparison. NAO’s functionalities were enhanced to let it execute the chosen gesture while talking. Twenty-four people took part in the empirical evaluation. The participants rated two conversations with NAO, one with gestures and one without. The results showed that the conversation with gestures was considered more appropriate. Furthermore, the perception of the conversation was also rated higher. The traits sympathetic, lively, active, engaged, communicative and fun-loving were all rated more positively in the conversation with gestures.
Studiengang: Business Information Technology (Bachelor)
Fachbereich der Arbeit: Business Information System & IT-Management
Vertraulichkeit: vertraulich
Art der Arbeit
Bachelor Thesis
Institute for Information Systems, School of Business FHNW, Basel
Autorinnen und Autoren
Rosario Brancato
Betreuende Dozierende
Vivienne Jia Zhong, Theresa Schmiedel
Sprache der Arbeit
Business Information Technology (Bachelor)
Standort Studiengang
Gesture, speech, mapper, NAO, robot, animation, FHNW, body language, part-of-speech, vector model