FHNW RoboLab - Mapping Robot’s Body Language to Its Speech

NAO is an example of a social robot. Its speech capabilities were enhanced in a previous project. Now we enable NAO to use gestures even when using the enhanced speech system.

Brancato, Rosario, 2020

Art der Arbeit Bachelor Thesis

Auftraggebende Institute for Information Systems, HSW FHNW

Betreuende Dozierende Zhong, Vivienne Jia, Schmiedel, Theresa

Humanoid robots are becoming more common in the business world. NAO is an example of a robot that uses voice and body language to interact with other humans, as a person would. In a previous project, NAO’s speech capabilities were enhanced with an external speech system. However, so far, the robot has not performed any gestures while using the external system. The objective of the project is to map NAO’s native animations according to its speech, also when the enhanced speech system is used.

First, the existing system was analysed and the main technologies were recognized. This was followed by the development of a gesture-to-speech mapping system. The implementation and integration of the new system named AnimationMapper was documented with flowcharts and class diagrams. An empirical evaluation was carried out to test the new system on users. The aim was to see whether the gesture-to-speech mapping system had an impact on the perception of NAO as a communication partner and whether the quality of NAO’s presentation had improved.

The Python module AnimationMapper was developed to reach the project goal of mapping gestures to speech. It is integrated into the intermediate server and extends the response sent to NAO with an animation command. AnimationMapper uses the response text generated by Dialogflow, a conversational user interface service, as an input to decide which gesture is the most appropriate. With the NLP library NLTK the most relevant words in the input text were discovered. With string and similarity comparison, the relevant words were linked to tags of NAO’s animations. The library word2vec was used to create a vector model for the similarity comparison. NAO’s functionalities were enhanced to let it execute the chosen gesture while talking. Twenty-four people took part in the empirical evaluation. The participants rated two conversations with NAO, one with gestures and one without. The results showed that the conversation with gestures was considered more appropriate. Furthermore, the perception of the conversation was also rated higher. The traits sympathetic, lively, active, engaged, communicative and fun-loving were all rated more positively in the conversation with gestures.

Studiengang: Business Information Technology (Bachelor)

Keywords Gesture, speech, mapper, NAO, robot, animation, FHNW, body language, part-of-speech, vector model

Vertraulichkeit: vertraulich

Art der Arbeit

Bachelor Thesis

Auftraggebende

Institute for Information Systems, HSW FHNW, Basel

Autorinnen und Autoren

Brancato, Rosario

Betreuende Dozierende

Zhong, Vivienne Jia, Schmiedel, Theresa

Publikationsjahr

2020

Sprache der Arbeit

Englisch

Vertraulichkeit

vertraulich

Studiengang

Business Information Technology (Bachelor)

Standort Studiengang

Basel

Keywords

Gesture, speech, mapper, NAO, robot, animation, FHNW, body language, part-of-speech, vector model