RBC

The Restaurant Booking Corpus

Related Paper: Siegert, Ingo; Nietzold, Jannik; Heinemann, Ralph; Wendemuth, Andreas The Restaurant Booking Corpus - content-identical comparative human-human and human-computer simulated telephone conversations. In: Elektronische Sprachsignalverarbeitung 2019 - Dresden: TUDpress, p. 126-133 - (Studientexte zur Sprachkommunikation; 93)

Brief Information

Collected Data

Recording Setup

Content of the Corpus

Participant's characterization

Brief Information

The Restaurant Booking Corpus is a dataset that lets the participants perform the same task with a human being or a technical system as conversational partner. RBC explicitly assigns the same role to the human conversational partner as wellto the technical system. To support this role model, certain influencing factors are eliminated: the effect of a visible counterpart, the speech content, and the dialog domain. Additionally, the statements given by the human conversational partner and the technical systems is identical.

The Requirements for identical speaker role are

Similar speech commands that feels “natural
Similar speech answers given by human and system
No visible counterpart, neither as a GUI, an avatar, nor a “body”

This is implemented due:

Limited task of restaurant booking with some additional constraints
Sripted answers by human agent and technical system
Conversations over simulated telephone

Collected Data

Audio recordings
- from the participant
- from the human agent/ technical system
Questionnaires
- Socio-demographic information (before the experiment)
- Experiences with technical systems in general (before the experiment)
- Perception of ALEXA and the interlocutor (after the experiment)

Recording Setup

The recordings took place at the Institute of Information Technology and Communications. They were conducted in a living-room-like surrounding. The aim of this setting was to enable the participant to get into a natural communication atmosphere (in contrast to the distraction of laboratory surroundings).

The recordings were conducted using two high-quality neckband microphones (Sennheiser HSP 2-EW-3) to capture the voices of the participant and the interlocutor as well as one high-quality shotgun microphone (Sennheiser ME 66) to capture the overall acoustic scene and especially the output of the voice assistant. The recordings were stored uncompressed in WAV-format with 44.1 kHz sample rate and 16 bit resolution.

Content of the Corpus


Participants	30 German-speaking students
Distribution of Sex	10 men 20 women
Distribution of Age	MW 24 years STD 3.45 years Min: 18, Max: 31 years
Total amount of data	5 hours, 37 minutes
Mean duration per dialog	196 secounds
Annotation	utterances, type of utterances, transcripts, context

Siegert, Ingo; Weißkirchen, Norman; Krüger, Julia; Akhtiamov, Oleg; Wendemuth, Andreas Admitting the addressee detection faultiness of voice assistants to improve the activation performance using a continuous learning framework. In: Cognitive Systems Research, Elsevier BV, 2021

Akhtiamov, Oleg; Siegert, Ingo; Karpov, Alexey; Minker, Wolfgang Using complexity-identical human- and machine-directed utterances to investigate addressee detection for spoken dialogue systems In: Sensors - Basel: MDPI, Volume 20(2020), issue 9, article 2740

Baumann, Timo; Siegert, Ingo Prosodic addressee-detection - ensuring privacy in always-on spoken dialog systems In: Mensch und Computer 2020 - Tagungsband - New York, New York: The Association for Computing Machinery, Inc. . - 2020, S. 195-198

Akhtiamov, Oleg; Siegert, Ingo; Karpov, Alexey; Minker, Wolfgang Cross-corpus data augmentation for acoustic addressee detection In: 20th Annual Meeting of the Special Interest Group on Discourse and Dialogue - Stroudsburg, PA, USA: Association for Computational Linguistics (ACL), S. 274-283, 2019 ; [Tagung: 20th Annual Meeting of theSpecial Interest Group on Discourse and Dialogue,SIGDIAL 2019, Stockholm, Sweden, 11-13 September 2019]

RBC

The Restaurant Booking Corpus

Brief Information

Collected Data

Recording Setup

Content of the Corpus

Participant's characterization

Related Publications

Brief Information

Collected Data

Recording Setup

Content of the Corpus

Related Publications