RBC


The Restaurant Booking Corpus

Related Paper: Siegert, Ingo; Nietzold, Jannik; Heinemann, Ralph; Wendemuth, Andreas The Restaurant Booking Corpus - content-identical comparative human-human and human-computer simulated telephone conversations. In: Elektronische Sprachsignalverarbeitung 2019 - Dresden: TUDpress, p. 126-133 - (Studientexte zur Sprachkommunikation; 93)

Brief Information

Collected Data

Recording Setup

Content of the Corpus

Participant's characterization


Brief Information

The Restaurant Booking Corpus is a dataset that lets the participants perform the same task with a human being or a technical system as conversational partner. RBC explicitly assigns the same role to the human conversational partner as wellto the technical system. To support this role model, certain influencing factors are eliminated: the effect of a visible counterpart, the speech content, and the dialog domain. Additionally, the statements given by the human conversational partner and the technical systems is identical.

The Requirements for identical speaker role are

This is implemented due:

Collected Data

  1. Audio recordings
    • from the participant
    • from the human agent/ technical system
  2. Questionnaires
    • Socio-demographic information (before the experiment)
    • Experiences with technical systems in general (before the experiment)
    • Perception of ALEXA and the interlocutor (after the experiment)

Recording Setup

The recordings took place at the Institute of Information Technology and Communications. They were conducted in a living-room-like surrounding. The aim of this setting was to enable the participant to get into a natural communication atmosphere (in contrast to the distraction of laboratory surroundings).

The recordings were conducted using two high-quality neckband microphones (Sennheiser HSP 2-EW-3) to capture the voices of the participant and the interlocutor as well as one high-quality shotgun microphone (Sennheiser ME 66) to capture the overall acoustic scene and especially the output of the voice assistant. The recordings were stored uncompressed in WAV-format with 44.1 kHz sample rate and 16 bit resolution.

Content of the Corpus

Participants 30 German-speaking students
Distribution of Sex 10 men 20 women
Distribution of Age MW 24 years STD 3.45 years Min: 18, Max: 31 years
Total amount of data 5 hours, 37 minutes
Mean duration per dialog 196 secounds
Annotation utterances, type of utterances, transcripts, context

Siegert, Ingo; Weißkirchen, Norman; Krüger, Julia; Akhtiamov, Oleg; Wendemuth, Andreas Admitting the addressee detection faultiness of voice assistants to improve the activation performance using a continuous learning framework. In: Cognitive Systems Research, Elsevier BV, 2021

Akhtiamov, Oleg; Siegert, Ingo; Karpov, Alexey; Minker, Wolfgang Using complexity-identical human- and machine-directed utterances to investigate addressee detection for spoken dialogue systems In: Sensors - Basel: MDPI, Volume 20(2020), issue 9, article 2740

Baumann, Timo; Siegert, Ingo Prosodic addressee-detection - ensuring privacy in always-on spoken dialog systems In: Mensch und Computer 2020 - Tagungsband - New York, New York: The Association for Computing Machinery, Inc. . - 2020, S. 195-198

Akhtiamov, Oleg; Siegert, Ingo; Karpov, Alexey; Minker, Wolfgang Cross-corpus data augmentation for acoustic addressee detection In: 20th Annual Meeting of the Special Interest Group on Discourse and Dialogue - Stroudsburg, PA, USA: Association for Computational Linguistics (ACL), S. 274-283, 2019 ; [Tagung: 20th Annual Meeting of theSpecial Interest Group on Discourse and Dialogue,SIGDIAL 2019, Stockholm, Sweden, 11-13 September 2019]