VACW


Voice Assistant Conversations in the wild

Related Paper: Siegert, Ingo Alexa in the wild - collecting unconstrained conversations with a modern voice assistant in a public environment In: Proceedings of the LREC 2020, Marseille, France, ELRA, 20, p 608-612

Brief Information

Collected Data

Recording Setup

Content of the Dataset


Brief Information

The Voice Assistant Conversations in the wild (VACW) is a collection of unconstrained interactions with a voice assistant during a science fair and comprises around 40 hours of user utterances.

The main goal of the data collection was to unconstrained voice assistant interactions available for academic research. Therefore, interactions made in a public science fair were collected without having a dedicated interaction protocol.

Ablauf

Collected Data

To remain the unconstrained fashion of the data, only limited recordings were done:

  1. Audio recordings and transcripts from the participants when actively interacting with an Amazon Echo dot.
  2. No further recordings or questionnaires were conducted.

Recording Setup

The recording of these public and unconstrained voice assistant interactions took place during the science exhibition of the “MS Wissenschaft” from May until October 2019. The “MS Wissenschaft” is a yearly touring exhibition with a distinct topic, The topic of 2019 was artificial intelligence (AI) . 27 different exibits all of them related to artificial intelligence could be visited on the ship. The exhibition is placed onto a ship which travels through different cities in Germany and Austria. In total 31 cities were visited, at each station the ships stays for 3 to 5 days. The exhibition is primarily aimed at school classes but also at interested adults, see Fig. 1. It is supported by various event formats (lectures, meet the scientists, film screenings and panel discussions) to attract visitors. More than 85 000 people with more than 500 classes visited MS Wissenschaft.

As voice assistant system, we used the Amazon ALEXA Echo Dot (2nd generation). We opted for a commercial system to allow a fully free interaction with a currently available system.

Foto_3_AlexaQuiz_small
Picture of the exhibit used to record interactions with modern voice assistants.
MSWI2019_Exponat_small
Picture of the 2019 exhibition at the MS Wissenschaft featuring AI topics (Copyright: Ilja Hendel/WiD).

Content of the Dataset

Duration 39.9 hours
Utterances approx. 37k
Language German (incl. dialects)

Lea Kisser, Ingo Siegert Erroneous Reactions of Voice Assistants "In the Wild" –- First Analyses. In Elektronische Sprachsignalverarbeitung 2022. Tagungsband der 33. Konferenz, pp. 113–120, Sonderborg, Denmark, 2022