VACW
Voice Assistant Conversations in the wild
Related Paper: Siegert, Ingo Alexa in the wild - collecting unconstrained conversations with a modern voice assistant in a public environment In: Proceedings of the LREC 2020, Marseille, France, ELRA, 20, p 608-612
Brief Information
Collected Data
Recording Setup
Content of the Dataset
Related Publications
Brief Information
The Voice Assistant Conversations in the wild (VACW) is a collection of unconstrained interactions with a voice assistant during a science fair and comprises around 40 hours of user utterances.
The main goal of the data collection was to unconstrained voice assistant interactions available for academic research. Therefore, interactions made in a public science fair were collected without having a dedicated interaction protocol.
Collected Data
To remain the unconstrained fashion of the data, only limited recordings were done:
- Audio recordings and transcripts from the participants when actively interacting with an Amazon Echo dot.
- No further recordings or questionnaires were conducted.
Recording Setup
The recording of these public and unconstrained voice assistant interactions took place during the science exhibition of the “MS Wissenschaft” from May until October 2019. The “MS Wissenschaft” is a yearly touring exhibition with a distinct topic, The topic of 2019 was artificial intelligence (AI) . 27 different exibits all of them related to artificial intelligence could be visited on the ship. The exhibition is placed onto a ship which travels through different cities in Germany and Austria. In total 31 cities were visited, at each station the ships stays for 3 to 5 days. The exhibition is primarily aimed at school classes but also at interested adults, see Fig. 1. It is supported by various event formats (lectures, meet the scientists, film screenings and panel discussions) to attract visitors. More than 85 000 people with more than 500 classes visited MS Wissenschaft.
As voice assistant system, we used the Amazon ALEXA Echo Dot (2nd generation). We opted for a commercial system to allow a fully free interaction with a currently available system.
Content of the Dataset
Duration | 39.9 hours |
Utterances | approx. 37k |
Language | German (incl. dialects) |
Related Publications
Lea Kisser, Ingo Siegert Erroneous Reactions of Voice Assistants "In the Wild" –- First Analyses. In Elektronische Sprachsignalverarbeitung 2022. Tagungsband der 33. Konferenz, pp. 113–120, Sonderborg, Denmark, 2022