Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists

Mark Cartwright, Graham Dove, Ana Elisa Méndez Méndez, Juan Bello, Oded Nov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Annotating rich audio data is an essential aspect of training and evaluating machine listening systems. We approach this task in the context of temporally-complex urban soundscapes, which require multiple labels to identify overlapping sound sources. Typically this work is crowdsourced, and previous studies have shown that workers can quickly label audio with binary annotation for single classes. However, this approach can be difcult to scale when multiple passes with diferent focus classes are required to annotate data with multiple labels. In citizen science, where tasks are often image-based, annotation eforts typically label multiple classes simultaneously in a single pass. This paper describes our data collection on the Zooniverse citizen science platform, comparing the efciencies of diferent audio annotation strategies. We compared multiple-pass binary annotation, single-pass multi-label annotation, and a hybrid approach: hierarchical multi-pass multi-label annotation. We discuss our fndings, which support using multi-label annotation, with reference to volunteer citizen scientists’ motivations.

Original languageEnglish (US)
Title of host publicationCHI 2019 - Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450359702
DOIs
StatePublished - May 2 2019
Event2019 CHI Conference on Human Factors in Computing Systems, CHI 2019 - Glasgow, United Kingdom
Duration: May 4 2019May 9 2019

Publication series

NameConference on Human Factors in Computing Systems - Proceedings

Conference

Conference2019 CHI Conference on Human Factors in Computing Systems, CHI 2019
CountryUnited Kingdom
CityGlasgow
Period5/4/195/9/19

Fingerprint

Labels
Acoustic waves

Keywords

  • Audio annotation
  • Citizen science
  • Crowdsourcing

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Graphics and Computer-Aided Design
  • Software

Cite this

Cartwright, M., Dove, G., Méndez, A. E. M., Bello, J., & Nov, O. (2019). Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists. In CHI 2019 - Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Conference on Human Factors in Computing Systems - Proceedings). Association for Computing Machinery. https://doi.org/10.1145/3290605.3300522

Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists. / Cartwright, Mark; Dove, Graham; Méndez, Ana Elisa Méndez; Bello, Juan; Nov, Oded.

CHI 2019 - Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, 2019. (Conference on Human Factors in Computing Systems - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cartwright, M, Dove, G, Méndez, AEM, Bello, J & Nov, O 2019, Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists. in CHI 2019 - Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Conference on Human Factors in Computing Systems - Proceedings, Association for Computing Machinery, 2019 CHI Conference on Human Factors in Computing Systems, CHI 2019, Glasgow, United Kingdom, 5/4/19. https://doi.org/10.1145/3290605.3300522
Cartwright M, Dove G, Méndez AEM, Bello J, Nov O. Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists. In CHI 2019 - Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery. 2019. (Conference on Human Factors in Computing Systems - Proceedings). https://doi.org/10.1145/3290605.3300522
Cartwright, Mark ; Dove, Graham ; Méndez, Ana Elisa Méndez ; Bello, Juan ; Nov, Oded. / Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists. CHI 2019 - Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, 2019. (Conference on Human Factors in Computing Systems - Proceedings).
@inproceedings{402273cf0ff844db83d36ba7bb968233,
title = "Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists",
abstract = "Annotating rich audio data is an essential aspect of training and evaluating machine listening systems. We approach this task in the context of temporally-complex urban soundscapes, which require multiple labels to identify overlapping sound sources. Typically this work is crowdsourced, and previous studies have shown that workers can quickly label audio with binary annotation for single classes. However, this approach can be difcult to scale when multiple passes with diferent focus classes are required to annotate data with multiple labels. In citizen science, where tasks are often image-based, annotation eforts typically label multiple classes simultaneously in a single pass. This paper describes our data collection on the Zooniverse citizen science platform, comparing the efciencies of diferent audio annotation strategies. We compared multiple-pass binary annotation, single-pass multi-label annotation, and a hybrid approach: hierarchical multi-pass multi-label annotation. We discuss our fndings, which support using multi-label annotation, with reference to volunteer citizen scientists’ motivations.",
keywords = "Audio annotation, Citizen science, Crowdsourcing",
author = "Mark Cartwright and Graham Dove and M{\'e}ndez, {Ana Elisa M{\'e}ndez} and Juan Bello and Oded Nov",
year = "2019",
month = "5",
day = "2",
doi = "10.1145/3290605.3300522",
language = "English (US)",
series = "Conference on Human Factors in Computing Systems - Proceedings",
publisher = "Association for Computing Machinery",
booktitle = "CHI 2019 - Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems",

}

TY - GEN

T1 - Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists

AU - Cartwright, Mark

AU - Dove, Graham

AU - Méndez, Ana Elisa Méndez

AU - Bello, Juan

AU - Nov, Oded

PY - 2019/5/2

Y1 - 2019/5/2

N2 - Annotating rich audio data is an essential aspect of training and evaluating machine listening systems. We approach this task in the context of temporally-complex urban soundscapes, which require multiple labels to identify overlapping sound sources. Typically this work is crowdsourced, and previous studies have shown that workers can quickly label audio with binary annotation for single classes. However, this approach can be difcult to scale when multiple passes with diferent focus classes are required to annotate data with multiple labels. In citizen science, where tasks are often image-based, annotation eforts typically label multiple classes simultaneously in a single pass. This paper describes our data collection on the Zooniverse citizen science platform, comparing the efciencies of diferent audio annotation strategies. We compared multiple-pass binary annotation, single-pass multi-label annotation, and a hybrid approach: hierarchical multi-pass multi-label annotation. We discuss our fndings, which support using multi-label annotation, with reference to volunteer citizen scientists’ motivations.

AB - Annotating rich audio data is an essential aspect of training and evaluating machine listening systems. We approach this task in the context of temporally-complex urban soundscapes, which require multiple labels to identify overlapping sound sources. Typically this work is crowdsourced, and previous studies have shown that workers can quickly label audio with binary annotation for single classes. However, this approach can be difcult to scale when multiple passes with diferent focus classes are required to annotate data with multiple labels. In citizen science, where tasks are often image-based, annotation eforts typically label multiple classes simultaneously in a single pass. This paper describes our data collection on the Zooniverse citizen science platform, comparing the efciencies of diferent audio annotation strategies. We compared multiple-pass binary annotation, single-pass multi-label annotation, and a hybrid approach: hierarchical multi-pass multi-label annotation. We discuss our fndings, which support using multi-label annotation, with reference to volunteer citizen scientists’ motivations.

KW - Audio annotation

KW - Citizen science

KW - Crowdsourcing

UR - http://www.scopus.com/inward/record.url?scp=85064262497&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064262497&partnerID=8YFLogxK

U2 - 10.1145/3290605.3300522

DO - 10.1145/3290605.3300522

M3 - Conference contribution

T3 - Conference on Human Factors in Computing Systems - Proceedings

BT - CHI 2019 - Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems

PB - Association for Computing Machinery

ER -