Finding the experts in the crowd

Validity and reliability of crowdsourced measures of children’s gradient speech contrasts

Daphna Harel, Elaine Russo Hitchcock, Daniel Szeredi, José Ortiz, Tara McAllister Byun

Research output: Contribution to journalArticle

Abstract

Perceptual ratings aggregated across multiple nonexpert listeners can be used to measure covert contrast in child speech. Online crowdsourcing provides access to a large pool of raters, but for practical purposes, researchers may wish to use smaller samples. The ratings obtained from these smaller samples may not maintain the high levels of validity seen in larger samples. This study aims to measure the validity and reliability of crowdsourced continuous ratings of child speech, obtained through Visual Analog Scaling, and to identify ways to improve these measurements. We first assess overall validity and interrater reliability for measurements obtained from a large set of raters. Second, we investigate two rater-level measures of quality, individual validity and intrarater reliability, and examine the relationship between them. Third, we show that these estimates may be used to establish guidelines for the inclusion of raters, thus impacting the quality of results obtained when smaller samples are used.

Original languageEnglish (US)
Pages (from-to)1-14
Number of pages14
JournalClinical Linguistics and Phonetics
DOIs
StateAccepted/In press - Jun 6 2016

Fingerprint

Reproducibility of Results
expert
Crowdsourcing
rating
Research Personnel
scaling
listener
Guidelines
inclusion
Raters
Crowds
Rating
Small Sample

Keywords

  • Child speech ratings
  • covert contrasts
  • reliability
  • validity

ASJC Scopus subject areas

  • Speech and Hearing
  • Linguistics and Language
  • Language and Linguistics

Cite this

Finding the experts in the crowd : Validity and reliability of crowdsourced measures of children’s gradient speech contrasts. / Harel, Daphna; Hitchcock, Elaine Russo; Szeredi, Daniel; Ortiz, José; McAllister Byun, Tara.

In: Clinical Linguistics and Phonetics, 06.06.2016, p. 1-14.

Research output: Contribution to journalArticle

@article{db3b16993dea4ee78a1cba212c73f8ab,
title = "Finding the experts in the crowd: Validity and reliability of crowdsourced measures of children’s gradient speech contrasts",
abstract = "Perceptual ratings aggregated across multiple nonexpert listeners can be used to measure covert contrast in child speech. Online crowdsourcing provides access to a large pool of raters, but for practical purposes, researchers may wish to use smaller samples. The ratings obtained from these smaller samples may not maintain the high levels of validity seen in larger samples. This study aims to measure the validity and reliability of crowdsourced continuous ratings of child speech, obtained through Visual Analog Scaling, and to identify ways to improve these measurements. We first assess overall validity and interrater reliability for measurements obtained from a large set of raters. Second, we investigate two rater-level measures of quality, individual validity and intrarater reliability, and examine the relationship between them. Third, we show that these estimates may be used to establish guidelines for the inclusion of raters, thus impacting the quality of results obtained when smaller samples are used.",
keywords = "Child speech ratings, covert contrasts, reliability, validity",
author = "Daphna Harel and Hitchcock, {Elaine Russo} and Daniel Szeredi and Jos{\'e} Ortiz and {McAllister Byun}, Tara",
year = "2016",
month = "6",
day = "6",
doi = "10.3109/02699206.2016.1174306",
language = "English (US)",
pages = "1--14",
journal = "Clinical Linguistics and Phonetics",
issn = "0269-9206",
publisher = "Informa Healthcare",

}

TY - JOUR

T1 - Finding the experts in the crowd

T2 - Validity and reliability of crowdsourced measures of children’s gradient speech contrasts

AU - Harel, Daphna

AU - Hitchcock, Elaine Russo

AU - Szeredi, Daniel

AU - Ortiz, José

AU - McAllister Byun, Tara

PY - 2016/6/6

Y1 - 2016/6/6

N2 - Perceptual ratings aggregated across multiple nonexpert listeners can be used to measure covert contrast in child speech. Online crowdsourcing provides access to a large pool of raters, but for practical purposes, researchers may wish to use smaller samples. The ratings obtained from these smaller samples may not maintain the high levels of validity seen in larger samples. This study aims to measure the validity and reliability of crowdsourced continuous ratings of child speech, obtained through Visual Analog Scaling, and to identify ways to improve these measurements. We first assess overall validity and interrater reliability for measurements obtained from a large set of raters. Second, we investigate two rater-level measures of quality, individual validity and intrarater reliability, and examine the relationship between them. Third, we show that these estimates may be used to establish guidelines for the inclusion of raters, thus impacting the quality of results obtained when smaller samples are used.

AB - Perceptual ratings aggregated across multiple nonexpert listeners can be used to measure covert contrast in child speech. Online crowdsourcing provides access to a large pool of raters, but for practical purposes, researchers may wish to use smaller samples. The ratings obtained from these smaller samples may not maintain the high levels of validity seen in larger samples. This study aims to measure the validity and reliability of crowdsourced continuous ratings of child speech, obtained through Visual Analog Scaling, and to identify ways to improve these measurements. We first assess overall validity and interrater reliability for measurements obtained from a large set of raters. Second, we investigate two rater-level measures of quality, individual validity and intrarater reliability, and examine the relationship between them. Third, we show that these estimates may be used to establish guidelines for the inclusion of raters, thus impacting the quality of results obtained when smaller samples are used.

KW - Child speech ratings

KW - covert contrasts

KW - reliability

KW - validity

UR - http://www.scopus.com/inward/record.url?scp=84973626421&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84973626421&partnerID=8YFLogxK

U2 - 10.3109/02699206.2016.1174306

DO - 10.3109/02699206.2016.1174306

M3 - Article

SP - 1

EP - 14

JO - Clinical Linguistics and Phonetics

JF - Clinical Linguistics and Phonetics

SN - 0269-9206

ER -