Assessing the reliability of textbook data in syntax: Adger's Core Syntax

Jon Sprouse, Diogo Almeida

Research output: Contribution to journalArticle

Abstract

There has been a consistent pattern of criticism of the reliability of acceptability judgment data in syntax for at least 50 years (e.g., Hill 1961), culminating in several high-profile criticisms within the past ten years (Edelman & Christiansen 2003, Ferreira 2005, Wasow & Arnold 2005, Gibson & Fedorenko 2010, in press). The fundamental claim of these critics is that traditional acceptability judgment collection methods, which tend to be relatively informal compared to methods from experimental psychology, lead to an intolerably high number of false positive results. In this paper we empirically assess this claim by formally testing all 469 (unique, US-English) data points from a popular syntax textbook (Adger 2003) using 440 naïve participants, two judgment tasks (magnitude estimation and yes-no), and three different types of statistical analyses (standard frequentist tests, linear mixed effects models, and Bayes factor analyses). The results suggest that the maximum discrepancy between traditional methods and formal experimental methods is 2%. This suggests that even under the (likely unwarranted) assumption that the discrepant results are all false positives that have found their way into the syntactic literature due to the shortcomings of traditional methods, the minimum replication rate of these 469 data points is 98%. We discuss the implications of these results for questions about the reliability of syntactic data, as well as the practical consequences of these results for the methodological options available to syntacticians.

Original languageEnglish (US)
Pages (from-to)609-652
Number of pages44
JournalJournal of Linguistics
Volume48
Issue number3
DOIs
StatePublished - Oct 1 2012

Fingerprint

syntax
textbook
criticism
experimental psychology
critic
Syntax
Textbooks
Acceptability Judgments
Datum Point
Criticism

ASJC Scopus subject areas

  • Language and Linguistics
  • Philosophy
  • Linguistics and Language

Cite this

Assessing the reliability of textbook data in syntax : Adger's Core Syntax. / Sprouse, Jon; Almeida, Diogo.

In: Journal of Linguistics, Vol. 48, No. 3, 01.10.2012, p. 609-652.

Research output: Contribution to journalArticle

@article{ab4103cf62574d568d7857d5d8519d3e,
title = "Assessing the reliability of textbook data in syntax: Adger's Core Syntax",
abstract = "There has been a consistent pattern of criticism of the reliability of acceptability judgment data in syntax for at least 50 years (e.g., Hill 1961), culminating in several high-profile criticisms within the past ten years (Edelman & Christiansen 2003, Ferreira 2005, Wasow & Arnold 2005, Gibson & Fedorenko 2010, in press). The fundamental claim of these critics is that traditional acceptability judgment collection methods, which tend to be relatively informal compared to methods from experimental psychology, lead to an intolerably high number of false positive results. In this paper we empirically assess this claim by formally testing all 469 (unique, US-English) data points from a popular syntax textbook (Adger 2003) using 440 na{\"i}ve participants, two judgment tasks (magnitude estimation and yes-no), and three different types of statistical analyses (standard frequentist tests, linear mixed effects models, and Bayes factor analyses). The results suggest that the maximum discrepancy between traditional methods and formal experimental methods is 2{\%}. This suggests that even under the (likely unwarranted) assumption that the discrepant results are all false positives that have found their way into the syntactic literature due to the shortcomings of traditional methods, the minimum replication rate of these 469 data points is 98{\%}. We discuss the implications of these results for questions about the reliability of syntactic data, as well as the practical consequences of these results for the methodological options available to syntacticians.",
author = "Jon Sprouse and Diogo Almeida",
year = "2012",
month = "10",
day = "1",
doi = "10.1017/S0022226712000011",
language = "English (US)",
volume = "48",
pages = "609--652",
journal = "Journal of Linguistics",
issn = "0022-2267",
publisher = "Cambridge University Press",
number = "3",

}

TY - JOUR

T1 - Assessing the reliability of textbook data in syntax

T2 - Adger's Core Syntax

AU - Sprouse, Jon

AU - Almeida, Diogo

PY - 2012/10/1

Y1 - 2012/10/1

N2 - There has been a consistent pattern of criticism of the reliability of acceptability judgment data in syntax for at least 50 years (e.g., Hill 1961), culminating in several high-profile criticisms within the past ten years (Edelman & Christiansen 2003, Ferreira 2005, Wasow & Arnold 2005, Gibson & Fedorenko 2010, in press). The fundamental claim of these critics is that traditional acceptability judgment collection methods, which tend to be relatively informal compared to methods from experimental psychology, lead to an intolerably high number of false positive results. In this paper we empirically assess this claim by formally testing all 469 (unique, US-English) data points from a popular syntax textbook (Adger 2003) using 440 naïve participants, two judgment tasks (magnitude estimation and yes-no), and three different types of statistical analyses (standard frequentist tests, linear mixed effects models, and Bayes factor analyses). The results suggest that the maximum discrepancy between traditional methods and formal experimental methods is 2%. This suggests that even under the (likely unwarranted) assumption that the discrepant results are all false positives that have found their way into the syntactic literature due to the shortcomings of traditional methods, the minimum replication rate of these 469 data points is 98%. We discuss the implications of these results for questions about the reliability of syntactic data, as well as the practical consequences of these results for the methodological options available to syntacticians.

AB - There has been a consistent pattern of criticism of the reliability of acceptability judgment data in syntax for at least 50 years (e.g., Hill 1961), culminating in several high-profile criticisms within the past ten years (Edelman & Christiansen 2003, Ferreira 2005, Wasow & Arnold 2005, Gibson & Fedorenko 2010, in press). The fundamental claim of these critics is that traditional acceptability judgment collection methods, which tend to be relatively informal compared to methods from experimental psychology, lead to an intolerably high number of false positive results. In this paper we empirically assess this claim by formally testing all 469 (unique, US-English) data points from a popular syntax textbook (Adger 2003) using 440 naïve participants, two judgment tasks (magnitude estimation and yes-no), and three different types of statistical analyses (standard frequentist tests, linear mixed effects models, and Bayes factor analyses). The results suggest that the maximum discrepancy between traditional methods and formal experimental methods is 2%. This suggests that even under the (likely unwarranted) assumption that the discrepant results are all false positives that have found their way into the syntactic literature due to the shortcomings of traditional methods, the minimum replication rate of these 469 data points is 98%. We discuss the implications of these results for questions about the reliability of syntactic data, as well as the practical consequences of these results for the methodological options available to syntacticians.

UR - http://www.scopus.com/inward/record.url?scp=84868273866&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84868273866&partnerID=8YFLogxK

U2 - 10.1017/S0022226712000011

DO - 10.1017/S0022226712000011

M3 - Article

VL - 48

SP - 609

EP - 652

JO - Journal of Linguistics

JF - Journal of Linguistics

SN - 0022-2267

IS - 3

ER -