Feedforward and feedback in speech perception

Revisiting analysis by synthesis

David Poeppel, Philip J. Monahan

Research output: Contribution to journalArticle

Abstract

We revisit the analysis by synthesis (A × S) approach to speech recognition. In the late 1950s and 1960s, Stevens and Halle proposed a model of spoken word recognition in which candidate word representations were synthesised from brief cues in the auditory signal and analysed against the input signal in tightly linked bottom-up/top-down fashion. While this approach failed to garner much support at the time, recent years have brought a surge of interest in Bayesian approaches to perception, and the idea of A × S has consequently gained attention, particularly in the domain of visual perception. We review the model and illustrate some data from speech perception that are well-accounted for in the context of such an architecture. We focus on prediction in speech perception, an operation at the centre of the A × S algorithm. The data reviewed here and the current possibilities to study online measures of speech processing using cognitive neuroscience methods, in our view, add to a provocative series of arguments why A × S should be reconsidered as a contender in speech recognition research, complementing currently more dominant models.

Original languageEnglish (US)
Pages (from-to)935-951
Number of pages17
JournalLanguage and Cognitive Processes
Volume26
Issue number7
DOIs
StatePublished - Aug 2011

Fingerprint

Speech Perception
Visual Perception
Bayes Theorem
Cues
visual perception
neurosciences
Research
candidacy
Recognition (Psychology)
Speech Recognition

Keywords

  • Bayesian
  • Cognitive neuroscience
  • Perception

ASJC Scopus subject areas

  • Linguistics and Language
  • Experimental and Cognitive Psychology

Cite this

Feedforward and feedback in speech perception : Revisiting analysis by synthesis. / Poeppel, David; Monahan, Philip J.

In: Language and Cognitive Processes, Vol. 26, No. 7, 08.2011, p. 935-951.

Research output: Contribution to journalArticle

@article{fd0f43ab791c4863a392f3d8d6450879,
title = "Feedforward and feedback in speech perception: Revisiting analysis by synthesis",
abstract = "We revisit the analysis by synthesis (A × S) approach to speech recognition. In the late 1950s and 1960s, Stevens and Halle proposed a model of spoken word recognition in which candidate word representations were synthesised from brief cues in the auditory signal and analysed against the input signal in tightly linked bottom-up/top-down fashion. While this approach failed to garner much support at the time, recent years have brought a surge of interest in Bayesian approaches to perception, and the idea of A × S has consequently gained attention, particularly in the domain of visual perception. We review the model and illustrate some data from speech perception that are well-accounted for in the context of such an architecture. We focus on prediction in speech perception, an operation at the centre of the A × S algorithm. The data reviewed here and the current possibilities to study online measures of speech processing using cognitive neuroscience methods, in our view, add to a provocative series of arguments why A × S should be reconsidered as a contender in speech recognition research, complementing currently more dominant models.",
keywords = "Bayesian, Cognitive neuroscience, Perception",
author = "David Poeppel and Monahan, {Philip J.}",
year = "2011",
month = "8",
doi = "10.1080/01690965.2010.493301",
language = "English (US)",
volume = "26",
pages = "935--951",
journal = "Language, Cognition and Neuroscience",
issn = "2327-3798",
publisher = "Taylor and Francis",
number = "7",

}

TY - JOUR

T1 - Feedforward and feedback in speech perception

T2 - Revisiting analysis by synthesis

AU - Poeppel, David

AU - Monahan, Philip J.

PY - 2011/8

Y1 - 2011/8

N2 - We revisit the analysis by synthesis (A × S) approach to speech recognition. In the late 1950s and 1960s, Stevens and Halle proposed a model of spoken word recognition in which candidate word representations were synthesised from brief cues in the auditory signal and analysed against the input signal in tightly linked bottom-up/top-down fashion. While this approach failed to garner much support at the time, recent years have brought a surge of interest in Bayesian approaches to perception, and the idea of A × S has consequently gained attention, particularly in the domain of visual perception. We review the model and illustrate some data from speech perception that are well-accounted for in the context of such an architecture. We focus on prediction in speech perception, an operation at the centre of the A × S algorithm. The data reviewed here and the current possibilities to study online measures of speech processing using cognitive neuroscience methods, in our view, add to a provocative series of arguments why A × S should be reconsidered as a contender in speech recognition research, complementing currently more dominant models.

AB - We revisit the analysis by synthesis (A × S) approach to speech recognition. In the late 1950s and 1960s, Stevens and Halle proposed a model of spoken word recognition in which candidate word representations were synthesised from brief cues in the auditory signal and analysed against the input signal in tightly linked bottom-up/top-down fashion. While this approach failed to garner much support at the time, recent years have brought a surge of interest in Bayesian approaches to perception, and the idea of A × S has consequently gained attention, particularly in the domain of visual perception. We review the model and illustrate some data from speech perception that are well-accounted for in the context of such an architecture. We focus on prediction in speech perception, an operation at the centre of the A × S algorithm. The data reviewed here and the current possibilities to study online measures of speech processing using cognitive neuroscience methods, in our view, add to a provocative series of arguments why A × S should be reconsidered as a contender in speech recognition research, complementing currently more dominant models.

KW - Bayesian

KW - Cognitive neuroscience

KW - Perception

UR - http://www.scopus.com/inward/record.url?scp=79960524722&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79960524722&partnerID=8YFLogxK

U2 - 10.1080/01690965.2010.493301

DO - 10.1080/01690965.2010.493301

M3 - Article

VL - 26

SP - 935

EP - 951

JO - Language, Cognition and Neuroscience

JF - Language, Cognition and Neuroscience

SN - 2327-3798

IS - 7

ER -