Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony

Ken W. Grant, Virginie Van Wassenhove, David Poeppel

Research output: Contribution to journalArticle

Abstract

Detection thresholds for temporal synchrony in auditory and auditory-visual sentence materials were obtained on normal-hearing subjects. For auditory conditions, thresholds were determined using an adaptive-tracking procedure to control the degree of temporal asynchrony of a narrow audio band of speech, both positive and negative in separate tracks, relative to three other narrow audio bands of speech. For auditory-visual conditions, thresholds were determined in a similar manner for each of four narrow audio bands of speech as well as a broadband speech condition, relative to a video image of a female speaker. Four different auditory filter conditions, as well as a broadband auditory-visual speech condition, were evaluated in order to determine whether detection thresholds were dependent on the spectral content of the acoustic speech signal. Consistent with previous studies of auditory-visual speech recognition which showed a broad, asymmetrical range of temporal synchrony for which intelligibility was basically unaffected (audio delays roughly between -40ms and +240 ms), auditory-visual synchrony detection thresholds also showed a broad, asymmetrical pattern of similar magnitude (audio delays roughly between -45ms and +200 ms). No differences in synchrony thresholds were observed for the different filtered bands of speech, or for broadband speech. In contrast, detection thresholds for audio-alone conditions were much smaller (between -17ms and +23ms) and symmetrical. These results suggest a fairly tight coupling between a subject's ability to detect cross-spectral (auditory) and cross-modal (auditory-visual) asynchrony and the intelligibility of auditory and auditory-visual speech materials. Published by Elsevier B.V.

Original languageEnglish (US)
Pages (from-to)43-53
Number of pages11
JournalSpeech Communication
Volume44
Issue number1-4 SPEC. ISS.
DOIs
StatePublished - Oct 2004

Fingerprint

Synchrony
Broadband
Speech Acoustics
Auditory Threshold
Speech intelligibility
Vision
Speech
Hearing
Spectrality
Speech Signal
Aptitude
Speech Recognition
Audition
Speech recognition
Acoustics
acoustics
Filter
video
Dependent
Range of data

Keywords

  • Auditory-visual speech processing
  • Cross-modal asynchrony
  • Spectro-temporal asynchrony

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Experimental and Cognitive Psychology
  • Linguistics and Language

Cite this

Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony. / Grant, Ken W.; Van Wassenhove, Virginie; Poeppel, David.

In: Speech Communication, Vol. 44, No. 1-4 SPEC. ISS., 10.2004, p. 43-53.

Research output: Contribution to journalArticle

Grant, Ken W. ; Van Wassenhove, Virginie ; Poeppel, David. / Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony. In: Speech Communication. 2004 ; Vol. 44, No. 1-4 SPEC. ISS. pp. 43-53.
@article{0b704b0f7f9c40a7bc353f079bb9a9e6,
title = "Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony",
abstract = "Detection thresholds for temporal synchrony in auditory and auditory-visual sentence materials were obtained on normal-hearing subjects. For auditory conditions, thresholds were determined using an adaptive-tracking procedure to control the degree of temporal asynchrony of a narrow audio band of speech, both positive and negative in separate tracks, relative to three other narrow audio bands of speech. For auditory-visual conditions, thresholds were determined in a similar manner for each of four narrow audio bands of speech as well as a broadband speech condition, relative to a video image of a female speaker. Four different auditory filter conditions, as well as a broadband auditory-visual speech condition, were evaluated in order to determine whether detection thresholds were dependent on the spectral content of the acoustic speech signal. Consistent with previous studies of auditory-visual speech recognition which showed a broad, asymmetrical range of temporal synchrony for which intelligibility was basically unaffected (audio delays roughly between -40ms and +240 ms), auditory-visual synchrony detection thresholds also showed a broad, asymmetrical pattern of similar magnitude (audio delays roughly between -45ms and +200 ms). No differences in synchrony thresholds were observed for the different filtered bands of speech, or for broadband speech. In contrast, detection thresholds for audio-alone conditions were much smaller (between -17ms and +23ms) and symmetrical. These results suggest a fairly tight coupling between a subject's ability to detect cross-spectral (auditory) and cross-modal (auditory-visual) asynchrony and the intelligibility of auditory and auditory-visual speech materials. Published by Elsevier B.V.",
keywords = "Auditory-visual speech processing, Cross-modal asynchrony, Spectro-temporal asynchrony",
author = "Grant, {Ken W.} and {Van Wassenhove}, Virginie and David Poeppel",
year = "2004",
month = "10",
doi = "10.1016/j.specom.2004.06.004",
language = "English (US)",
volume = "44",
pages = "43--53",
journal = "Speech Communication",
issn = "0167-6393",
publisher = "Elsevier",
number = "1-4 SPEC. ISS.",

}

TY - JOUR

T1 - Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony

AU - Grant, Ken W.

AU - Van Wassenhove, Virginie

AU - Poeppel, David

PY - 2004/10

Y1 - 2004/10

N2 - Detection thresholds for temporal synchrony in auditory and auditory-visual sentence materials were obtained on normal-hearing subjects. For auditory conditions, thresholds were determined using an adaptive-tracking procedure to control the degree of temporal asynchrony of a narrow audio band of speech, both positive and negative in separate tracks, relative to three other narrow audio bands of speech. For auditory-visual conditions, thresholds were determined in a similar manner for each of four narrow audio bands of speech as well as a broadband speech condition, relative to a video image of a female speaker. Four different auditory filter conditions, as well as a broadband auditory-visual speech condition, were evaluated in order to determine whether detection thresholds were dependent on the spectral content of the acoustic speech signal. Consistent with previous studies of auditory-visual speech recognition which showed a broad, asymmetrical range of temporal synchrony for which intelligibility was basically unaffected (audio delays roughly between -40ms and +240 ms), auditory-visual synchrony detection thresholds also showed a broad, asymmetrical pattern of similar magnitude (audio delays roughly between -45ms and +200 ms). No differences in synchrony thresholds were observed for the different filtered bands of speech, or for broadband speech. In contrast, detection thresholds for audio-alone conditions were much smaller (between -17ms and +23ms) and symmetrical. These results suggest a fairly tight coupling between a subject's ability to detect cross-spectral (auditory) and cross-modal (auditory-visual) asynchrony and the intelligibility of auditory and auditory-visual speech materials. Published by Elsevier B.V.

AB - Detection thresholds for temporal synchrony in auditory and auditory-visual sentence materials were obtained on normal-hearing subjects. For auditory conditions, thresholds were determined using an adaptive-tracking procedure to control the degree of temporal asynchrony of a narrow audio band of speech, both positive and negative in separate tracks, relative to three other narrow audio bands of speech. For auditory-visual conditions, thresholds were determined in a similar manner for each of four narrow audio bands of speech as well as a broadband speech condition, relative to a video image of a female speaker. Four different auditory filter conditions, as well as a broadband auditory-visual speech condition, were evaluated in order to determine whether detection thresholds were dependent on the spectral content of the acoustic speech signal. Consistent with previous studies of auditory-visual speech recognition which showed a broad, asymmetrical range of temporal synchrony for which intelligibility was basically unaffected (audio delays roughly between -40ms and +240 ms), auditory-visual synchrony detection thresholds also showed a broad, asymmetrical pattern of similar magnitude (audio delays roughly between -45ms and +200 ms). No differences in synchrony thresholds were observed for the different filtered bands of speech, or for broadband speech. In contrast, detection thresholds for audio-alone conditions were much smaller (between -17ms and +23ms) and symmetrical. These results suggest a fairly tight coupling between a subject's ability to detect cross-spectral (auditory) and cross-modal (auditory-visual) asynchrony and the intelligibility of auditory and auditory-visual speech materials. Published by Elsevier B.V.

KW - Auditory-visual speech processing

KW - Cross-modal asynchrony

KW - Spectro-temporal asynchrony

UR - http://www.scopus.com/inward/record.url?scp=10444249633&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=10444249633&partnerID=8YFLogxK

U2 - 10.1016/j.specom.2004.06.004

DO - 10.1016/j.specom.2004.06.004

M3 - Article

VL - 44

SP - 43

EP - 53

JO - Speech Communication

JF - Speech Communication

SN - 0167-6393

IS - 1-4 SPEC. ISS.

ER -