Sound texture perception via statistics of the auditory periphery: Evidence from sound synthesis

Josh H. McDermott, Eero Simoncelli

Research output: Contribution to journalArticle

Abstract

Rainstorms, insect swarms, and galloping horses produce "sound textures" -the collective result of many similar acoustic events. Sound textures are distinguished by temporal homogeneity, suggesting they could be recognized with time-averaged statistics. To test this hypothesis, we processed real-world textures with an auditory model containing filters tuned for sound frequencies and their modulations, and measured statistics of the resulting decomposition. We then assessed the realism and recognizability of novel sounds synthesized to have matching statistics. Statistics of individual frequency channels, capturing spectral power and sparsity, generally failed to produce compelling synthetic textures; however, combining them with correlations between channels produced identifiable and natural-sounding textures. Synthesis quality declined if statistics were computed from biologically implausible auditory models. The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations. The synthesis methodology offers a powerful tool for their further investigation.

Original languageEnglish (US)
Pages (from-to)926-940
Number of pages15
JournalNeuron
Volume71
Issue number5
DOIs
StatePublished - Sep 8 2011

Fingerprint

Acoustics
Horses
Insects
Population
Power (Psychology)

ASJC Scopus subject areas

  • Neuroscience(all)

Cite this

Sound texture perception via statistics of the auditory periphery : Evidence from sound synthesis. / McDermott, Josh H.; Simoncelli, Eero.

In: Neuron, Vol. 71, No. 5, 08.09.2011, p. 926-940.

Research output: Contribution to journalArticle

@article{2cc3133e02b44e35b8890b8ab98021af,
title = "Sound texture perception via statistics of the auditory periphery: Evidence from sound synthesis",
abstract = "Rainstorms, insect swarms, and galloping horses produce {"}sound textures{"} -the collective result of many similar acoustic events. Sound textures are distinguished by temporal homogeneity, suggesting they could be recognized with time-averaged statistics. To test this hypothesis, we processed real-world textures with an auditory model containing filters tuned for sound frequencies and their modulations, and measured statistics of the resulting decomposition. We then assessed the realism and recognizability of novel sounds synthesized to have matching statistics. Statistics of individual frequency channels, capturing spectral power and sparsity, generally failed to produce compelling synthetic textures; however, combining them with correlations between channels produced identifiable and natural-sounding textures. Synthesis quality declined if statistics were computed from biologically implausible auditory models. The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations. The synthesis methodology offers a powerful tool for their further investigation.",
author = "McDermott, {Josh H.} and Eero Simoncelli",
year = "2011",
month = "9",
day = "8",
doi = "10.1016/j.neuron.2011.06.032",
language = "English (US)",
volume = "71",
pages = "926--940",
journal = "Neuron",
issn = "0896-6273",
publisher = "Cell Press",
number = "5",

}

TY - JOUR

T1 - Sound texture perception via statistics of the auditory periphery

T2 - Evidence from sound synthesis

AU - McDermott, Josh H.

AU - Simoncelli, Eero

PY - 2011/9/8

Y1 - 2011/9/8

N2 - Rainstorms, insect swarms, and galloping horses produce "sound textures" -the collective result of many similar acoustic events. Sound textures are distinguished by temporal homogeneity, suggesting they could be recognized with time-averaged statistics. To test this hypothesis, we processed real-world textures with an auditory model containing filters tuned for sound frequencies and their modulations, and measured statistics of the resulting decomposition. We then assessed the realism and recognizability of novel sounds synthesized to have matching statistics. Statistics of individual frequency channels, capturing spectral power and sparsity, generally failed to produce compelling synthetic textures; however, combining them with correlations between channels produced identifiable and natural-sounding textures. Synthesis quality declined if statistics were computed from biologically implausible auditory models. The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations. The synthesis methodology offers a powerful tool for their further investigation.

AB - Rainstorms, insect swarms, and galloping horses produce "sound textures" -the collective result of many similar acoustic events. Sound textures are distinguished by temporal homogeneity, suggesting they could be recognized with time-averaged statistics. To test this hypothesis, we processed real-world textures with an auditory model containing filters tuned for sound frequencies and their modulations, and measured statistics of the resulting decomposition. We then assessed the realism and recognizability of novel sounds synthesized to have matching statistics. Statistics of individual frequency channels, capturing spectral power and sparsity, generally failed to produce compelling synthetic textures; however, combining them with correlations between channels produced identifiable and natural-sounding textures. Synthesis quality declined if statistics were computed from biologically implausible auditory models. The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations. The synthesis methodology offers a powerful tool for their further investigation.

UR - http://www.scopus.com/inward/record.url?scp=80052406394&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80052406394&partnerID=8YFLogxK

U2 - 10.1016/j.neuron.2011.06.032

DO - 10.1016/j.neuron.2011.06.032

M3 - Article

C2 - 21903084

AN - SCOPUS:80052406394

VL - 71

SP - 926

EP - 940

JO - Neuron

JF - Neuron

SN - 0896-6273

IS - 5

ER -