A robust mid-level representation for harmonic content in music signals

Juan P. Bello, Jeremy Pickens

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

When considering the problem of audio-to-audio matching, determining musical similarity using low-level features such as Fourier transforms and MFCCs is an extremely difficult task, as there is little semantic information available. Full semantic transcription of audio is an unreliable and imperfect task in the best case, an unsolved problem in the worst. To this end we propose a robust mid-level representation that incorporates both harmonic and rhythmic information, without attempting full transcription. We describe a process for creating this representation automatically, directly from multi-timbral and polyphonic music signals, with an emphasis on popular music. We also offer various evaluations of our techniques. Moreso than most approaches working from raw audio, we incorporate musical knowledge into our assumptions, our models, and our processes. Our hope is that by utilizing this notion of a musically-motivated mid-level representation we may help bridge the gap between symbolic and audio research.

Original languageEnglish (US)
Title of host publicationISMIR 2005 - 6th International Conference on Music Information Retrieval
Pages304-311
Number of pages8
StatePublished - 2005
Event6th International Conference on Music Information Retrieval, ISMIR 2005 - London, United Kingdom
Duration: Sep 11 2005Sep 15 2005

Other

Other6th International Conference on Music Information Retrieval, ISMIR 2005
CountryUnited Kingdom
CityLondon
Period9/11/059/15/05

Fingerprint

Transcription
Semantics
Fourier transforms
Harmonics
Music
Popular music
Evaluation
Imperfect
Semantic Information
Polyphonic

Keywords

  • Harmonic description
  • Music similarity
  • Segmentation

ASJC Scopus subject areas

  • Music
  • Information Systems

Cite this

Bello, J. P., & Pickens, J. (2005). A robust mid-level representation for harmonic content in music signals. In ISMIR 2005 - 6th International Conference on Music Information Retrieval (pp. 304-311)

A robust mid-level representation for harmonic content in music signals. / Bello, Juan P.; Pickens, Jeremy.

ISMIR 2005 - 6th International Conference on Music Information Retrieval. 2005. p. 304-311.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Bello, JP & Pickens, J 2005, A robust mid-level representation for harmonic content in music signals. in ISMIR 2005 - 6th International Conference on Music Information Retrieval. pp. 304-311, 6th International Conference on Music Information Retrieval, ISMIR 2005, London, United Kingdom, 9/11/05.
Bello JP, Pickens J. A robust mid-level representation for harmonic content in music signals. In ISMIR 2005 - 6th International Conference on Music Information Retrieval. 2005. p. 304-311
Bello, Juan P. ; Pickens, Jeremy. / A robust mid-level representation for harmonic content in music signals. ISMIR 2005 - 6th International Conference on Music Information Retrieval. 2005. pp. 304-311
@inproceedings{b5175d68b3c24154b921887a97e33985,
title = "A robust mid-level representation for harmonic content in music signals",
abstract = "When considering the problem of audio-to-audio matching, determining musical similarity using low-level features such as Fourier transforms and MFCCs is an extremely difficult task, as there is little semantic information available. Full semantic transcription of audio is an unreliable and imperfect task in the best case, an unsolved problem in the worst. To this end we propose a robust mid-level representation that incorporates both harmonic and rhythmic information, without attempting full transcription. We describe a process for creating this representation automatically, directly from multi-timbral and polyphonic music signals, with an emphasis on popular music. We also offer various evaluations of our techniques. Moreso than most approaches working from raw audio, we incorporate musical knowledge into our assumptions, our models, and our processes. Our hope is that by utilizing this notion of a musically-motivated mid-level representation we may help bridge the gap between symbolic and audio research.",
keywords = "Harmonic description, Music similarity, Segmentation",
author = "Bello, {Juan P.} and Jeremy Pickens",
year = "2005",
language = "English (US)",
isbn = "9780955117909",
pages = "304--311",
booktitle = "ISMIR 2005 - 6th International Conference on Music Information Retrieval",

}

TY - GEN

T1 - A robust mid-level representation for harmonic content in music signals

AU - Bello, Juan P.

AU - Pickens, Jeremy

PY - 2005

Y1 - 2005

N2 - When considering the problem of audio-to-audio matching, determining musical similarity using low-level features such as Fourier transforms and MFCCs is an extremely difficult task, as there is little semantic information available. Full semantic transcription of audio is an unreliable and imperfect task in the best case, an unsolved problem in the worst. To this end we propose a robust mid-level representation that incorporates both harmonic and rhythmic information, without attempting full transcription. We describe a process for creating this representation automatically, directly from multi-timbral and polyphonic music signals, with an emphasis on popular music. We also offer various evaluations of our techniques. Moreso than most approaches working from raw audio, we incorporate musical knowledge into our assumptions, our models, and our processes. Our hope is that by utilizing this notion of a musically-motivated mid-level representation we may help bridge the gap between symbolic and audio research.

AB - When considering the problem of audio-to-audio matching, determining musical similarity using low-level features such as Fourier transforms and MFCCs is an extremely difficult task, as there is little semantic information available. Full semantic transcription of audio is an unreliable and imperfect task in the best case, an unsolved problem in the worst. To this end we propose a robust mid-level representation that incorporates both harmonic and rhythmic information, without attempting full transcription. We describe a process for creating this representation automatically, directly from multi-timbral and polyphonic music signals, with an emphasis on popular music. We also offer various evaluations of our techniques. Moreso than most approaches working from raw audio, we incorporate musical knowledge into our assumptions, our models, and our processes. Our hope is that by utilizing this notion of a musically-motivated mid-level representation we may help bridge the gap between symbolic and audio research.

KW - Harmonic description

KW - Music similarity

KW - Segmentation

UR - http://www.scopus.com/inward/record.url?scp=84873553947&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84873553947&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9780955117909

SP - 304

EP - 311

BT - ISMIR 2005 - 6th International Conference on Music Information Retrieval

ER -