Downbeat tracking with multiple features and deep neural networks

Simon Durand, Juan P. Bello, Bertrand David, Gael Richard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we introduce a novel method for the automatic estimation of downbeat positions from music signals. Our system relies on the computation of musically inspired features capturing important aspects of music such as timbre, harmony, rhythmic patterns, or local similarities in both timbre and harmony. It then uses several independent deep neural networks to learn higher-level representations. The downbeat sequences are finally obtained thanks to a temporal decoding step based on the Viterbi algorithm. The comparative evaluation conducted on varied datasets demonstrates the efficiency and robustness across different music styles of our approach.

Original languageEnglish (US)
Title of host publication2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages409-413
Number of pages5
Volume2015-August
ISBN (Electronic)9781467369978
DOIs
StatePublished - Aug 4 2015
Event40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, Australia
Duration: Apr 19 2014Apr 24 2014

Other

Other40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
CountryAustralia
CityBrisbane
Period4/19/144/24/14

Fingerprint

Viterbi algorithm
Decoding
Deep neural networks

Keywords

  • Deep Networks
  • Downbeat Tracking
  • Music Information Retrieval
  • Music Signal Processing

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Durand, S., Bello, J. P., David, B., & Richard, G. (2015). Downbeat tracking with multiple features and deep neural networks. In 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings (Vol. 2015-August, pp. 409-413). [7178001] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2015.7178001

Downbeat tracking with multiple features and deep neural networks. / Durand, Simon; Bello, Juan P.; David, Bertrand; Richard, Gael.

2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Vol. 2015-August Institute of Electrical and Electronics Engineers Inc., 2015. p. 409-413 7178001.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Durand, S, Bello, JP, David, B & Richard, G 2015, Downbeat tracking with multiple features and deep neural networks. in 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. vol. 2015-August, 7178001, Institute of Electrical and Electronics Engineers Inc., pp. 409-413, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015, Brisbane, Australia, 4/19/14. https://doi.org/10.1109/ICASSP.2015.7178001
Durand S, Bello JP, David B, Richard G. Downbeat tracking with multiple features and deep neural networks. In 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Vol. 2015-August. Institute of Electrical and Electronics Engineers Inc. 2015. p. 409-413. 7178001 https://doi.org/10.1109/ICASSP.2015.7178001
Durand, Simon ; Bello, Juan P. ; David, Bertrand ; Richard, Gael. / Downbeat tracking with multiple features and deep neural networks. 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Vol. 2015-August Institute of Electrical and Electronics Engineers Inc., 2015. pp. 409-413
@inproceedings{08300b49aa4743f0ac4f7e3a91e300c7,
title = "Downbeat tracking with multiple features and deep neural networks",
abstract = "In this paper, we introduce a novel method for the automatic estimation of downbeat positions from music signals. Our system relies on the computation of musically inspired features capturing important aspects of music such as timbre, harmony, rhythmic patterns, or local similarities in both timbre and harmony. It then uses several independent deep neural networks to learn higher-level representations. The downbeat sequences are finally obtained thanks to a temporal decoding step based on the Viterbi algorithm. The comparative evaluation conducted on varied datasets demonstrates the efficiency and robustness across different music styles of our approach.",
keywords = "Deep Networks, Downbeat Tracking, Music Information Retrieval, Music Signal Processing",
author = "Simon Durand and Bello, {Juan P.} and Bertrand David and Gael Richard",
year = "2015",
month = "8",
day = "4",
doi = "10.1109/ICASSP.2015.7178001",
language = "English (US)",
volume = "2015-August",
pages = "409--413",
booktitle = "2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - Downbeat tracking with multiple features and deep neural networks

AU - Durand, Simon

AU - Bello, Juan P.

AU - David, Bertrand

AU - Richard, Gael

PY - 2015/8/4

Y1 - 2015/8/4

N2 - In this paper, we introduce a novel method for the automatic estimation of downbeat positions from music signals. Our system relies on the computation of musically inspired features capturing important aspects of music such as timbre, harmony, rhythmic patterns, or local similarities in both timbre and harmony. It then uses several independent deep neural networks to learn higher-level representations. The downbeat sequences are finally obtained thanks to a temporal decoding step based on the Viterbi algorithm. The comparative evaluation conducted on varied datasets demonstrates the efficiency and robustness across different music styles of our approach.

AB - In this paper, we introduce a novel method for the automatic estimation of downbeat positions from music signals. Our system relies on the computation of musically inspired features capturing important aspects of music such as timbre, harmony, rhythmic patterns, or local similarities in both timbre and harmony. It then uses several independent deep neural networks to learn higher-level representations. The downbeat sequences are finally obtained thanks to a temporal decoding step based on the Viterbi algorithm. The comparative evaluation conducted on varied datasets demonstrates the efficiency and robustness across different music styles of our approach.

KW - Deep Networks

KW - Downbeat Tracking

KW - Music Information Retrieval

KW - Music Signal Processing

UR - http://www.scopus.com/inward/record.url?scp=84946023293&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946023293&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2015.7178001

DO - 10.1109/ICASSP.2015.7178001

M3 - Conference contribution

AN - SCOPUS:84946023293

VL - 2015-August

SP - 409

EP - 413

BT - 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -