A Music Structure Informed Downbeat Tracking System Using Skip-chain Conditional Random Fields and Deep Learning

Magdalena Fuentes, Brian McFee, Helene C. Crayencour, Slim Essid, Juan Bello

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In recent years the task of downbeat tracking has received increasing attention and the state of the art has been improved with the introduction of deep learning methods. Among proposed solutions, existing systems exploit short-term musical rules as part of their language modelling. In this work we show in an oracle scenario how including longer-term musical rules, in particular music structure, can enhance downbeat estimation. We introduce a skip-chain conditional random field language model for downbeat tracking designed to include section information in an unified and flexible framework. We combine this model with a state-of-the-art convolutional-recurrent network and we contrast the system's performance to the commonly used Bar Pointer model. Our experiments on the popular Beatles dataset show that incorporating structure information in the language model leads to more consistent and more robust downbeat estimations.

Original languageEnglish (US)
Title of host publication2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages481-485
Number of pages5
ISBN (Electronic)9781479981311
DOIs
StatePublished - May 1 2019
Event44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, United Kingdom
Duration: May 12 2019May 17 2019

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2019-May
ISSN (Print)1520-6149

Conference

Conference44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
CountryUnited Kingdom
CityBrighton
Period5/12/195/17/19

Fingerprint

Deep learning
Experiments

Keywords

  • Convolutional-Recurrent Neural Networks
  • Deep Learning
  • Downbeat Tracking
  • Music Structure
  • Skip-Chain Conditional Random Fields

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Fuentes, M., McFee, B., Crayencour, H. C., Essid, S., & Bello, J. (2019). A Music Structure Informed Downbeat Tracking System Using Skip-chain Conditional Random Fields and Deep Learning. In 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings (pp. 481-485). [8682870] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2019-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2019.8682870

A Music Structure Informed Downbeat Tracking System Using Skip-chain Conditional Random Fields and Deep Learning. / Fuentes, Magdalena; McFee, Brian; Crayencour, Helene C.; Essid, Slim; Bello, Juan.

2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 481-485 8682870 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2019-May).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fuentes, M, McFee, B, Crayencour, HC, Essid, S & Bello, J 2019, A Music Structure Informed Downbeat Tracking System Using Skip-chain Conditional Random Fields and Deep Learning. in 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings., 8682870, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2019-May, Institute of Electrical and Electronics Engineers Inc., pp. 481-485, 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, Brighton, United Kingdom, 5/12/19. https://doi.org/10.1109/ICASSP.2019.8682870
Fuentes M, McFee B, Crayencour HC, Essid S, Bello J. A Music Structure Informed Downbeat Tracking System Using Skip-chain Conditional Random Fields and Deep Learning. In 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. p. 481-485. 8682870. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2019.8682870
Fuentes, Magdalena ; McFee, Brian ; Crayencour, Helene C. ; Essid, Slim ; Bello, Juan. / A Music Structure Informed Downbeat Tracking System Using Skip-chain Conditional Random Fields and Deep Learning. 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 481-485 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{079954821a72470cbf280c4cdb527cc1,
title = "A Music Structure Informed Downbeat Tracking System Using Skip-chain Conditional Random Fields and Deep Learning",
abstract = "In recent years the task of downbeat tracking has received increasing attention and the state of the art has been improved with the introduction of deep learning methods. Among proposed solutions, existing systems exploit short-term musical rules as part of their language modelling. In this work we show in an oracle scenario how including longer-term musical rules, in particular music structure, can enhance downbeat estimation. We introduce a skip-chain conditional random field language model for downbeat tracking designed to include section information in an unified and flexible framework. We combine this model with a state-of-the-art convolutional-recurrent network and we contrast the system's performance to the commonly used Bar Pointer model. Our experiments on the popular Beatles dataset show that incorporating structure information in the language model leads to more consistent and more robust downbeat estimations.",
keywords = "Convolutional-Recurrent Neural Networks, Deep Learning, Downbeat Tracking, Music Structure, Skip-Chain Conditional Random Fields",
author = "Magdalena Fuentes and Brian McFee and Crayencour, {Helene C.} and Slim Essid and Juan Bello",
year = "2019",
month = "5",
day = "1",
doi = "10.1109/ICASSP.2019.8682870",
language = "English (US)",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "481--485",
booktitle = "2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings",

}

TY - GEN

T1 - A Music Structure Informed Downbeat Tracking System Using Skip-chain Conditional Random Fields and Deep Learning

AU - Fuentes, Magdalena

AU - McFee, Brian

AU - Crayencour, Helene C.

AU - Essid, Slim

AU - Bello, Juan

PY - 2019/5/1

Y1 - 2019/5/1

N2 - In recent years the task of downbeat tracking has received increasing attention and the state of the art has been improved with the introduction of deep learning methods. Among proposed solutions, existing systems exploit short-term musical rules as part of their language modelling. In this work we show in an oracle scenario how including longer-term musical rules, in particular music structure, can enhance downbeat estimation. We introduce a skip-chain conditional random field language model for downbeat tracking designed to include section information in an unified and flexible framework. We combine this model with a state-of-the-art convolutional-recurrent network and we contrast the system's performance to the commonly used Bar Pointer model. Our experiments on the popular Beatles dataset show that incorporating structure information in the language model leads to more consistent and more robust downbeat estimations.

AB - In recent years the task of downbeat tracking has received increasing attention and the state of the art has been improved with the introduction of deep learning methods. Among proposed solutions, existing systems exploit short-term musical rules as part of their language modelling. In this work we show in an oracle scenario how including longer-term musical rules, in particular music structure, can enhance downbeat estimation. We introduce a skip-chain conditional random field language model for downbeat tracking designed to include section information in an unified and flexible framework. We combine this model with a state-of-the-art convolutional-recurrent network and we contrast the system's performance to the commonly used Bar Pointer model. Our experiments on the popular Beatles dataset show that incorporating structure information in the language model leads to more consistent and more robust downbeat estimations.

KW - Convolutional-Recurrent Neural Networks

KW - Deep Learning

KW - Downbeat Tracking

KW - Music Structure

KW - Skip-Chain Conditional Random Fields

UR - http://www.scopus.com/inward/record.url?scp=85068990053&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068990053&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2019.8682870

DO - 10.1109/ICASSP.2019.8682870

M3 - Conference contribution

AN - SCOPUS:85068990053

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 481

EP - 485

BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -