Gaussian-Bernoulli restricted Boltzmann machines and automatic feature extraction for noise robust missing data mask estimation

Sami Keronen, Kyunghyun Cho, Tapani Raiko, Alexander Ilin, Kalle Palomaki

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A missing data mask estimation method based on Gaussian-Bernoulli restricted Boltzmann machine (GRBM) trained on cross-correlation representation of the audio signal is presented in the study. The automatically learned features by the GRBM are utilized in dividing the time-frequency units of the spectrographic mask into noise and speech dominant. The system is evaluated against two baseline mask estimation methods in a reverberant multisource environment speech recognition task. The proposed system is shown to provide a performance improvement in the speech recognition accuracy over the previous multifeature approaches.

Original languageEnglish (US)
Title of host publication2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
Pages6729-6733
Number of pages5
DOIs
StatePublished - Oct 18 2013
Event2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Vancouver, BC, Canada
Duration: May 26 2013May 31 2013

Other

Other2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
CountryCanada
CityVancouver, BC
Period5/26/135/31/13

Fingerprint

Feature extraction
Masks
Speech recognition

Keywords

  • deep learning
  • GRBM
  • mask estimation
  • Noise robust
  • speech recognition

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Electrical and Electronic Engineering

Cite this

Keronen, S., Cho, K., Raiko, T., Ilin, A., & Palomaki, K. (2013). Gaussian-Bernoulli restricted Boltzmann machines and automatic feature extraction for noise robust missing data mask estimation. In 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings (pp. 6729-6733). [6638964] https://doi.org/10.1109/ICASSP.2013.6638964

Gaussian-Bernoulli restricted Boltzmann machines and automatic feature extraction for noise robust missing data mask estimation. / Keronen, Sami; Cho, Kyunghyun; Raiko, Tapani; Ilin, Alexander; Palomaki, Kalle.

2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings. 2013. p. 6729-6733 6638964.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Keronen, S, Cho, K, Raiko, T, Ilin, A & Palomaki, K 2013, Gaussian-Bernoulli restricted Boltzmann machines and automatic feature extraction for noise robust missing data mask estimation. in 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings., 6638964, pp. 6729-6733, 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, 5/26/13. https://doi.org/10.1109/ICASSP.2013.6638964
Keronen S, Cho K, Raiko T, Ilin A, Palomaki K. Gaussian-Bernoulli restricted Boltzmann machines and automatic feature extraction for noise robust missing data mask estimation. In 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings. 2013. p. 6729-6733. 6638964 https://doi.org/10.1109/ICASSP.2013.6638964
Keronen, Sami ; Cho, Kyunghyun ; Raiko, Tapani ; Ilin, Alexander ; Palomaki, Kalle. / Gaussian-Bernoulli restricted Boltzmann machines and automatic feature extraction for noise robust missing data mask estimation. 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings. 2013. pp. 6729-6733
@inproceedings{96531b3e903d4d41a8e456c40c0dde31,
title = "Gaussian-Bernoulli restricted Boltzmann machines and automatic feature extraction for noise robust missing data mask estimation",
abstract = "A missing data mask estimation method based on Gaussian-Bernoulli restricted Boltzmann machine (GRBM) trained on cross-correlation representation of the audio signal is presented in the study. The automatically learned features by the GRBM are utilized in dividing the time-frequency units of the spectrographic mask into noise and speech dominant. The system is evaluated against two baseline mask estimation methods in a reverberant multisource environment speech recognition task. The proposed system is shown to provide a performance improvement in the speech recognition accuracy over the previous multifeature approaches.",
keywords = "deep learning, GRBM, mask estimation, Noise robust, speech recognition",
author = "Sami Keronen and Kyunghyun Cho and Tapani Raiko and Alexander Ilin and Kalle Palomaki",
year = "2013",
month = "10",
day = "18",
doi = "10.1109/ICASSP.2013.6638964",
language = "English (US)",
isbn = "9781479903566",
pages = "6729--6733",
booktitle = "2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings",

}

TY - GEN

T1 - Gaussian-Bernoulli restricted Boltzmann machines and automatic feature extraction for noise robust missing data mask estimation

AU - Keronen, Sami

AU - Cho, Kyunghyun

AU - Raiko, Tapani

AU - Ilin, Alexander

AU - Palomaki, Kalle

PY - 2013/10/18

Y1 - 2013/10/18

N2 - A missing data mask estimation method based on Gaussian-Bernoulli restricted Boltzmann machine (GRBM) trained on cross-correlation representation of the audio signal is presented in the study. The automatically learned features by the GRBM are utilized in dividing the time-frequency units of the spectrographic mask into noise and speech dominant. The system is evaluated against two baseline mask estimation methods in a reverberant multisource environment speech recognition task. The proposed system is shown to provide a performance improvement in the speech recognition accuracy over the previous multifeature approaches.

AB - A missing data mask estimation method based on Gaussian-Bernoulli restricted Boltzmann machine (GRBM) trained on cross-correlation representation of the audio signal is presented in the study. The automatically learned features by the GRBM are utilized in dividing the time-frequency units of the spectrographic mask into noise and speech dominant. The system is evaluated against two baseline mask estimation methods in a reverberant multisource environment speech recognition task. The proposed system is shown to provide a performance improvement in the speech recognition accuracy over the previous multifeature approaches.

KW - deep learning

KW - GRBM

KW - mask estimation

KW - Noise robust

KW - speech recognition

UR - http://www.scopus.com/inward/record.url?scp=84890493403&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84890493403&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2013.6638964

DO - 10.1109/ICASSP.2013.6638964

M3 - Conference contribution

AN - SCOPUS:84890493403

SN - 9781479903566

SP - 6729

EP - 6733

BT - 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings

ER -