Generalized optimization algorithm for speech recognition transducers

Cyril Allauzen, Mehryar Mohri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Weighted transducers provide a common representation for the components of a speech recognition system. In previous work, we showed that these components can be combined off-line into a single compact recognition transducer that maps directly HMM state sequences to word sequences. The construction of that recognition transducer and its efficiency of use critically depend on the use of a general optimization algorithm, determinization. However, not all weighted automata and transducers used in large-vocabulary speech recognition are determinizable. We present a general algorithm that can make an arbitrary weighted transducer determinizable and generalize our previous optimization technique for building an integrated recognition transducer to deal with arbitrary weighted transducers used in speech recognition. We report experimental results in a large-vocabulary speech recognition task, How May I Help You (HMIHY), showing that our generalized technique leads to a recognition transducer that performs as well as our original solution in the case of classical n-gram models while inserting less special symbols, and that it leads to a substantial improvement of the recognition speed, factor of 2.6, in the same task when using a class-based language model.

Original languageEnglish (US)
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Pages352-355
Number of pages4
Volume1
StatePublished - 2003
Event2003 IEEE International Conference on Accoustics, Speech, and Signal Processing - Hong Kong, Hong Kong
Duration: Apr 6 2003Apr 10 2003

Other

Other2003 IEEE International Conference on Accoustics, Speech, and Signal Processing
CountryHong Kong
CityHong Kong
Period4/6/034/10/03

Fingerprint

speech recognition
Speech recognition
Transducers
transducers
optimization

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing
  • Acoustics and Ultrasonics

Cite this

Allauzen, C., & Mohri, M. (2003). Generalized optimization algorithm for speech recognition transducers. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 1, pp. 352-355)

Generalized optimization algorithm for speech recognition transducers. / Allauzen, Cyril; Mohri, Mehryar.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1 2003. p. 352-355.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Allauzen, C & Mohri, M 2003, Generalized optimization algorithm for speech recognition transducers. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. vol. 1, pp. 352-355, 2003 IEEE International Conference on Accoustics, Speech, and Signal Processing, Hong Kong, Hong Kong, 4/6/03.
Allauzen C, Mohri M. Generalized optimization algorithm for speech recognition transducers. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1. 2003. p. 352-355
Allauzen, Cyril ; Mohri, Mehryar. / Generalized optimization algorithm for speech recognition transducers. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1 2003. pp. 352-355
@inproceedings{8680d61dbefb404eb914fe56f7918c27,
title = "Generalized optimization algorithm for speech recognition transducers",
abstract = "Weighted transducers provide a common representation for the components of a speech recognition system. In previous work, we showed that these components can be combined off-line into a single compact recognition transducer that maps directly HMM state sequences to word sequences. The construction of that recognition transducer and its efficiency of use critically depend on the use of a general optimization algorithm, determinization. However, not all weighted automata and transducers used in large-vocabulary speech recognition are determinizable. We present a general algorithm that can make an arbitrary weighted transducer determinizable and generalize our previous optimization technique for building an integrated recognition transducer to deal with arbitrary weighted transducers used in speech recognition. We report experimental results in a large-vocabulary speech recognition task, How May I Help You (HMIHY), showing that our generalized technique leads to a recognition transducer that performs as well as our original solution in the case of classical n-gram models while inserting less special symbols, and that it leads to a substantial improvement of the recognition speed, factor of 2.6, in the same task when using a class-based language model.",
author = "Cyril Allauzen and Mehryar Mohri",
year = "2003",
language = "English (US)",
volume = "1",
pages = "352--355",
booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

}

TY - GEN

T1 - Generalized optimization algorithm for speech recognition transducers

AU - Allauzen, Cyril

AU - Mohri, Mehryar

PY - 2003

Y1 - 2003

N2 - Weighted transducers provide a common representation for the components of a speech recognition system. In previous work, we showed that these components can be combined off-line into a single compact recognition transducer that maps directly HMM state sequences to word sequences. The construction of that recognition transducer and its efficiency of use critically depend on the use of a general optimization algorithm, determinization. However, not all weighted automata and transducers used in large-vocabulary speech recognition are determinizable. We present a general algorithm that can make an arbitrary weighted transducer determinizable and generalize our previous optimization technique for building an integrated recognition transducer to deal with arbitrary weighted transducers used in speech recognition. We report experimental results in a large-vocabulary speech recognition task, How May I Help You (HMIHY), showing that our generalized technique leads to a recognition transducer that performs as well as our original solution in the case of classical n-gram models while inserting less special symbols, and that it leads to a substantial improvement of the recognition speed, factor of 2.6, in the same task when using a class-based language model.

AB - Weighted transducers provide a common representation for the components of a speech recognition system. In previous work, we showed that these components can be combined off-line into a single compact recognition transducer that maps directly HMM state sequences to word sequences. The construction of that recognition transducer and its efficiency of use critically depend on the use of a general optimization algorithm, determinization. However, not all weighted automata and transducers used in large-vocabulary speech recognition are determinizable. We present a general algorithm that can make an arbitrary weighted transducer determinizable and generalize our previous optimization technique for building an integrated recognition transducer to deal with arbitrary weighted transducers used in speech recognition. We report experimental results in a large-vocabulary speech recognition task, How May I Help You (HMIHY), showing that our generalized technique leads to a recognition transducer that performs as well as our original solution in the case of classical n-gram models while inserting less special symbols, and that it leads to a substantial improvement of the recognition speed, factor of 2.6, in the same task when using a class-based language model.

UR - http://www.scopus.com/inward/record.url?scp=0141591502&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0141591502&partnerID=8YFLogxK

M3 - Conference contribution

VL - 1

SP - 352

EP - 355

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

ER -