Lattice kernels for spoken-dialog classification

Corinna Cortes, Patrick Haffner, Mehryar Mohri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Classification is a key task in spoken-dialog systems. The response of a spoken-dialog system is often guided by the category assigned to the speaker's utterance. Unfortunately, classifiers based on the one-best transcription of the speech utterances are not satisfactory because of the high word error rate of conversational speech recognition systems. Since the correct transcription may not be the highest ranking one, but often will be represented in the word lattices output by the recognizer, the classification accuracy can be much higher if the full lattice is exploited both during training and classification. In this paper we present the first principled approach for classification based on full lattices. For this purpose, we use the Support Vector Machine framework with kernels for lattices. The lattice kernels we define belong to the general class of rational kernels. We give efficient algorithms for computing kernels for arbitrary lattices and report experiments using the algorithm in a difficult call-classification task with 38 categories. Our experiments with a trigram lattice kernel show a 15% reduction in error rate at a 30% rejection level.

Original languageEnglish (US)
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Pages628-631
Number of pages4
Volume1
StatePublished - 2003
Event2003 IEEE International Conference on Accoustics, Speech, and Signal Processing - Hong Kong, Hong Kong
Duration: Apr 6 2003Apr 10 2003

Other

Other2003 IEEE International Conference on Accoustics, Speech, and Signal Processing
CountryHong Kong
CityHong Kong
Period4/6/034/10/03

Fingerprint

Transcription
Speech recognition
Support vector machines
ranking
Classifiers
speech recognition
Experiments
classifiers
rejection
education
output

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing
  • Acoustics and Ultrasonics

Cite this

Cortes, C., Haffner, P., & Mohri, M. (2003). Lattice kernels for spoken-dialog classification. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 1, pp. 628-631)

Lattice kernels for spoken-dialog classification. / Cortes, Corinna; Haffner, Patrick; Mohri, Mehryar.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1 2003. p. 628-631.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cortes, C, Haffner, P & Mohri, M 2003, Lattice kernels for spoken-dialog classification. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. vol. 1, pp. 628-631, 2003 IEEE International Conference on Accoustics, Speech, and Signal Processing, Hong Kong, Hong Kong, 4/6/03.
Cortes C, Haffner P, Mohri M. Lattice kernels for spoken-dialog classification. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1. 2003. p. 628-631
Cortes, Corinna ; Haffner, Patrick ; Mohri, Mehryar. / Lattice kernels for spoken-dialog classification. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1 2003. pp. 628-631
@inproceedings{c038f0c7bc3f4ef0aa49ade0eec603ba,
title = "Lattice kernels for spoken-dialog classification",
abstract = "Classification is a key task in spoken-dialog systems. The response of a spoken-dialog system is often guided by the category assigned to the speaker's utterance. Unfortunately, classifiers based on the one-best transcription of the speech utterances are not satisfactory because of the high word error rate of conversational speech recognition systems. Since the correct transcription may not be the highest ranking one, but often will be represented in the word lattices output by the recognizer, the classification accuracy can be much higher if the full lattice is exploited both during training and classification. In this paper we present the first principled approach for classification based on full lattices. For this purpose, we use the Support Vector Machine framework with kernels for lattices. The lattice kernels we define belong to the general class of rational kernels. We give efficient algorithms for computing kernels for arbitrary lattices and report experiments using the algorithm in a difficult call-classification task with 38 categories. Our experiments with a trigram lattice kernel show a 15{\%} reduction in error rate at a 30{\%} rejection level.",
author = "Corinna Cortes and Patrick Haffner and Mehryar Mohri",
year = "2003",
language = "English (US)",
volume = "1",
pages = "628--631",
booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

}

TY - GEN

T1 - Lattice kernels for spoken-dialog classification

AU - Cortes, Corinna

AU - Haffner, Patrick

AU - Mohri, Mehryar

PY - 2003

Y1 - 2003

N2 - Classification is a key task in spoken-dialog systems. The response of a spoken-dialog system is often guided by the category assigned to the speaker's utterance. Unfortunately, classifiers based on the one-best transcription of the speech utterances are not satisfactory because of the high word error rate of conversational speech recognition systems. Since the correct transcription may not be the highest ranking one, but often will be represented in the word lattices output by the recognizer, the classification accuracy can be much higher if the full lattice is exploited both during training and classification. In this paper we present the first principled approach for classification based on full lattices. For this purpose, we use the Support Vector Machine framework with kernels for lattices. The lattice kernels we define belong to the general class of rational kernels. We give efficient algorithms for computing kernels for arbitrary lattices and report experiments using the algorithm in a difficult call-classification task with 38 categories. Our experiments with a trigram lattice kernel show a 15% reduction in error rate at a 30% rejection level.

AB - Classification is a key task in spoken-dialog systems. The response of a spoken-dialog system is often guided by the category assigned to the speaker's utterance. Unfortunately, classifiers based on the one-best transcription of the speech utterances are not satisfactory because of the high word error rate of conversational speech recognition systems. Since the correct transcription may not be the highest ranking one, but often will be represented in the word lattices output by the recognizer, the classification accuracy can be much higher if the full lattice is exploited both during training and classification. In this paper we present the first principled approach for classification based on full lattices. For this purpose, we use the Support Vector Machine framework with kernels for lattices. The lattice kernels we define belong to the general class of rational kernels. We give efficient algorithms for computing kernels for arbitrary lattices and report experiments using the algorithm in a difficult call-classification task with 38 categories. Our experiments with a trigram lattice kernel show a 15% reduction in error rate at a 30% rejection level.

UR - http://www.scopus.com/inward/record.url?scp=0141607893&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0141607893&partnerID=8YFLogxK

M3 - Conference contribution

VL - 1

SP - 628

EP - 631

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

ER -