Distribution kernels based on moments of counts

Corinna Cortes, Mehryar Mohri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Many applications in text and speech processing require the analysis of distributions of variable-length sequences. We recently introduced a general kernel framework, rational kernels, to extend kernel methods to the analysis of such variable-length sequences or more generally weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification using Support Vector Machines. However, the rational kernels previously introduced do not fully encompass distributions over alternate sequences. Prior similarity measures between two weighted automata are based only on the expected counts of cooccurring subsequences and ignore similarities (or dissimilarities) in higher order moments of the distributions of these counts. In this paper, we introduce a new family of rational kernels, moment kernels, that precisely exploit this additional information. These kernels are distribution kernels based on moments of counts of strings. We describe efficient algorithms to compute moment kernels and apply them to several difficult spoken-dialog classification tasks. Our experiments show that using the second moment of the counts of n-gram sequences consistently improves the classification accuracy in these tasks.

Original languageEnglish (US)
Title of host publicationProceedings, Twenty-First International Conference on Machine Learning, ICML 2004
EditorsR. Greiner, D. Schuurmans
Pages193-200
Number of pages8
StatePublished - 2004
EventProceedings, Twenty-First International Conference on Machine Learning, ICML 2004 - Banff, Alta, Canada
Duration: Jul 4 2004Jul 8 2004

Other

OtherProceedings, Twenty-First International Conference on Machine Learning, ICML 2004
CountryCanada
CityBanff, Alta
Period7/4/047/8/04

Fingerprint

Text processing
Speech processing
Support vector machines
Experiments

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Cortes, C., & Mohri, M. (2004). Distribution kernels based on moments of counts. In R. Greiner, & D. Schuurmans (Eds.), Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004 (pp. 193-200)

Distribution kernels based on moments of counts. / Cortes, Corinna; Mohri, Mehryar.

Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004. ed. / R. Greiner; D. Schuurmans. 2004. p. 193-200.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cortes, C & Mohri, M 2004, Distribution kernels based on moments of counts. in R Greiner & D Schuurmans (eds), Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004. pp. 193-200, Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004, Banff, Alta, Canada, 7/4/04.
Cortes C, Mohri M. Distribution kernels based on moments of counts. In Greiner R, Schuurmans D, editors, Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004. 2004. p. 193-200
Cortes, Corinna ; Mohri, Mehryar. / Distribution kernels based on moments of counts. Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004. editor / R. Greiner ; D. Schuurmans. 2004. pp. 193-200
@inproceedings{ff4a709927cb4ef0bc22359808532d7d,
title = "Distribution kernels based on moments of counts",
abstract = "Many applications in text and speech processing require the analysis of distributions of variable-length sequences. We recently introduced a general kernel framework, rational kernels, to extend kernel methods to the analysis of such variable-length sequences or more generally weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification using Support Vector Machines. However, the rational kernels previously introduced do not fully encompass distributions over alternate sequences. Prior similarity measures between two weighted automata are based only on the expected counts of cooccurring subsequences and ignore similarities (or dissimilarities) in higher order moments of the distributions of these counts. In this paper, we introduce a new family of rational kernels, moment kernels, that precisely exploit this additional information. These kernels are distribution kernels based on moments of counts of strings. We describe efficient algorithms to compute moment kernels and apply them to several difficult spoken-dialog classification tasks. Our experiments show that using the second moment of the counts of n-gram sequences consistently improves the classification accuracy in these tasks.",
author = "Corinna Cortes and Mehryar Mohri",
year = "2004",
language = "English (US)",
isbn = "1581138385",
pages = "193--200",
editor = "R. Greiner and D. Schuurmans",
booktitle = "Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004",

}

TY - GEN

T1 - Distribution kernels based on moments of counts

AU - Cortes, Corinna

AU - Mohri, Mehryar

PY - 2004

Y1 - 2004

N2 - Many applications in text and speech processing require the analysis of distributions of variable-length sequences. We recently introduced a general kernel framework, rational kernels, to extend kernel methods to the analysis of such variable-length sequences or more generally weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification using Support Vector Machines. However, the rational kernels previously introduced do not fully encompass distributions over alternate sequences. Prior similarity measures between two weighted automata are based only on the expected counts of cooccurring subsequences and ignore similarities (or dissimilarities) in higher order moments of the distributions of these counts. In this paper, we introduce a new family of rational kernels, moment kernels, that precisely exploit this additional information. These kernels are distribution kernels based on moments of counts of strings. We describe efficient algorithms to compute moment kernels and apply them to several difficult spoken-dialog classification tasks. Our experiments show that using the second moment of the counts of n-gram sequences consistently improves the classification accuracy in these tasks.

AB - Many applications in text and speech processing require the analysis of distributions of variable-length sequences. We recently introduced a general kernel framework, rational kernels, to extend kernel methods to the analysis of such variable-length sequences or more generally weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification using Support Vector Machines. However, the rational kernels previously introduced do not fully encompass distributions over alternate sequences. Prior similarity measures between two weighted automata are based only on the expected counts of cooccurring subsequences and ignore similarities (or dissimilarities) in higher order moments of the distributions of these counts. In this paper, we introduce a new family of rational kernels, moment kernels, that precisely exploit this additional information. These kernels are distribution kernels based on moments of counts of strings. We describe efficient algorithms to compute moment kernels and apply them to several difficult spoken-dialog classification tasks. Our experiments show that using the second moment of the counts of n-gram sequences consistently improves the classification accuracy in these tasks.

UR - http://www.scopus.com/inward/record.url?scp=14344261324&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=14344261324&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:14344261324

SN - 1581138385

SN - 9781581138382

SP - 193

EP - 200

BT - Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004

A2 - Greiner, R.

A2 - Schuurmans, D.

ER -