Moment kernels for regular distributions

Corinna Cortes, Mehryar Mohri

Research output: Contribution to journalArticle

Abstract

Many machine learning problems in natural language processing, transaction-log analysis, or computational biology, require the analysis of variable-length sequences, or, more generally, distributions of variable-length sequences. Kernel methods introduced for fixed-size vectors have proven very successful in a variety of machine learning tasks. We recently introduced a new and general kernel framework, rational kernels, to extend these methods to the analysis of variable-length sequences or more generally distributions given by weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification with Support Vector Machines. However, the rational kernels previously introduced in these applications do not fully encompass distributions over alternate sequences. They are based only on the counts of co-occurring subsequences averaged over the alternate paths without taking into accounts information about the higher-order moments of the distributions of these counts. In this paper, we introduce a new family of rational kernels, moment kernels, that precisely exploits this additional information. These kernels are distribution kernels based on moments of counts of strings. We describe efficient algorithms to compute moment kernels and apply them to several difficult spoken-dialog classification tasks. Our experiments show that using the second moment of the counts of n-gram sequences consistently improves the classification accuracy in these tasks.

Original languageEnglish (US)
Pages (from-to)117-134
Number of pages18
JournalMachine Learning
Volume60
Issue number1-3
DOIs
StatePublished - Sep 2005

Fingerprint

Learning systems
Support vector machines
Processing
Experiments

Keywords

  • Kernel methods
  • Rational kernels
  • Spoken-dialog classification
  • Statistical learning
  • String kernels
  • Weighted automata
  • Weighted finite-state transducers

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Artificial Intelligence

Cite this

Moment kernels for regular distributions. / Cortes, Corinna; Mohri, Mehryar.

In: Machine Learning, Vol. 60, No. 1-3, 09.2005, p. 117-134.

Research output: Contribution to journalArticle

Cortes, Corinna ; Mohri, Mehryar. / Moment kernels for regular distributions. In: Machine Learning. 2005 ; Vol. 60, No. 1-3. pp. 117-134.
@article{f4654546c44742a1bd05e51dfce7df65,
title = "Moment kernels for regular distributions",
abstract = "Many machine learning problems in natural language processing, transaction-log analysis, or computational biology, require the analysis of variable-length sequences, or, more generally, distributions of variable-length sequences. Kernel methods introduced for fixed-size vectors have proven very successful in a variety of machine learning tasks. We recently introduced a new and general kernel framework, rational kernels, to extend these methods to the analysis of variable-length sequences or more generally distributions given by weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification with Support Vector Machines. However, the rational kernels previously introduced in these applications do not fully encompass distributions over alternate sequences. They are based only on the counts of co-occurring subsequences averaged over the alternate paths without taking into accounts information about the higher-order moments of the distributions of these counts. In this paper, we introduce a new family of rational kernels, moment kernels, that precisely exploits this additional information. These kernels are distribution kernels based on moments of counts of strings. We describe efficient algorithms to compute moment kernels and apply them to several difficult spoken-dialog classification tasks. Our experiments show that using the second moment of the counts of n-gram sequences consistently improves the classification accuracy in these tasks.",
keywords = "Kernel methods, Rational kernels, Spoken-dialog classification, Statistical learning, String kernels, Weighted automata, Weighted finite-state transducers",
author = "Corinna Cortes and Mehryar Mohri",
year = "2005",
month = "9",
doi = "10.1007/s10994-005-0919-8",
language = "English (US)",
volume = "60",
pages = "117--134",
journal = "Machine Learning",
issn = "0885-6125",
publisher = "Springer Netherlands",
number = "1-3",

}

TY - JOUR

T1 - Moment kernels for regular distributions

AU - Cortes, Corinna

AU - Mohri, Mehryar

PY - 2005/9

Y1 - 2005/9

N2 - Many machine learning problems in natural language processing, transaction-log analysis, or computational biology, require the analysis of variable-length sequences, or, more generally, distributions of variable-length sequences. Kernel methods introduced for fixed-size vectors have proven very successful in a variety of machine learning tasks. We recently introduced a new and general kernel framework, rational kernels, to extend these methods to the analysis of variable-length sequences or more generally distributions given by weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification with Support Vector Machines. However, the rational kernels previously introduced in these applications do not fully encompass distributions over alternate sequences. They are based only on the counts of co-occurring subsequences averaged over the alternate paths without taking into accounts information about the higher-order moments of the distributions of these counts. In this paper, we introduce a new family of rational kernels, moment kernels, that precisely exploits this additional information. These kernels are distribution kernels based on moments of counts of strings. We describe efficient algorithms to compute moment kernels and apply them to several difficult spoken-dialog classification tasks. Our experiments show that using the second moment of the counts of n-gram sequences consistently improves the classification accuracy in these tasks.

AB - Many machine learning problems in natural language processing, transaction-log analysis, or computational biology, require the analysis of variable-length sequences, or, more generally, distributions of variable-length sequences. Kernel methods introduced for fixed-size vectors have proven very successful in a variety of machine learning tasks. We recently introduced a new and general kernel framework, rational kernels, to extend these methods to the analysis of variable-length sequences or more generally distributions given by weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification with Support Vector Machines. However, the rational kernels previously introduced in these applications do not fully encompass distributions over alternate sequences. They are based only on the counts of co-occurring subsequences averaged over the alternate paths without taking into accounts information about the higher-order moments of the distributions of these counts. In this paper, we introduce a new family of rational kernels, moment kernels, that precisely exploits this additional information. These kernels are distribution kernels based on moments of counts of strings. We describe efficient algorithms to compute moment kernels and apply them to several difficult spoken-dialog classification tasks. Our experiments show that using the second moment of the counts of n-gram sequences consistently improves the classification accuracy in these tasks.

KW - Kernel methods

KW - Rational kernels

KW - Spoken-dialog classification

KW - Statistical learning

KW - String kernels

KW - Weighted automata

KW - Weighted finite-state transducers

UR - http://www.scopus.com/inward/record.url?scp=24044496973&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=24044496973&partnerID=8YFLogxK

U2 - 10.1007/s10994-005-0919-8

DO - 10.1007/s10994-005-0919-8

M3 - Article

VL - 60

SP - 117

EP - 134

JO - Machine Learning

JF - Machine Learning

SN - 0885-6125

IS - 1-3

ER -