Learning sequence kernels

Corinna Cortes, Mehryar Mohri, Afshin Rostamizadeh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Kernel methods are used to tackle a variety of learning tasks including classification, regression, ranking, clustering, and dimensionality reduction. The appropriate choice of a kernel is often left to the user. But, poor selections may lead to a sub-optimal performance. Instead, sample points can be used to learn a kernel function appropriate for the task by selecting one out of a family of kernels determined by the user. This paper considers the problem of learning sequence kernel functions, an important problem for applications in computational biology, natural language processing, document classification and other text processing areas. For most kernel-based learning techniques, the kernels selected must be positive definite symmetric, which, for sequence data, are found to be rational kernels. We give a general formulation of the problem of learning rational kernels and prove that a large family of rational kernels can be learned efficiently using a simple quadratic program both in the context of support vector machines and kernel ridge regression. This improves upon previous work that generally results in a more costly semi-definite or quadratically constrained quadratic program. Furthermore, in the specific case of kernel ridge regression, we give an alternative solution based on a closed-form solution for the optimal kernel matrix. We also report results of experiments with our kernel learning techniques in classification and regression tasks.

Original languageEnglish (US)
Title of host publicationProceedings of the 2008 IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008
Pages2-8
Number of pages7
DOIs
StatePublished - 2008
Event2008 IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008 - Cancun, Mexico
Duration: Oct 16 2008Oct 19 2008

Other

Other2008 IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008
CountryMexico
CityCancun
Period10/16/0810/19/08

Fingerprint

Text processing
Support vector machines
Processing
Experiments

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Electrical and Electronic Engineering

Cite this

Cortes, C., Mohri, M., & Rostamizadeh, A. (2008). Learning sequence kernels. In Proceedings of the 2008 IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008 (pp. 2-8). [4685446] https://doi.org/10.1109/MLSP.2008.4685446

Learning sequence kernels. / Cortes, Corinna; Mohri, Mehryar; Rostamizadeh, Afshin.

Proceedings of the 2008 IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008. 2008. p. 2-8 4685446.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cortes, C, Mohri, M & Rostamizadeh, A 2008, Learning sequence kernels. in Proceedings of the 2008 IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008., 4685446, pp. 2-8, 2008 IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008, Cancun, Mexico, 10/16/08. https://doi.org/10.1109/MLSP.2008.4685446
Cortes C, Mohri M, Rostamizadeh A. Learning sequence kernels. In Proceedings of the 2008 IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008. 2008. p. 2-8. 4685446 https://doi.org/10.1109/MLSP.2008.4685446
Cortes, Corinna ; Mohri, Mehryar ; Rostamizadeh, Afshin. / Learning sequence kernels. Proceedings of the 2008 IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008. 2008. pp. 2-8
@inproceedings{238e10cbeed34e14a6a8a567ada66c1a,
title = "Learning sequence kernels",
abstract = "Kernel methods are used to tackle a variety of learning tasks including classification, regression, ranking, clustering, and dimensionality reduction. The appropriate choice of a kernel is often left to the user. But, poor selections may lead to a sub-optimal performance. Instead, sample points can be used to learn a kernel function appropriate for the task by selecting one out of a family of kernels determined by the user. This paper considers the problem of learning sequence kernel functions, an important problem for applications in computational biology, natural language processing, document classification and other text processing areas. For most kernel-based learning techniques, the kernels selected must be positive definite symmetric, which, for sequence data, are found to be rational kernels. We give a general formulation of the problem of learning rational kernels and prove that a large family of rational kernels can be learned efficiently using a simple quadratic program both in the context of support vector machines and kernel ridge regression. This improves upon previous work that generally results in a more costly semi-definite or quadratically constrained quadratic program. Furthermore, in the specific case of kernel ridge regression, we give an alternative solution based on a closed-form solution for the optimal kernel matrix. We also report results of experiments with our kernel learning techniques in classification and regression tasks.",
author = "Corinna Cortes and Mehryar Mohri and Afshin Rostamizadeh",
year = "2008",
doi = "10.1109/MLSP.2008.4685446",
language = "English (US)",
isbn = "9781424423767",
pages = "2--8",
booktitle = "Proceedings of the 2008 IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008",

}

TY - GEN

T1 - Learning sequence kernels

AU - Cortes, Corinna

AU - Mohri, Mehryar

AU - Rostamizadeh, Afshin

PY - 2008

Y1 - 2008

N2 - Kernel methods are used to tackle a variety of learning tasks including classification, regression, ranking, clustering, and dimensionality reduction. The appropriate choice of a kernel is often left to the user. But, poor selections may lead to a sub-optimal performance. Instead, sample points can be used to learn a kernel function appropriate for the task by selecting one out of a family of kernels determined by the user. This paper considers the problem of learning sequence kernel functions, an important problem for applications in computational biology, natural language processing, document classification and other text processing areas. For most kernel-based learning techniques, the kernels selected must be positive definite symmetric, which, for sequence data, are found to be rational kernels. We give a general formulation of the problem of learning rational kernels and prove that a large family of rational kernels can be learned efficiently using a simple quadratic program both in the context of support vector machines and kernel ridge regression. This improves upon previous work that generally results in a more costly semi-definite or quadratically constrained quadratic program. Furthermore, in the specific case of kernel ridge regression, we give an alternative solution based on a closed-form solution for the optimal kernel matrix. We also report results of experiments with our kernel learning techniques in classification and regression tasks.

AB - Kernel methods are used to tackle a variety of learning tasks including classification, regression, ranking, clustering, and dimensionality reduction. The appropriate choice of a kernel is often left to the user. But, poor selections may lead to a sub-optimal performance. Instead, sample points can be used to learn a kernel function appropriate for the task by selecting one out of a family of kernels determined by the user. This paper considers the problem of learning sequence kernel functions, an important problem for applications in computational biology, natural language processing, document classification and other text processing areas. For most kernel-based learning techniques, the kernels selected must be positive definite symmetric, which, for sequence data, are found to be rational kernels. We give a general formulation of the problem of learning rational kernels and prove that a large family of rational kernels can be learned efficiently using a simple quadratic program both in the context of support vector machines and kernel ridge regression. This improves upon previous work that generally results in a more costly semi-definite or quadratically constrained quadratic program. Furthermore, in the specific case of kernel ridge regression, we give an alternative solution based on a closed-form solution for the optimal kernel matrix. We also report results of experiments with our kernel learning techniques in classification and regression tasks.

UR - http://www.scopus.com/inward/record.url?scp=58049176470&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=58049176470&partnerID=8YFLogxK

U2 - 10.1109/MLSP.2008.4685446

DO - 10.1109/MLSP.2008.4685446

M3 - Conference contribution

SN - 9781424423767

SP - 2

EP - 8

BT - Proceedings of the 2008 IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008

ER -