Kernel methods for learning languages

Leonid (Aryeh) Kontorovich, Corinna Cortes, Mehryar Mohri

Research output: Contribution to journalArticle

Abstract

This paper studies a novel paradigm for learning formal languages from positive and negative examples which consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. Such mappings can often be represented flexibly with string kernels, with the additional benefit of computational efficiency. The paradigm inspected can thus be viewed as that of using kernel methods for learning languages. We initiate the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. We introduce a subsequence feature mapping to a Hilbert space and prove that piecewise-testable languages are linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. We also show that the positive definite symmetric kernel associated to this embedding is a rational kernel and show that it can be computed in quadratic time using general-purpose weighted automata algorithms. Our examination of the linear separability of piecewise-testable languages leads us to study the general problem of separability with other finite regular covers. We show that all languages linearly separable under a regular finite cover embedding, a generalization of the subsequence embedding we use, are regular. We give a general analysis of the use of support vector machines in combination with kernels to determine a separating hyperplane for languages and study the corresponding learning guarantees. Our analysis includes several additional linear separability results in abstract settings and partial characterizations for the linear separability of the family of all regular languages.

Original languageEnglish (US)
Pages (from-to)223-236
Number of pages14
JournalTheoretical Computer Science
Volume405
Issue number3
DOIs
StatePublished - Oct 17 2008

Fingerprint

Kernel Methods
Separability
Formal languages
Subsequence
kernel
Hilbert spaces
Computational efficiency
Hyperplane
Support vector machines
Strings
Linearly
Paradigm
Cover
Weighted Automata
Formal Languages
Regular Languages
Feature Space
Positive definite
Computational Efficiency
Automata

Keywords

  • Finite automata
  • Kernels
  • Learning automata
  • Margin theory
  • Piecewise-testable languages
  • Support vector machines

ASJC Scopus subject areas

  • Computational Theory and Mathematics

Cite this

Kernel methods for learning languages. / Kontorovich, Leonid (Aryeh); Cortes, Corinna; Mohri, Mehryar.

In: Theoretical Computer Science, Vol. 405, No. 3, 17.10.2008, p. 223-236.

Research output: Contribution to journalArticle

Kontorovich, Leonid (Aryeh) ; Cortes, Corinna ; Mohri, Mehryar. / Kernel methods for learning languages. In: Theoretical Computer Science. 2008 ; Vol. 405, No. 3. pp. 223-236.
@article{f745903874ab4517af2c07a38cff6153,
title = "Kernel methods for learning languages",
abstract = "This paper studies a novel paradigm for learning formal languages from positive and negative examples which consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. Such mappings can often be represented flexibly with string kernels, with the additional benefit of computational efficiency. The paradigm inspected can thus be viewed as that of using kernel methods for learning languages. We initiate the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. We introduce a subsequence feature mapping to a Hilbert space and prove that piecewise-testable languages are linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. We also show that the positive definite symmetric kernel associated to this embedding is a rational kernel and show that it can be computed in quadratic time using general-purpose weighted automata algorithms. Our examination of the linear separability of piecewise-testable languages leads us to study the general problem of separability with other finite regular covers. We show that all languages linearly separable under a regular finite cover embedding, a generalization of the subsequence embedding we use, are regular. We give a general analysis of the use of support vector machines in combination with kernels to determine a separating hyperplane for languages and study the corresponding learning guarantees. Our analysis includes several additional linear separability results in abstract settings and partial characterizations for the linear separability of the family of all regular languages.",
keywords = "Finite automata, Kernels, Learning automata, Margin theory, Piecewise-testable languages, Support vector machines",
author = "Kontorovich, {Leonid (Aryeh)} and Corinna Cortes and Mehryar Mohri",
year = "2008",
month = "10",
day = "17",
doi = "10.1016/j.tcs.2008.06.037",
language = "English (US)",
volume = "405",
pages = "223--236",
journal = "Theoretical Computer Science",
issn = "0304-3975",
publisher = "Elsevier",
number = "3",

}

TY - JOUR

T1 - Kernel methods for learning languages

AU - Kontorovich, Leonid (Aryeh)

AU - Cortes, Corinna

AU - Mohri, Mehryar

PY - 2008/10/17

Y1 - 2008/10/17

N2 - This paper studies a novel paradigm for learning formal languages from positive and negative examples which consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. Such mappings can often be represented flexibly with string kernels, with the additional benefit of computational efficiency. The paradigm inspected can thus be viewed as that of using kernel methods for learning languages. We initiate the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. We introduce a subsequence feature mapping to a Hilbert space and prove that piecewise-testable languages are linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. We also show that the positive definite symmetric kernel associated to this embedding is a rational kernel and show that it can be computed in quadratic time using general-purpose weighted automata algorithms. Our examination of the linear separability of piecewise-testable languages leads us to study the general problem of separability with other finite regular covers. We show that all languages linearly separable under a regular finite cover embedding, a generalization of the subsequence embedding we use, are regular. We give a general analysis of the use of support vector machines in combination with kernels to determine a separating hyperplane for languages and study the corresponding learning guarantees. Our analysis includes several additional linear separability results in abstract settings and partial characterizations for the linear separability of the family of all regular languages.

AB - This paper studies a novel paradigm for learning formal languages from positive and negative examples which consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. Such mappings can often be represented flexibly with string kernels, with the additional benefit of computational efficiency. The paradigm inspected can thus be viewed as that of using kernel methods for learning languages. We initiate the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. We introduce a subsequence feature mapping to a Hilbert space and prove that piecewise-testable languages are linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. We also show that the positive definite symmetric kernel associated to this embedding is a rational kernel and show that it can be computed in quadratic time using general-purpose weighted automata algorithms. Our examination of the linear separability of piecewise-testable languages leads us to study the general problem of separability with other finite regular covers. We show that all languages linearly separable under a regular finite cover embedding, a generalization of the subsequence embedding we use, are regular. We give a general analysis of the use of support vector machines in combination with kernels to determine a separating hyperplane for languages and study the corresponding learning guarantees. Our analysis includes several additional linear separability results in abstract settings and partial characterizations for the linear separability of the family of all regular languages.

KW - Finite automata

KW - Kernels

KW - Learning automata

KW - Margin theory

KW - Piecewise-testable languages

KW - Support vector machines

UR - http://www.scopus.com/inward/record.url?scp=51049103186&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=51049103186&partnerID=8YFLogxK

U2 - 10.1016/j.tcs.2008.06.037

DO - 10.1016/j.tcs.2008.06.037

M3 - Article

VL - 405

SP - 223

EP - 236

JO - Theoretical Computer Science

JF - Theoretical Computer Science

SN - 0304-3975

IS - 3

ER -