Learning linearly separable languages

Leonid Kontorovich, Corinna Cortes, Mehryar Mohri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a novel paradigm for learning languages that consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. It initiates the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. It introduces a high-dimensional feature map and proves piecewise-testable languages to be linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. It also shows that the positive definite kernel associated to this embedding can be computed in quadratic time. It examines the use of support vector machines in combination with this kernel to determine a separating hyperplane and the corresponding learning guarantees. It also proves that all languages linearly separable under a regular finite cover embedding, a generalization of the embedding we used, are regular.

Original languageEnglish (US)
Title of host publicationAlgorithmic Learning Theory - 17th International Conference, ALT 2006, Proceedings
Pages288-303
Number of pages16
Volume4264 LNAI
StatePublished - 2006
Event17th International Conference on Algorithmic Learning Theory, ALT 2006 - Barcelona, Spain
Duration: Oct 7 2006Oct 10 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4264 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other17th International Conference on Algorithmic Learning Theory, ALT 2006
CountrySpain
CityBarcelona
Period10/7/0610/10/06

Fingerprint

Support vector machines
Language
Linearly
Learning
Hyperplane
High-dimensional
Positive Definite Kernels
Separability
Feature Space
Subsequence
Automata
Support Vector Machine
Strings
Paradigm
Cover
kernel

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Kontorovich, L., Cortes, C., & Mohri, M. (2006). Learning linearly separable languages. In Algorithmic Learning Theory - 17th International Conference, ALT 2006, Proceedings (Vol. 4264 LNAI, pp. 288-303). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4264 LNAI).

Learning linearly separable languages. / Kontorovich, Leonid; Cortes, Corinna; Mohri, Mehryar.

Algorithmic Learning Theory - 17th International Conference, ALT 2006, Proceedings. Vol. 4264 LNAI 2006. p. 288-303 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4264 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kontorovich, L, Cortes, C & Mohri, M 2006, Learning linearly separable languages. in Algorithmic Learning Theory - 17th International Conference, ALT 2006, Proceedings. vol. 4264 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4264 LNAI, pp. 288-303, 17th International Conference on Algorithmic Learning Theory, ALT 2006, Barcelona, Spain, 10/7/06.
Kontorovich L, Cortes C, Mohri M. Learning linearly separable languages. In Algorithmic Learning Theory - 17th International Conference, ALT 2006, Proceedings. Vol. 4264 LNAI. 2006. p. 288-303. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Kontorovich, Leonid ; Cortes, Corinna ; Mohri, Mehryar. / Learning linearly separable languages. Algorithmic Learning Theory - 17th International Conference, ALT 2006, Proceedings. Vol. 4264 LNAI 2006. pp. 288-303 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{89d5ea01b2ba4f65acd36b6f06279dc4,
title = "Learning linearly separable languages",
abstract = "This paper presents a novel paradigm for learning languages that consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. It initiates the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. It introduces a high-dimensional feature map and proves piecewise-testable languages to be linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. It also shows that the positive definite kernel associated to this embedding can be computed in quadratic time. It examines the use of support vector machines in combination with this kernel to determine a separating hyperplane and the corresponding learning guarantees. It also proves that all languages linearly separable under a regular finite cover embedding, a generalization of the embedding we used, are regular.",
author = "Leonid Kontorovich and Corinna Cortes and Mehryar Mohri",
year = "2006",
language = "English (US)",
isbn = "3540466495",
volume = "4264 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "288--303",
booktitle = "Algorithmic Learning Theory - 17th International Conference, ALT 2006, Proceedings",

}

TY - GEN

T1 - Learning linearly separable languages

AU - Kontorovich, Leonid

AU - Cortes, Corinna

AU - Mohri, Mehryar

PY - 2006

Y1 - 2006

N2 - This paper presents a novel paradigm for learning languages that consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. It initiates the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. It introduces a high-dimensional feature map and proves piecewise-testable languages to be linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. It also shows that the positive definite kernel associated to this embedding can be computed in quadratic time. It examines the use of support vector machines in combination with this kernel to determine a separating hyperplane and the corresponding learning guarantees. It also proves that all languages linearly separable under a regular finite cover embedding, a generalization of the embedding we used, are regular.

AB - This paper presents a novel paradigm for learning languages that consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. It initiates the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. It introduces a high-dimensional feature map and proves piecewise-testable languages to be linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. It also shows that the positive definite kernel associated to this embedding can be computed in quadratic time. It examines the use of support vector machines in combination with this kernel to determine a separating hyperplane and the corresponding learning guarantees. It also proves that all languages linearly separable under a regular finite cover embedding, a generalization of the embedding we used, are regular.

UR - http://www.scopus.com/inward/record.url?scp=33750743774&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33750743774&partnerID=8YFLogxK

M3 - Conference contribution

SN - 3540466495

SN - 9783540466499

VL - 4264 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 288

EP - 303

BT - Algorithmic Learning Theory - 17th International Conference, ALT 2006, Proceedings

ER -