Learning from uncertain data

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The application of statistical methods to natural language processing has been remarkably successful over the past two decades. But, to deal with recent problems arising in this field, machine learning techniques must be generalized to deal with uncertain data, or datasets whose elements are distributions over sequences, such as weighted automata. This paper reviews a number of recent results related to this question. We discuss how to compute efficiently basic statistics from a weighted automaton such as the expected count of an arbitrary sequence and higher moments of that distribution, by using weighted transducers. Both the corresponding transducers and related algorithms are described. We show how general classification techniques such as Support Vector Machines can be extended to deal with distributions by using general kernels between weighted automata. We describe several examples of positive definite kernels between weighted automata such as kernels based on counts of common n-gram sequences, counts of common factors or suffixes, or other more complex kernels, and describe a general algorithm for computing them efficiently. We also demonstrate how machine learning techniques such as clustering based on the edit-distance can be extended to deal with unweighted and weighted automata representing distributions.

Original languageEnglish (US)
Title of host publicationLecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)
EditorsB. Scholkopf, M.K. Warmuth
Pages656-670
Number of pages15
Volume2777
StatePublished - 2003
Event16th Annual Conference on Learning Theory and 7th Kernel Workshop, COLT/Kernel 2003 - Washington, DC, United States
Duration: Aug 24 2003Aug 27 2003

Other

Other16th Annual Conference on Learning Theory and 7th Kernel Workshop, COLT/Kernel 2003
CountryUnited States
CityWashington, DC
Period8/24/038/27/03

Fingerprint

Learning systems
Transducers
Support vector machines
Statistical methods
Statistics
Processing

ASJC Scopus subject areas

  • Hardware and Architecture

Cite this

Mohri, M. (2003). Learning from uncertain data. In B. Scholkopf, & M. K. Warmuth (Eds.), Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 2777, pp. 656-670)

Learning from uncertain data. / Mohri, Mehryar.

Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). ed. / B. Scholkopf; M.K. Warmuth. Vol. 2777 2003. p. 656-670.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mohri, M 2003, Learning from uncertain data. in B Scholkopf & MK Warmuth (eds), Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). vol. 2777, pp. 656-670, 16th Annual Conference on Learning Theory and 7th Kernel Workshop, COLT/Kernel 2003, Washington, DC, United States, 8/24/03.
Mohri M. Learning from uncertain data. In Scholkopf B, Warmuth MK, editors, Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). Vol. 2777. 2003. p. 656-670
Mohri, Mehryar. / Learning from uncertain data. Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). editor / B. Scholkopf ; M.K. Warmuth. Vol. 2777 2003. pp. 656-670
@inproceedings{e5c88f2e23f243e393fcfec9f8bd5864,
title = "Learning from uncertain data",
abstract = "The application of statistical methods to natural language processing has been remarkably successful over the past two decades. But, to deal with recent problems arising in this field, machine learning techniques must be generalized to deal with uncertain data, or datasets whose elements are distributions over sequences, such as weighted automata. This paper reviews a number of recent results related to this question. We discuss how to compute efficiently basic statistics from a weighted automaton such as the expected count of an arbitrary sequence and higher moments of that distribution, by using weighted transducers. Both the corresponding transducers and related algorithms are described. We show how general classification techniques such as Support Vector Machines can be extended to deal with distributions by using general kernels between weighted automata. We describe several examples of positive definite kernels between weighted automata such as kernels based on counts of common n-gram sequences, counts of common factors or suffixes, or other more complex kernels, and describe a general algorithm for computing them efficiently. We also demonstrate how machine learning techniques such as clustering based on the edit-distance can be extended to deal with unweighted and weighted automata representing distributions.",
author = "Mehryar Mohri",
year = "2003",
language = "English (US)",
volume = "2777",
pages = "656--670",
editor = "B. Scholkopf and M.K. Warmuth",
booktitle = "Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)",

}

TY - GEN

T1 - Learning from uncertain data

AU - Mohri, Mehryar

PY - 2003

Y1 - 2003

N2 - The application of statistical methods to natural language processing has been remarkably successful over the past two decades. But, to deal with recent problems arising in this field, machine learning techniques must be generalized to deal with uncertain data, or datasets whose elements are distributions over sequences, such as weighted automata. This paper reviews a number of recent results related to this question. We discuss how to compute efficiently basic statistics from a weighted automaton such as the expected count of an arbitrary sequence and higher moments of that distribution, by using weighted transducers. Both the corresponding transducers and related algorithms are described. We show how general classification techniques such as Support Vector Machines can be extended to deal with distributions by using general kernels between weighted automata. We describe several examples of positive definite kernels between weighted automata such as kernels based on counts of common n-gram sequences, counts of common factors or suffixes, or other more complex kernels, and describe a general algorithm for computing them efficiently. We also demonstrate how machine learning techniques such as clustering based on the edit-distance can be extended to deal with unweighted and weighted automata representing distributions.

AB - The application of statistical methods to natural language processing has been remarkably successful over the past two decades. But, to deal with recent problems arising in this field, machine learning techniques must be generalized to deal with uncertain data, or datasets whose elements are distributions over sequences, such as weighted automata. This paper reviews a number of recent results related to this question. We discuss how to compute efficiently basic statistics from a weighted automaton such as the expected count of an arbitrary sequence and higher moments of that distribution, by using weighted transducers. Both the corresponding transducers and related algorithms are described. We show how general classification techniques such as Support Vector Machines can be extended to deal with distributions by using general kernels between weighted automata. We describe several examples of positive definite kernels between weighted automata such as kernels based on counts of common n-gram sequences, counts of common factors or suffixes, or other more complex kernels, and describe a general algorithm for computing them efficiently. We also demonstrate how machine learning techniques such as clustering based on the edit-distance can be extended to deal with unweighted and weighted automata representing distributions.

UR - http://www.scopus.com/inward/record.url?scp=9444293358&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=9444293358&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:9444293358

VL - 2777

SP - 656

EP - 670

BT - Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)

A2 - Scholkopf, B.

A2 - Warmuth, M.K.

ER -