On the computation of the relative entropy of probabilistic automata

Corinna Cortes, Mehryar Mohri, Ashish Rastogi, Michael Riley

Research output: Contribution to journalArticle

Abstract

We present an exhaustive analysis of the problem of computing the relative entropy of two probabilistic automata. We show that the problem of computing the relative entropy of unambiguous probabilistic automata can be formulated as a shortest-distance problem over an appropriate semiring, give efficient exact and approximate algorithms for its computation in that case, and report the results of experiments demonstrating the practicality of our algorithms for very large weighted automata. We also prove that the computation of the relative entropy of arbitrary probabilistic automata is PSPACE-complete. The relative entropy is used in a variety of machine learning algorithms and applications to measure the discrepancy of two distributions. We examine the use of the symmetrized relative entropy in machine learning algorithms and show that, contrarily to what is suggested by a number of publications in that domain, the symmetrized relative entropy is neither positive definite symmetric nor negative definite symmetric, which limits its use and application in kernel methods. In particular, the convergence of training for learning algorithms is not guaranteed when the symmetrized relative entropy is used directly as a kernel, or as the operand of an exponential as in the case of Gaussian Kernels. Finally, we show that our algorithm for the computation of the entropy of an unambiguous probabilistic automaton can be generalized to the computation of the norm of an unambiguous probabilistic automaton by using a monoid morphism. In particular, this yields efficient algorithms for the computation of the L(p)-norm of a probabilistic automaton.

Original languageEnglish (US)
Pages (from-to)219-242
Number of pages24
JournalInternational Journal of Foundations of Computer Science
Volume19
Issue number1
DOIs
StatePublished - Feb 2008

Fingerprint

Entropy
Learning algorithms
Learning systems
Experiments

ASJC Scopus subject areas

  • Computer Science (miscellaneous)

Cite this

On the computation of the relative entropy of probabilistic automata. / Cortes, Corinna; Mohri, Mehryar; Rastogi, Ashish; Riley, Michael.

In: International Journal of Foundations of Computer Science, Vol. 19, No. 1, 02.2008, p. 219-242.

Research output: Contribution to journalArticle

Cortes, Corinna ; Mohri, Mehryar ; Rastogi, Ashish ; Riley, Michael. / On the computation of the relative entropy of probabilistic automata. In: International Journal of Foundations of Computer Science. 2008 ; Vol. 19, No. 1. pp. 219-242.
@article{42edad52d3e94ec08ba6890c4db2491a,
title = "On the computation of the relative entropy of probabilistic automata",
abstract = "We present an exhaustive analysis of the problem of computing the relative entropy of two probabilistic automata. We show that the problem of computing the relative entropy of unambiguous probabilistic automata can be formulated as a shortest-distance problem over an appropriate semiring, give efficient exact and approximate algorithms for its computation in that case, and report the results of experiments demonstrating the practicality of our algorithms for very large weighted automata. We also prove that the computation of the relative entropy of arbitrary probabilistic automata is PSPACE-complete. The relative entropy is used in a variety of machine learning algorithms and applications to measure the discrepancy of two distributions. We examine the use of the symmetrized relative entropy in machine learning algorithms and show that, contrarily to what is suggested by a number of publications in that domain, the symmetrized relative entropy is neither positive definite symmetric nor negative definite symmetric, which limits its use and application in kernel methods. In particular, the convergence of training for learning algorithms is not guaranteed when the symmetrized relative entropy is used directly as a kernel, or as the operand of an exponential as in the case of Gaussian Kernels. Finally, we show that our algorithm for the computation of the entropy of an unambiguous probabilistic automaton can be generalized to the computation of the norm of an unambiguous probabilistic automaton by using a monoid morphism. In particular, this yields efficient algorithms for the computation of the L(p)-norm of a probabilistic automaton.",
author = "Corinna Cortes and Mehryar Mohri and Ashish Rastogi and Michael Riley",
year = "2008",
month = "2",
doi = "10.1142/S0129054108005644",
language = "English (US)",
volume = "19",
pages = "219--242",
journal = "International Journal of Foundations of Computer Science",
issn = "0129-0541",
publisher = "World Scientific Publishing Co. Pte Ltd",
number = "1",

}

TY - JOUR

T1 - On the computation of the relative entropy of probabilistic automata

AU - Cortes, Corinna

AU - Mohri, Mehryar

AU - Rastogi, Ashish

AU - Riley, Michael

PY - 2008/2

Y1 - 2008/2

N2 - We present an exhaustive analysis of the problem of computing the relative entropy of two probabilistic automata. We show that the problem of computing the relative entropy of unambiguous probabilistic automata can be formulated as a shortest-distance problem over an appropriate semiring, give efficient exact and approximate algorithms for its computation in that case, and report the results of experiments demonstrating the practicality of our algorithms for very large weighted automata. We also prove that the computation of the relative entropy of arbitrary probabilistic automata is PSPACE-complete. The relative entropy is used in a variety of machine learning algorithms and applications to measure the discrepancy of two distributions. We examine the use of the symmetrized relative entropy in machine learning algorithms and show that, contrarily to what is suggested by a number of publications in that domain, the symmetrized relative entropy is neither positive definite symmetric nor negative definite symmetric, which limits its use and application in kernel methods. In particular, the convergence of training for learning algorithms is not guaranteed when the symmetrized relative entropy is used directly as a kernel, or as the operand of an exponential as in the case of Gaussian Kernels. Finally, we show that our algorithm for the computation of the entropy of an unambiguous probabilistic automaton can be generalized to the computation of the norm of an unambiguous probabilistic automaton by using a monoid morphism. In particular, this yields efficient algorithms for the computation of the L(p)-norm of a probabilistic automaton.

AB - We present an exhaustive analysis of the problem of computing the relative entropy of two probabilistic automata. We show that the problem of computing the relative entropy of unambiguous probabilistic automata can be formulated as a shortest-distance problem over an appropriate semiring, give efficient exact and approximate algorithms for its computation in that case, and report the results of experiments demonstrating the practicality of our algorithms for very large weighted automata. We also prove that the computation of the relative entropy of arbitrary probabilistic automata is PSPACE-complete. The relative entropy is used in a variety of machine learning algorithms and applications to measure the discrepancy of two distributions. We examine the use of the symmetrized relative entropy in machine learning algorithms and show that, contrarily to what is suggested by a number of publications in that domain, the symmetrized relative entropy is neither positive definite symmetric nor negative definite symmetric, which limits its use and application in kernel methods. In particular, the convergence of training for learning algorithms is not guaranteed when the symmetrized relative entropy is used directly as a kernel, or as the operand of an exponential as in the case of Gaussian Kernels. Finally, we show that our algorithm for the computation of the entropy of an unambiguous probabilistic automaton can be generalized to the computation of the norm of an unambiguous probabilistic automaton by using a monoid morphism. In particular, this yields efficient algorithms for the computation of the L(p)-norm of a probabilistic automaton.

UR - http://www.scopus.com/inward/record.url?scp=43949114194&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=43949114194&partnerID=8YFLogxK

U2 - 10.1142/S0129054108005644

DO - 10.1142/S0129054108005644

M3 - Article

VL - 19

SP - 219

EP - 242

JO - International Journal of Foundations of Computer Science

JF - International Journal of Foundations of Computer Science

SN - 0129-0541

IS - 1

ER -