A two-stage pretraining algorithm for deep boltzmann machines

Kyunghyun Cho, Tapani Raiko, Alexander Ilin, Juha Karhunen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A deep Boltzmann machine (DBM) is a recently introduced Markov random field model that has multiple layers of hidden units. It has been shown empirically that it is difficult to train a DBM with approximate maximum- likelihood learning using the stochastic gradient unlike its simpler special case, restricted Boltzmann machine (RBM). In this paper, we propose a novel pretraining algorithm that consists of two stages; obtaining approximate posterior distributions over hidden units from a simpler model and maximizing the variational lower-bound given the fixed hidden posterior distributions. We show empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm.

Original languageEnglish (US)
Title of host publicationArtificial Neural Networks and Machine Learning, ICANN 2013 - 23rd International Conference on Artificial Neural Networks, Proceedings
Pages106-113
Number of pages8
Volume8131 LNCS
DOIs
StatePublished - 2013
Event23rd International Conference on Artificial Neural Networks, ICANN 2013 - Sofia, Bulgaria
Duration: Sep 10 2013Sep 13 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8131 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other23rd International Conference on Artificial Neural Networks, ICANN 2013
CountryBulgaria
CitySofia
Period9/10/139/13/13

Fingerprint

Boltzmann Machine
Posterior distribution
Stochastic Gradient
Unit
Generative Models
Random Field
Maximum likelihood
Maximum Likelihood
Lower bound
Model

Keywords

  • Deep Boltzmann Machine
  • Deep Learning
  • Pretraining

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Cho, K., Raiko, T., Ilin, A., & Karhunen, J. (2013). A two-stage pretraining algorithm for deep boltzmann machines. In Artificial Neural Networks and Machine Learning, ICANN 2013 - 23rd International Conference on Artificial Neural Networks, Proceedings (Vol. 8131 LNCS, pp. 106-113). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8131 LNCS). https://doi.org/10.1007/978-3-642-40728-4_14

A two-stage pretraining algorithm for deep boltzmann machines. / Cho, Kyunghyun; Raiko, Tapani; Ilin, Alexander; Karhunen, Juha.

Artificial Neural Networks and Machine Learning, ICANN 2013 - 23rd International Conference on Artificial Neural Networks, Proceedings. Vol. 8131 LNCS 2013. p. 106-113 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8131 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cho, K, Raiko, T, Ilin, A & Karhunen, J 2013, A two-stage pretraining algorithm for deep boltzmann machines. in Artificial Neural Networks and Machine Learning, ICANN 2013 - 23rd International Conference on Artificial Neural Networks, Proceedings. vol. 8131 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8131 LNCS, pp. 106-113, 23rd International Conference on Artificial Neural Networks, ICANN 2013, Sofia, Bulgaria, 9/10/13. https://doi.org/10.1007/978-3-642-40728-4_14
Cho K, Raiko T, Ilin A, Karhunen J. A two-stage pretraining algorithm for deep boltzmann machines. In Artificial Neural Networks and Machine Learning, ICANN 2013 - 23rd International Conference on Artificial Neural Networks, Proceedings. Vol. 8131 LNCS. 2013. p. 106-113. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-40728-4_14
Cho, Kyunghyun ; Raiko, Tapani ; Ilin, Alexander ; Karhunen, Juha. / A two-stage pretraining algorithm for deep boltzmann machines. Artificial Neural Networks and Machine Learning, ICANN 2013 - 23rd International Conference on Artificial Neural Networks, Proceedings. Vol. 8131 LNCS 2013. pp. 106-113 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{daa2df45862c4724871c38935b402ef2,
title = "A two-stage pretraining algorithm for deep boltzmann machines",
abstract = "A deep Boltzmann machine (DBM) is a recently introduced Markov random field model that has multiple layers of hidden units. It has been shown empirically that it is difficult to train a DBM with approximate maximum- likelihood learning using the stochastic gradient unlike its simpler special case, restricted Boltzmann machine (RBM). In this paper, we propose a novel pretraining algorithm that consists of two stages; obtaining approximate posterior distributions over hidden units from a simpler model and maximizing the variational lower-bound given the fixed hidden posterior distributions. We show empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm.",
keywords = "Deep Boltzmann Machine, Deep Learning, Pretraining",
author = "Kyunghyun Cho and Tapani Raiko and Alexander Ilin and Juha Karhunen",
year = "2013",
doi = "10.1007/978-3-642-40728-4_14",
language = "English (US)",
isbn = "9783642407277",
volume = "8131 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "106--113",
booktitle = "Artificial Neural Networks and Machine Learning, ICANN 2013 - 23rd International Conference on Artificial Neural Networks, Proceedings",

}

TY - GEN

T1 - A two-stage pretraining algorithm for deep boltzmann machines

AU - Cho, Kyunghyun

AU - Raiko, Tapani

AU - Ilin, Alexander

AU - Karhunen, Juha

PY - 2013

Y1 - 2013

N2 - A deep Boltzmann machine (DBM) is a recently introduced Markov random field model that has multiple layers of hidden units. It has been shown empirically that it is difficult to train a DBM with approximate maximum- likelihood learning using the stochastic gradient unlike its simpler special case, restricted Boltzmann machine (RBM). In this paper, we propose a novel pretraining algorithm that consists of two stages; obtaining approximate posterior distributions over hidden units from a simpler model and maximizing the variational lower-bound given the fixed hidden posterior distributions. We show empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm.

AB - A deep Boltzmann machine (DBM) is a recently introduced Markov random field model that has multiple layers of hidden units. It has been shown empirically that it is difficult to train a DBM with approximate maximum- likelihood learning using the stochastic gradient unlike its simpler special case, restricted Boltzmann machine (RBM). In this paper, we propose a novel pretraining algorithm that consists of two stages; obtaining approximate posterior distributions over hidden units from a simpler model and maximizing the variational lower-bound given the fixed hidden posterior distributions. We show empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm.

KW - Deep Boltzmann Machine

KW - Deep Learning

KW - Pretraining

UR - http://www.scopus.com/inward/record.url?scp=84884941662&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84884941662&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-40728-4_14

DO - 10.1007/978-3-642-40728-4_14

M3 - Conference contribution

SN - 9783642407277

VL - 8131 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 106

EP - 113

BT - Artificial Neural Networks and Machine Learning, ICANN 2013 - 23rd International Conference on Artificial Neural Networks, Proceedings

ER -