Dialogwae: Multimodal response generation with conditional Wasserstein auto-encoder

Xiaodong Gu, Kyunghyun Cho, Jung Woo Ha, Sunghun Kim

Research output: Contribution to conferencePaper

Abstract

Variational autoencoders (VAEs) have shown a promise in data-driven conversation modeling. However, most VAE conversation models match the approximate posterior distribution over the latent variables to a simple prior such as standard normal distribution, thereby restricting the generated responses to a relatively simple (e.g., unimodal) scope. In this paper, we propose DialogWAE, a conditional Wasserstein autoencoder (WAE) specially designed for dialogue modeling. Unlike VAEs that impose a simple distribution over the latent variables, DialogWAE models the distribution of data by training a GAN within the latent variable space. Specifically, our model samples from the prior and posterior distributions over the latent variables by transforming context-dependent random noise using neural networks and minimizes the Wasserstein distance between the two distributions. We further develop a Gaussian mixture prior network to enrich the latent space. Experiments on two popular datasets show that DialogWAE outperforms the state-of-the-art approaches in generating more coherent, informative and diverse responses.

Original languageEnglish (US)
StatePublished - Jan 1 2019
Event7th International Conference on Learning Representations, ICLR 2019 - New Orleans, United States
Duration: May 6 2019May 9 2019

Conference

Conference7th International Conference on Learning Representations, ICLR 2019
CountryUnited States
CityNew Orleans
Period5/6/195/9/19

Fingerprint

Normal distribution
conversation
Neural networks
neural network
Experiments
dialogue
experiment
Modeling
Data-driven
Neural Networks
Experiment
Normal Distribution

ASJC Scopus subject areas

  • Education
  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Cite this

Gu, X., Cho, K., Ha, J. W., & Kim, S. (2019). Dialogwae: Multimodal response generation with conditional Wasserstein auto-encoder. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.

Dialogwae : Multimodal response generation with conditional Wasserstein auto-encoder. / Gu, Xiaodong; Cho, Kyunghyun; Ha, Jung Woo; Kim, Sunghun.

2019. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.

Research output: Contribution to conferencePaper

Gu, X, Cho, K, Ha, JW & Kim, S 2019, 'Dialogwae: Multimodal response generation with conditional Wasserstein auto-encoder' Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States, 5/6/19 - 5/9/19, .
Gu X, Cho K, Ha JW, Kim S. Dialogwae: Multimodal response generation with conditional Wasserstein auto-encoder. 2019. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.
Gu, Xiaodong ; Cho, Kyunghyun ; Ha, Jung Woo ; Kim, Sunghun. / Dialogwae : Multimodal response generation with conditional Wasserstein auto-encoder. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.
@conference{a393c6b1785f401b934b8454c370a7dd,
title = "Dialogwae: Multimodal response generation with conditional Wasserstein auto-encoder",
abstract = "Variational autoencoders (VAEs) have shown a promise in data-driven conversation modeling. However, most VAE conversation models match the approximate posterior distribution over the latent variables to a simple prior such as standard normal distribution, thereby restricting the generated responses to a relatively simple (e.g., unimodal) scope. In this paper, we propose DialogWAE, a conditional Wasserstein autoencoder (WAE) specially designed for dialogue modeling. Unlike VAEs that impose a simple distribution over the latent variables, DialogWAE models the distribution of data by training a GAN within the latent variable space. Specifically, our model samples from the prior and posterior distributions over the latent variables by transforming context-dependent random noise using neural networks and minimizes the Wasserstein distance between the two distributions. We further develop a Gaussian mixture prior network to enrich the latent space. Experiments on two popular datasets show that DialogWAE outperforms the state-of-the-art approaches in generating more coherent, informative and diverse responses.",
author = "Xiaodong Gu and Kyunghyun Cho and Ha, {Jung Woo} and Sunghun Kim",
year = "2019",
month = "1",
day = "1",
language = "English (US)",
note = "7th International Conference on Learning Representations, ICLR 2019 ; Conference date: 06-05-2019 Through 09-05-2019",

}

TY - CONF

T1 - Dialogwae

T2 - Multimodal response generation with conditional Wasserstein auto-encoder

AU - Gu, Xiaodong

AU - Cho, Kyunghyun

AU - Ha, Jung Woo

AU - Kim, Sunghun

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Variational autoencoders (VAEs) have shown a promise in data-driven conversation modeling. However, most VAE conversation models match the approximate posterior distribution over the latent variables to a simple prior such as standard normal distribution, thereby restricting the generated responses to a relatively simple (e.g., unimodal) scope. In this paper, we propose DialogWAE, a conditional Wasserstein autoencoder (WAE) specially designed for dialogue modeling. Unlike VAEs that impose a simple distribution over the latent variables, DialogWAE models the distribution of data by training a GAN within the latent variable space. Specifically, our model samples from the prior and posterior distributions over the latent variables by transforming context-dependent random noise using neural networks and minimizes the Wasserstein distance between the two distributions. We further develop a Gaussian mixture prior network to enrich the latent space. Experiments on two popular datasets show that DialogWAE outperforms the state-of-the-art approaches in generating more coherent, informative and diverse responses.

AB - Variational autoencoders (VAEs) have shown a promise in data-driven conversation modeling. However, most VAE conversation models match the approximate posterior distribution over the latent variables to a simple prior such as standard normal distribution, thereby restricting the generated responses to a relatively simple (e.g., unimodal) scope. In this paper, we propose DialogWAE, a conditional Wasserstein autoencoder (WAE) specially designed for dialogue modeling. Unlike VAEs that impose a simple distribution over the latent variables, DialogWAE models the distribution of data by training a GAN within the latent variable space. Specifically, our model samples from the prior and posterior distributions over the latent variables by transforming context-dependent random noise using neural networks and minimizes the Wasserstein distance between the two distributions. We further develop a Gaussian mixture prior network to enrich the latent space. Experiments on two popular datasets show that DialogWAE outperforms the state-of-the-art approaches in generating more coherent, informative and diverse responses.

UR - http://www.scopus.com/inward/record.url?scp=85071157689&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071157689&partnerID=8YFLogxK

M3 - Paper

ER -