Stable and effective trainable greedy decoding for sequence to sequence learning

Yun Chen, Kyunghyun Cho, Samuel Bowman, Victor O.K. Li

Research output: Contribution to conferencePaper

Abstract

We introduce a fast, general method to manipulate the behavior of the decoder in a sequence to sequence neural network model. We propose a small neural network actor that observes and manipulates the hidden state of a previously-trained decoder. We evaluate our model on the task of neural machine translation. In this task, we use beam search to decode sentences from the plain decoder for each training set input, rank them by BLEU score, and train the actor to encourage the decoder to generate the highest-BLEU output in a single greedy decoding operation without beam search. Experiments on several datasets and models show that our method yields substantial improvements in both translation quality and translation speed over its base system, with no additional data.

Original languageEnglish (US)
StatePublished - Jan 1 2018
Event6th International Conference on Learning Representations, ICLR 2018 - Vancouver, Canada
Duration: Apr 30 2018May 3 2018

Conference

Conference6th International Conference on Learning Representations, ICLR 2018
CountryCanada
CityVancouver
Period4/30/185/3/18

Fingerprint

Decoding
neural network
learning
Neural networks
experiment
Sequence Learning
Experiments
Train
Neural Networks
Experiment
Neural Network Model
Machine Translation

ASJC Scopus subject areas

  • Education
  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Cite this

Chen, Y., Cho, K., Bowman, S., & Li, V. O. K. (2018). Stable and effective trainable greedy decoding for sequence to sequence learning. Paper presented at 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada.

Stable and effective trainable greedy decoding for sequence to sequence learning. / Chen, Yun; Cho, Kyunghyun; Bowman, Samuel; Li, Victor O.K.

2018. Paper presented at 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada.

Research output: Contribution to conferencePaper

Chen, Y, Cho, K, Bowman, S & Li, VOK 2018, 'Stable and effective trainable greedy decoding for sequence to sequence learning' Paper presented at 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada, 4/30/18 - 5/3/18, .
Chen Y, Cho K, Bowman S, Li VOK. Stable and effective trainable greedy decoding for sequence to sequence learning. 2018. Paper presented at 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada.
Chen, Yun ; Cho, Kyunghyun ; Bowman, Samuel ; Li, Victor O.K. / Stable and effective trainable greedy decoding for sequence to sequence learning. Paper presented at 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada.
@conference{9af3d379733d436eb6da708c1c8e9e4d,
title = "Stable and effective trainable greedy decoding for sequence to sequence learning",
abstract = "We introduce a fast, general method to manipulate the behavior of the decoder in a sequence to sequence neural network model. We propose a small neural network actor that observes and manipulates the hidden state of a previously-trained decoder. We evaluate our model on the task of neural machine translation. In this task, we use beam search to decode sentences from the plain decoder for each training set input, rank them by BLEU score, and train the actor to encourage the decoder to generate the highest-BLEU output in a single greedy decoding operation without beam search. Experiments on several datasets and models show that our method yields substantial improvements in both translation quality and translation speed over its base system, with no additional data.",
author = "Yun Chen and Kyunghyun Cho and Samuel Bowman and Li, {Victor O.K.}",
year = "2018",
month = "1",
day = "1",
language = "English (US)",
note = "6th International Conference on Learning Representations, ICLR 2018 ; Conference date: 30-04-2018 Through 03-05-2018",

}

TY - CONF

T1 - Stable and effective trainable greedy decoding for sequence to sequence learning

AU - Chen, Yun

AU - Cho, Kyunghyun

AU - Bowman, Samuel

AU - Li, Victor O.K.

PY - 2018/1/1

Y1 - 2018/1/1

N2 - We introduce a fast, general method to manipulate the behavior of the decoder in a sequence to sequence neural network model. We propose a small neural network actor that observes and manipulates the hidden state of a previously-trained decoder. We evaluate our model on the task of neural machine translation. In this task, we use beam search to decode sentences from the plain decoder for each training set input, rank them by BLEU score, and train the actor to encourage the decoder to generate the highest-BLEU output in a single greedy decoding operation without beam search. Experiments on several datasets and models show that our method yields substantial improvements in both translation quality and translation speed over its base system, with no additional data.

AB - We introduce a fast, general method to manipulate the behavior of the decoder in a sequence to sequence neural network model. We propose a small neural network actor that observes and manipulates the hidden state of a previously-trained decoder. We evaluate our model on the task of neural machine translation. In this task, we use beam search to decode sentences from the plain decoder for each training set input, rank them by BLEU score, and train the actor to encourage the decoder to generate the highest-BLEU output in a single greedy decoding operation without beam search. Experiments on several datasets and models show that our method yields substantial improvements in both translation quality and translation speed over its base system, with no additional data.

UR - http://www.scopus.com/inward/record.url?scp=85070943062&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070943062&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85070943062

ER -