A correlational encoder decoder architecture for pivot based sequence generation

Amrita Saha, Mitesh M. Khapra, Sarath Chandar, Janarthanan Rajendran, Kyunghyun Cho

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Interlingua based Machine Translation (MT) aims to encode multiple languages into a common linguistic representation and then decode sentences in multiple target languages from this representation. In this work we explore this idea in the context of neural encoder decoder architectures, albeit on a smaller scale and without MT as the end goal. Specifically, we consider the case of three languages or modalities X, Z and Y wherein we are interested in generating sequences in Y starting from information available in X. However, there is no parallel training data available between X and Y but, training data is available between X & Z and Z & Y (as is often the case in many real world applications). Z thus acts as a pivot/bridge. An obvious solution, which is perhaps less elegant but works very well in practice is to train a two stage model which first converts from X to Z and then from Z to Y. Instead we explore an interlingua inspired solution which jointly learns to do the following (i) encode X and Z to a common representation and (ii) decode Y from this common representation. We evaluate our model on two tasks: (i) bridge transliteration and (ii) bridge captioning. We report promising results in both these applications and believe that this is a right step towards truly interlingua inspired encoder decoder architectures.

Original languageEnglish (US)
Title of host publicationCOLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016
Subtitle of host publicationTechnical Papers
PublisherAssociation for Computational Linguistics, ACL Anthology
Pages109-118
Number of pages10
ISBN (Print)9784879747020
StatePublished - Jan 1 2016
Event26th International Conference on Computational Linguistics, COLING 2016 - Osaka, Japan
Duration: Dec 11 2016Dec 16 2016

Other

Other26th International Conference on Computational Linguistics, COLING 2016
CountryJapan
CityOsaka
Period12/11/1612/16/16

Fingerprint

language
Linguistics
available information
linguistics
Interlingua
Language
Machine Translation
Modality
Real World
Linguistic Representation
Convert
Transliteration
Train

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Language and Linguistics
  • Linguistics and Language

Cite this

Saha, A., Khapra, M. M., Chandar, S., Rajendran, J., & Cho, K. (2016). A correlational encoder decoder architecture for pivot based sequence generation. In COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers (pp. 109-118). Association for Computational Linguistics, ACL Anthology.

A correlational encoder decoder architecture for pivot based sequence generation. / Saha, Amrita; Khapra, Mitesh M.; Chandar, Sarath; Rajendran, Janarthanan; Cho, Kyunghyun.

COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers. Association for Computational Linguistics, ACL Anthology, 2016. p. 109-118.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Saha, A, Khapra, MM, Chandar, S, Rajendran, J & Cho, K 2016, A correlational encoder decoder architecture for pivot based sequence generation. in COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers. Association for Computational Linguistics, ACL Anthology, pp. 109-118, 26th International Conference on Computational Linguistics, COLING 2016, Osaka, Japan, 12/11/16.
Saha A, Khapra MM, Chandar S, Rajendran J, Cho K. A correlational encoder decoder architecture for pivot based sequence generation. In COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers. Association for Computational Linguistics, ACL Anthology. 2016. p. 109-118
Saha, Amrita ; Khapra, Mitesh M. ; Chandar, Sarath ; Rajendran, Janarthanan ; Cho, Kyunghyun. / A correlational encoder decoder architecture for pivot based sequence generation. COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers. Association for Computational Linguistics, ACL Anthology, 2016. pp. 109-118
@inproceedings{f25303148346485da80a89b8b59931b7,
title = "A correlational encoder decoder architecture for pivot based sequence generation",
abstract = "Interlingua based Machine Translation (MT) aims to encode multiple languages into a common linguistic representation and then decode sentences in multiple target languages from this representation. In this work we explore this idea in the context of neural encoder decoder architectures, albeit on a smaller scale and without MT as the end goal. Specifically, we consider the case of three languages or modalities X, Z and Y wherein we are interested in generating sequences in Y starting from information available in X. However, there is no parallel training data available between X and Y but, training data is available between X & Z and Z & Y (as is often the case in many real world applications). Z thus acts as a pivot/bridge. An obvious solution, which is perhaps less elegant but works very well in practice is to train a two stage model which first converts from X to Z and then from Z to Y. Instead we explore an interlingua inspired solution which jointly learns to do the following (i) encode X and Z to a common representation and (ii) decode Y from this common representation. We evaluate our model on two tasks: (i) bridge transliteration and (ii) bridge captioning. We report promising results in both these applications and believe that this is a right step towards truly interlingua inspired encoder decoder architectures.",
author = "Amrita Saha and Khapra, {Mitesh M.} and Sarath Chandar and Janarthanan Rajendran and Kyunghyun Cho",
year = "2016",
month = "1",
day = "1",
language = "English (US)",
isbn = "9784879747020",
pages = "109--118",
booktitle = "COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016",
publisher = "Association for Computational Linguistics, ACL Anthology",

}

TY - GEN

T1 - A correlational encoder decoder architecture for pivot based sequence generation

AU - Saha, Amrita

AU - Khapra, Mitesh M.

AU - Chandar, Sarath

AU - Rajendran, Janarthanan

AU - Cho, Kyunghyun

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Interlingua based Machine Translation (MT) aims to encode multiple languages into a common linguistic representation and then decode sentences in multiple target languages from this representation. In this work we explore this idea in the context of neural encoder decoder architectures, albeit on a smaller scale and without MT as the end goal. Specifically, we consider the case of three languages or modalities X, Z and Y wherein we are interested in generating sequences in Y starting from information available in X. However, there is no parallel training data available between X and Y but, training data is available between X & Z and Z & Y (as is often the case in many real world applications). Z thus acts as a pivot/bridge. An obvious solution, which is perhaps less elegant but works very well in practice is to train a two stage model which first converts from X to Z and then from Z to Y. Instead we explore an interlingua inspired solution which jointly learns to do the following (i) encode X and Z to a common representation and (ii) decode Y from this common representation. We evaluate our model on two tasks: (i) bridge transliteration and (ii) bridge captioning. We report promising results in both these applications and believe that this is a right step towards truly interlingua inspired encoder decoder architectures.

AB - Interlingua based Machine Translation (MT) aims to encode multiple languages into a common linguistic representation and then decode sentences in multiple target languages from this representation. In this work we explore this idea in the context of neural encoder decoder architectures, albeit on a smaller scale and without MT as the end goal. Specifically, we consider the case of three languages or modalities X, Z and Y wherein we are interested in generating sequences in Y starting from information available in X. However, there is no parallel training data available between X and Y but, training data is available between X & Z and Z & Y (as is often the case in many real world applications). Z thus acts as a pivot/bridge. An obvious solution, which is perhaps less elegant but works very well in practice is to train a two stage model which first converts from X to Z and then from Z to Y. Instead we explore an interlingua inspired solution which jointly learns to do the following (i) encode X and Z to a common representation and (ii) decode Y from this common representation. We evaluate our model on two tasks: (i) bridge transliteration and (ii) bridge captioning. We report promising results in both these applications and believe that this is a right step towards truly interlingua inspired encoder decoder architectures.

UR - http://www.scopus.com/inward/record.url?scp=85024094852&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85024094852&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85024094852

SN - 9784879747020

SP - 109

EP - 118

BT - COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016

PB - Association for Computational Linguistics, ACL Anthology

ER -