Language independent connectivity strength features for phrase pivot statistical machine translation

Ahmed El Kholy, Nizar Habash, Gregor Leusch, Evgeny Matusov, Hassan Sawaf

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

An important challenge to statistical machine translation (SMT) is the lack of parallel data for many language pairs. One common solution is to pivot through a third language for which there exist parallel corpora with the source and target languages. Although pivoting is a robust technique, it introduces some low quality translations. In this paper, we present two language-independent features to improve the quality of phrase-pivot based SMT. The features, source connectivity strength and target connectivity strength reflect the quality of projected alignments between the source and target phrases in the pivot phrase table. We show positive results (0.6 BLEU points) on Persian-Arabic SMT as a case study.

Original languageEnglish (US)
Title of host publicationShort Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages412-418
Number of pages7
Volume2
ISBN (Print)9781937284510
StatePublished - Jan 1 2013
Event51st Annual Meeting of the Association for Computational Linguistics, ACL 2013 - Sofia, Bulgaria
Duration: Aug 4 2013Aug 9 2013

Other

Other51st Annual Meeting of the Association for Computational Linguistics, ACL 2013
CountryBulgaria
CitySofia
Period8/4/138/9/13

Fingerprint

language
Statistical Machine Translation
Connectivity
Language
lack
Parallel Corpora
Alignment

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

El Kholy, A., Habash, N., Leusch, G., Matusov, E., & Sawaf, H. (2013). Language independent connectivity strength features for phrase pivot statistical machine translation. In Short Papers (Vol. 2, pp. 412-418). Association for Computational Linguistics (ACL).

Language independent connectivity strength features for phrase pivot statistical machine translation. / El Kholy, Ahmed; Habash, Nizar; Leusch, Gregor; Matusov, Evgeny; Sawaf, Hassan.

Short Papers. Vol. 2 Association for Computational Linguistics (ACL), 2013. p. 412-418.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

El Kholy, A, Habash, N, Leusch, G, Matusov, E & Sawaf, H 2013, Language independent connectivity strength features for phrase pivot statistical machine translation. in Short Papers. vol. 2, Association for Computational Linguistics (ACL), pp. 412-418, 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, Sofia, Bulgaria, 8/4/13.
El Kholy A, Habash N, Leusch G, Matusov E, Sawaf H. Language independent connectivity strength features for phrase pivot statistical machine translation. In Short Papers. Vol. 2. Association for Computational Linguistics (ACL). 2013. p. 412-418
El Kholy, Ahmed ; Habash, Nizar ; Leusch, Gregor ; Matusov, Evgeny ; Sawaf, Hassan. / Language independent connectivity strength features for phrase pivot statistical machine translation. Short Papers. Vol. 2 Association for Computational Linguistics (ACL), 2013. pp. 412-418
@inproceedings{405fa51228ab453cbf1752a717f07ec3,
title = "Language independent connectivity strength features for phrase pivot statistical machine translation",
abstract = "An important challenge to statistical machine translation (SMT) is the lack of parallel data for many language pairs. One common solution is to pivot through a third language for which there exist parallel corpora with the source and target languages. Although pivoting is a robust technique, it introduces some low quality translations. In this paper, we present two language-independent features to improve the quality of phrase-pivot based SMT. The features, source connectivity strength and target connectivity strength reflect the quality of projected alignments between the source and target phrases in the pivot phrase table. We show positive results (0.6 BLEU points) on Persian-Arabic SMT as a case study.",
author = "{El Kholy}, Ahmed and Nizar Habash and Gregor Leusch and Evgeny Matusov and Hassan Sawaf",
year = "2013",
month = "1",
day = "1",
language = "English (US)",
isbn = "9781937284510",
volume = "2",
pages = "412--418",
booktitle = "Short Papers",
publisher = "Association for Computational Linguistics (ACL)",

}

TY - GEN

T1 - Language independent connectivity strength features for phrase pivot statistical machine translation

AU - El Kholy, Ahmed

AU - Habash, Nizar

AU - Leusch, Gregor

AU - Matusov, Evgeny

AU - Sawaf, Hassan

PY - 2013/1/1

Y1 - 2013/1/1

N2 - An important challenge to statistical machine translation (SMT) is the lack of parallel data for many language pairs. One common solution is to pivot through a third language for which there exist parallel corpora with the source and target languages. Although pivoting is a robust technique, it introduces some low quality translations. In this paper, we present two language-independent features to improve the quality of phrase-pivot based SMT. The features, source connectivity strength and target connectivity strength reflect the quality of projected alignments between the source and target phrases in the pivot phrase table. We show positive results (0.6 BLEU points) on Persian-Arabic SMT as a case study.

AB - An important challenge to statistical machine translation (SMT) is the lack of parallel data for many language pairs. One common solution is to pivot through a third language for which there exist parallel corpora with the source and target languages. Although pivoting is a robust technique, it introduces some low quality translations. In this paper, we present two language-independent features to improve the quality of phrase-pivot based SMT. The features, source connectivity strength and target connectivity strength reflect the quality of projected alignments between the source and target phrases in the pivot phrase table. We show positive results (0.6 BLEU points) on Persian-Arabic SMT as a case study.

UR - http://www.scopus.com/inward/record.url?scp=84907372613&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84907372613&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84907372613

SN - 9781937284510

VL - 2

SP - 412

EP - 418

BT - Short Papers

PB - Association for Computational Linguistics (ACL)

ER -