Machine translation evaluation for Arabic using morphologically-enriched embeddings

Francisco Guzmán, Houda Bouamor, Ramy Baly, Nizar Habash

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Evaluation of machine translation (MT) into morphologically rich languages (MRL) has not been well studied despite posing many challenges. In this paper, we explore the use of embeddings obtained from different levels of lexical and morpho-syntactic linguistic analysis and show that they improve MT evaluation into an MRL. Specifically we report on Arabic, a language with complex and rich morphology. Our results show that using a neural-network model with different input representations produces results that clearly outperform the state-of-the-art for MT evaluation into Arabic, by almost over 75% increase in correlation with human judgments on pairwise MT evaluation quality task. More importantly, we demonstrate the usefulness of morpho-syntactic representations to model sentence similarity for MT evaluation and address complex linguistic phenomena of Arabic.

Original languageEnglish (US)
Title of host publicationCOLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016
Subtitle of host publicationTechnical Papers
PublisherAssociation for Computational Linguistics, ACL Anthology
Pages1398-1408
Number of pages11
ISBN (Print)9784879747020
StatePublished - Jan 1 2016
Event26th International Conference on Computational Linguistics, COLING 2016 - Osaka, Japan
Duration: Dec 11 2016Dec 16 2016

Other

Other26th International Conference on Computational Linguistics, COLING 2016
CountryJapan
CityOsaka
Period12/11/1612/16/16

Fingerprint

Syntactics
Linguistics
evaluation
Neural networks
language
linguistics
neural network
Evaluation
Machine Translation
Language
Syntax

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Language and Linguistics
  • Linguistics and Language

Cite this

Guzmán, F., Bouamor, H., Baly, R., & Habash, N. (2016). Machine translation evaluation for Arabic using morphologically-enriched embeddings. In COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers (pp. 1398-1408). Association for Computational Linguistics, ACL Anthology.

Machine translation evaluation for Arabic using morphologically-enriched embeddings. / Guzmán, Francisco; Bouamor, Houda; Baly, Ramy; Habash, Nizar.

COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers. Association for Computational Linguistics, ACL Anthology, 2016. p. 1398-1408.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Guzmán, F, Bouamor, H, Baly, R & Habash, N 2016, Machine translation evaluation for Arabic using morphologically-enriched embeddings. in COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers. Association for Computational Linguistics, ACL Anthology, pp. 1398-1408, 26th International Conference on Computational Linguistics, COLING 2016, Osaka, Japan, 12/11/16.
Guzmán F, Bouamor H, Baly R, Habash N. Machine translation evaluation for Arabic using morphologically-enriched embeddings. In COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers. Association for Computational Linguistics, ACL Anthology. 2016. p. 1398-1408
Guzmán, Francisco ; Bouamor, Houda ; Baly, Ramy ; Habash, Nizar. / Machine translation evaluation for Arabic using morphologically-enriched embeddings. COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers. Association for Computational Linguistics, ACL Anthology, 2016. pp. 1398-1408
@inproceedings{83dcfe47870f456ba02ab10b9907d103,
title = "Machine translation evaluation for Arabic using morphologically-enriched embeddings",
abstract = "Evaluation of machine translation (MT) into morphologically rich languages (MRL) has not been well studied despite posing many challenges. In this paper, we explore the use of embeddings obtained from different levels of lexical and morpho-syntactic linguistic analysis and show that they improve MT evaluation into an MRL. Specifically we report on Arabic, a language with complex and rich morphology. Our results show that using a neural-network model with different input representations produces results that clearly outperform the state-of-the-art for MT evaluation into Arabic, by almost over 75{\%} increase in correlation with human judgments on pairwise MT evaluation quality task. More importantly, we demonstrate the usefulness of morpho-syntactic representations to model sentence similarity for MT evaluation and address complex linguistic phenomena of Arabic.",
author = "Francisco Guzm{\'a}n and Houda Bouamor and Ramy Baly and Nizar Habash",
year = "2016",
month = "1",
day = "1",
language = "English (US)",
isbn = "9784879747020",
pages = "1398--1408",
booktitle = "COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016",
publisher = "Association for Computational Linguistics, ACL Anthology",

}

TY - GEN

T1 - Machine translation evaluation for Arabic using morphologically-enriched embeddings

AU - Guzmán, Francisco

AU - Bouamor, Houda

AU - Baly, Ramy

AU - Habash, Nizar

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Evaluation of machine translation (MT) into morphologically rich languages (MRL) has not been well studied despite posing many challenges. In this paper, we explore the use of embeddings obtained from different levels of lexical and morpho-syntactic linguistic analysis and show that they improve MT evaluation into an MRL. Specifically we report on Arabic, a language with complex and rich morphology. Our results show that using a neural-network model with different input representations produces results that clearly outperform the state-of-the-art for MT evaluation into Arabic, by almost over 75% increase in correlation with human judgments on pairwise MT evaluation quality task. More importantly, we demonstrate the usefulness of morpho-syntactic representations to model sentence similarity for MT evaluation and address complex linguistic phenomena of Arabic.

AB - Evaluation of machine translation (MT) into morphologically rich languages (MRL) has not been well studied despite posing many challenges. In this paper, we explore the use of embeddings obtained from different levels of lexical and morpho-syntactic linguistic analysis and show that they improve MT evaluation into an MRL. Specifically we report on Arabic, a language with complex and rich morphology. Our results show that using a neural-network model with different input representations produces results that clearly outperform the state-of-the-art for MT evaluation into Arabic, by almost over 75% increase in correlation with human judgments on pairwise MT evaluation quality task. More importantly, we demonstrate the usefulness of morpho-syntactic representations to model sentence similarity for MT evaluation and address complex linguistic phenomena of Arabic.

UR - http://www.scopus.com/inward/record.url?scp=85048001534&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048001534&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85048001534

SN - 9784879747020

SP - 1398

EP - 1408

BT - COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016

PB - Association for Computational Linguistics, ACL Anthology

ER -