Dialectal Arabic to english machine translation: Pivoting through modern standard Arabic

Wael Salloum, Nizar Habash

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Modern Standard Arabic (MSA) has a wealth of natural language processing (NLP) tools and resources. In comparison, resources for dialectal Arabic (DA), the unstandardized spoken varieties of Arabic, are still lacking. We present ELISSA, a machine translation (MT) system for DA to MSA. ELISSA employs a rule-based approach that relies on morphological analysis, transfer rules and dictionaries in addition to language models to produce MSA paraphrases of DA sentences. ELISSA can be employed as a general preprocessor for DA when using MSA NLP tools. A manual error analysis of ELISSA's output shows that it produces correct MSA translations over 93% of the time. Using ELISSA to produce MSA versions of DA sentences as part of an MSA-pivoting DA-to-English MT solution, improves BLEU scores on multiple blind test sets between 0.6% and 1.4%.

Original languageEnglish (US)
Title of host publicationNAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies, Proceedings of the Main Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages348-358
Number of pages11
ISBN (Electronic)9781937284473
StatePublished - Jan 1 2013
Event2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013 - Atlanta, United States
Duration: Jun 9 2013Jun 14 2013

Publication series

NameNAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference

Other

Other2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013
CountryUnited States
CityAtlanta
Period6/9/136/14/13

    Fingerprint

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Linguistics and Language

Cite this

Salloum, W., & Habash, N. (2013). Dialectal Arabic to english machine translation: Pivoting through modern standard Arabic. In NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference (pp. 348-358). (NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference). Association for Computational Linguistics (ACL).