MADAMIRA: A fast, comprehensive tool for morphological analysis and disambiguation of Arabic

Arfath Pasha, Mohamed Al-Badrashiny, Mona Diab, Ahmed El Kholy, Ramy Eskander, Nizar Habash, Manoj Pooleery, Owen Rambow, Ryan M. Roth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we present MADAMIRA, a system for morphological analysis and disambiguation of Arabic that combines some of the best aspects of two previously commonly used systems for Arabic processing, MADA (Habash and Rambow, 2005; Habash et al., 2009; Habash et al., 2013) and AMIRA (Diab et al., 2007). MADAMIRA improves upon the two systems with a more streamlined Java implementation that is more robust, portable, extensible, and is faster than its ancestors by more than an order of magnitude. We also discuss an online demo (see http://nlp.ldeo.columbia.edu/madamira/) that highlights these aspects.

Original languageEnglish (US)
Title of host publicationProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014
PublisherEuropean Language Resources Association (ELRA)
Pages1094-1101
Number of pages8
ISBN (Electronic)9782951740884
StatePublished - Jan 1 2014
Event9th International Conference on Language Resources and Evaluation, LREC 2014 - Reykjavik, Iceland
Duration: May 26 2014May 31 2014

Other

Other9th International Conference on Language Resources and Evaluation, LREC 2014
CountryIceland
CityReykjavik
Period5/26/145/31/14

Fingerprint

Morphological Analysis
Disambiguation
Java
Ancestors

Keywords

  • Base Phrase Chunking
  • Morphological Analysis
  • Morphological Tagging
  • Named Entity Recognition

ASJC Scopus subject areas

  • Linguistics and Language
  • Library and Information Sciences
  • Education
  • Language and Linguistics

Cite this

Pasha, A., Al-Badrashiny, M., Diab, M., El Kholy, A., Eskander, R., Habash, N., ... Roth, R. M. (2014). MADAMIRA: A fast, comprehensive tool for morphological analysis and disambiguation of Arabic. In Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014 (pp. 1094-1101). European Language Resources Association (ELRA).

MADAMIRA : A fast, comprehensive tool for morphological analysis and disambiguation of Arabic. / Pasha, Arfath; Al-Badrashiny, Mohamed; Diab, Mona; El Kholy, Ahmed; Eskander, Ramy; Habash, Nizar; Pooleery, Manoj; Rambow, Owen; Roth, Ryan M.

Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association (ELRA), 2014. p. 1094-1101.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Pasha, A, Al-Badrashiny, M, Diab, M, El Kholy, A, Eskander, R, Habash, N, Pooleery, M, Rambow, O & Roth, RM 2014, MADAMIRA: A fast, comprehensive tool for morphological analysis and disambiguation of Arabic. in Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association (ELRA), pp. 1094-1101, 9th International Conference on Language Resources and Evaluation, LREC 2014, Reykjavik, Iceland, 5/26/14.
Pasha A, Al-Badrashiny M, Diab M, El Kholy A, Eskander R, Habash N et al. MADAMIRA: A fast, comprehensive tool for morphological analysis and disambiguation of Arabic. In Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association (ELRA). 2014. p. 1094-1101
Pasha, Arfath ; Al-Badrashiny, Mohamed ; Diab, Mona ; El Kholy, Ahmed ; Eskander, Ramy ; Habash, Nizar ; Pooleery, Manoj ; Rambow, Owen ; Roth, Ryan M. / MADAMIRA : A fast, comprehensive tool for morphological analysis and disambiguation of Arabic. Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association (ELRA), 2014. pp. 1094-1101
@inproceedings{7a6d0b4a13fa4eb9b1bce0a5082be8a8,
title = "MADAMIRA: A fast, comprehensive tool for morphological analysis and disambiguation of Arabic",
abstract = "In this paper, we present MADAMIRA, a system for morphological analysis and disambiguation of Arabic that combines some of the best aspects of two previously commonly used systems for Arabic processing, MADA (Habash and Rambow, 2005; Habash et al., 2009; Habash et al., 2013) and AMIRA (Diab et al., 2007). MADAMIRA improves upon the two systems with a more streamlined Java implementation that is more robust, portable, extensible, and is faster than its ancestors by more than an order of magnitude. We also discuss an online demo (see http://nlp.ldeo.columbia.edu/madamira/) that highlights these aspects.",
keywords = "Base Phrase Chunking, Morphological Analysis, Morphological Tagging, Named Entity Recognition",
author = "Arfath Pasha and Mohamed Al-Badrashiny and Mona Diab and {El Kholy}, Ahmed and Ramy Eskander and Nizar Habash and Manoj Pooleery and Owen Rambow and Roth, {Ryan M.}",
year = "2014",
month = "1",
day = "1",
language = "English (US)",
pages = "1094--1101",
booktitle = "Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014",
publisher = "European Language Resources Association (ELRA)",

}

TY - GEN

T1 - MADAMIRA

T2 - A fast, comprehensive tool for morphological analysis and disambiguation of Arabic

AU - Pasha, Arfath

AU - Al-Badrashiny, Mohamed

AU - Diab, Mona

AU - El Kholy, Ahmed

AU - Eskander, Ramy

AU - Habash, Nizar

AU - Pooleery, Manoj

AU - Rambow, Owen

AU - Roth, Ryan M.

PY - 2014/1/1

Y1 - 2014/1/1

N2 - In this paper, we present MADAMIRA, a system for morphological analysis and disambiguation of Arabic that combines some of the best aspects of two previously commonly used systems for Arabic processing, MADA (Habash and Rambow, 2005; Habash et al., 2009; Habash et al., 2013) and AMIRA (Diab et al., 2007). MADAMIRA improves upon the two systems with a more streamlined Java implementation that is more robust, portable, extensible, and is faster than its ancestors by more than an order of magnitude. We also discuss an online demo (see http://nlp.ldeo.columbia.edu/madamira/) that highlights these aspects.

AB - In this paper, we present MADAMIRA, a system for morphological analysis and disambiguation of Arabic that combines some of the best aspects of two previously commonly used systems for Arabic processing, MADA (Habash and Rambow, 2005; Habash et al., 2009; Habash et al., 2013) and AMIRA (Diab et al., 2007). MADAMIRA improves upon the two systems with a more streamlined Java implementation that is more robust, portable, extensible, and is faster than its ancestors by more than an order of magnitude. We also discuss an online demo (see http://nlp.ldeo.columbia.edu/madamira/) that highlights these aspects.

KW - Base Phrase Chunking

KW - Morphological Analysis

KW - Morphological Tagging

KW - Named Entity Recognition

UR - http://www.scopus.com/inward/record.url?scp=85034832585&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85034832585&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85034832585

SP - 1094

EP - 1101

BT - Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014

PB - European Language Resources Association (ELRA)

ER -