Don’t throw those morphological analyzers away just yet: Neural morphological disambiguation for Arabic

Nasser Zalmout, Nizar Habash

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a model for Arabic morphological disambiguation based on Recurrent Neural Networks (RNN). We train Long Short-Term Memory (LSTM) cells in several configurations and embedding levels to model the various morphological features. Our experiments show that these models outperform state-of-the-art systems without explicit use of feature engineering. However, adding learning features from a morphological analyzer to model the space of possible analyses provides additional improvement. We make use of the resulting morphological models for scoring and ranking the analyses of the morphological analyzer for morphological disambiguation. The results show significant gains in accuracy across several evaluation metrics. Our system results in 4.4% absolute increase over the state-of-the-art in full morphological analysis accuracy (30.6% relative error reduction), and 10.6% (31.5% relative error reduction) for out-of-vocabulary words.

Original languageEnglish (US)
Title of host publicationEMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages704-713
Number of pages10
ISBN (Electronic)9781945626838
StatePublished - Jan 1 2017
Event2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017 - Copenhagen, Denmark
Duration: Sep 9 2017Sep 11 2017

Publication series

NameEMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017
CountryDenmark
CityCopenhagen
Period9/9/179/11/17

Fingerprint

Recurrent neural networks
Experiments
Long short-term memory

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Computational Theory and Mathematics

Cite this

Zalmout, N., & Habash, N. (2017). Don’t throw those morphological analyzers away just yet: Neural morphological disambiguation for Arabic. In EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 704-713). (EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings). Association for Computational Linguistics (ACL).

Don’t throw those morphological analyzers away just yet : Neural morphological disambiguation for Arabic. / Zalmout, Nasser; Habash, Nizar.

EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings. Association for Computational Linguistics (ACL), 2017. p. 704-713 (EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zalmout, N & Habash, N 2017, Don’t throw those morphological analyzers away just yet: Neural morphological disambiguation for Arabic. in EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings. EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings, Association for Computational Linguistics (ACL), pp. 704-713, 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, 9/9/17.
Zalmout N, Habash N. Don’t throw those morphological analyzers away just yet: Neural morphological disambiguation for Arabic. In EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings. Association for Computational Linguistics (ACL). 2017. p. 704-713. (EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings).
Zalmout, Nasser ; Habash, Nizar. / Don’t throw those morphological analyzers away just yet : Neural morphological disambiguation for Arabic. EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings. Association for Computational Linguistics (ACL), 2017. pp. 704-713 (EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings).
@inproceedings{df38eff1beae43a6bb4bda1e32833dc6,
title = "Don’t throw those morphological analyzers away just yet: Neural morphological disambiguation for Arabic",
abstract = "This paper presents a model for Arabic morphological disambiguation based on Recurrent Neural Networks (RNN). We train Long Short-Term Memory (LSTM) cells in several configurations and embedding levels to model the various morphological features. Our experiments show that these models outperform state-of-the-art systems without explicit use of feature engineering. However, adding learning features from a morphological analyzer to model the space of possible analyses provides additional improvement. We make use of the resulting morphological models for scoring and ranking the analyses of the morphological analyzer for morphological disambiguation. The results show significant gains in accuracy across several evaluation metrics. Our system results in 4.4{\%} absolute increase over the state-of-the-art in full morphological analysis accuracy (30.6{\%} relative error reduction), and 10.6{\%} (31.5{\%} relative error reduction) for out-of-vocabulary words.",
author = "Nasser Zalmout and Nizar Habash",
year = "2017",
month = "1",
day = "1",
language = "English (US)",
series = "EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings",
publisher = "Association for Computational Linguistics (ACL)",
pages = "704--713",
booktitle = "EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings",

}

TY - GEN

T1 - Don’t throw those morphological analyzers away just yet

T2 - Neural morphological disambiguation for Arabic

AU - Zalmout, Nasser

AU - Habash, Nizar

PY - 2017/1/1

Y1 - 2017/1/1

N2 - This paper presents a model for Arabic morphological disambiguation based on Recurrent Neural Networks (RNN). We train Long Short-Term Memory (LSTM) cells in several configurations and embedding levels to model the various morphological features. Our experiments show that these models outperform state-of-the-art systems without explicit use of feature engineering. However, adding learning features from a morphological analyzer to model the space of possible analyses provides additional improvement. We make use of the resulting morphological models for scoring and ranking the analyses of the morphological analyzer for morphological disambiguation. The results show significant gains in accuracy across several evaluation metrics. Our system results in 4.4% absolute increase over the state-of-the-art in full morphological analysis accuracy (30.6% relative error reduction), and 10.6% (31.5% relative error reduction) for out-of-vocabulary words.

AB - This paper presents a model for Arabic morphological disambiguation based on Recurrent Neural Networks (RNN). We train Long Short-Term Memory (LSTM) cells in several configurations and embedding levels to model the various morphological features. Our experiments show that these models outperform state-of-the-art systems without explicit use of feature engineering. However, adding learning features from a morphological analyzer to model the space of possible analyses provides additional improvement. We make use of the resulting morphological models for scoring and ranking the analyses of the morphological analyzer for morphological disambiguation. The results show significant gains in accuracy across several evaluation metrics. Our system results in 4.4% absolute increase over the state-of-the-art in full morphological analysis accuracy (30.6% relative error reduction), and 10.6% (31.5% relative error reduction) for out-of-vocabulary words.

UR - http://www.scopus.com/inward/record.url?scp=85055682298&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055682298&partnerID=8YFLogxK

M3 - Conference contribution

T3 - EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings

SP - 704

EP - 713

BT - EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings

PB - Association for Computational Linguistics (ACL)

ER -