Hybrid Natural Language Generation from Lexical Conceptual Structures

Nizar Habash, Bonnie Dorr, David Traum

Research output: Contribution to journalReview article

Abstract

This paper describes Lexogen, a system for generating natural-language sentences from Lexical Conceptual Structure, an interlingual representation. The system has been developed as part of a Chinese-English Machine Translation (MT) system; however, it is designed to be used for many other MT language pairs and natural language applications. The contributions of this work include: (1) development of a large-scale Hybrid Natural Language Generation system with language-independent components; (2) enhancements to an interlingual representation and associated algorithm for generation from ambiguous input; (3) development of an efficient reusable language-independent linearization module with a grammar description language that can be used with other systems; (4) improvements to an earlier algorithm for hierarchically mapping thematic roles to surface positions; and (5) development of a diagnostic tool for lexicon coverage and correctness and use of the tool for verification of English, Spanish, and Chinese lexicons. An evaluation of Chinese-English translation quality shows comparable performance with a commercial translation system. The generation system can also be extended to other languages and this is demonstrated and evaluated for Spanish.

Original languageEnglish (US)
Pages (from-to)81-127
Number of pages47
JournalMachine Translation
Volume18
Issue number2
DOIs
StatePublished - Dec 1 2003

Fingerprint

language
Linearization
Conceptual Structure
Language Generation
Natural Language
Language
grammar
diagnostic
coverage
evaluation
Lexicon
performance
English Translation
Thematic Roles
Machine Translation System
Evaluation
Diagnostics
Enhancement
Module
Machine Translation

Keywords

  • Hybrid Natural Language Generation
  • Interlingua
  • Lexical Conceptual Structure
  • Multilingual Natural Language Generation

ASJC Scopus subject areas

  • Software
  • Language and Linguistics
  • Linguistics and Language
  • Artificial Intelligence

Cite this

Hybrid Natural Language Generation from Lexical Conceptual Structures. / Habash, Nizar; Dorr, Bonnie; Traum, David.

In: Machine Translation, Vol. 18, No. 2, 01.12.2003, p. 81-127.

Research output: Contribution to journalReview article

Habash, Nizar ; Dorr, Bonnie ; Traum, David. / Hybrid Natural Language Generation from Lexical Conceptual Structures. In: Machine Translation. 2003 ; Vol. 18, No. 2. pp. 81-127.
@article{c335029d5d764057bbd5bca6bddd38b1,
title = "Hybrid Natural Language Generation from Lexical Conceptual Structures",
abstract = "This paper describes Lexogen, a system for generating natural-language sentences from Lexical Conceptual Structure, an interlingual representation. The system has been developed as part of a Chinese-English Machine Translation (MT) system; however, it is designed to be used for many other MT language pairs and natural language applications. The contributions of this work include: (1) development of a large-scale Hybrid Natural Language Generation system with language-independent components; (2) enhancements to an interlingual representation and associated algorithm for generation from ambiguous input; (3) development of an efficient reusable language-independent linearization module with a grammar description language that can be used with other systems; (4) improvements to an earlier algorithm for hierarchically mapping thematic roles to surface positions; and (5) development of a diagnostic tool for lexicon coverage and correctness and use of the tool for verification of English, Spanish, and Chinese lexicons. An evaluation of Chinese-English translation quality shows comparable performance with a commercial translation system. The generation system can also be extended to other languages and this is demonstrated and evaluated for Spanish.",
keywords = "Hybrid Natural Language Generation, Interlingua, Lexical Conceptual Structure, Multilingual Natural Language Generation",
author = "Nizar Habash and Bonnie Dorr and David Traum",
year = "2003",
month = "12",
day = "1",
doi = "10.1023/B:COAT.0000020960.27186.18",
language = "English (US)",
volume = "18",
pages = "81--127",
journal = "Machine Translation",
issn = "0922-6567",
publisher = "Springer Netherlands",
number = "2",

}

TY - JOUR

T1 - Hybrid Natural Language Generation from Lexical Conceptual Structures

AU - Habash, Nizar

AU - Dorr, Bonnie

AU - Traum, David

PY - 2003/12/1

Y1 - 2003/12/1

N2 - This paper describes Lexogen, a system for generating natural-language sentences from Lexical Conceptual Structure, an interlingual representation. The system has been developed as part of a Chinese-English Machine Translation (MT) system; however, it is designed to be used for many other MT language pairs and natural language applications. The contributions of this work include: (1) development of a large-scale Hybrid Natural Language Generation system with language-independent components; (2) enhancements to an interlingual representation and associated algorithm for generation from ambiguous input; (3) development of an efficient reusable language-independent linearization module with a grammar description language that can be used with other systems; (4) improvements to an earlier algorithm for hierarchically mapping thematic roles to surface positions; and (5) development of a diagnostic tool for lexicon coverage and correctness and use of the tool for verification of English, Spanish, and Chinese lexicons. An evaluation of Chinese-English translation quality shows comparable performance with a commercial translation system. The generation system can also be extended to other languages and this is demonstrated and evaluated for Spanish.

AB - This paper describes Lexogen, a system for generating natural-language sentences from Lexical Conceptual Structure, an interlingual representation. The system has been developed as part of a Chinese-English Machine Translation (MT) system; however, it is designed to be used for many other MT language pairs and natural language applications. The contributions of this work include: (1) development of a large-scale Hybrid Natural Language Generation system with language-independent components; (2) enhancements to an interlingual representation and associated algorithm for generation from ambiguous input; (3) development of an efficient reusable language-independent linearization module with a grammar description language that can be used with other systems; (4) improvements to an earlier algorithm for hierarchically mapping thematic roles to surface positions; and (5) development of a diagnostic tool for lexicon coverage and correctness and use of the tool for verification of English, Spanish, and Chinese lexicons. An evaluation of Chinese-English translation quality shows comparable performance with a commercial translation system. The generation system can also be extended to other languages and this is demonstrated and evaluated for Spanish.

KW - Hybrid Natural Language Generation

KW - Interlingua

KW - Lexical Conceptual Structure

KW - Multilingual Natural Language Generation

UR - http://www.scopus.com/inward/record.url?scp=2342636469&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=2342636469&partnerID=8YFLogxK

U2 - 10.1023/B:COAT.0000020960.27186.18

DO - 10.1023/B:COAT.0000020960.27186.18

M3 - Review article

VL - 18

SP - 81

EP - 127

JO - Machine Translation

JF - Machine Translation

SN - 0922-6567

IS - 2

ER -