Information Extraction

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Information extraction (IE) is the automatic identification of selected types of entities, relations, or events in free text. This article appraises two specific strands of IE - name identification and classification, and event extraction. Conventional treatment of languages pays little attention to proper names, addresses etc. Presentations of language analysis generally look up words in a dictionary and identify them as nouns etc. The incessant presence of names in a text, makes linguistic analysis of the same difficult, in the absence of the names being identified by their types and as linguistic units. Name tagging involves creating, several finite-state patterns, each corresponding to some noun subset. Elements of the patterns would match specific/classes of tokens with particular features. Event extraction typically works by creating a series of regular expressions, customized to capture the relevant events. Enhancement of each expression is corresponded by a relevant, suitable enhancement in the event patterns.

Original languageEnglish (US)
Title of host publicationThe Oxford Handbook of Computational Linguistics
PublisherOxford University Press
Volume9780199276349
ISBN (Electronic)9780191743573
ISBN (Print)9780199276349
DOIs
StatePublished - Sep 18 2012

    Fingerprint

Keywords

  • Automatic
  • Event
  • Linguistic analysis
  • Name
  • Patterns
  • Tagging

ASJC Scopus subject areas

  • Arts and Humanities(all)
  • Social Sciences(all)

Cite this

Grishman, R. (2012). Information Extraction. In The Oxford Handbook of Computational Linguistics (Vol. 9780199276349). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199276349.013.0030