Confidence estimation for knowledge base population

Xiang Li, Ralph Grishman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Information extraction systems automatically extract structured information from machine-readable documents, such as newswire, web, and multimedia. Despite significant improvement, the performance is far from perfect. Hence, it is useful to accurately estimate confidence in the correctness of the extracted information. Using the Knowledge Base Population Slot Filling task as a case study, we propose a confidence estimation model based on the Maximum Entropy framework, obtaining an average precision of 83.5%, Pearson coefficient of 54.2%, and 2.3% absolute improvement in F-measure score through a weighted voting strategy.

Original languageEnglish (US)
Title of host publicationInternational Conference Recent Advances in Natural Language Processing, RANLP
Pages396-401
Number of pages6
StatePublished - 2013
Event9th International Conference on Recent Advances in Natural Language Processing, RANLP 2013 - Hissar, Bulgaria
Duration: Sep 9 2013Sep 11 2013

Other

Other9th International Conference on Recent Advances in Natural Language Processing, RANLP 2013
CountryBulgaria
CityHissar
Period9/9/139/11/13

Fingerprint

Entropy

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Software
  • Electrical and Electronic Engineering

Cite this

Li, X., & Grishman, R. (2013). Confidence estimation for knowledge base population. In International Conference Recent Advances in Natural Language Processing, RANLP (pp. 396-401)

Confidence estimation for knowledge base population. / Li, Xiang; Grishman, Ralph.

International Conference Recent Advances in Natural Language Processing, RANLP. 2013. p. 396-401.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, X & Grishman, R 2013, Confidence estimation for knowledge base population. in International Conference Recent Advances in Natural Language Processing, RANLP. pp. 396-401, 9th International Conference on Recent Advances in Natural Language Processing, RANLP 2013, Hissar, Bulgaria, 9/9/13.
Li X, Grishman R. Confidence estimation for knowledge base population. In International Conference Recent Advances in Natural Language Processing, RANLP. 2013. p. 396-401
Li, Xiang ; Grishman, Ralph. / Confidence estimation for knowledge base population. International Conference Recent Advances in Natural Language Processing, RANLP. 2013. pp. 396-401
@inproceedings{e986f37a72d1452a982a5339414479ca,
title = "Confidence estimation for knowledge base population",
abstract = "Information extraction systems automatically extract structured information from machine-readable documents, such as newswire, web, and multimedia. Despite significant improvement, the performance is far from perfect. Hence, it is useful to accurately estimate confidence in the correctness of the extracted information. Using the Knowledge Base Population Slot Filling task as a case study, we propose a confidence estimation model based on the Maximum Entropy framework, obtaining an average precision of 83.5{\%}, Pearson coefficient of 54.2{\%}, and 2.3{\%} absolute improvement in F-measure score through a weighted voting strategy.",
author = "Xiang Li and Ralph Grishman",
year = "2013",
language = "English (US)",
pages = "396--401",
booktitle = "International Conference Recent Advances in Natural Language Processing, RANLP",

}

TY - GEN

T1 - Confidence estimation for knowledge base population

AU - Li, Xiang

AU - Grishman, Ralph

PY - 2013

Y1 - 2013

N2 - Information extraction systems automatically extract structured information from machine-readable documents, such as newswire, web, and multimedia. Despite significant improvement, the performance is far from perfect. Hence, it is useful to accurately estimate confidence in the correctness of the extracted information. Using the Knowledge Base Population Slot Filling task as a case study, we propose a confidence estimation model based on the Maximum Entropy framework, obtaining an average precision of 83.5%, Pearson coefficient of 54.2%, and 2.3% absolute improvement in F-measure score through a weighted voting strategy.

AB - Information extraction systems automatically extract structured information from machine-readable documents, such as newswire, web, and multimedia. Despite significant improvement, the performance is far from perfect. Hence, it is useful to accurately estimate confidence in the correctness of the extracted information. Using the Knowledge Base Population Slot Filling task as a case study, we propose a confidence estimation model based on the Maximum Entropy framework, obtaining an average precision of 83.5%, Pearson coefficient of 54.2%, and 2.3% absolute improvement in F-measure score through a weighted voting strategy.

UR - http://www.scopus.com/inward/record.url?scp=84890516337&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84890516337&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84890516337

SP - 396

EP - 401

BT - International Conference Recent Advances in Natural Language Processing, RANLP

ER -