Domain adaptation with active learning for named entity recognition

Huiyu Sun, Ralph Grishman, Yingchao Wang

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

One of the dominant problems facing Named Entity Recognition is that when a system trained on one domain is applied to a different domain, a substantial drop in performance is frequently observed. In this paper, we apply active learning strategies to domain adaptation for named entity recognition systems and show that adaptive learning combining the source and target domains is more effective than nonadaptive learning directly from the target domain. Active learning aims to minimize labeling effort by selecting the most informative instances to label. We investigate several sample selection techniques such as Maximum Entropy and Smallest Margin and apply them to the ACE corpus. Our results show that the labeling cost can be reduced by over 92% without degrading the performance.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages611-622
Number of pages12
Volume10040
DOIs
StatePublished - 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10040
ISSN (Print)03029743
ISSN (Electronic)16113349

Fingerprint

Named Entity Recognition
Active Learning
Labeling
Labels
Entropy
Sample Selection
Target
Adaptive Learning
Learning Strategies
Costs
Maximum Entropy
Margin
Problem-Based Learning
Minimise

Keywords

  • Active learning
  • Domain adaptation
  • Named entity recognition

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Sun, H., Grishman, R., & Wang, Y. (2016). Domain adaptation with active learning for named entity recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10040, pp. 611-622). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10040). Springer Verlag. https://doi.org/10.1007/978-3-319-48674-1_54

Domain adaptation with active learning for named entity recognition. / Sun, Huiyu; Grishman, Ralph; Wang, Yingchao.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 10040 Springer Verlag, 2016. p. 611-622 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10040).

Research output: Chapter in Book/Report/Conference proceedingChapter

Sun, H, Grishman, R & Wang, Y 2016, Domain adaptation with active learning for named entity recognition. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 10040, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10040, Springer Verlag, pp. 611-622. https://doi.org/10.1007/978-3-319-48674-1_54
Sun H, Grishman R, Wang Y. Domain adaptation with active learning for named entity recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 10040. Springer Verlag. 2016. p. 611-622. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-48674-1_54
Sun, Huiyu ; Grishman, Ralph ; Wang, Yingchao. / Domain adaptation with active learning for named entity recognition. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 10040 Springer Verlag, 2016. pp. 611-622 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inbook{607ea17e553b4d32b2bc24e0b467fc3b,
title = "Domain adaptation with active learning for named entity recognition",
abstract = "One of the dominant problems facing Named Entity Recognition is that when a system trained on one domain is applied to a different domain, a substantial drop in performance is frequently observed. In this paper, we apply active learning strategies to domain adaptation for named entity recognition systems and show that adaptive learning combining the source and target domains is more effective than nonadaptive learning directly from the target domain. Active learning aims to minimize labeling effort by selecting the most informative instances to label. We investigate several sample selection techniques such as Maximum Entropy and Smallest Margin and apply them to the ACE corpus. Our results show that the labeling cost can be reduced by over 92{\%} without degrading the performance.",
keywords = "Active learning, Domain adaptation, Named entity recognition",
author = "Huiyu Sun and Ralph Grishman and Yingchao Wang",
year = "2016",
doi = "10.1007/978-3-319-48674-1_54",
language = "English (US)",
volume = "10040",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "611--622",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
address = "Germany",

}

TY - CHAP

T1 - Domain adaptation with active learning for named entity recognition

AU - Sun, Huiyu

AU - Grishman, Ralph

AU - Wang, Yingchao

PY - 2016

Y1 - 2016

N2 - One of the dominant problems facing Named Entity Recognition is that when a system trained on one domain is applied to a different domain, a substantial drop in performance is frequently observed. In this paper, we apply active learning strategies to domain adaptation for named entity recognition systems and show that adaptive learning combining the source and target domains is more effective than nonadaptive learning directly from the target domain. Active learning aims to minimize labeling effort by selecting the most informative instances to label. We investigate several sample selection techniques such as Maximum Entropy and Smallest Margin and apply them to the ACE corpus. Our results show that the labeling cost can be reduced by over 92% without degrading the performance.

AB - One of the dominant problems facing Named Entity Recognition is that when a system trained on one domain is applied to a different domain, a substantial drop in performance is frequently observed. In this paper, we apply active learning strategies to domain adaptation for named entity recognition systems and show that adaptive learning combining the source and target domains is more effective than nonadaptive learning directly from the target domain. Active learning aims to minimize labeling effort by selecting the most informative instances to label. We investigate several sample selection techniques such as Maximum Entropy and Smallest Margin and apply them to the ACE corpus. Our results show that the labeling cost can be reduced by over 92% without degrading the performance.

KW - Active learning

KW - Domain adaptation

KW - Named entity recognition

UR - http://www.scopus.com/inward/record.url?scp=85015318795&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85015318795&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-48674-1_54

DO - 10.1007/978-3-319-48674-1_54

M3 - Chapter

VL - 10040

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 611

EP - 622

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -