A two-level approach for subtitle alignment

Jia Huang, Hao Ding, Xiaohua Hu, Yong Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we propose a two-level Needleman-Wunsch algorithm to align two subtitle files. We consider each subtitle file as a sequence of sentences, and each sentence as a sequence of characters. Our algorithm aligns the OCR and Web subtitles from both sentence level and character level. Experiments on ten datasets from two TV shows indicate that our algorithm outperforms the state-of-the-art approaches with an average precision and recall of 0.96 and 0.95.

Original languageEnglish (US)
Title of host publicationAdvances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014, Proceedings
PublisherSpringer Verlag
Pages468-473
Number of pages6
Volume8416 LNCS
ISBN (Print)9783319060279
DOIs
StatePublished - 2014
Event36th European Conference on Information Retrieval, ECIR 2014 - Amsterdam, Netherlands
Duration: Apr 13 2014Apr 16 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8416 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other36th European Conference on Information Retrieval, ECIR 2014
CountryNetherlands
CityAmsterdam
Period4/13/144/16/14

Fingerprint

Alignment
Optical character recognition
Experiment
Experiments
Character

Keywords

  • dynamic programming
  • sequence alignment
  • subtitle alignment

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Huang, J., Ding, H., Hu, X., & Liu, Y. (2014). A two-level approach for subtitle alignment. In Advances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014, Proceedings (Vol. 8416 LNCS, pp. 468-473). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8416 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-06028-6_43

A two-level approach for subtitle alignment. / Huang, Jia; Ding, Hao; Hu, Xiaohua; Liu, Yong.

Advances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014, Proceedings. Vol. 8416 LNCS Springer Verlag, 2014. p. 468-473 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8416 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Huang, J, Ding, H, Hu, X & Liu, Y 2014, A two-level approach for subtitle alignment. in Advances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014, Proceedings. vol. 8416 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8416 LNCS, Springer Verlag, pp. 468-473, 36th European Conference on Information Retrieval, ECIR 2014, Amsterdam, Netherlands, 4/13/14. https://doi.org/10.1007/978-3-319-06028-6_43
Huang J, Ding H, Hu X, Liu Y. A two-level approach for subtitle alignment. In Advances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014, Proceedings. Vol. 8416 LNCS. Springer Verlag. 2014. p. 468-473. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-06028-6_43
Huang, Jia ; Ding, Hao ; Hu, Xiaohua ; Liu, Yong. / A two-level approach for subtitle alignment. Advances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014, Proceedings. Vol. 8416 LNCS Springer Verlag, 2014. pp. 468-473 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{91f2f57363d946ed853c06011391ccd5,
title = "A two-level approach for subtitle alignment",
abstract = "In this paper, we propose a two-level Needleman-Wunsch algorithm to align two subtitle files. We consider each subtitle file as a sequence of sentences, and each sentence as a sequence of characters. Our algorithm aligns the OCR and Web subtitles from both sentence level and character level. Experiments on ten datasets from two TV shows indicate that our algorithm outperforms the state-of-the-art approaches with an average precision and recall of 0.96 and 0.95.",
keywords = "dynamic programming, sequence alignment, subtitle alignment",
author = "Jia Huang and Hao Ding and Xiaohua Hu and Yong Liu",
year = "2014",
doi = "10.1007/978-3-319-06028-6_43",
language = "English (US)",
isbn = "9783319060279",
volume = "8416 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "468--473",
booktitle = "Advances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014, Proceedings",

}

TY - GEN

T1 - A two-level approach for subtitle alignment

AU - Huang, Jia

AU - Ding, Hao

AU - Hu, Xiaohua

AU - Liu, Yong

PY - 2014

Y1 - 2014

N2 - In this paper, we propose a two-level Needleman-Wunsch algorithm to align two subtitle files. We consider each subtitle file as a sequence of sentences, and each sentence as a sequence of characters. Our algorithm aligns the OCR and Web subtitles from both sentence level and character level. Experiments on ten datasets from two TV shows indicate that our algorithm outperforms the state-of-the-art approaches with an average precision and recall of 0.96 and 0.95.

AB - In this paper, we propose a two-level Needleman-Wunsch algorithm to align two subtitle files. We consider each subtitle file as a sequence of sentences, and each sentence as a sequence of characters. Our algorithm aligns the OCR and Web subtitles from both sentence level and character level. Experiments on ten datasets from two TV shows indicate that our algorithm outperforms the state-of-the-art approaches with an average precision and recall of 0.96 and 0.95.

KW - dynamic programming

KW - sequence alignment

KW - subtitle alignment

UR - http://www.scopus.com/inward/record.url?scp=84899951243&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84899951243&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-06028-6_43

DO - 10.1007/978-3-319-06028-6_43

M3 - Conference contribution

AN - SCOPUS:84899951243

SN - 9783319060279

VL - 8416 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 468

EP - 473

BT - Advances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014, Proceedings

PB - Springer Verlag

ER -