Simulating zero-resource spoken term discovery

Jerome White, Douglas W. Oard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

If search engines are ever to index all of the spoken content in the world, they will need to handle hundreds of languages for which no automatic speech recognition systems exist. Zero-resource spoken term discovery, in which repeated content is detected in some acoustic representation, offers a potentially useful source of indexing features. This paper describes a text-based simulation of a zero-resource spoken term discovery system that allows any information retrieval test collection to be used as a basis for early development of information retrieval techniques. It is proposed that these techniques can be later applied to actual zero-resource spoken term discovery results.

Original languageEnglish (US)
Title of host publicationCIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages2371-2374
Number of pages4
VolumePart F131841
ISBN (Electronic)9781450349185
DOIs
StatePublished - Nov 6 2017
Event26th ACM International Conference on Information and Knowledge Management, CIKM 2017 - Singapore, Singapore
Duration: Nov 6 2017Nov 10 2017

Other

Other26th ACM International Conference on Information and Knowledge Management, CIKM 2017
CountrySingapore
CitySingapore
Period11/6/1711/10/17

    Fingerprint

Keywords

  • N-gram retrieval
  • Simulation
  • Zero resource term discovery

ASJC Scopus subject areas

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

Cite this

White, J., & Oard, D. W. (2017). Simulating zero-resource spoken term discovery. In CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management (Vol. Part F131841, pp. 2371-2374). Association for Computing Machinery. https://doi.org/10.1145/3132847.3133160