Simulating zero-resource spoken term discovery

Jerome White, Douglas W. Oard

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    If search engines are ever to index all of the spoken content in the world, they will need to handle hundreds of languages for which no automatic speech recognition systems exist. Zero-resource spoken term discovery, in which repeated content is detected in some acoustic representation, offers a potentially useful source of indexing features. This paper describes a text-based simulation of a zero-resource spoken term discovery system that allows any information retrieval test collection to be used as a basis for early development of information retrieval techniques. It is proposed that these techniques can be later applied to actual zero-resource spoken term discovery results.

    Original languageEnglish (US)
    Title of host publicationCIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management
    PublisherAssociation for Computing Machinery
    Pages2371-2374
    Number of pages4
    VolumePart F131841
    ISBN (Electronic)9781450349185
    DOIs
    StatePublished - Nov 6 2017
    Event26th ACM International Conference on Information and Knowledge Management, CIKM 2017 - Singapore, Singapore
    Duration: Nov 6 2017Nov 10 2017

    Other

    Other26th ACM International Conference on Information and Knowledge Management, CIKM 2017
    CountrySingapore
    CitySingapore
    Period11/6/1711/10/17

    Fingerprint

    Resources
    Information retrieval
    Speech recognition
    Simulation
    Search engine
    Test collections
    Indexing
    Language

    Keywords

    • N-gram retrieval
    • Simulation
    • Zero resource term discovery

    ASJC Scopus subject areas

    • Business, Management and Accounting(all)
    • Decision Sciences(all)

    Cite this

    White, J., & Oard, D. W. (2017). Simulating zero-resource spoken term discovery. In CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management (Vol. Part F131841, pp. 2371-2374). Association for Computing Machinery. https://doi.org/10.1145/3132847.3133160

    Simulating zero-resource spoken term discovery. / White, Jerome; Oard, Douglas W.

    CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management. Vol. Part F131841 Association for Computing Machinery, 2017. p. 2371-2374.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    White, J & Oard, DW 2017, Simulating zero-resource spoken term discovery. in CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management. vol. Part F131841, Association for Computing Machinery, pp. 2371-2374, 26th ACM International Conference on Information and Knowledge Management, CIKM 2017, Singapore, Singapore, 11/6/17. https://doi.org/10.1145/3132847.3133160
    White J, Oard DW. Simulating zero-resource spoken term discovery. In CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management. Vol. Part F131841. Association for Computing Machinery. 2017. p. 2371-2374 https://doi.org/10.1145/3132847.3133160
    White, Jerome ; Oard, Douglas W. / Simulating zero-resource spoken term discovery. CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management. Vol. Part F131841 Association for Computing Machinery, 2017. pp. 2371-2374
    @inproceedings{22a8a7f74e3047119220470806470de9,
    title = "Simulating zero-resource spoken term discovery",
    abstract = "If search engines are ever to index all of the spoken content in the world, they will need to handle hundreds of languages for which no automatic speech recognition systems exist. Zero-resource spoken term discovery, in which repeated content is detected in some acoustic representation, offers a potentially useful source of indexing features. This paper describes a text-based simulation of a zero-resource spoken term discovery system that allows any information retrieval test collection to be used as a basis for early development of information retrieval techniques. It is proposed that these techniques can be later applied to actual zero-resource spoken term discovery results.",
    keywords = "N-gram retrieval, Simulation, Zero resource term discovery",
    author = "Jerome White and Oard, {Douglas W.}",
    year = "2017",
    month = "11",
    day = "6",
    doi = "10.1145/3132847.3133160",
    language = "English (US)",
    volume = "Part F131841",
    pages = "2371--2374",
    booktitle = "CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management",
    publisher = "Association for Computing Machinery",

    }

    TY - GEN

    T1 - Simulating zero-resource spoken term discovery

    AU - White, Jerome

    AU - Oard, Douglas W.

    PY - 2017/11/6

    Y1 - 2017/11/6

    N2 - If search engines are ever to index all of the spoken content in the world, they will need to handle hundreds of languages for which no automatic speech recognition systems exist. Zero-resource spoken term discovery, in which repeated content is detected in some acoustic representation, offers a potentially useful source of indexing features. This paper describes a text-based simulation of a zero-resource spoken term discovery system that allows any information retrieval test collection to be used as a basis for early development of information retrieval techniques. It is proposed that these techniques can be later applied to actual zero-resource spoken term discovery results.

    AB - If search engines are ever to index all of the spoken content in the world, they will need to handle hundreds of languages for which no automatic speech recognition systems exist. Zero-resource spoken term discovery, in which repeated content is detected in some acoustic representation, offers a potentially useful source of indexing features. This paper describes a text-based simulation of a zero-resource spoken term discovery system that allows any information retrieval test collection to be used as a basis for early development of information retrieval techniques. It is proposed that these techniques can be later applied to actual zero-resource spoken term discovery results.

    KW - N-gram retrieval

    KW - Simulation

    KW - Zero resource term discovery

    UR - http://www.scopus.com/inward/record.url?scp=85037328407&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85037328407&partnerID=8YFLogxK

    U2 - 10.1145/3132847.3133160

    DO - 10.1145/3132847.3133160

    M3 - Conference contribution

    VL - Part F131841

    SP - 2371

    EP - 2374

    BT - CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management

    PB - Association for Computing Machinery

    ER -