Modeling pronunciation variation with context-dependent articulatory feature decision trees

Samuel Bowman, Karen Livescu

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    We consider the problem of predicting the surface pronunciations of a word in conversational speech, using a model of pronunciation variation based on articulatory features. We build context-dependent decision trees for both phone-based and feature-based models, and compare their perplexities on conversational data from the Switchboard Transcription Project. We find that a fully-factored model, with separate decision trees for each articulatory feature, does not perform well, but a feature-based model using a smaller number of "feature bundles" outperforms both the fully-factored model and a phone-based model. The articulatory feature-based decision trees are also much more robust to reductions in training data. We also analyze the usefulness of various context variables.

    Original languageEnglish (US)
    Title of host publicationProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
    Pages326-329
    Number of pages4
    StatePublished - 2010
    Event11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010 - Makuhari, Chiba, Japan
    Duration: Sep 26 2010Sep 30 2010

    Other

    Other11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010
    CountryJapan
    CityMakuhari, Chiba
    Period9/26/109/30/10

    Fingerprint

    Decision Trees
    Decision Tree
    Modeling

    Keywords

    • Articulatory features
    • Pronunciation modeling

    ASJC Scopus subject areas

    • Language and Linguistics
    • Speech and Hearing

    Cite this

    Bowman, S., & Livescu, K. (2010). Modeling pronunciation variation with context-dependent articulatory feature decision trees. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 (pp. 326-329)

    Modeling pronunciation variation with context-dependent articulatory feature decision trees. / Bowman, Samuel; Livescu, Karen.

    Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. 2010. p. 326-329.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Bowman, S & Livescu, K 2010, Modeling pronunciation variation with context-dependent articulatory feature decision trees. in Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. pp. 326-329, 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010, Makuhari, Chiba, Japan, 9/26/10.
    Bowman S, Livescu K. Modeling pronunciation variation with context-dependent articulatory feature decision trees. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. 2010. p. 326-329
    Bowman, Samuel ; Livescu, Karen. / Modeling pronunciation variation with context-dependent articulatory feature decision trees. Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. 2010. pp. 326-329
    @inproceedings{47f72f064dce4a47bae0ef5046396372,
    title = "Modeling pronunciation variation with context-dependent articulatory feature decision trees",
    abstract = "We consider the problem of predicting the surface pronunciations of a word in conversational speech, using a model of pronunciation variation based on articulatory features. We build context-dependent decision trees for both phone-based and feature-based models, and compare their perplexities on conversational data from the Switchboard Transcription Project. We find that a fully-factored model, with separate decision trees for each articulatory feature, does not perform well, but a feature-based model using a smaller number of {"}feature bundles{"} outperforms both the fully-factored model and a phone-based model. The articulatory feature-based decision trees are also much more robust to reductions in training data. We also analyze the usefulness of various context variables.",
    keywords = "Articulatory features, Pronunciation modeling",
    author = "Samuel Bowman and Karen Livescu",
    year = "2010",
    language = "English (US)",
    pages = "326--329",
    booktitle = "Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010",

    }

    TY - GEN

    T1 - Modeling pronunciation variation with context-dependent articulatory feature decision trees

    AU - Bowman, Samuel

    AU - Livescu, Karen

    PY - 2010

    Y1 - 2010

    N2 - We consider the problem of predicting the surface pronunciations of a word in conversational speech, using a model of pronunciation variation based on articulatory features. We build context-dependent decision trees for both phone-based and feature-based models, and compare their perplexities on conversational data from the Switchboard Transcription Project. We find that a fully-factored model, with separate decision trees for each articulatory feature, does not perform well, but a feature-based model using a smaller number of "feature bundles" outperforms both the fully-factored model and a phone-based model. The articulatory feature-based decision trees are also much more robust to reductions in training data. We also analyze the usefulness of various context variables.

    AB - We consider the problem of predicting the surface pronunciations of a word in conversational speech, using a model of pronunciation variation based on articulatory features. We build context-dependent decision trees for both phone-based and feature-based models, and compare their perplexities on conversational data from the Switchboard Transcription Project. We find that a fully-factored model, with separate decision trees for each articulatory feature, does not perform well, but a feature-based model using a smaller number of "feature bundles" outperforms both the fully-factored model and a phone-based model. The articulatory feature-based decision trees are also much more robust to reductions in training data. We also analyze the usefulness of various context variables.

    KW - Articulatory features

    KW - Pronunciation modeling

    UR - http://www.scopus.com/inward/record.url?scp=79959828525&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=79959828525&partnerID=8YFLogxK

    M3 - Conference contribution

    SP - 326

    EP - 329

    BT - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

    ER -