Generalized substring selectivity estimation

Zhiyuan Chen, Flip Korn, Nick Koudas, Shanmugavelayutham Muthukrishnan

    Research output: Contribution to journalArticle

    Abstract

    An approach to selectivity estimation of generalized Boolean substring queries with a focus on conjunctive multidimensional and Boolean queries was presented. The set hashing, a Monte Carlo technique, was used to succinctly represent the set of tuples containing a given substring as a signature vector of hash values. The analysis showed that using only linear storage, a large number of cross-counts were generated including those for complex co-occurrences of substrings.

    Original languageEnglish (US)
    Pages (from-to)98-132
    Number of pages35
    JournalJournal of Computer and System Sciences
    Volume66
    Issue number1
    DOIs
    StatePublished - Jul 8 2003

    Fingerprint

    Selectivity
    Query
    Monte Carlo Techniques
    Hashing
    Count
    Signature

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • Computer Networks and Communications
    • Computational Theory and Mathematics
    • Applied Mathematics

    Cite this

    Generalized substring selectivity estimation. / Chen, Zhiyuan; Korn, Flip; Koudas, Nick; Muthukrishnan, Shanmugavelayutham.

    In: Journal of Computer and System Sciences, Vol. 66, No. 1, 08.07.2003, p. 98-132.

    Research output: Contribution to journalArticle

    Chen, Z, Korn, F, Koudas, N & Muthukrishnan, S 2003, 'Generalized substring selectivity estimation', Journal of Computer and System Sciences, vol. 66, no. 1, pp. 98-132. https://doi.org/10.1016/S0022-0000(02)00031-4
    Chen, Zhiyuan ; Korn, Flip ; Koudas, Nick ; Muthukrishnan, Shanmugavelayutham. / Generalized substring selectivity estimation. In: Journal of Computer and System Sciences. 2003 ; Vol. 66, No. 1. pp. 98-132.
    @article{d4022283661f47418c27938efacb4b3e,
    title = "Generalized substring selectivity estimation",
    abstract = "An approach to selectivity estimation of generalized Boolean substring queries with a focus on conjunctive multidimensional and Boolean queries was presented. The set hashing, a Monte Carlo technique, was used to succinctly represent the set of tuples containing a given substring as a signature vector of hash values. The analysis showed that using only linear storage, a large number of cross-counts were generated including those for complex co-occurrences of substrings.",
    author = "Zhiyuan Chen and Flip Korn and Nick Koudas and Shanmugavelayutham Muthukrishnan",
    year = "2003",
    month = "7",
    day = "8",
    doi = "10.1016/S0022-0000(02)00031-4",
    language = "English (US)",
    volume = "66",
    pages = "98--132",
    journal = "Journal of Computer and System Sciences",
    issn = "0022-0000",
    publisher = "Academic Press Inc.",
    number = "1",

    }

    TY - JOUR

    T1 - Generalized substring selectivity estimation

    AU - Chen, Zhiyuan

    AU - Korn, Flip

    AU - Koudas, Nick

    AU - Muthukrishnan, Shanmugavelayutham

    PY - 2003/7/8

    Y1 - 2003/7/8

    N2 - An approach to selectivity estimation of generalized Boolean substring queries with a focus on conjunctive multidimensional and Boolean queries was presented. The set hashing, a Monte Carlo technique, was used to succinctly represent the set of tuples containing a given substring as a signature vector of hash values. The analysis showed that using only linear storage, a large number of cross-counts were generated including those for complex co-occurrences of substrings.

    AB - An approach to selectivity estimation of generalized Boolean substring queries with a focus on conjunctive multidimensional and Boolean queries was presented. The set hashing, a Monte Carlo technique, was used to succinctly represent the set of tuples containing a given substring as a signature vector of hash values. The analysis showed that using only linear storage, a large number of cross-counts were generated including those for complex co-occurrences of substrings.

    UR - http://www.scopus.com/inward/record.url?scp=0038605881&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=0038605881&partnerID=8YFLogxK

    U2 - 10.1016/S0022-0000(02)00031-4

    DO - 10.1016/S0022-0000(02)00031-4

    M3 - Article

    AN - SCOPUS:0038605881

    VL - 66

    SP - 98

    EP - 132

    JO - Journal of Computer and System Sciences

    JF - Journal of Computer and System Sciences

    SN - 0022-0000

    IS - 1

    ER -