An experimental study of index compression and DAAT query processing methods

Antonio Mallia, Michał Siedlaczek, Torsten Suel

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    In the last two decades, the IR community has seen numerous advances in top-k query processing and inverted index compression techniques. While newly proposed methods are typically compared against several baselines, these evaluations are often very limited, and we feel that there is no clear overall picture on the best choices of algorithms and compression methods. In this paper, we attempt to address this issue by evaluating a number of state-of-the-art index compression methods and safe disjunctive DAAT query processing algorithms. Our goal is to understand how much index compression performance impacts overall query processing speed, how the choice of query processing algorithm depends on the compression method used, and how performance is impacted by document reordering techniques and the number of results returned, keeping in mind that current search engines typically use sets of hundreds or thousands of candidates for further reranking.

    Original languageEnglish (US)
    Title of host publicationAdvances in Information Retrieval - 41st European Conference on IR Research, ECIR 2019, Proceedings
    EditorsNorbert Fuhr, Leif Azzopardi, Benno Stein, Claudia Hauff, Philipp Mayr, Djoerd Hiemstra
    PublisherSpringer-Verlag
    Pages353-368
    Number of pages16
    ISBN (Print)9783030157111
    DOIs
    StatePublished - Jan 1 2019
    Event41st European Conference on Information Retrieval, ECIR 2019 - Cologne, Germany
    Duration: Apr 14 2019Apr 18 2019

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume11437 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference41st European Conference on Information Retrieval, ECIR 2019
    CountryGermany
    CityCologne
    Period4/14/194/18/19

    Fingerprint

    Query processing
    Query Processing
    Experimental Study
    Compression
    Search engines
    Reordering
    Search Engine
    Baseline
    Evaluation

    Keywords

    • Compression
    • Inverted indexes
    • Query processing

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • Computer Science(all)

    Cite this

    Mallia, A., Siedlaczek, M., & Suel, T. (2019). An experimental study of index compression and DAAT query processing methods. In N. Fuhr, L. Azzopardi, B. Stein, C. Hauff, P. Mayr, & D. Hiemstra (Eds.), Advances in Information Retrieval - 41st European Conference on IR Research, ECIR 2019, Proceedings (pp. 353-368). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11437 LNCS). Springer-Verlag. https://doi.org/10.1007/978-3-030-15712-8_23

    An experimental study of index compression and DAAT query processing methods. / Mallia, Antonio; Siedlaczek, Michał; Suel, Torsten.

    Advances in Information Retrieval - 41st European Conference on IR Research, ECIR 2019, Proceedings. ed. / Norbert Fuhr; Leif Azzopardi; Benno Stein; Claudia Hauff; Philipp Mayr; Djoerd Hiemstra. Springer-Verlag, 2019. p. 353-368 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11437 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Mallia, A, Siedlaczek, M & Suel, T 2019, An experimental study of index compression and DAAT query processing methods. in N Fuhr, L Azzopardi, B Stein, C Hauff, P Mayr & D Hiemstra (eds), Advances in Information Retrieval - 41st European Conference on IR Research, ECIR 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11437 LNCS, Springer-Verlag, pp. 353-368, 41st European Conference on Information Retrieval, ECIR 2019, Cologne, Germany, 4/14/19. https://doi.org/10.1007/978-3-030-15712-8_23
    Mallia A, Siedlaczek M, Suel T. An experimental study of index compression and DAAT query processing methods. In Fuhr N, Azzopardi L, Stein B, Hauff C, Mayr P, Hiemstra D, editors, Advances in Information Retrieval - 41st European Conference on IR Research, ECIR 2019, Proceedings. Springer-Verlag. 2019. p. 353-368. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-15712-8_23
    Mallia, Antonio ; Siedlaczek, Michał ; Suel, Torsten. / An experimental study of index compression and DAAT query processing methods. Advances in Information Retrieval - 41st European Conference on IR Research, ECIR 2019, Proceedings. editor / Norbert Fuhr ; Leif Azzopardi ; Benno Stein ; Claudia Hauff ; Philipp Mayr ; Djoerd Hiemstra. Springer-Verlag, 2019. pp. 353-368 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
    @inproceedings{4d53d52dcf354753a6f0dd2acab5b104,
    title = "An experimental study of index compression and DAAT query processing methods",
    abstract = "In the last two decades, the IR community has seen numerous advances in{\^A} top-k query processing and inverted index compression techniques. While newly proposed methods are typically compared against several baselines, these evaluations are often very limited, and we feel that there is no clear overall picture on the best choices of algorithms and compression methods. In this paper, we attempt to address this issue by evaluating a number of state-of-the-art index compression methods and safe disjunctive DAAT query processing algorithms. Our goal is to understand how much index compression performance impacts overall query processing speed, how the choice of query processing algorithm depends on the compression method used, and how performance is impacted by document reordering techniques and the number of results returned, keeping in mind that current search engines typically use sets of hundreds or thousands of candidates for further reranking.",
    keywords = "Compression, Inverted indexes, Query processing",
    author = "Antonio Mallia and Michał Siedlaczek and Torsten Suel",
    year = "2019",
    month = "1",
    day = "1",
    doi = "10.1007/978-3-030-15712-8_23",
    language = "English (US)",
    isbn = "9783030157111",
    series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
    publisher = "Springer-Verlag",
    pages = "353--368",
    editor = "Norbert Fuhr and Leif Azzopardi and Benno Stein and Claudia Hauff and Philipp Mayr and Djoerd Hiemstra",
    booktitle = "Advances in Information Retrieval - 41st European Conference on IR Research, ECIR 2019, Proceedings",

    }

    TY - GEN

    T1 - An experimental study of index compression and DAAT query processing methods

    AU - Mallia, Antonio

    AU - Siedlaczek, Michał

    AU - Suel, Torsten

    PY - 2019/1/1

    Y1 - 2019/1/1

    N2 - In the last two decades, the IR community has seen numerous advances in top-k query processing and inverted index compression techniques. While newly proposed methods are typically compared against several baselines, these evaluations are often very limited, and we feel that there is no clear overall picture on the best choices of algorithms and compression methods. In this paper, we attempt to address this issue by evaluating a number of state-of-the-art index compression methods and safe disjunctive DAAT query processing algorithms. Our goal is to understand how much index compression performance impacts overall query processing speed, how the choice of query processing algorithm depends on the compression method used, and how performance is impacted by document reordering techniques and the number of results returned, keeping in mind that current search engines typically use sets of hundreds or thousands of candidates for further reranking.

    AB - In the last two decades, the IR community has seen numerous advances in top-k query processing and inverted index compression techniques. While newly proposed methods are typically compared against several baselines, these evaluations are often very limited, and we feel that there is no clear overall picture on the best choices of algorithms and compression methods. In this paper, we attempt to address this issue by evaluating a number of state-of-the-art index compression methods and safe disjunctive DAAT query processing algorithms. Our goal is to understand how much index compression performance impacts overall query processing speed, how the choice of query processing algorithm depends on the compression method used, and how performance is impacted by document reordering techniques and the number of results returned, keeping in mind that current search engines typically use sets of hundreds or thousands of candidates for further reranking.

    KW - Compression

    KW - Inverted indexes

    KW - Query processing

    UR - http://www.scopus.com/inward/record.url?scp=85064869779&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85064869779&partnerID=8YFLogxK

    U2 - 10.1007/978-3-030-15712-8_23

    DO - 10.1007/978-3-030-15712-8_23

    M3 - Conference contribution

    AN - SCOPUS:85064869779

    SN - 9783030157111

    T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    SP - 353

    EP - 368

    BT - Advances in Information Retrieval - 41st European Conference on IR Research, ECIR 2019, Proceedings

    A2 - Fuhr, Norbert

    A2 - Azzopardi, Leif

    A2 - Stein, Benno

    A2 - Hauff, Claudia

    A2 - Mayr, Philipp

    A2 - Hiemstra, Djoerd

    PB - Springer-Verlag

    ER -