Online set selection with fairness and diversity constraints

Julia Stoyanovich, Ke Yang, H. V. Jagadish

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Selection algorithms usually score individual items in isolation, and then select the top scoring items. However, often there is an additional diversity objective. Since diversity is a group property, it does not easily jibe with individual item scoring. In this paper, we study set selection queries subject to diversity and group fairness constraints. We develop algorithms for several problem settings with streaming data, where an online decision must be made on each item as it is presented. We show through experiments with real and synthetic data that fairness and diversity can be achieved, usually with modest costs in terms of quality. Our experimental evaluation leads to several important insights in online set selection. We demonstrate that theoretical guarantees on solution quality are conservative in real datasets, and that tuning the length of the score estimation phase leads to an interesting accuracy-efficiency trade-off. Further, we show that if a difference in scores is expected between groups, then these groups must be treated separately during processing. Otherwise, a solution may be derived that meets diversity constraints, but that selects lower-scoring members of disadvantaged groups.

    Original languageEnglish (US)
    Title of host publicationAdvances in Database Technology - EDBT 2018
    Subtitle of host publication21st International Conference on Extending Database Technology, Proceedings
    EditorsNorman May, Erhard Rahm, Reinhard Pichler, Michael Bohlen, Shan-Hung Wu, Katja Hose
    PublisherOpenProceedings.org
    Pages241-252
    Number of pages12
    ISBN (Electronic)9783893180783
    DOIs
    StatePublished - Jan 1 2018
    Event21st International Conference on Extending Database Technology, EDBT 2018 - Vienna, Austria
    Duration: Mar 26 2018Mar 29 2018

    Publication series

    NameAdvances in Database Technology - EDBT
    Volume2018-March
    ISSN (Electronic)2367-2005

    Conference

    Conference21st International Conference on Extending Database Technology, EDBT 2018
    CountryAustria
    CityVienna
    Period3/26/183/29/18

    Fingerprint

    Tuning
    Processing
    Costs
    Experiments

    ASJC Scopus subject areas

    • Information Systems
    • Software
    • Computer Science Applications

    Cite this

    Stoyanovich, J., Yang, K., & Jagadish, H. V. (2018). Online set selection with fairness and diversity constraints. In N. May, E. Rahm, R. Pichler, M. Bohlen, S-H. Wu, & K. Hose (Eds.), Advances in Database Technology - EDBT 2018: 21st International Conference on Extending Database Technology, Proceedings (pp. 241-252). (Advances in Database Technology - EDBT; Vol. 2018-March). OpenProceedings.org. https://doi.org/10.5441/002/edbt.2018.22

    Online set selection with fairness and diversity constraints. / Stoyanovich, Julia; Yang, Ke; Jagadish, H. V.

    Advances in Database Technology - EDBT 2018: 21st International Conference on Extending Database Technology, Proceedings. ed. / Norman May; Erhard Rahm; Reinhard Pichler; Michael Bohlen; Shan-Hung Wu; Katja Hose. OpenProceedings.org, 2018. p. 241-252 (Advances in Database Technology - EDBT; Vol. 2018-March).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Stoyanovich, J, Yang, K & Jagadish, HV 2018, Online set selection with fairness and diversity constraints. in N May, E Rahm, R Pichler, M Bohlen, S-H Wu & K Hose (eds), Advances in Database Technology - EDBT 2018: 21st International Conference on Extending Database Technology, Proceedings. Advances in Database Technology - EDBT, vol. 2018-March, OpenProceedings.org, pp. 241-252, 21st International Conference on Extending Database Technology, EDBT 2018, Vienna, Austria, 3/26/18. https://doi.org/10.5441/002/edbt.2018.22
    Stoyanovich J, Yang K, Jagadish HV. Online set selection with fairness and diversity constraints. In May N, Rahm E, Pichler R, Bohlen M, Wu S-H, Hose K, editors, Advances in Database Technology - EDBT 2018: 21st International Conference on Extending Database Technology, Proceedings. OpenProceedings.org. 2018. p. 241-252. (Advances in Database Technology - EDBT). https://doi.org/10.5441/002/edbt.2018.22
    Stoyanovich, Julia ; Yang, Ke ; Jagadish, H. V. / Online set selection with fairness and diversity constraints. Advances in Database Technology - EDBT 2018: 21st International Conference on Extending Database Technology, Proceedings. editor / Norman May ; Erhard Rahm ; Reinhard Pichler ; Michael Bohlen ; Shan-Hung Wu ; Katja Hose. OpenProceedings.org, 2018. pp. 241-252 (Advances in Database Technology - EDBT).
    @inproceedings{8808d7c195e94a10b5a197f77913b5ab,
    title = "Online set selection with fairness and diversity constraints",
    abstract = "Selection algorithms usually score individual items in isolation, and then select the top scoring items. However, often there is an additional diversity objective. Since diversity is a group property, it does not easily jibe with individual item scoring. In this paper, we study set selection queries subject to diversity and group fairness constraints. We develop algorithms for several problem settings with streaming data, where an online decision must be made on each item as it is presented. We show through experiments with real and synthetic data that fairness and diversity can be achieved, usually with modest costs in terms of quality. Our experimental evaluation leads to several important insights in online set selection. We demonstrate that theoretical guarantees on solution quality are conservative in real datasets, and that tuning the length of the score estimation phase leads to an interesting accuracy-efficiency trade-off. Further, we show that if a difference in scores is expected between groups, then these groups must be treated separately during processing. Otherwise, a solution may be derived that meets diversity constraints, but that selects lower-scoring members of disadvantaged groups.",
    author = "Julia Stoyanovich and Ke Yang and Jagadish, {H. V.}",
    year = "2018",
    month = "1",
    day = "1",
    doi = "10.5441/002/edbt.2018.22",
    language = "English (US)",
    series = "Advances in Database Technology - EDBT",
    publisher = "OpenProceedings.org",
    pages = "241--252",
    editor = "Norman May and Erhard Rahm and Reinhard Pichler and Michael Bohlen and Shan-Hung Wu and Katja Hose",
    booktitle = "Advances in Database Technology - EDBT 2018",

    }

    TY - GEN

    T1 - Online set selection with fairness and diversity constraints

    AU - Stoyanovich, Julia

    AU - Yang, Ke

    AU - Jagadish, H. V.

    PY - 2018/1/1

    Y1 - 2018/1/1

    N2 - Selection algorithms usually score individual items in isolation, and then select the top scoring items. However, often there is an additional diversity objective. Since diversity is a group property, it does not easily jibe with individual item scoring. In this paper, we study set selection queries subject to diversity and group fairness constraints. We develop algorithms for several problem settings with streaming data, where an online decision must be made on each item as it is presented. We show through experiments with real and synthetic data that fairness and diversity can be achieved, usually with modest costs in terms of quality. Our experimental evaluation leads to several important insights in online set selection. We demonstrate that theoretical guarantees on solution quality are conservative in real datasets, and that tuning the length of the score estimation phase leads to an interesting accuracy-efficiency trade-off. Further, we show that if a difference in scores is expected between groups, then these groups must be treated separately during processing. Otherwise, a solution may be derived that meets diversity constraints, but that selects lower-scoring members of disadvantaged groups.

    AB - Selection algorithms usually score individual items in isolation, and then select the top scoring items. However, often there is an additional diversity objective. Since diversity is a group property, it does not easily jibe with individual item scoring. In this paper, we study set selection queries subject to diversity and group fairness constraints. We develop algorithms for several problem settings with streaming data, where an online decision must be made on each item as it is presented. We show through experiments with real and synthetic data that fairness and diversity can be achieved, usually with modest costs in terms of quality. Our experimental evaluation leads to several important insights in online set selection. We demonstrate that theoretical guarantees on solution quality are conservative in real datasets, and that tuning the length of the score estimation phase leads to an interesting accuracy-efficiency trade-off. Further, we show that if a difference in scores is expected between groups, then these groups must be treated separately during processing. Otherwise, a solution may be derived that meets diversity constraints, but that selects lower-scoring members of disadvantaged groups.

    UR - http://www.scopus.com/inward/record.url?scp=85058352110&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85058352110&partnerID=8YFLogxK

    U2 - 10.5441/002/edbt.2018.22

    DO - 10.5441/002/edbt.2018.22

    M3 - Conference contribution

    T3 - Advances in Database Technology - EDBT

    SP - 241

    EP - 252

    BT - Advances in Database Technology - EDBT 2018

    A2 - May, Norman

    A2 - Rahm, Erhard

    A2 - Pichler, Reinhard

    A2 - Bohlen, Michael

    A2 - Wu, Shan-Hung

    A2 - Hose, Katja

    PB - OpenProceedings.org

    ER -