Combinatorial partial monitoring game with linear feedback and its applications

Tian Lin, Bruno Abrahao, Robert Kleinberg, John C.S. Lui, Wei Chen

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    In online learning, a player chooses actions to play and receives reward and feedback from the environment with the goal of maximizing her reward over time. In this paper, we propose the model of combinatorial partial monitoring games with linear feedback, a model which simultane-ously addresses limited feedback, infinite outcome space of the environment and exponentially large action space of the player. We present the Global Confidence Bound (GCB) algorithm, which integrates ideas from both combinatorial multi-armed bandits and finite partial monitoring games to handle all the above issues. GCB only requires feedback on a small set of actions and achieves 0(T2/3 log T) distribution-independent regret and C(log T) distribution-dependent regret (the latter assuming unique optimal action), where T is the total time steps played. Moreover, the regret bounds only depend linearly on log \X\ rather than \X\, where X is the action space. GCB isolates offline optimization tasks from online learning and avoids explicit enumeration of all actions in the online learning part. We demonstrate that our model and algorithm can be applied to a crowdsourcing application leading to both an efficient learning algorithm and low regret, and argue that they can be applied to a wide range of combinatorial applications constrained with limited feedback.

    Original languageEnglish (US)
    Title of host publication31st International Conference on Machine Learning, ICML 2014
    PublisherInternational Machine Learning Society (IMLS)
    Pages2512-2537
    Number of pages26
    ISBN (Electronic)9781634393973
    StatePublished - Jan 1 2014
    Event31st International Conference on Machine Learning, ICML 2014 - Beijing, China
    Duration: Jun 21 2014Jun 26 2014

    Publication series

    Name31st International Conference on Machine Learning, ICML 2014
    Volume3

    Other

    Other31st International Conference on Machine Learning, ICML 2014
    CountryChina
    CityBeijing
    Period6/21/146/26/14

    Fingerprint

    Feedback
    Monitoring
    Learning algorithms

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Computer Networks and Communications
    • Software

    Cite this

    Lin, T., Abrahao, B., Kleinberg, R., Lui, J. C. S., & Chen, W. (2014). Combinatorial partial monitoring game with linear feedback and its applications. In 31st International Conference on Machine Learning, ICML 2014 (pp. 2512-2537). (31st International Conference on Machine Learning, ICML 2014; Vol. 3). International Machine Learning Society (IMLS).

    Combinatorial partial monitoring game with linear feedback and its applications. / Lin, Tian; Abrahao, Bruno; Kleinberg, Robert; Lui, John C.S.; Chen, Wei.

    31st International Conference on Machine Learning, ICML 2014. International Machine Learning Society (IMLS), 2014. p. 2512-2537 (31st International Conference on Machine Learning, ICML 2014; Vol. 3).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Lin, T, Abrahao, B, Kleinberg, R, Lui, JCS & Chen, W 2014, Combinatorial partial monitoring game with linear feedback and its applications. in 31st International Conference on Machine Learning, ICML 2014. 31st International Conference on Machine Learning, ICML 2014, vol. 3, International Machine Learning Society (IMLS), pp. 2512-2537, 31st International Conference on Machine Learning, ICML 2014, Beijing, China, 6/21/14.
    Lin T, Abrahao B, Kleinberg R, Lui JCS, Chen W. Combinatorial partial monitoring game with linear feedback and its applications. In 31st International Conference on Machine Learning, ICML 2014. International Machine Learning Society (IMLS). 2014. p. 2512-2537. (31st International Conference on Machine Learning, ICML 2014).
    Lin, Tian ; Abrahao, Bruno ; Kleinberg, Robert ; Lui, John C.S. ; Chen, Wei. / Combinatorial partial monitoring game with linear feedback and its applications. 31st International Conference on Machine Learning, ICML 2014. International Machine Learning Society (IMLS), 2014. pp. 2512-2537 (31st International Conference on Machine Learning, ICML 2014).
    @inproceedings{41d59b34bc4d43d58e21e561f775f4cb,
    title = "Combinatorial partial monitoring game with linear feedback and its applications",
    abstract = "In online learning, a player chooses actions to play and receives reward and feedback from the environment with the goal of maximizing her reward over time. In this paper, we propose the model of combinatorial partial monitoring games with linear feedback, a model which simultane-ously addresses limited feedback, infinite outcome space of the environment and exponentially large action space of the player. We present the Global Confidence Bound (GCB) algorithm, which integrates ideas from both combinatorial multi-armed bandits and finite partial monitoring games to handle all the above issues. GCB only requires feedback on a small set of actions and achieves 0(T2/3 log T) distribution-independent regret and C(log T) distribution-dependent regret (the latter assuming unique optimal action), where T is the total time steps played. Moreover, the regret bounds only depend linearly on log \X\ rather than \X\, where X is the action space. GCB isolates offline optimization tasks from online learning and avoids explicit enumeration of all actions in the online learning part. We demonstrate that our model and algorithm can be applied to a crowdsourcing application leading to both an efficient learning algorithm and low regret, and argue that they can be applied to a wide range of combinatorial applications constrained with limited feedback.",
    author = "Tian Lin and Bruno Abrahao and Robert Kleinberg and Lui, {John C.S.} and Wei Chen",
    year = "2014",
    month = "1",
    day = "1",
    language = "English (US)",
    series = "31st International Conference on Machine Learning, ICML 2014",
    publisher = "International Machine Learning Society (IMLS)",
    pages = "2512--2537",
    booktitle = "31st International Conference on Machine Learning, ICML 2014",

    }

    TY - GEN

    T1 - Combinatorial partial monitoring game with linear feedback and its applications

    AU - Lin, Tian

    AU - Abrahao, Bruno

    AU - Kleinberg, Robert

    AU - Lui, John C.S.

    AU - Chen, Wei

    PY - 2014/1/1

    Y1 - 2014/1/1

    N2 - In online learning, a player chooses actions to play and receives reward and feedback from the environment with the goal of maximizing her reward over time. In this paper, we propose the model of combinatorial partial monitoring games with linear feedback, a model which simultane-ously addresses limited feedback, infinite outcome space of the environment and exponentially large action space of the player. We present the Global Confidence Bound (GCB) algorithm, which integrates ideas from both combinatorial multi-armed bandits and finite partial monitoring games to handle all the above issues. GCB only requires feedback on a small set of actions and achieves 0(T2/3 log T) distribution-independent regret and C(log T) distribution-dependent regret (the latter assuming unique optimal action), where T is the total time steps played. Moreover, the regret bounds only depend linearly on log \X\ rather than \X\, where X is the action space. GCB isolates offline optimization tasks from online learning and avoids explicit enumeration of all actions in the online learning part. We demonstrate that our model and algorithm can be applied to a crowdsourcing application leading to both an efficient learning algorithm and low regret, and argue that they can be applied to a wide range of combinatorial applications constrained with limited feedback.

    AB - In online learning, a player chooses actions to play and receives reward and feedback from the environment with the goal of maximizing her reward over time. In this paper, we propose the model of combinatorial partial monitoring games with linear feedback, a model which simultane-ously addresses limited feedback, infinite outcome space of the environment and exponentially large action space of the player. We present the Global Confidence Bound (GCB) algorithm, which integrates ideas from both combinatorial multi-armed bandits and finite partial monitoring games to handle all the above issues. GCB only requires feedback on a small set of actions and achieves 0(T2/3 log T) distribution-independent regret and C(log T) distribution-dependent regret (the latter assuming unique optimal action), where T is the total time steps played. Moreover, the regret bounds only depend linearly on log \X\ rather than \X\, where X is the action space. GCB isolates offline optimization tasks from online learning and avoids explicit enumeration of all actions in the online learning part. We demonstrate that our model and algorithm can be applied to a crowdsourcing application leading to both an efficient learning algorithm and low regret, and argue that they can be applied to a wide range of combinatorial applications constrained with limited feedback.

    UR - http://www.scopus.com/inward/record.url?scp=84919902752&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84919902752&partnerID=8YFLogxK

    M3 - Conference contribution

    AN - SCOPUS:84919902752

    T3 - 31st International Conference on Machine Learning, ICML 2014

    SP - 2512

    EP - 2537

    BT - 31st International Conference on Machine Learning, ICML 2014

    PB - International Machine Learning Society (IMLS)

    ER -