Evolving game-specific UCB alternatives for general video game playing

Ivan Bravi, Ahmed Khalifa, Christoffer Holmgård, Julian Togelius

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    At the core of the most popular version of the Monte Carlo Tree Search (MCTS) algorithm is the UCB1 (Upper Confidence Bound) equation. This equation decides which node to explore next, and therefore shapes the behavior of the search process. If the UCB1 equation is replaced with another equation, the behavior of the MCTS algorithm changes, which might increase its performance on certain problems (and decrease it on others). In this paper, we use genetic programming to evolve replacements to the UCB1 equation targeted at playing individual games in the General Video Game AI (GVGAI) Framework. Each equation is evolved to maximize playing strength in a single game, but is then also tested on all other games in our test set. For every game included in the experiments, we found a UCB replacement that performs significantly better than standard UCB1. Additionally, evolved UCB replacements also tend to improve performance in some GVGAI games for which they are not evolved, showing that improvements generalize across games to clusters of games with similar game mechanics or algorithmic performance. Such an evolved portfolio of UCB variations could be useful for a hyper-heuristic game-playing agent, allowing it to select the most appropriate heuristics for particular games or problems in general.

    Original languageEnglish (US)
    Title of host publicationApplications of Evolutionary Computation - 20th European Conference, EvoApplications 2017, Proceedings
    PublisherSpringer Verlag
    Pages393-406
    Number of pages14
    Volume10199 LNCS
    ISBN (Print)9783319558486
    DOIs
    StatePublished - 2017
    Event20th European Conference on the Applications of Evolutionary Computation, EvoApplications 2017 - Amsterdam, Netherlands
    Duration: Apr 19 2017Apr 21 2017

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume10199 LNCS
    ISSN (Print)03029743
    ISSN (Electronic)16113349

    Other

    Other20th European Conference on the Applications of Evolutionary Computation, EvoApplications 2017
    CountryNetherlands
    City Amsterdam
    Period4/19/174/21/17

    Fingerprint

    Video Games
    Game
    Genetic programming
    Alternatives
    Mechanics
    Replacement
    Tree Algorithms
    Experiments
    Search Algorithm
    Hyper-heuristics
    Confidence Bounds
    Test Set
    Genetic Programming
    Maximise
    Heuristics
    Tend
    Decrease
    Generalise

    Keywords

    • General AI
    • Genetic programming
    • Monte-Carlo tree search

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • Computer Science(all)

    Cite this

    Bravi, I., Khalifa, A., Holmgård, C., & Togelius, J. (2017). Evolving game-specific UCB alternatives for general video game playing. In Applications of Evolutionary Computation - 20th European Conference, EvoApplications 2017, Proceedings (Vol. 10199 LNCS, pp. 393-406). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10199 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-55849-3_26

    Evolving game-specific UCB alternatives for general video game playing. / Bravi, Ivan; Khalifa, Ahmed; Holmgård, Christoffer; Togelius, Julian.

    Applications of Evolutionary Computation - 20th European Conference, EvoApplications 2017, Proceedings. Vol. 10199 LNCS Springer Verlag, 2017. p. 393-406 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10199 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Bravi, I, Khalifa, A, Holmgård, C & Togelius, J 2017, Evolving game-specific UCB alternatives for general video game playing. in Applications of Evolutionary Computation - 20th European Conference, EvoApplications 2017, Proceedings. vol. 10199 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10199 LNCS, Springer Verlag, pp. 393-406, 20th European Conference on the Applications of Evolutionary Computation, EvoApplications 2017, Amsterdam, Netherlands, 4/19/17. https://doi.org/10.1007/978-3-319-55849-3_26
    Bravi I, Khalifa A, Holmgård C, Togelius J. Evolving game-specific UCB alternatives for general video game playing. In Applications of Evolutionary Computation - 20th European Conference, EvoApplications 2017, Proceedings. Vol. 10199 LNCS. Springer Verlag. 2017. p. 393-406. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-55849-3_26
    Bravi, Ivan ; Khalifa, Ahmed ; Holmgård, Christoffer ; Togelius, Julian. / Evolving game-specific UCB alternatives for general video game playing. Applications of Evolutionary Computation - 20th European Conference, EvoApplications 2017, Proceedings. Vol. 10199 LNCS Springer Verlag, 2017. pp. 393-406 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
    @inproceedings{d5211e357d1e4ce199d510727fd123c3,
    title = "Evolving game-specific UCB alternatives for general video game playing",
    abstract = "At the core of the most popular version of the Monte Carlo Tree Search (MCTS) algorithm is the UCB1 (Upper Confidence Bound) equation. This equation decides which node to explore next, and therefore shapes the behavior of the search process. If the UCB1 equation is replaced with another equation, the behavior of the MCTS algorithm changes, which might increase its performance on certain problems (and decrease it on others). In this paper, we use genetic programming to evolve replacements to the UCB1 equation targeted at playing individual games in the General Video Game AI (GVGAI) Framework. Each equation is evolved to maximize playing strength in a single game, but is then also tested on all other games in our test set. For every game included in the experiments, we found a UCB replacement that performs significantly better than standard UCB1. Additionally, evolved UCB replacements also tend to improve performance in some GVGAI games for which they are not evolved, showing that improvements generalize across games to clusters of games with similar game mechanics or algorithmic performance. Such an evolved portfolio of UCB variations could be useful for a hyper-heuristic game-playing agent, allowing it to select the most appropriate heuristics for particular games or problems in general.",
    keywords = "General AI, Genetic programming, Monte-Carlo tree search",
    author = "Ivan Bravi and Ahmed Khalifa and Christoffer Holmg{\aa}rd and Julian Togelius",
    year = "2017",
    doi = "10.1007/978-3-319-55849-3_26",
    language = "English (US)",
    isbn = "9783319558486",
    volume = "10199 LNCS",
    series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
    publisher = "Springer Verlag",
    pages = "393--406",
    booktitle = "Applications of Evolutionary Computation - 20th European Conference, EvoApplications 2017, Proceedings",
    address = "Germany",

    }

    TY - GEN

    T1 - Evolving game-specific UCB alternatives for general video game playing

    AU - Bravi, Ivan

    AU - Khalifa, Ahmed

    AU - Holmgård, Christoffer

    AU - Togelius, Julian

    PY - 2017

    Y1 - 2017

    N2 - At the core of the most popular version of the Monte Carlo Tree Search (MCTS) algorithm is the UCB1 (Upper Confidence Bound) equation. This equation decides which node to explore next, and therefore shapes the behavior of the search process. If the UCB1 equation is replaced with another equation, the behavior of the MCTS algorithm changes, which might increase its performance on certain problems (and decrease it on others). In this paper, we use genetic programming to evolve replacements to the UCB1 equation targeted at playing individual games in the General Video Game AI (GVGAI) Framework. Each equation is evolved to maximize playing strength in a single game, but is then also tested on all other games in our test set. For every game included in the experiments, we found a UCB replacement that performs significantly better than standard UCB1. Additionally, evolved UCB replacements also tend to improve performance in some GVGAI games for which they are not evolved, showing that improvements generalize across games to clusters of games with similar game mechanics or algorithmic performance. Such an evolved portfolio of UCB variations could be useful for a hyper-heuristic game-playing agent, allowing it to select the most appropriate heuristics for particular games or problems in general.

    AB - At the core of the most popular version of the Monte Carlo Tree Search (MCTS) algorithm is the UCB1 (Upper Confidence Bound) equation. This equation decides which node to explore next, and therefore shapes the behavior of the search process. If the UCB1 equation is replaced with another equation, the behavior of the MCTS algorithm changes, which might increase its performance on certain problems (and decrease it on others). In this paper, we use genetic programming to evolve replacements to the UCB1 equation targeted at playing individual games in the General Video Game AI (GVGAI) Framework. Each equation is evolved to maximize playing strength in a single game, but is then also tested on all other games in our test set. For every game included in the experiments, we found a UCB replacement that performs significantly better than standard UCB1. Additionally, evolved UCB replacements also tend to improve performance in some GVGAI games for which they are not evolved, showing that improvements generalize across games to clusters of games with similar game mechanics or algorithmic performance. Such an evolved portfolio of UCB variations could be useful for a hyper-heuristic game-playing agent, allowing it to select the most appropriate heuristics for particular games or problems in general.

    KW - General AI

    KW - Genetic programming

    KW - Monte-Carlo tree search

    UR - http://www.scopus.com/inward/record.url?scp=85017548504&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85017548504&partnerID=8YFLogxK

    U2 - 10.1007/978-3-319-55849-3_26

    DO - 10.1007/978-3-319-55849-3_26

    M3 - Conference contribution

    SN - 9783319558486

    VL - 10199 LNCS

    T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    SP - 393

    EP - 406

    BT - Applications of Evolutionary Computation - 20th European Conference, EvoApplications 2017, Proceedings

    PB - Springer Verlag

    ER -