Composable planning with attributes

Amy Zhang, Adam Lerer, Sainbayar Sukhbaatar, Robert Fergus, Arthur Szlam

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The tasks that an agent will need to solve often are not known during training. However, if the agent knows which properties of the environment are important then, after learning how its actions affect those properties, it may be able to use this knowledge to solve complex tasks without training specifically for them. Towards this end, we consider a setup in which an environment is augmented with a set of user defined attributes that parameterize the features of interest. We propose a method that learns a policy for transitioning between "nearby" sets of attributes, and maintains a graph of possible transitions. Given a task at test time that can be expressed in terms of a target set of attributes, and a current state, our model infers the attributes of the current state and searches over paths through attribute space to get a high level plan, and then uses its low level policy to execute the plan. We show in 3D block stacking, grid-world games, and StarCraft® that our model is able to generalize to longer, more complex tasks at test time by composing simpler learned policies.

Original languageEnglish (US)
Title of host publication35th International Conference on Machine Learning, ICML 2018
EditorsAndreas Krause, Jennifer Dy
PublisherInternational Machine Learning Society (IMLS)
Pages9292-9307
Number of pages16
Volume13
ISBN (Electronic)9781510867963
StatePublished - Jan 1 2018
Event35th International Conference on Machine Learning, ICML 2018 - Stockholm, Sweden
Duration: Jul 10 2018Jul 15 2018

Other

Other35th International Conference on Machine Learning, ICML 2018
CountrySweden
CityStockholm
Period7/10/187/15/18

Fingerprint

Planning

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software

Cite this

Zhang, A., Lerer, A., Sukhbaatar, S., Fergus, R., & Szlam, A. (2018). Composable planning with attributes. In A. Krause, & J. Dy (Eds.), 35th International Conference on Machine Learning, ICML 2018 (Vol. 13, pp. 9292-9307). International Machine Learning Society (IMLS).

Composable planning with attributes. / Zhang, Amy; Lerer, Adam; Sukhbaatar, Sainbayar; Fergus, Robert; Szlam, Arthur.

35th International Conference on Machine Learning, ICML 2018. ed. / Andreas Krause; Jennifer Dy. Vol. 13 International Machine Learning Society (IMLS), 2018. p. 9292-9307.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, A, Lerer, A, Sukhbaatar, S, Fergus, R & Szlam, A 2018, Composable planning with attributes. in A Krause & J Dy (eds), 35th International Conference on Machine Learning, ICML 2018. vol. 13, International Machine Learning Society (IMLS), pp. 9292-9307, 35th International Conference on Machine Learning, ICML 2018, Stockholm, Sweden, 7/10/18.
Zhang A, Lerer A, Sukhbaatar S, Fergus R, Szlam A. Composable planning with attributes. In Krause A, Dy J, editors, 35th International Conference on Machine Learning, ICML 2018. Vol. 13. International Machine Learning Society (IMLS). 2018. p. 9292-9307
Zhang, Amy ; Lerer, Adam ; Sukhbaatar, Sainbayar ; Fergus, Robert ; Szlam, Arthur. / Composable planning with attributes. 35th International Conference on Machine Learning, ICML 2018. editor / Andreas Krause ; Jennifer Dy. Vol. 13 International Machine Learning Society (IMLS), 2018. pp. 9292-9307
@inproceedings{c5c521bc05b14554b258806fdfcca3ae,
title = "Composable planning with attributes",
abstract = "The tasks that an agent will need to solve often are not known during training. However, if the agent knows which properties of the environment are important then, after learning how its actions affect those properties, it may be able to use this knowledge to solve complex tasks without training specifically for them. Towards this end, we consider a setup in which an environment is augmented with a set of user defined attributes that parameterize the features of interest. We propose a method that learns a policy for transitioning between {"}nearby{"} sets of attributes, and maintains a graph of possible transitions. Given a task at test time that can be expressed in terms of a target set of attributes, and a current state, our model infers the attributes of the current state and searches over paths through attribute space to get a high level plan, and then uses its low level policy to execute the plan. We show in 3D block stacking, grid-world games, and StarCraft{\circledR} that our model is able to generalize to longer, more complex tasks at test time by composing simpler learned policies.",
author = "Amy Zhang and Adam Lerer and Sainbayar Sukhbaatar and Robert Fergus and Arthur Szlam",
year = "2018",
month = "1",
day = "1",
language = "English (US)",
volume = "13",
pages = "9292--9307",
editor = "Andreas Krause and Jennifer Dy",
booktitle = "35th International Conference on Machine Learning, ICML 2018",
publisher = "International Machine Learning Society (IMLS)",

}

TY - GEN

T1 - Composable planning with attributes

AU - Zhang, Amy

AU - Lerer, Adam

AU - Sukhbaatar, Sainbayar

AU - Fergus, Robert

AU - Szlam, Arthur

PY - 2018/1/1

Y1 - 2018/1/1

N2 - The tasks that an agent will need to solve often are not known during training. However, if the agent knows which properties of the environment are important then, after learning how its actions affect those properties, it may be able to use this knowledge to solve complex tasks without training specifically for them. Towards this end, we consider a setup in which an environment is augmented with a set of user defined attributes that parameterize the features of interest. We propose a method that learns a policy for transitioning between "nearby" sets of attributes, and maintains a graph of possible transitions. Given a task at test time that can be expressed in terms of a target set of attributes, and a current state, our model infers the attributes of the current state and searches over paths through attribute space to get a high level plan, and then uses its low level policy to execute the plan. We show in 3D block stacking, grid-world games, and StarCraft® that our model is able to generalize to longer, more complex tasks at test time by composing simpler learned policies.

AB - The tasks that an agent will need to solve often are not known during training. However, if the agent knows which properties of the environment are important then, after learning how its actions affect those properties, it may be able to use this knowledge to solve complex tasks without training specifically for them. Towards this end, we consider a setup in which an environment is augmented with a set of user defined attributes that parameterize the features of interest. We propose a method that learns a policy for transitioning between "nearby" sets of attributes, and maintains a graph of possible transitions. Given a task at test time that can be expressed in terms of a target set of attributes, and a current state, our model infers the attributes of the current state and searches over paths through attribute space to get a high level plan, and then uses its low level policy to execute the plan. We show in 3D block stacking, grid-world games, and StarCraft® that our model is able to generalize to longer, more complex tasks at test time by composing simpler learned policies.

UR - http://www.scopus.com/inward/record.url?scp=85057257139&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057257139&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85057257139

VL - 13

SP - 9292

EP - 9307

BT - 35th International Conference on Machine Learning, ICML 2018

A2 - Krause, Andreas

A2 - Dy, Jennifer

PB - International Machine Learning Society (IMLS)

ER -