H optimal control of unknown linear discrete-time systems

An off-policy reinforcement learning approach

Bahare Kiumarsi, Hamidreza Modares, Frank L. Lewis, Zhong-Ping Jiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper proposes a model-free H control design for linear discrete-time systems using reinforcement learning (RL). A novel off-policy RL algorithm is used to solve the game algebraic Riccati equation (GARE) online using the measured data along the system trajectories. The proposed RL algorithm has the following advantages compared to existing model-free RL methods for solving H control problem: 1) It is data efficient and fast since a stream of experiences which is obtained from executing a fixed behavioral policy is reused to update many value functions correspond to different leaning policies sequentially. 2) The disturbance input does not need to be adjusted in a specific manner. 3) There is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation conditions. A simulation example is used to verify the effectiveness of the proposed control scheme.

Original languageEnglish (US)
Title of host publicationProceedings of the 2015 7th IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2015 and Robotics, Automation and Mechatronics, RAM 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages41-46
Number of pages6
ISBN (Print)9781467373364
DOIs
StatePublished - Sep 23 2015
Event7th IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2015 and the 7th IEEE International Conference on Robotics, Automation and Mechatronics, RAM 2015 - Siem Reap, Cambodia
Duration: Jul 15 2015Jul 17 2015

Other

Other7th IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2015 and the 7th IEEE International Conference on Robotics, Automation and Mechatronics, RAM 2015
CountryCambodia
CitySiem Reap
Period7/15/157/17/15

Fingerprint

Reinforcement learning
Learning algorithms
Riccati equations
Trajectories

Keywords

  • game algebraic Riccati equation
  • H control
  • off-policy
  • reinforcement learning

ASJC Scopus subject areas

  • Artificial Intelligence
  • Control and Systems Engineering

Cite this

Kiumarsi, B., Modares, H., Lewis, F. L., & Jiang, Z-P. (2015). H optimal control of unknown linear discrete-time systems: An off-policy reinforcement learning approach. In Proceedings of the 2015 7th IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2015 and Robotics, Automation and Mechatronics, RAM 2015 (pp. 41-46). [7274545] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCIS.2015.7274545

H optimal control of unknown linear discrete-time systems : An off-policy reinforcement learning approach. / Kiumarsi, Bahare; Modares, Hamidreza; Lewis, Frank L.; Jiang, Zhong-Ping.

Proceedings of the 2015 7th IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2015 and Robotics, Automation and Mechatronics, RAM 2015. Institute of Electrical and Electronics Engineers Inc., 2015. p. 41-46 7274545.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kiumarsi, B, Modares, H, Lewis, FL & Jiang, Z-P 2015, H optimal control of unknown linear discrete-time systems: An off-policy reinforcement learning approach. in Proceedings of the 2015 7th IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2015 and Robotics, Automation and Mechatronics, RAM 2015., 7274545, Institute of Electrical and Electronics Engineers Inc., pp. 41-46, 7th IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2015 and the 7th IEEE International Conference on Robotics, Automation and Mechatronics, RAM 2015, Siem Reap, Cambodia, 7/15/15. https://doi.org/10.1109/ICCIS.2015.7274545
Kiumarsi B, Modares H, Lewis FL, Jiang Z-P. H optimal control of unknown linear discrete-time systems: An off-policy reinforcement learning approach. In Proceedings of the 2015 7th IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2015 and Robotics, Automation and Mechatronics, RAM 2015. Institute of Electrical and Electronics Engineers Inc. 2015. p. 41-46. 7274545 https://doi.org/10.1109/ICCIS.2015.7274545
Kiumarsi, Bahare ; Modares, Hamidreza ; Lewis, Frank L. ; Jiang, Zhong-Ping. / H optimal control of unknown linear discrete-time systems : An off-policy reinforcement learning approach. Proceedings of the 2015 7th IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2015 and Robotics, Automation and Mechatronics, RAM 2015. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 41-46
@inproceedings{f5df300125144478a6eb57d5d8132079,
title = "H∞ optimal control of unknown linear discrete-time systems: An off-policy reinforcement learning approach",
abstract = "This paper proposes a model-free H∞ control design for linear discrete-time systems using reinforcement learning (RL). A novel off-policy RL algorithm is used to solve the game algebraic Riccati equation (GARE) online using the measured data along the system trajectories. The proposed RL algorithm has the following advantages compared to existing model-free RL methods for solving H∞ control problem: 1) It is data efficient and fast since a stream of experiences which is obtained from executing a fixed behavioral policy is reused to update many value functions correspond to different leaning policies sequentially. 2) The disturbance input does not need to be adjusted in a specific manner. 3) There is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation conditions. A simulation example is used to verify the effectiveness of the proposed control scheme.",
keywords = "game algebraic Riccati equation, H control, off-policy, reinforcement learning",
author = "Bahare Kiumarsi and Hamidreza Modares and Lewis, {Frank L.} and Zhong-Ping Jiang",
year = "2015",
month = "9",
day = "23",
doi = "10.1109/ICCIS.2015.7274545",
language = "English (US)",
isbn = "9781467373364",
pages = "41--46",
booktitle = "Proceedings of the 2015 7th IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2015 and Robotics, Automation and Mechatronics, RAM 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - H∞ optimal control of unknown linear discrete-time systems

T2 - An off-policy reinforcement learning approach

AU - Kiumarsi, Bahare

AU - Modares, Hamidreza

AU - Lewis, Frank L.

AU - Jiang, Zhong-Ping

PY - 2015/9/23

Y1 - 2015/9/23

N2 - This paper proposes a model-free H∞ control design for linear discrete-time systems using reinforcement learning (RL). A novel off-policy RL algorithm is used to solve the game algebraic Riccati equation (GARE) online using the measured data along the system trajectories. The proposed RL algorithm has the following advantages compared to existing model-free RL methods for solving H∞ control problem: 1) It is data efficient and fast since a stream of experiences which is obtained from executing a fixed behavioral policy is reused to update many value functions correspond to different leaning policies sequentially. 2) The disturbance input does not need to be adjusted in a specific manner. 3) There is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation conditions. A simulation example is used to verify the effectiveness of the proposed control scheme.

AB - This paper proposes a model-free H∞ control design for linear discrete-time systems using reinforcement learning (RL). A novel off-policy RL algorithm is used to solve the game algebraic Riccati equation (GARE) online using the measured data along the system trajectories. The proposed RL algorithm has the following advantages compared to existing model-free RL methods for solving H∞ control problem: 1) It is data efficient and fast since a stream of experiences which is obtained from executing a fixed behavioral policy is reused to update many value functions correspond to different leaning policies sequentially. 2) The disturbance input does not need to be adjusted in a specific manner. 3) There is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation conditions. A simulation example is used to verify the effectiveness of the proposed control scheme.

KW - game algebraic Riccati equation

KW - H control

KW - off-policy

KW - reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=84960850910&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84960850910&partnerID=8YFLogxK

U2 - 10.1109/ICCIS.2015.7274545

DO - 10.1109/ICCIS.2015.7274545

M3 - Conference contribution

SN - 9781467373364

SP - 41

EP - 46

BT - Proceedings of the 2015 7th IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2015 and Robotics, Automation and Mechatronics, RAM 2015

PB - Institute of Electrical and Electronics Engineers Inc.

ER -