H control of linear discrete-time systems

Off-policy reinforcement learning

Bahare Kiumarsi, Frank L. Lewis, Zhong-Ping Jiang

Research output: Contribution to journalArticle

Abstract

In this paper, a model-free solution to the H control of linear discrete-time systems is presented. The proposed approach employs off-policy reinforcement learning (RL) to solve the game algebraic Riccati equation online using measured data along the system trajectories. Like existing model-free RL algorithms, no knowledge of the system dynamics is required. However, the proposed method has two main advantages. First, the disturbance input does not need to be adjusted in a specific manner. This makes it more practical as the disturbance cannot be specified in most real-world applications. Second, there is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation (PE) condition. Consequently, the convergence of the proposed algorithm is not affected by probing noise. An example of the H control for an F-16 aircraft is given. It is seen that the convergence of the new off-policy RL algorithm is insensitive to probing noise.

Original languageEnglish (US)
Pages (from-to)144-152
Number of pages9
JournalAutomatica
Volume78
DOIs
StatePublished - Apr 1 2017

Fingerprint

Reinforcement learning
Learning algorithms
Riccati equations
Dynamical systems
Trajectories
Aircraft

Keywords

  • H control
  • Off-policy reinforcement learning
  • Optimal control

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this

H control of linear discrete-time systems : Off-policy reinforcement learning. / Kiumarsi, Bahare; Lewis, Frank L.; Jiang, Zhong-Ping.

In: Automatica, Vol. 78, 01.04.2017, p. 144-152.

Research output: Contribution to journalArticle

Kiumarsi, Bahare ; Lewis, Frank L. ; Jiang, Zhong-Ping. / H control of linear discrete-time systems : Off-policy reinforcement learning. In: Automatica. 2017 ; Vol. 78. pp. 144-152.
@article{c5e1960b20cf4e0e807366304481cce6,
title = "H∞ control of linear discrete-time systems: Off-policy reinforcement learning",
abstract = "In this paper, a model-free solution to the H∞ control of linear discrete-time systems is presented. The proposed approach employs off-policy reinforcement learning (RL) to solve the game algebraic Riccati equation online using measured data along the system trajectories. Like existing model-free RL algorithms, no knowledge of the system dynamics is required. However, the proposed method has two main advantages. First, the disturbance input does not need to be adjusted in a specific manner. This makes it more practical as the disturbance cannot be specified in most real-world applications. Second, there is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation (PE) condition. Consequently, the convergence of the proposed algorithm is not affected by probing noise. An example of the H∞ control for an F-16 aircraft is given. It is seen that the convergence of the new off-policy RL algorithm is insensitive to probing noise.",
keywords = "H control, Off-policy reinforcement learning, Optimal control",
author = "Bahare Kiumarsi and Lewis, {Frank L.} and Zhong-Ping Jiang",
year = "2017",
month = "4",
day = "1",
doi = "10.1016/j.automatica.2016.12.009",
language = "English (US)",
volume = "78",
pages = "144--152",
journal = "Automatica",
issn = "0005-1098",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - H∞ control of linear discrete-time systems

T2 - Off-policy reinforcement learning

AU - Kiumarsi, Bahare

AU - Lewis, Frank L.

AU - Jiang, Zhong-Ping

PY - 2017/4/1

Y1 - 2017/4/1

N2 - In this paper, a model-free solution to the H∞ control of linear discrete-time systems is presented. The proposed approach employs off-policy reinforcement learning (RL) to solve the game algebraic Riccati equation online using measured data along the system trajectories. Like existing model-free RL algorithms, no knowledge of the system dynamics is required. However, the proposed method has two main advantages. First, the disturbance input does not need to be adjusted in a specific manner. This makes it more practical as the disturbance cannot be specified in most real-world applications. Second, there is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation (PE) condition. Consequently, the convergence of the proposed algorithm is not affected by probing noise. An example of the H∞ control for an F-16 aircraft is given. It is seen that the convergence of the new off-policy RL algorithm is insensitive to probing noise.

AB - In this paper, a model-free solution to the H∞ control of linear discrete-time systems is presented. The proposed approach employs off-policy reinforcement learning (RL) to solve the game algebraic Riccati equation online using measured data along the system trajectories. Like existing model-free RL algorithms, no knowledge of the system dynamics is required. However, the proposed method has two main advantages. First, the disturbance input does not need to be adjusted in a specific manner. This makes it more practical as the disturbance cannot be specified in most real-world applications. Second, there is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation (PE) condition. Consequently, the convergence of the proposed algorithm is not affected by probing noise. An example of the H∞ control for an F-16 aircraft is given. It is seen that the convergence of the new off-policy RL algorithm is insensitive to probing noise.

KW - H control

KW - Off-policy reinforcement learning

KW - Optimal control

UR - http://www.scopus.com/inward/record.url?scp=85010390649&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85010390649&partnerID=8YFLogxK

U2 - 10.1016/j.automatica.2016.12.009

DO - 10.1016/j.automatica.2016.12.009

M3 - Article

VL - 78

SP - 144

EP - 152

JO - Automatica

JF - Automatica

SN - 0005-1098

ER -