Reinforcement learning for linear continuous-time systems: An incremental learning approach

Research output: Contribution to journalArticle

Abstract

In this paper, we introduce a novel reinforcement learning (RL) scheme for linear continuous-time dynamical systems. Different from traditional batch learning algorithms, an incremental learning approach is developed, which provides a more efficient way to tackle the on-line learning problem in real-world applications. We provide concrete convergence and robust analysis on this incremental-learning algorithm. An extension to solving robust optimal control problems is also given. Two simulation examples are also given to illustrate the effectiveness of our theoretical result.

Original languageEnglish (US)
Article number8651896
Pages (from-to)433-440
Number of pages8
JournalIEEE/CAA Journal of Automatica Sinica
Volume6
Issue number2
DOIs
StatePublished - Mar 1 2019

Fingerprint

Continuous time systems
Reinforcement learning
Learning algorithms
Dynamical systems
Concretes

Keywords

  • Adaptive optimal control
  • robust dynamic programming
  • value iteration (VI)

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Information Systems
  • Artificial Intelligence

Cite this

Reinforcement learning for linear continuous-time systems : An incremental learning approach. / Bian, Tao; Jiang, Zhong-Ping.

In: IEEE/CAA Journal of Automatica Sinica, Vol. 6, No. 2, 8651896, 01.03.2019, p. 433-440.

Research output: Contribution to journalArticle

@article{6576b4566b0c4c7997c9e952850ca04f,
title = "Reinforcement learning for linear continuous-time systems: An incremental learning approach",
abstract = "In this paper, we introduce a novel reinforcement learning (RL) scheme for linear continuous-time dynamical systems. Different from traditional batch learning algorithms, an incremental learning approach is developed, which provides a more efficient way to tackle the on-line learning problem in real-world applications. We provide concrete convergence and robust analysis on this incremental-learning algorithm. An extension to solving robust optimal control problems is also given. Two simulation examples are also given to illustrate the effectiveness of our theoretical result.",
keywords = "Adaptive optimal control, robust dynamic programming, value iteration (VI)",
author = "Tao Bian and Zhong-Ping Jiang",
year = "2019",
month = "3",
day = "1",
doi = "10.1109/JAS.2019.1911390",
language = "English (US)",
volume = "6",
pages = "433--440",
journal = "IEEE/CAA Journal of Automatica Sinica",
issn = "2329-9274",
publisher = "IEEE Advancing Technology for Humanity",
number = "2",

}

TY - JOUR

T1 - Reinforcement learning for linear continuous-time systems

T2 - An incremental learning approach

AU - Bian, Tao

AU - Jiang, Zhong-Ping

PY - 2019/3/1

Y1 - 2019/3/1

N2 - In this paper, we introduce a novel reinforcement learning (RL) scheme for linear continuous-time dynamical systems. Different from traditional batch learning algorithms, an incremental learning approach is developed, which provides a more efficient way to tackle the on-line learning problem in real-world applications. We provide concrete convergence and robust analysis on this incremental-learning algorithm. An extension to solving robust optimal control problems is also given. Two simulation examples are also given to illustrate the effectiveness of our theoretical result.

AB - In this paper, we introduce a novel reinforcement learning (RL) scheme for linear continuous-time dynamical systems. Different from traditional batch learning algorithms, an incremental learning approach is developed, which provides a more efficient way to tackle the on-line learning problem in real-world applications. We provide concrete convergence and robust analysis on this incremental-learning algorithm. An extension to solving robust optimal control problems is also given. Two simulation examples are also given to illustrate the effectiveness of our theoretical result.

KW - Adaptive optimal control

KW - robust dynamic programming

KW - value iteration (VI)

UR - http://www.scopus.com/inward/record.url?scp=85062882417&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062882417&partnerID=8YFLogxK

U2 - 10.1109/JAS.2019.1911390

DO - 10.1109/JAS.2019.1911390

M3 - Article

AN - SCOPUS:85062882417

VL - 6

SP - 433

EP - 440

JO - IEEE/CAA Journal of Automatica Sinica

JF - IEEE/CAA Journal of Automatica Sinica

SN - 2329-9274

IS - 2

M1 - 8651896

ER -