Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design

Research output: Contribution to journalArticle

Abstract

This paper presents a novel non-model-based, data-driven adaptive optimal controller design for linear continuous-time systems with completely unknown dynamics. Inspired by the stochastic approximation theory, a continuous-time version of the traditional value iteration (VI) algorithm is presented with rigorous convergence analysis. This VI method is crucial for developing new adaptive dynamic programming methods to solve the adaptive optimal control problem and the stochastic robust optimal control problem for linear continuous-time systems. Fundamentally different from existing results, the a priori knowledge of an initial admissible control policy is no longer required. The efficacy of the proposed methodology is illustrated by two examples and a brief comparative study between VI and earlier policy-iteration methods.

Original languageEnglish (US)
Pages (from-to)348-360
Number of pages13
JournalAutomatica
Volume71
DOIs
StatePublished - Sep 1 2016

Fingerprint

Dynamic programming
Continuous time systems
Approximation theory
Controllers

Keywords

  • Adaptive control
  • Adaptive dynamic programming
  • Optimal control
  • Stochastic approximation
  • Value iteration

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this

Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design. / Bian, Tao; Jiang, Zhong-Ping.

In: Automatica, Vol. 71, 01.09.2016, p. 348-360.

Research output: Contribution to journalArticle

@article{9189c07bd39143bcb19cd32b7bc50d0e,
title = "Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design",
abstract = "This paper presents a novel non-model-based, data-driven adaptive optimal controller design for linear continuous-time systems with completely unknown dynamics. Inspired by the stochastic approximation theory, a continuous-time version of the traditional value iteration (VI) algorithm is presented with rigorous convergence analysis. This VI method is crucial for developing new adaptive dynamic programming methods to solve the adaptive optimal control problem and the stochastic robust optimal control problem for linear continuous-time systems. Fundamentally different from existing results, the a priori knowledge of an initial admissible control policy is no longer required. The efficacy of the proposed methodology is illustrated by two examples and a brief comparative study between VI and earlier policy-iteration methods.",
keywords = "Adaptive control, Adaptive dynamic programming, Optimal control, Stochastic approximation, Value iteration",
author = "Tao Bian and Zhong-Ping Jiang",
year = "2016",
month = "9",
day = "1",
doi = "10.1016/j.automatica.2016.05.003",
language = "English (US)",
volume = "71",
pages = "348--360",
journal = "Automatica",
issn = "0005-1098",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design

AU - Bian, Tao

AU - Jiang, Zhong-Ping

PY - 2016/9/1

Y1 - 2016/9/1

N2 - This paper presents a novel non-model-based, data-driven adaptive optimal controller design for linear continuous-time systems with completely unknown dynamics. Inspired by the stochastic approximation theory, a continuous-time version of the traditional value iteration (VI) algorithm is presented with rigorous convergence analysis. This VI method is crucial for developing new adaptive dynamic programming methods to solve the adaptive optimal control problem and the stochastic robust optimal control problem for linear continuous-time systems. Fundamentally different from existing results, the a priori knowledge of an initial admissible control policy is no longer required. The efficacy of the proposed methodology is illustrated by two examples and a brief comparative study between VI and earlier policy-iteration methods.

AB - This paper presents a novel non-model-based, data-driven adaptive optimal controller design for linear continuous-time systems with completely unknown dynamics. Inspired by the stochastic approximation theory, a continuous-time version of the traditional value iteration (VI) algorithm is presented with rigorous convergence analysis. This VI method is crucial for developing new adaptive dynamic programming methods to solve the adaptive optimal control problem and the stochastic robust optimal control problem for linear continuous-time systems. Fundamentally different from existing results, the a priori knowledge of an initial admissible control policy is no longer required. The efficacy of the proposed methodology is illustrated by two examples and a brief comparative study between VI and earlier policy-iteration methods.

KW - Adaptive control

KW - Adaptive dynamic programming

KW - Optimal control

KW - Stochastic approximation

KW - Value iteration

UR - http://www.scopus.com/inward/record.url?scp=84975755115&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84975755115&partnerID=8YFLogxK

U2 - 10.1016/j.automatica.2016.05.003

DO - 10.1016/j.automatica.2016.05.003

M3 - Article

VL - 71

SP - 348

EP - 360

JO - Automatica

JF - Automatica

SN - 0005-1098

ER -