Adaptive dynamic programming for finite-horizon optimal control of linear time-varying discrete-time systems

Bo Pang, Tao Bian, Zhong-Ping Jiang

Research output: Contribution to journalArticle

Abstract

This paper studies data-driven learning-based methods for the finite-horizon optimal control of linear time-varying discrete-time systems. First, a novel finite-horizon Policy Iteration (PI) method for linear time-varying discrete-time systems is presented. Its connections with existing infinite-horizon PI methods are discussed. Then, both data-driven off-policy PI and Value Iteration (VI) algorithms are derived to find approximate optimal controllers when the system dynamics is completely unknown. Under mild conditions, the proposed data-driven off-policy algorithms converge to the optimal solution. Finally, the effectiveness and feasibility of the developed methods are validated by a practical example of spacecraft attitude control.

Original languageEnglish (US)
Pages (from-to)73-84
Number of pages12
JournalControl Theory and Technology
Volume17
Issue number1
DOIs
StatePublished - Feb 1 2019

Fingerprint

Policy Iteration
Adaptive Dynamics
Time-varying Systems
Finite Horizon
Discrete-time Systems
Dynamic programming
Data-driven
Dynamic Programming
Linear Time
Optimal Control
Iteration Method
Attitude control
Value Iteration
Spacecraft
Dynamical systems
Attitude Control
Infinite Horizon
System Dynamics
Controllers
Optimal Solution

Keywords

  • adaptive dynamic programming
  • Optimal control
  • policy iteration (PI)
  • time-varying system
  • value iteration (VI)

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Aerospace Engineering
  • Control and Optimization

Cite this

Adaptive dynamic programming for finite-horizon optimal control of linear time-varying discrete-time systems. / Pang, Bo; Bian, Tao; Jiang, Zhong-Ping.

In: Control Theory and Technology, Vol. 17, No. 1, 01.02.2019, p. 73-84.

Research output: Contribution to journalArticle

@article{fa9e89e18e6f48c0816f7afda7094b4f,
title = "Adaptive dynamic programming for finite-horizon optimal control of linear time-varying discrete-time systems",
abstract = "This paper studies data-driven learning-based methods for the finite-horizon optimal control of linear time-varying discrete-time systems. First, a novel finite-horizon Policy Iteration (PI) method for linear time-varying discrete-time systems is presented. Its connections with existing infinite-horizon PI methods are discussed. Then, both data-driven off-policy PI and Value Iteration (VI) algorithms are derived to find approximate optimal controllers when the system dynamics is completely unknown. Under mild conditions, the proposed data-driven off-policy algorithms converge to the optimal solution. Finally, the effectiveness and feasibility of the developed methods are validated by a practical example of spacecraft attitude control.",
keywords = "adaptive dynamic programming, Optimal control, policy iteration (PI), time-varying system, value iteration (VI)",
author = "Bo Pang and Tao Bian and Zhong-Ping Jiang",
year = "2019",
month = "2",
day = "1",
doi = "10.1007/s11768-019-8168-8",
language = "English (US)",
volume = "17",
pages = "73--84",
journal = "Control Theory and Technology",
issn = "2095-6983",
publisher = "Springer Science + Business Media",
number = "1",

}

TY - JOUR

T1 - Adaptive dynamic programming for finite-horizon optimal control of linear time-varying discrete-time systems

AU - Pang, Bo

AU - Bian, Tao

AU - Jiang, Zhong-Ping

PY - 2019/2/1

Y1 - 2019/2/1

N2 - This paper studies data-driven learning-based methods for the finite-horizon optimal control of linear time-varying discrete-time systems. First, a novel finite-horizon Policy Iteration (PI) method for linear time-varying discrete-time systems is presented. Its connections with existing infinite-horizon PI methods are discussed. Then, both data-driven off-policy PI and Value Iteration (VI) algorithms are derived to find approximate optimal controllers when the system dynamics is completely unknown. Under mild conditions, the proposed data-driven off-policy algorithms converge to the optimal solution. Finally, the effectiveness and feasibility of the developed methods are validated by a practical example of spacecraft attitude control.

AB - This paper studies data-driven learning-based methods for the finite-horizon optimal control of linear time-varying discrete-time systems. First, a novel finite-horizon Policy Iteration (PI) method for linear time-varying discrete-time systems is presented. Its connections with existing infinite-horizon PI methods are discussed. Then, both data-driven off-policy PI and Value Iteration (VI) algorithms are derived to find approximate optimal controllers when the system dynamics is completely unknown. Under mild conditions, the proposed data-driven off-policy algorithms converge to the optimal solution. Finally, the effectiveness and feasibility of the developed methods are validated by a practical example of spacecraft attitude control.

KW - adaptive dynamic programming

KW - Optimal control

KW - policy iteration (PI)

KW - time-varying system

KW - value iteration (VI)

UR - http://www.scopus.com/inward/record.url?scp=85060576402&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85060576402&partnerID=8YFLogxK

U2 - 10.1007/s11768-019-8168-8

DO - 10.1007/s11768-019-8168-8

M3 - Article

AN - SCOPUS:85060576402

VL - 17

SP - 73

EP - 84

JO - Control Theory and Technology

JF - Control Theory and Technology

SN - 2095-6983

IS - 1

ER -