Reinforcement learning and non-zero-sum game output regulation for multi-player linear uncertain systems

Adedapo Odekunle, Weinan Gao, Masoud Davari, Zhong Ping Jiang

Research output: Contribution to journalArticle

Abstract

This paper studies the non-zero-sum game output regulation problem (GORP) for a class of continuous-time multi-player linear systems. Without the knowledge of state and input matrices, the Nash equilibrium solution, N-tuple of feedback control policy, is learned through online data collected along the system trajectories. A key strategy is, for the first time, to combine techniques from reinforcement learning (RL), differential game theory, and output regulation for data-driven control design. Different from the existing literature of adaptive optimal output regulation, the feedforward matrices are considered nontrivial. Theoretical analysis shows the disturbance rejection and tracking ability of the closed-loop system. Simulation results demonstrate the efficacy of the developed data-driven control approach.

Original languageEnglish (US)
Article number108672
JournalAutomatica
DOIs
StateAccepted/In press - Jan 1 2019

Fingerprint

Uncertain systems
Reinforcement learning
Disturbance rejection
Game theory
Closed loop systems
Feedback control
Linear systems
Trajectories

Keywords

  • Adaptive optimal control
  • Data-Driven control
  • Game theory
  • Output regulation
  • Reinforcement learning (RL)

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this

Reinforcement learning and non-zero-sum game output regulation for multi-player linear uncertain systems. / Odekunle, Adedapo; Gao, Weinan; Davari, Masoud; Jiang, Zhong Ping.

In: Automatica, 01.01.2019.

Research output: Contribution to journalArticle

@article{9eeaba70a8874acfae279d2085b277e5,
title = "Reinforcement learning and non-zero-sum game output regulation for multi-player linear uncertain systems",
abstract = "This paper studies the non-zero-sum game output regulation problem (GORP) for a class of continuous-time multi-player linear systems. Without the knowledge of state and input matrices, the Nash equilibrium solution, N-tuple of feedback control policy, is learned through online data collected along the system trajectories. A key strategy is, for the first time, to combine techniques from reinforcement learning (RL), differential game theory, and output regulation for data-driven control design. Different from the existing literature of adaptive optimal output regulation, the feedforward matrices are considered nontrivial. Theoretical analysis shows the disturbance rejection and tracking ability of the closed-loop system. Simulation results demonstrate the efficacy of the developed data-driven control approach.",
keywords = "Adaptive optimal control, Data-Driven control, Game theory, Output regulation, Reinforcement learning (RL)",
author = "Adedapo Odekunle and Weinan Gao and Masoud Davari and Jiang, {Zhong Ping}",
year = "2019",
month = "1",
day = "1",
doi = "10.1016/j.automatica.2019.108672",
language = "English (US)",
journal = "Automatica",
issn = "0005-1098",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Reinforcement learning and non-zero-sum game output regulation for multi-player linear uncertain systems

AU - Odekunle, Adedapo

AU - Gao, Weinan

AU - Davari, Masoud

AU - Jiang, Zhong Ping

PY - 2019/1/1

Y1 - 2019/1/1

N2 - This paper studies the non-zero-sum game output regulation problem (GORP) for a class of continuous-time multi-player linear systems. Without the knowledge of state and input matrices, the Nash equilibrium solution, N-tuple of feedback control policy, is learned through online data collected along the system trajectories. A key strategy is, for the first time, to combine techniques from reinforcement learning (RL), differential game theory, and output regulation for data-driven control design. Different from the existing literature of adaptive optimal output regulation, the feedforward matrices are considered nontrivial. Theoretical analysis shows the disturbance rejection and tracking ability of the closed-loop system. Simulation results demonstrate the efficacy of the developed data-driven control approach.

AB - This paper studies the non-zero-sum game output regulation problem (GORP) for a class of continuous-time multi-player linear systems. Without the knowledge of state and input matrices, the Nash equilibrium solution, N-tuple of feedback control policy, is learned through online data collected along the system trajectories. A key strategy is, for the first time, to combine techniques from reinforcement learning (RL), differential game theory, and output regulation for data-driven control design. Different from the existing literature of adaptive optimal output regulation, the feedforward matrices are considered nontrivial. Theoretical analysis shows the disturbance rejection and tracking ability of the closed-loop system. Simulation results demonstrate the efficacy of the developed data-driven control approach.

KW - Adaptive optimal control

KW - Data-Driven control

KW - Game theory

KW - Output regulation

KW - Reinforcement learning (RL)

UR - http://www.scopus.com/inward/record.url?scp=85075349918&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85075349918&partnerID=8YFLogxK

U2 - 10.1016/j.automatica.2019.108672

DO - 10.1016/j.automatica.2019.108672

M3 - Article

AN - SCOPUS:85075349918

JO - Automatica

JF - Automatica

SN - 0005-1098

M1 - 108672

ER -