Hybrid Learning in Stochastic Games and Its Application in Network Security

Quanyan Zhu, Tembine Hamidou, Tamer Başar

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

In this chapter, we consider a class of two-player nonzero-sum stochastic games with incomplete information, which is inspired by recent applications of game theory in network security. We develop fully distributed reinforcement learning algorithms, which require for each player a minimal amount of information regarding the other player. At each time, each player can be in an active mode or in a sleep mode. If a player is in an active mode, the player updates the strategy and estimates of unknown quantities using a specific pure or hybrid learning pattern. The players' intelligence and rationality are captured by the weighted linear combination of different learning patterns.We use stochastic approximation techniques to show that, under appropriate conditions, the pure or hybrid learning schemes with random updates can be studied using their deterministic ordinary differential equation (ODE) counterparts. Convergence to state-independent equilibria is analyzed for special classes of games, namely, games with two actions, and potential games. Results are applied to network security games between an intruder and an administrator, where the noncooperative behaviors are characterized well by the features of distributed hybrid learning.

Original languageEnglish (US)
Title of host publicationReinforcement Learning and Approximate Dynamic Programming for Feedback Control
PublisherJohn Wiley and Sons
Pages303-329
Number of pages27
ISBN (Print)9781118104200
DOIs
StatePublished - Feb 7 2013

Fingerprint

Network security
Game theory
Reinforcement learning
Ordinary differential equations
Learning algorithms
Sleep

Keywords

  • Games and learning algorithms, attacker/IDS
  • Hybrid learning in games, network security
  • Multiagent games, learning and control
  • New paradigm of hybrid learning CODIPAS-RL
  • Players' information limits, of payoff functions

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Zhu, Q., Hamidou, T., & Başar, T. (2013). Hybrid Learning in Stochastic Games and Its Application in Network Security. In Reinforcement Learning and Approximate Dynamic Programming for Feedback Control (pp. 303-329). John Wiley and Sons. https://doi.org/10.1002/9781118453988.ch14

Hybrid Learning in Stochastic Games and Its Application in Network Security. / Zhu, Quanyan; Hamidou, Tembine; Başar, Tamer.

Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. John Wiley and Sons, 2013. p. 303-329.

Research output: Chapter in Book/Report/Conference proceedingChapter

Zhu, Q, Hamidou, T & Başar, T 2013, Hybrid Learning in Stochastic Games and Its Application in Network Security. in Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. John Wiley and Sons, pp. 303-329. https://doi.org/10.1002/9781118453988.ch14
Zhu Q, Hamidou T, Başar T. Hybrid Learning in Stochastic Games and Its Application in Network Security. In Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. John Wiley and Sons. 2013. p. 303-329 https://doi.org/10.1002/9781118453988.ch14
Zhu, Quanyan ; Hamidou, Tembine ; Başar, Tamer. / Hybrid Learning in Stochastic Games and Its Application in Network Security. Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. John Wiley and Sons, 2013. pp. 303-329
@inbook{65c66e84eeea4f83b6880edb3412f508,
title = "Hybrid Learning in Stochastic Games and Its Application in Network Security",
abstract = "In this chapter, we consider a class of two-player nonzero-sum stochastic games with incomplete information, which is inspired by recent applications of game theory in network security. We develop fully distributed reinforcement learning algorithms, which require for each player a minimal amount of information regarding the other player. At each time, each player can be in an active mode or in a sleep mode. If a player is in an active mode, the player updates the strategy and estimates of unknown quantities using a specific pure or hybrid learning pattern. The players' intelligence and rationality are captured by the weighted linear combination of different learning patterns.We use stochastic approximation techniques to show that, under appropriate conditions, the pure or hybrid learning schemes with random updates can be studied using their deterministic ordinary differential equation (ODE) counterparts. Convergence to state-independent equilibria is analyzed for special classes of games, namely, games with two actions, and potential games. Results are applied to network security games between an intruder and an administrator, where the noncooperative behaviors are characterized well by the features of distributed hybrid learning.",
keywords = "Games and learning algorithms, attacker/IDS, Hybrid learning in games, network security, Multiagent games, learning and control, New paradigm of hybrid learning CODIPAS-RL, Players' information limits, of payoff functions",
author = "Quanyan Zhu and Tembine Hamidou and Tamer Başar",
year = "2013",
month = "2",
day = "7",
doi = "10.1002/9781118453988.ch14",
language = "English (US)",
isbn = "9781118104200",
pages = "303--329",
booktitle = "Reinforcement Learning and Approximate Dynamic Programming for Feedback Control",
publisher = "John Wiley and Sons",

}

TY - CHAP

T1 - Hybrid Learning in Stochastic Games and Its Application in Network Security

AU - Zhu, Quanyan

AU - Hamidou, Tembine

AU - Başar, Tamer

PY - 2013/2/7

Y1 - 2013/2/7

N2 - In this chapter, we consider a class of two-player nonzero-sum stochastic games with incomplete information, which is inspired by recent applications of game theory in network security. We develop fully distributed reinforcement learning algorithms, which require for each player a minimal amount of information regarding the other player. At each time, each player can be in an active mode or in a sleep mode. If a player is in an active mode, the player updates the strategy and estimates of unknown quantities using a specific pure or hybrid learning pattern. The players' intelligence and rationality are captured by the weighted linear combination of different learning patterns.We use stochastic approximation techniques to show that, under appropriate conditions, the pure or hybrid learning schemes with random updates can be studied using their deterministic ordinary differential equation (ODE) counterparts. Convergence to state-independent equilibria is analyzed for special classes of games, namely, games with two actions, and potential games. Results are applied to network security games between an intruder and an administrator, where the noncooperative behaviors are characterized well by the features of distributed hybrid learning.

AB - In this chapter, we consider a class of two-player nonzero-sum stochastic games with incomplete information, which is inspired by recent applications of game theory in network security. We develop fully distributed reinforcement learning algorithms, which require for each player a minimal amount of information regarding the other player. At each time, each player can be in an active mode or in a sleep mode. If a player is in an active mode, the player updates the strategy and estimates of unknown quantities using a specific pure or hybrid learning pattern. The players' intelligence and rationality are captured by the weighted linear combination of different learning patterns.We use stochastic approximation techniques to show that, under appropriate conditions, the pure or hybrid learning schemes with random updates can be studied using their deterministic ordinary differential equation (ODE) counterparts. Convergence to state-independent equilibria is analyzed for special classes of games, namely, games with two actions, and potential games. Results are applied to network security games between an intruder and an administrator, where the noncooperative behaviors are characterized well by the features of distributed hybrid learning.

KW - Games and learning algorithms, attacker/IDS

KW - Hybrid learning in games, network security

KW - Multiagent games, learning and control

KW - New paradigm of hybrid learning CODIPAS-RL

KW - Players' information limits, of payoff functions

UR - http://www.scopus.com/inward/record.url?scp=84886348633&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84886348633&partnerID=8YFLogxK

U2 - 10.1002/9781118453988.ch14

DO - 10.1002/9781118453988.ch14

M3 - Chapter

SN - 9781118104200

SP - 303

EP - 329

BT - Reinforcement Learning and Approximate Dynamic Programming for Feedback Control

PB - John Wiley and Sons

ER -