Towards optimal control of air handling units using deep reinforcement learning and recurrent neural network

Zhengbo Zou, Xinran Yu, Semiha Ergan

Research output: Contribution to journalArticle

Abstract

Optimal control of heating, ventilation and air conditioning systems (HVACs) aims to minimize the energy consumption of equipment while maintaining the thermal comfort of occupants. Traditional rule-based control methods are not optimized for HVAC systems with continuous sensor readings and actuator controls. Recent developments in deep reinforcement learning (DRL) enabled control of HVACs with continuous sensor inputs and actions, while eliminating the need of building complex thermodynamic models. DRL control includes an environment, which approximates real-world HVAC operations; and an agent, that aims to achieve optimal control over the HVAC. Existing DRL control frameworks use simulation tools (e.g., EnergyPlus) to build DRL training environments with HVAC systems information, but oversimplify building geometrics. This study proposes a framework aiming to achieve optimal control over Air Handling Units (AHUs) by implementing long-short-term-memory (LSTM) networks to approximate real-world HVAC operations to build DRL training environments. The framework also implements state-of-the-art DRL algorithms (e.g., deep deterministic policy gradient) for optimal control over the AHUs. Three AHUs, each with two-years of building automation system (BAS) data, were used as testbeds for evaluation. Our LSTM-based DRL training environments, built using the first year's BAS data, achieved an average mean square error of 0.0015 across 16 normalized AHU parameters. When deployed in the testing environments, which were built using the second year's BAS data of the same AHUs, the DRL agents achieved 27%–30% energy saving comparing to the actual energy consumption, while maintaining the predicted percentage of discomfort (PPD) at 10%.

Original languageEnglish (US)
Article number106535
JournalBuilding and Environment
Volume168
DOIs
StatePublished - Jan 15 2020

Fingerprint

Recurrent neural networks
Reinforcement learning
reinforcement
neural network
learning
air
Air
automation
Automation
energy consumption
Energy utilization
sensor
air conditioning
Thermal comfort
energy saving
Sensors
heat pump
HVAC
ventilation
conditioning

Keywords

  • Deep reinforcement learning
  • Energy consumption
  • HVAC control
  • Long-short-term-memory network
  • Thermal comfort

ASJC Scopus subject areas

  • Environmental Engineering
  • Civil and Structural Engineering
  • Geography, Planning and Development
  • Building and Construction

Cite this

Towards optimal control of air handling units using deep reinforcement learning and recurrent neural network. / Zou, Zhengbo; Yu, Xinran; Ergan, Semiha.

In: Building and Environment, Vol. 168, 106535, 15.01.2020.

Research output: Contribution to journalArticle

@article{f3d121b4521e43da8d0e0b01c3d4ed62,
title = "Towards optimal control of air handling units using deep reinforcement learning and recurrent neural network",
abstract = "Optimal control of heating, ventilation and air conditioning systems (HVACs) aims to minimize the energy consumption of equipment while maintaining the thermal comfort of occupants. Traditional rule-based control methods are not optimized for HVAC systems with continuous sensor readings and actuator controls. Recent developments in deep reinforcement learning (DRL) enabled control of HVACs with continuous sensor inputs and actions, while eliminating the need of building complex thermodynamic models. DRL control includes an environment, which approximates real-world HVAC operations; and an agent, that aims to achieve optimal control over the HVAC. Existing DRL control frameworks use simulation tools (e.g., EnergyPlus) to build DRL training environments with HVAC systems information, but oversimplify building geometrics. This study proposes a framework aiming to achieve optimal control over Air Handling Units (AHUs) by implementing long-short-term-memory (LSTM) networks to approximate real-world HVAC operations to build DRL training environments. The framework also implements state-of-the-art DRL algorithms (e.g., deep deterministic policy gradient) for optimal control over the AHUs. Three AHUs, each with two-years of building automation system (BAS) data, were used as testbeds for evaluation. Our LSTM-based DRL training environments, built using the first year's BAS data, achieved an average mean square error of 0.0015 across 16 normalized AHU parameters. When deployed in the testing environments, which were built using the second year's BAS data of the same AHUs, the DRL agents achieved 27{\%}–30{\%} energy saving comparing to the actual energy consumption, while maintaining the predicted percentage of discomfort (PPD) at 10{\%}.",
keywords = "Deep reinforcement learning, Energy consumption, HVAC control, Long-short-term-memory network, Thermal comfort",
author = "Zhengbo Zou and Xinran Yu and Semiha Ergan",
year = "2020",
month = "1",
day = "15",
doi = "10.1016/j.buildenv.2019.106535",
language = "English (US)",
volume = "168",
journal = "Building and Environment",
issn = "0360-1323",
publisher = "Elsevier BV",

}

TY - JOUR

T1 - Towards optimal control of air handling units using deep reinforcement learning and recurrent neural network

AU - Zou, Zhengbo

AU - Yu, Xinran

AU - Ergan, Semiha

PY - 2020/1/15

Y1 - 2020/1/15

N2 - Optimal control of heating, ventilation and air conditioning systems (HVACs) aims to minimize the energy consumption of equipment while maintaining the thermal comfort of occupants. Traditional rule-based control methods are not optimized for HVAC systems with continuous sensor readings and actuator controls. Recent developments in deep reinforcement learning (DRL) enabled control of HVACs with continuous sensor inputs and actions, while eliminating the need of building complex thermodynamic models. DRL control includes an environment, which approximates real-world HVAC operations; and an agent, that aims to achieve optimal control over the HVAC. Existing DRL control frameworks use simulation tools (e.g., EnergyPlus) to build DRL training environments with HVAC systems information, but oversimplify building geometrics. This study proposes a framework aiming to achieve optimal control over Air Handling Units (AHUs) by implementing long-short-term-memory (LSTM) networks to approximate real-world HVAC operations to build DRL training environments. The framework also implements state-of-the-art DRL algorithms (e.g., deep deterministic policy gradient) for optimal control over the AHUs. Three AHUs, each with two-years of building automation system (BAS) data, were used as testbeds for evaluation. Our LSTM-based DRL training environments, built using the first year's BAS data, achieved an average mean square error of 0.0015 across 16 normalized AHU parameters. When deployed in the testing environments, which were built using the second year's BAS data of the same AHUs, the DRL agents achieved 27%–30% energy saving comparing to the actual energy consumption, while maintaining the predicted percentage of discomfort (PPD) at 10%.

AB - Optimal control of heating, ventilation and air conditioning systems (HVACs) aims to minimize the energy consumption of equipment while maintaining the thermal comfort of occupants. Traditional rule-based control methods are not optimized for HVAC systems with continuous sensor readings and actuator controls. Recent developments in deep reinforcement learning (DRL) enabled control of HVACs with continuous sensor inputs and actions, while eliminating the need of building complex thermodynamic models. DRL control includes an environment, which approximates real-world HVAC operations; and an agent, that aims to achieve optimal control over the HVAC. Existing DRL control frameworks use simulation tools (e.g., EnergyPlus) to build DRL training environments with HVAC systems information, but oversimplify building geometrics. This study proposes a framework aiming to achieve optimal control over Air Handling Units (AHUs) by implementing long-short-term-memory (LSTM) networks to approximate real-world HVAC operations to build DRL training environments. The framework also implements state-of-the-art DRL algorithms (e.g., deep deterministic policy gradient) for optimal control over the AHUs. Three AHUs, each with two-years of building automation system (BAS) data, were used as testbeds for evaluation. Our LSTM-based DRL training environments, built using the first year's BAS data, achieved an average mean square error of 0.0015 across 16 normalized AHU parameters. When deployed in the testing environments, which were built using the second year's BAS data of the same AHUs, the DRL agents achieved 27%–30% energy saving comparing to the actual energy consumption, while maintaining the predicted percentage of discomfort (PPD) at 10%.

KW - Deep reinforcement learning

KW - Energy consumption

KW - HVAC control

KW - Long-short-term-memory network

KW - Thermal comfort

UR - http://www.scopus.com/inward/record.url?scp=85074928449&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074928449&partnerID=8YFLogxK

U2 - 10.1016/j.buildenv.2019.106535

DO - 10.1016/j.buildenv.2019.106535

M3 - Article

AN - SCOPUS:85074928449

VL - 168

JO - Building and Environment

JF - Building and Environment

SN - 0360-1323

M1 - 106535

ER -