Datacenter Cooling Optimization using Deep Reinforcement Learning

Overview

This project implements deep reinforcement learning algorithms to optimize datacenter cooling systems using EnergyPlus simulations. It was developed as part of Washington University in St. Louis's CSE 510A: Deep Reinforcement Learning course.

With the growing popularity of deep learning and big data, data centers have become essential to modern infrastructure, leading to increased energy consumption both for computing operations and cooling needs. While large-scale data centers have been extensively studied, small- to mid-sized data centers remain understudied despite occupying 42.5% and 19.5% of the market share respectively. This project focuses on using Deep Reinforcement Learning (DRL) to optimize cooling efficiency in these smaller facilities.

Research Context

As machines within a data center complete their tasks, they generate heat, creating a complex spatial cooling problem. This project leverages DRL to dynamically adapt to changing conditions such as machine workload and external temperature.

Key contributions include:

A focus on small- to mid-sized data centers
A novel exploration of Dueling Deep Q-Networks (DDQN) for data-center cooling
Comparative analysis between DRL methods and traditional control approaches

Features

Multiple DRL algorithm implementations:
- Dueling Deep Q-Network (DDQN) - Our novel contribution to datacenter cooling
- Proximal Policy Optimization (PPO) with Generalized Advantage Estimation
- Soft Actor-Critic (SAC) with automatic entropy tuning
Integration with EnergyPlus for accurate building energy simulation
Custom environment for datacenter cooling optimization
Performance metrics and energy efficiency tracking
Configurable hyperparameters for training
Baselines for comparison (Random, Rules-Based Controller, Rules-Based Incremental)

Repository Structure

Core Files

drl/ddqn.py - Dueling Deep Q-Network implementation
drl/ppo.py - Proximal Policy Optimization implementation
drl/sac.py - Soft Actor-Critic implementation
requirements.txt - Project dependencies

Prerequisites

Anaconda - For environment management
GitHub Desktop - For repository management
EnergyPlus - For building energy simulation
Sinergym - Python wrapper for EnergyPlus

Experimental Setup

This project uses the Sinergym Python package to simulate a small datacenter through the Eplus-datacenter-mixed-continuous-stochastic-v1 environment. The environment simulates:

A 491.3 m² building divided into two asymmetrical zones (west and east)
Each zone equipped with an HVAC system
Hosted servers as primary heat sources
Stochastic weather conditions with 1.5 standard deviations of normal amplification
Training period from June 1st to August 31st

Getting Started

Clone the repository:

git clone https://github.com/peyton-gozon/CSE510A-Datacenter-Cooling

Navigate to the project directory:
```
cd CSE510A-Datacenter-Cooling
```

Create and activate a conda environment:

conda create -n cooler
conda activate cooler

Install Python 3.12:
```
conda install python=3.12
```
Install required packages:
```
pip3 install -r requirements.txt
```
Configure EnergyPlus:
- Set the EPLUS_PATH environment variable in your chosen algorithm file to point to your EnergyPlus installation directory

Run the training:

cd drl
python3 ppo.py  # or ddqn.py/sac.py for other algorithms

Usage

Choose between different DRL algorithms based on your needs:

ddqn.py - Our novel approach using Dueling DQN with discretized action space (best performance in our tests)
ppo.py - PPO with Generalized Advantage Estimation for more stable training
sac.py - SAC with automatic entropy tuning for continuous action spaces

Key Findings

Our experiments revealed:

DDQN significantly outperformed other approaches, showing a 35.8% improvement over the Rules-Based Incremental Controller
PPO achieved a 10.9% improvement over the baseline
SAC showed limited improvement (0.284%) compared to the baseline
Weather forecasting data generally reduced model performance across configurations
Model-free approaches like DDQN offer promising results for small to mid-sized data centers with limited computational resources

Recommended Hyperparameters

Based on our research, we recommend:

DDQN: Learning rate of 0.0005 decayed over 50k timesteps, γ=0.99
PPO: Clip ratio of 0.1, 64×64 node network, linear learning rate scheduling, batch size of 64
SAC: γ=0.99, τ=0.005, α=0.2, learning rate of 0.0003, (256, 2) network architecture with learnable α and automatic entropy tuning

Technologies

Python 3.12
EnergyPlus 24.2.0
PyTorch 2.5.1
Gymnasium 1.0.0
Sinergym 3.7.0
NumPy
Tensorboard 2.18.0

Acknowledgments

Developed by Joseph Islam, Peyton Gozon, and Aadarsha Gopala Reddy for CSE 510A at Washington University in St. Louis.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
drl		drl
report		report
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Datacenter Cooling Optimization using Deep Reinforcement Learning

Overview

Research Context

Features

Repository Structure

Core Files

Prerequisites

Experimental Setup

Getting Started

Usage

Key Findings

Recommended Hyperparameters

Technologies

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

agopalareddy/CSE510A_Datacenter_Cooling

Folders and files

Latest commit

History

Repository files navigation

Datacenter Cooling Optimization using Deep Reinforcement Learning

Overview

Research Context

Features

Repository Structure

Core Files

Prerequisites

Experimental Setup

Getting Started

Usage

Key Findings

Recommended Hyperparameters

Technologies

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages