Skip to content

thaihungle/SHM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

69 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Stable Hadamard Memory: A Unified Linear Memory Framework

LICENSE PyTorch Popgym Pomdp-Baselines

πŸš€ Getting Started | πŸ”§ Usage | 🎯 Benchmarks | 🧠 Baselines | 🀝 Todo

Stable Hadamard Memory (SHM) framework delivers a breakthrough in scalable and robust memory for deep learning models. Using the Hadamard product for updates and calibration, it ensures stable gradient flows while avoiding issues like vanishing or exploding gradients. πŸŽ‰ SHM excels at long-term reasoning due to its attention-free, parallelizable design, and linear complexity, making it ideal for large-scale tasks. ✨ If you find SHM helpful, please share your feedback, cite our work, and give it a ⭐. Your support means a lot!

Why SHM?

  • SHM provides a stable and efficient approach to neural memory construction in deep sequence models, offering a strong foundation for advanced neural architectures.
  • SHM is designed to be flexible and adaptable, making it easy to integrate into a wide range of applications and research workflows.
  • SHM math is simple, yet generic:

Special cases of SHM:

πŸ“œ For more details, check out our paper, reviews and blogs. Please feel free to let me know your suggestions. We're constantly working to improve and expand the framework.

Important

If you find this repository helpful for your work, please consider citing as follows:

@inproceedings{
le2025stable,
title={Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning},
author={Hung Le and Dung Nguyen and Kien Do and Sunil Gupta and Svetha Venkatesh},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=We5z3UEnUY}
}

πŸš€ Installation and Quick Start

⏬ Cloning the Repository

First, clone the SHM repository:

cd /path/to/your/project
git clone https://github.com/thaihungle/SHM.git

πŸ’Ώ Installing Dependencies

Python 3.8 or higher is recommended. If you use GPUs, CUDA 11 or higher is recommended. After ensuring the CUDA driver is installed correctly, you can install the other dependencies.

We recommend setting up separate dependencies for each benchmark.

Example Setup for POPGym benchmark: Python 3.8 + PyTorch 2.4.0

# Install Python
conda create -n SHM-popgym python=3.8
conda activate SHM-popgym
# Install other dependencies
pip install -r popgym_requirements.txt

Example Setup for Pomdp-baselines benchmark: Python 3.8 + PyTorch 2.4.0

# Install Python
conda create -n SHM-pomdp python=3.8
conda activate SHM-pomdp
# Install other dependencies
pip install -r pompd_requirements.txt

πŸ”§ Usage

SHM can be used as an independent Pytorch module:

import torch
from shm import SHM

batch, length, dim = 2, 64, 16
# remove to("cuda") if you use CPU
x = torch.randn(batch, length, dim).to("cuda")
model = SHM(input_size=dim, mem_size=16, output_size=32).to("cuda")
y = model(x)

Implementation details of the SHM module can be found in shm.py. Just so you know, when we adapt to specific tasks, we can slightly modify the implementation to follow the common practice (e.g., add residual shortcut).

🎯 Benchmarks

☝️ POPGym

POPGym is designed to benchmark memory in deep reinforcement learning. Here, we focus on the most memory-intensive tasks:

  • Autoencode
  • Battleship
  • Concentration
  • RepeatPrevious

Each task consists of 3 modes of environments: easy, medium, and hard.

Example easy training using SHM with a memory size of 128:

python train_popgym.py --env AutoencodeEasy --model shm --m 128
python train_popgym.py --env BattleshipEasy --model shm --m 128
python train_popgym.py --env ConcentrationEasy --model shm --m 128
python train_popgym.py --env RepeatPreviousEasy --model shm --m 128

Example hard training using SHM with a memory size of 32:

python train_popgym.py --env AutoencodeHard --model shm --m 32
python train_popgym.py --env BattleshipHard --model shm --m 32
python train_popgym.py --env ConcentrationHard --model shm --m 32
python train_popgym.py --env RepeatPreviousHard --model shm --m 32

Results and Logs

See folder ./results_popggym for Popgym's outputs and logs (we support Tensorboard!). You should be able to reproduce results like this:

Hyperparameters

We follow the well-established hyperparameters set by POPGym. We only tune the memory-related hyperparameters:

  • $m$: memory size for matrix-based memory models such as SHM
  • $h$: hidden size for vector-based memory models such as GRU

For other hyperparameters, see train_popgym.py.

✌️ Pomdp-baselines

Pomdp-baselines benchmarks in several subareas of POMDPs (including meta RL, robust RL, generalization in RL, temporal credit assignment). Here, we focus on 2 tasks:

  • Meta RL
  • Long-horizon Credit Assignment Each task consists of several environments.

Example training using SHM (default $m=24$)

python train_pomdp.py --task meta --env wind_50_150 --model shm
python train_pomdp.py --task meta --env point_robot_50_150 --model shm

Results and Logs

See folder ./results_pomdp for Pomdp-baselines's outputs and logs (we support Tensorboard!). You should be able to reproduce something like this:

🧠 Baselines

☝️ POPGym

In addition to default POPGym baselines. We have added the following models:

To run experiments with baselines, please refer to train_popgym.py to add the baseline calls. Then, run the training command.

Example easy training using GRU with different hidden sizes:

python train_popgym.py --env AutoencodeEasy --model gru --h 256
python train_popgym.py --env AutoencodeEasy --model gru --h 512
python train_popgym.py --env AutoencodeEasy --model gru --h 1024
Other baselines

✌️ Pomdp-baselines

In addition to default PPomdp-baselines (MLP, GRU, and LSTM). We have added the following models:

To run experiments with baselines, please refer to train_pomdp.py and config files for hyperparameter details.

Example training using GRU

python train_pomdp.py --task meta --env wind --model gru
python train_pomdp.py --task meta --env point_robot --model gru
python train_pomdp.py --task credit --env key_to_door --model gru
python train_pomdp.py --task credit --env visual_match --model gru

🀝 Things to Do

  • POPgym Tasks
  • Pomdp-baseline Tasks
  • Time-series Tasks
  • LLM Tasks

Any contribution you can make is welcome.

About

Source code for Stable Hadamard Memory

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published