π Getting Started | π§ Usage | π― Benchmarks | π§ Baselines | π€ Todo
Stable Hadamard Memory (SHM) framework delivers a breakthrough in scalable and robust memory for deep learning models. Using the Hadamard product for updates and calibration, it ensures stable gradient flows while avoiding issues like vanishing or exploding gradients. π SHM excels at long-term reasoning due to its attention-free, parallelizable design, and linear complexity, making it ideal for large-scale tasks. β¨ If you find SHM helpful, please share your feedback, cite our work, and give it a β. Your support means a lot!
Why SHM?
- SHM provides a stable and efficient approach to neural memory construction in deep sequence models, offering a strong foundation for advanced neural architectures.
- SHM is designed to be flexible and adaptable, making it easy to integrate into a wide range of applications and research workflows.
- SHM math is simple, yet generic:
Special cases of SHM:
-
SSM:
$M_t$ ,$C_t$ , and$U_t$ are vectors -
Linear Attention:
$C_t=1$ -
mLSTM:
$C_t$ is a scalar
π For more details, check out our paper, reviews and blogs. Please feel free to let me know your suggestions. We're constantly working to improve and expand the framework.
Important
If you find this repository helpful for your work, please consider citing as follows:
@inproceedings{
le2025stable,
title={Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning},
author={Hung Le and Dung Nguyen and Kien Do and Sunil Gupta and Svetha Venkatesh},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=We5z3UEnUY}
}First, clone the SHM repository:
cd /path/to/your/project
git clone https://github.com/thaihungle/SHM.gitPython 3.8 or higher is recommended. If you use GPUs, CUDA 11 or higher is recommended. After ensuring the CUDA driver is installed correctly, you can install the other dependencies.
We recommend setting up separate dependencies for each benchmark.
Example Setup for POPGym benchmark: Python 3.8 + PyTorch 2.4.0
# Install Python
conda create -n SHM-popgym python=3.8
conda activate SHM-popgym
# Install other dependencies
pip install -r popgym_requirements.txtExample Setup for Pomdp-baselines benchmark: Python 3.8 + PyTorch 2.4.0
# Install Python
conda create -n SHM-pomdp python=3.8
conda activate SHM-pomdp
# Install other dependencies
pip install -r pompd_requirements.txtSHM can be used as an independent Pytorch module:
import torch
from shm import SHM
batch, length, dim = 2, 64, 16
# remove to("cuda") if you use CPU
x = torch.randn(batch, length, dim).to("cuda")
model = SHM(input_size=dim, mem_size=16, output_size=32).to("cuda")
y = model(x)Implementation details of the SHM module can be found in shm.py. Just so you know, when we adapt to specific tasks, we can slightly modify the implementation to follow the common practice (e.g., add residual shortcut).
POPGym is designed to benchmark memory in deep reinforcement learning. Here, we focus on the most memory-intensive tasks:
- Autoencode
- Battleship
- Concentration
- RepeatPrevious
Each task consists of 3 modes of environments: easy, medium, and hard.
Example easy training using SHM with a memory size of 128:
python train_popgym.py --env AutoencodeEasy --model shm --m 128
python train_popgym.py --env BattleshipEasy --model shm --m 128
python train_popgym.py --env ConcentrationEasy --model shm --m 128
python train_popgym.py --env RepeatPreviousEasy --model shm --m 128
Example hard training using SHM with a memory size of 32:
python train_popgym.py --env AutoencodeHard --model shm --m 32
python train_popgym.py --env BattleshipHard --model shm --m 32
python train_popgym.py --env ConcentrationHard --model shm --m 32
python train_popgym.py --env RepeatPreviousHard --model shm --m 32
Results and Logs
See folder ./results_popggym for Popgym's outputs and logs (we support Tensorboard!). You should be able to reproduce results like this:
Hyperparameters
We follow the well-established hyperparameters set by POPGym. We only tune the memory-related hyperparameters:
-
$m$ : memory size for matrix-based memory models such as SHM -
$h$ : hidden size for vector-based memory models such as GRU
For other hyperparameters, see train_popgym.py.
Pomdp-baselines benchmarks in several subareas of POMDPs (including meta RL, robust RL, generalization in RL, temporal credit assignment). Here, we focus on 2 tasks:
- Meta RL
- Long-horizon Credit Assignment Each task consists of several environments.
Example training using SHM (default
python train_pomdp.py --task meta --env wind_50_150 --model shm
python train_pomdp.py --task meta --env point_robot_50_150 --model shm
Results and Logs
See folder ./results_pomdp for Pomdp-baselines's outputs and logs (we support Tensorboard!). You should be able to reproduce something like this:
In addition to default POPGym baselines. We have added the following models:
To run experiments with baselines, please refer to train_popgym.py to add the baseline calls. Then, run the training command.
Example easy training using GRU with different hidden sizes:
python train_popgym.py --env AutoencodeEasy --model gru --h 256
python train_popgym.py --env AutoencodeEasy --model gru --h 512
python train_popgym.py --env AutoencodeEasy --model gru --h 1024
In addition to default PPomdp-baselines (MLP, GRU, and LSTM). We have added the following models:
To run experiments with baselines, please refer to train_pomdp.py and config files for hyperparameter details.
Example training using GRU
python train_pomdp.py --task meta --env wind --model gru
python train_pomdp.py --task meta --env point_robot --model gru
python train_pomdp.py --task credit --env key_to_door --model gru
python train_pomdp.py --task credit --env visual_match --model gru
- POPgym Tasks
- Pomdp-baseline Tasks
- Time-series Tasks
- LLM Tasks
Any contribution you can make is welcome.



