This repo serves for the experiments for the paper:
Title: In-context Language Learning: Architectures and Algorithms
Authors : Ekin Akyürek, Bailin Wang, Yoon Kim, Jacob Andreas
conda create -n seq_icl python=3.11
pip install -r requirements.txt
To run the training,
python -m train experiment=dfa/lstm
python -m train experiment=dfa/retnet
python -m train experiment=dfa/gla
python -m train experiment=dfa/transformer+
- add
export PATH=$PATH:/usr/local/sbin:/usr/sbin:/sbin
so that ldconfig can work properly - The MHA in simple_lm.py use
num_heads
, but in other modules we usen_heads
. The name needs to be changed for consistency, but they're kept as is for now. - you might need to set up conv1d following the command in this issue
git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal_conv1d
git checkout v1.0.2 # this is the highest compatible version allowed by Mamba
CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
This repo is adapted from safari. Triton implementations are taken from linear rnn.