Music to Dance as Language Translation using Sequence Models
André Correia, Luís A. Alexandre
Paper: https://arxiv.org/abs/2403.15569
If you want to extract audio and pose features from a data set, you can find example code for AIST++ and PhantomDance data sets in the processing directory.
Otherwise, you can download the processed AIST++ and PhantomDance data bellow:
sudo apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
curl -O https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Linux-x86_64.sh
bash Anaconda3-2024.02-1-Linux-x86_64.sh
conda init
conda create --name MDLT
conda activate MDLT
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip3 install numpy
pip3 install transformers
pip install causal-conv1d>=1.2.0
pip install mamba-ssm
Train Transformer on AIST++
python3 main_transformer_aist.py --genre 'all' --infer_every 1000 --K 20 --n_epochs 50000 --infer_every 5000 --n_layer 6 --d_model 128 --val_index 0
Train Transformer on AIST++ "mLH" genre
python3 main_transformer_aist.py --genre 'mLH' --infer_every 1000 --K 20 --n_epochs 50000 --infer_every 5000 --n_layer 6 --d_model 128 --val_index 0
Train Transformer on PhantomDance
python3 main_transformer_phantom.py --infer_every 1000 --K 20 --n_epochs 50000 --infer_every 5000 --n_layer 6 --d_model 128 --val_type 0
If you use this codebase, or otherwise found our work valuable, please cite MDLT:
@misc{correia2024music,
title={Music to Dance as Language Translation using Sequence Models},
author={André Correia and Luís A. Alexandre},
year={2024},
eprint={2403.15569},
archivePrefix={arXiv},
primaryClass={cs.SD}
}