Signature‑Informed Transformer (SIT) for Asset Allocation

This is the origin Pytorch implementation of SIT in the following paper: SIGNATURE-INFORMED TRANSFORMER FOR ASSET ALLOCATION.

🚩News(AUG 08, 2025) We have released SIT.

Repository structure

asset_data/full_dataset.csv – a CSV of daily prices/returns used to reproduce the paper’s experiments. It contains date‑indexed closing prices for a universe of up to 50 assets. The data are split chronologically: training covers 2000‑01‑01 to 2016‑12‑31, validation covers 2017‑01‑01 to 2019‑12‑31 and testing spans 2020‑01‑01 to 2024‑12‑31. Only the first data_pool columns (assets) are used during training.
0_get_sig_data_all.py – pre‑computes signature and cross‑signature features. It reads full_dataset.csv, splits it into the train/val/test ranges above and saves the signature tensors and future returns for multiple asset pools and window/horizon configurations. Running this script is optional but speeds up training.
run.py – entry point for training and evaluation. It wraps the experiment class in exp/ and exposes many hyper‑parameters, such as number of assets (--data_pool), lookback window (--window_size), horizon (--horizon), model dimension (--d_model), number of transformer layers and heads, maximum position, trade cost etc.
runfile/test.sh – example shell script that trains SIT on three different asset pools (30, 40 and 50 assets) with different hyper‑parameter settings. Adjust the script or construct your own command lines using run.py.
results/ – contains equity curves (*_test_equity_curve.png), portfolio statistics (*_test_metrics.csv) and positions (*_test_positions.csv) generated by the example script.

Requirements and installation

SIT requires Python 3.8+ and PyTorch 1.10+. To install the dependencies, clone the repository and run:

# clone the project (replace with your fork if necessary)
git clone https://github.com/Yoontae6719/Signature-Informed-Transformer-For-Asset-Allocation.git
cd Signature-Informed-Transformer-For-Asset-Allocation

# install python packages
pip install -r requirements.txt  # installs PyTorch, pandas, numpy, tqdm, joblib, etc

Obtain the dataset. A sample full_dataset.csv is provided under asset_data/. If you wish to experiment with your own assets, create a CSV with a Date column and one column per asset containing daily returns or prices. Missing values should be forward‑filled.
Generate signatures (MUST). Running signature extraction ahead of time speeds up training. Use:

   # create signature caches for pools of 30, 40 and 50 assets with window=60 and horizon=20
   python 0_get_sig_data_all.py

The script iterates over DATA_POOLS = [40, 50, 30] and saves pre‑computed training, validation and test tensors to signature_cache_6020/pool_{n}. If you change the --window_size and --horizon values in run.py, re‑generate the cache accordingly. you can download pre-processed dataset Please click this one

Training and evaluation

To train SIT from scratch and evaluate it on the test set, execute:

python run.py \
    --is_training 1 \
    --model_id dp30 \
    --model SIT \
    --data FULL \
    --root_path ./asset_data/ \
    --data_path full_dataset.csv \
    --data_pool 30 \
    --window_size 60 \
    --horizon 20 \
    --d_model 8 \
    --n_heads 8 \
    --num_layers 1 \
    --sig_input_dim 2 \
    --cross_sig_dim 1 \
    --hidden_c 64 \
    --ff_dim 64 \
    --temperature 1.3 \
    --trade_cost_bps 0.0 \
    --itr 3

Alternatively, run the provided script:

bash ./runfile/test.sh

which trains three configurations sequentially. Training results and test performance are saved under results/.

Important command‑line flags

Flag	Description
`--data_pool`	Number of assets to include in the portfolio (e.g., 30, 40, 50).
`--window_size`	Length of the historical window used to compute path signatures. The script `0_get_sig_data_all.py` uses a default of 60.
`--horizon`	Prediction horizon (in trading days). Default is 20.
`--temperature`	Softmax temperature used when converting predicted returns into portfolio weights; higher temperature produces more uniform allocations.
`--trade_cost_bps`	Transaction cost in basis points (e.g., 0.05 % = 0.5 bps).

Results and metrics

After training, SIT evaluates the portfolio on the validation and test sets. The experiment class computes the conditional value‑at‑risk (CVaR) and other metrics and saves:

Equity curves – .png plots showing cumulative returns on the test set.
Metrics CSV – summary statistics such as annualised return, volatility, Sharpe ratio and CVaR.
Positions CSV – the predicted positions for each rebalancing date.

Results generated by test.sh can be found under results/.

Citation

will be updated

License

This project is open‑sourced under the MIT License. See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Signature‑Informed Transformer (SIT) for Asset Allocation

Repository structure

Requirements and installation

Training and evaluation

Important command‑line flags

Results and metrics

Citation

License

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
TS_Traditional		TS_Traditional
asset_data		asset_data
data_provider		data_provider
exp		exp
model		model
results		results
runfile		runfile
utils		utils
0_get_sig_data_all.py		0_get_sig_data_all.py
1_main_result.ipynb		1_main_result.ipynb
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

License

Yoontae6719/Signature-Informed-Transformer-For-Asset-Allocation

Folders and files

Latest commit

History

Repository files navigation

Signature‑Informed Transformer (SIT) for Asset Allocation

Repository structure

Requirements and installation

Training and evaluation

Important command‑line flags

Results and metrics

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages