—————————————————————————————————————————————————————
The Simple_backtest_Binance is a high-frequency trading backtesting framework. It provides usages of Binance APIs and functions to analyse hour_level trading data of 2 cryptos, Ethererum (ETH) and Cardano (ADA).
Data collection, alpha implementing, analysing models, and backtest on backtrader and paper-trading on Binance api are all included, written in the Python language.
You can find usage examples here.
Note: You are strongly suggested to build a virtual environment in
python3.8
and above.
To start a virtual environment, you can use conda
conda create -n your_env_name python=3.8
To activate or deactivate the enviroment, you can use:
On Linux
source activate your_env_namn
# To deactivate:
source deactivate
On Windows
activate your_env_name
# to deactivate
deactivate env_name # or activate root
To use the tools, you need to install the packages in required version.
cd proj/
conda install -n your_env_nam requirements.txt # or python3.8 -m pip install -r requirements.txt
-
Tools and functions, see func_instruction here
-
Data collection
- Kline and aggTrades data downloading & saving from Binance Market Data api
- Kline and aggTrades data merging
- merging on
number of trade (trades_at_current_ts)
andbuy/sell ratio (buy_sell_ratio_at_current_ts)
Notice: the data size is more than 1.5GB per year, please pay attention to the storage of data. You can use
MangoDB
to store the data. -
Alpha implement & tuning (some of the alphas were inspired by TradingView
-
Alpha training
-
Use different mark_out to train alphas on core models (parallel training available),available models:
- Lasso regression
- OLS & WLS
- Transformer
- Random forest
- LSTM
- GRU
- CNN
-
The mark_out (find here for defination) including:
- previous 5, 10, 20, 40, 60, 100 bar and current bar diff
- time weighted average price diff (tWap)
- volume weighted average price diff (vWap)
- updating…
-
Models are defined in a class
MyModel
(GPU available)
-
-
BackTesting
-
- In this file, a customized broker class called
MyBroker
that extends thebt.brokers.BackBroker
class from the Backtrader library, which will be used in the main filerun_strategy.py
. The purpose of this custom broker class is to simulate slippage in the backtest strategy. - The
MyBroker
class has the following attributes:params
: a parameter tuple that can be passed during initialization_last_prices
: a dictionary to store the last price for each assetslippage
: the amount of slippage to simulate (in dollars or percentage)init_cash
: the initial cash amount for the broker
- Several methods are overridden:
- the
start
method to set the initial cash amount - the
buy
method to add slippage to the buy price and execute the order - the
sell
method to subtract slippage from the sell price and execute the order
- the
- In this file, a customized broker class called
-
- To run the strategy, use
python run_strategy.py
- strategies available:
- dual_thrust
- rbreaker
- SmaCross
- follow the template
./backtrader/strategy/template.py
to customize your strategy
- To run the strategy, use
-
-
Paper trading
- see
./paper_trading/README.md
for instructions
- see
-
Analytics
Since I didn't test on the whole dataset, I prefer to save the analysis at this point. However, according to the protocols for me to test on 2023 data, several problems occurred multiple times. So I would like to give some conclusions here:
- In the modeling part, Lasso, RandomForest and CNN are usable examples for neither Regression task and Classification task, because they requires longer time to proceed while giving bad predictions.
Transformer
is good but seems costly for digital format data. What's more, the Transformer based model requires to be updated on a regular trading period ( normally a week or a monnth).- It is better used for NLP dataset. I have experience using Transformer to generate sentiment indicators and topic indicators for over 2 years, please let me know if you are interested.
- Some of the links and modules are outdates. Please refers to
./paper_trading/README.md
for more information and better results. - To run the project, you may need to link to your cloud databse or drive to process. Load all models and data into
torch.cuda
might be a solution. Apply for API in google clouds is the best way for training and uploading result at the same time. Refer here for more details.
- In the modeling part, Lasso, RandomForest and CNN are usable examples for neither Regression task and Classification task, because they requires longer time to proceed while giving bad predictions.