Skip to content

Adds 323 Bitcoin ARIMA #326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions 04 Strategy Library/00 Strategy Library/01 Strategy Library.php
Original file line number Diff line number Diff line change
Expand Up @@ -634,15 +634,6 @@
'description' => 'We apply Simple Moving Averages to manage the risk of holding leveraged ETFs in an attempt to beat the S&P500',
'tags' => 'Simple Moving Average, Risk Management, S&P500, ETF'
],
[
'name' => 'Leveraged ETFs with Systematic Risk Management',
'link' => 'leveraged-etfs-with-systematic-risk-management',
'sources' => [
'The Lead-Lag Report' => 'https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2741701'
],
'description' => 'We apply Simple Moving Averages to manage risk in holding leveraged ETFs in an attempt to by the S&P500',
'tags' => 'Simple Moving Average, Risk Management, S&P500, ETF'
],
[
'name' => 'Ichimoku Clouds in the Energy Sector',
'link' => 'strategy-library/ichimoku-clouds-in-the-energy-sector',
Expand Down Expand Up @@ -678,6 +669,15 @@
],
'description' => "Mathematically Deriving the Optimal Entry and Liquidation Values of a Pairs Trading Process",
'tags'=>'Pairs Trading, Ornstein-Uhlenbeck Process, Optimal Stopping'
],
[
'name' => 'Forecasting Bitcoin Prices with ARIMA Models',
'link' => 'strategy-library/bitcoin-arima-forecasting',
'sources' => [
'arXiv' => 'https://arxiv.org/pdf/1904.05315.pdf'
],
'description' => "We attempt to forecast the prices of Bitcoin using an ARIMA model",
'tags'=>'Bitcoin, ARIMA, Forecasting, Time-Series Models'
]
];

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
<p>
In this tutorial, we attempt to forecast Bitcoin prices with an ARIMA model.
</p>
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
<p>
In recent years, more and more complicated models have been developed for predicting financial time-series, namely Deep Learning. However, in this tutorial, we go back to the classic time-series models, employing the ARIMA Model in an attempt to forecast Bitcoin prices.
</p>
89 changes: 89 additions & 0 deletions 04 Strategy Library/1032 Bitcoin ARIMA Forecasting/03 Method.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
<p>
An ARIMA model requires a stationary time-series, that is, the mean and variance stay relatively the same over time.
However, Bitcoin went from a few hundred dollars to tens of thousands of dollars in just the past few years, so
clearly Bitcoin prices aren&apos;t stationary. To transform the price data into a stationary series, we take the first order
difference of the log of the prices. Given X as the Bitcoin closing prices, we use the following code:
</p>

<div class="section-example-container">
<pre class="python">
import numpy as np
X = np.diff(np.log(X))
</pre>
</div>

<p>then to ensure the stationarity of our transformed data, we pass it into the following function:</p>

<div class="section-example-container">
<pre class="python">
def __is_stationary(self, X, significance_level=.05):
# include above: from statsmodels.tsa.stattools import adfuller
result = adfuller(X)
p_value = result[1]
return p_value < significance_level
</pre>
</div>

<p>After taking the most recent 400 points of Bitcoin closing prices and performing the transformation, we get a p-value of 2.5e-16 for stationarity, far below our critical value of .05. Since we only want to use stationary data with our ARIMA model, our algorithm stops trading when the recent historical data is no longer stationary after the transformation.</p>

<p>We then want to grid search different ARIMA orders to find the model that minimizes the Mean Squared Error (MSE) on unseen data. The ARIMA order is represented by (p, d, q), where p stands for the past values for Auto-Regression (hence AR in ARIMA), d stands for the degree of differencing (or the order of Integration, hence I in ARIMA), and q stands for how the past errors are accounted for in future predictions (which is a Moving Average model, hence MA in ARIMA). The possible p and q values range between 0 and 5, while the d term is kept at 1. Then, we want to iterate over each (p, d, q) combination to minimize our MSE. Before we show the code for the grid search, let&rsquo;s see how we can evaluate an ARIMA model given a single order:</p>

<div class="section-example-container">
<pre class="python">
def evaluate_arima_model(X, arima_order, oos_size=20):
train_data, oos_data = X[:-oos_size], X[-oos_size:]
history = deque([x for x in train_data], maxlen=len(train_data))

predictions = []
for i in range(len(oos_data)):
model = ARIMA(np.array(history), order=arima_order)
model_fit = model.fit(disp=0)
y_hat = model_fit.forecast()[0]
predictions.append(y_hat)
history.append(oos_data[i])
# include above: from sklearn import metrics
return metrics.mean_squared_error(oos_data, predictions)
</pre>
</div>

<p>We essentially have a rolling window with 80% of the data from the left, and fit an ARIMA model using these points of data along with the specific (p, d, q) order. We forecast out one time-step into the future, record this value along with the actual value. Then, we shift this window to the right, and repeat the last step. We repeat this process until we&rsquo;ve forecasted the remaining values, and then we compute the MSE by plugging in the forecasted and actual values into sklearn&rsquo;s <strong>mean_squared_error</strong> function. Now for the grid search:</p>

<div class="section-example-container">
<pre class="python">
def Train(X, p_values=range(6), d_values=[1], q_values=range(6)):
data = transform_data(X)

if not is_stationary(data):
return None

best_score, best_pdq = float("inf"), None
for p in p_values:
for d in d_values:
for q in q_values:
order = (p,d,q)
try:
mse = evaluate_arima_model(data, order)
if mse < best_score:
best_score, best_pdq = mse, order
except:
continue

return best_pdq
</pre>
</div>

<p>As described earlier, we iterate over and evaluate all possible (p, d, q), and choose the best model. </p>

<p>
The trading logic is quite simple. At the start of each month, we take the past 70 points of data to find the ARIMA
order that maximizes the out-of-sample MSE. Every day, we fit 50 points of the most recent historical data to an ARIMA
model given our order, using the order of that month, to forecast one value into the future. Because our data was
transformed through a logarithm and differencing, we would need to undo the transformation on the forecasted value
first, and this can be seen in the <strong>__undo_forecast_transform</strong> method in <strong>Model.py</strong>
under the <strong>Algorithm </strong>section. We then calculate the percent change in price using:
<strong>forecasted price / current price - 1</strong>. Then, we emit an Insight based on the direction of the percent
change with the weight of the Insight as the absolute value of the percent change. We use the
<strong>InsightWeightPortfolioConstructionModel</strong> so that the weight of the Insight determines the portfolio
allocation percentage, which means larger forecasted moves will have larger allocation.
</p>

Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
<div class="qc-embed-frame" style="display: inline-block; position: relative; width: 100%; min-height: 100px; min-width: 300px;">
<div class="qc-embed-dummy" style="padding-top: 56.25%;"></div>
<div class="qc-embed-element" style="position: absolute; top: 0; bottom: 0; left: 0; right: 0;">
<iframe class="qc-embed-backtest" height="100%" width="100%" style="border: 1px solid #ccc; padding: 0; margin: 0;" src="https://www.quantconnect.com/terminal/processCache?request=embedded_backtest_5f73c6cec527c2665a194c8fc9dd94e1.html"></iframe>
</div>
</div>
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
<p>The performance of the algorithm was poor. It achieved a Sharpe Ratio of only 0.262, while simply holding Bitcoin over the same period would have yielded a Sharpe Ratio of 1.688.&nbsp;</p>
<p>Here are some ideas for improvement:</p>
<ul>
<li>Experimentation with ARIMA variants and other time-series models (e.g. SARIMAX, ARFIMA, VAR, etc.)</li>
<li>Applying different types of transformations to achieve stationarity</li>
</ul>
<p>If a user comes across any interesting results, we&rsquo;d love to hear about it in the Community Forum.</p>
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
<ol>
<li>
Amin Azari (2019). Bitcoin Price Prediction: An ARIMA Approach. CoRR, abs/1904.05315.
<a href="https://arxiv.org/pdf/1904.05315.pdf">Online Copy</a>.
</li>
</ol>