This repository demonstrates how to model and forecast the U.S. Treasury yield curve using real market data and machine learning techniques.
Overview
The project combines two main components:
-
Yield Curve Fitting with Gaussian Processes -- Uses daily U.S. Treasury constant-maturity yields from 1-month to 30-years. -- Applies Gaussian Process Regression (GPR) to fit a smooth yield curve. -- Provides confidence intervals around the estimated curve.
-
Forecasting with Random Forests -- Trains a Random Forest Regressor to predict the next-day 10-year yield. -- Features include lagged yields across multiple maturities. -- Evaluated with Mean Absolute Error (MAE).
Data
- Source: Federal Reserve Economic Data (FRED)
- Series: DGS1MO, DGS3MO, DGS6MO, DGS1, DGS2, DGS3, DGS5, DGS7, DGS10, DGS20, DGS30
- Frequency: Daily (business days)
Installation
- Clone the repository and install dependencies:
- pip install -r requirements.txt
Dependencies include:
- pandas
- numpy
- matplotlib
- scikit-learn
- pandas-datareader
Usage
-
Run the main script: python yield_curve_ml.py
-
This will: -- Download yield data from FRED. -- Fit the Gaussian Process yield curve for the most recent date. -- Train a Random Forest model and forecast the next-day 10-year yield. -- Display visualizations of the curve fit and forecast performance.
Output
- Yield curve plots with uncertainty bands.
- Forecast vs actual plots for the 10-year yield.
- Feature importance rankings.
Possible future Extensions
- Forecast the full yield curve using multi-output regressors.
- Apply sequential models such as LSTMs.
- Experiment with alternative curve fitting methods (Nelson–Siegel, Svensson).
References
- Federal Reserve Bank of St. Louis (FRED) Treasury Constant Maturity Rates
- Rasmussen & Williams, Gaussian Processes for Machine Learning
- Breiman, Random Forests