DML_RF

This repository contains a clean and minimal Python implementation of Double Machine Learning (DML) using Random Forest . The script performs causal inference by estimating the Average Treatment Effect (ATE) through three different methods and visualizing the bias.

Ordinary Least Squares (OLS): Baseline estimation without controlling for selection bias.
Naive DML: Splits the sample into two parts — one for estimating nuisance parameters, the other for estimating the ATE.
Cross-Fitting DML: Splits the sample into two and alternates roles between folds to estimate nuisance parameters and ATE, reducing overfitting and bias.

The goal is to compare the effectiveness of these methods with visualized plot

Key Features

🔍 Compares OLS, naive DML, and cross-fitting DML
🌲 Uses Random Forest to model nuisance components
🧪 Easy-to-modify structure with real world data
📉 Demonstrates the advantage of cross-fitting in reducing overfitting bias

Applications

Causal inference from observational data
Econometrics and policy evaluation
Research settings requiring robust ATE estimation

File Overview

DML_RF.ipynb: Contains the full pipeline for simulation, estimation, and comparison across the three ATE estimation methods.

Requirements

Python 3.8+
pandas
numpy
sklearn

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
DML_RF.html		DML_RF.html
DML_RF.ipynb		DML_RF.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DML_RF

Key Features

Applications

File Overview

Requirements

About

Uh oh!

Releases

Packages

Languages

gfshi/DML_RandomForest

Folders and files

Latest commit

History

Repository files navigation

DML_RF

Key Features

Applications

File Overview

Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages