Name	Name	Last commit message	Last commit date
parent directory ..
baselines	baselines
plots	plots
.gitignore	.gitignore
README.md	README.md
alphafold.ipynb	alphafold.ipynb
ballots.ipynb	ballots.ipynb
census_education.ipynb	census_education.ipynb
census_healthcare.ipynb	census_healthcare.ipynb
census_healthcare_ppboot.ipynb	census_healthcare_ppboot.ipynb
census_income.ipynb	census_income.ipynb
census_income_covshift.ipynb	census_income_covshift.ipynb
forest.ipynb	forest.ipynb
galaxies.ipynb	galaxies.ipynb
gene_expression.ipynb	gene_expression.ipynb
plankton.ipynb	plankton.ipynb
power_analysis.ipynb	power_analysis.ipynb
tree_cover_ptd.ipynb	tree_cover_ptd.ipynb
utils.py	utils.py

Name

Last commit message

Last commit date

plots

census_education.ipynb

census_healthcare.ipynb

census_healthcare_ppboot.ipynb

census_income.ipynb

census_income_covshift.ipynb

forest.ipynb

galaxies.ipynb

gene_expression.ipynb

Examples

This folder contains a suite of examples on real data:

Proteomic Analysis with AlphaFold (alphafold.ipynb)
Galaxy Classification with Computer Vision (galaxies.ipynb)
Gene Expression Analysis with Transformers (gene_expression.ipynb)
Deforestation Analysis with Computer Vision (forest.ipynb)
Health Insurance Analysis with Boosting Trees (census_healthcare.ipynb)
Income Analysis with Boosting Trees (covariate shift) (census_income_covshift.ipynb)
Plankton Counting with Computer Vision (label shift) (plankton.ipynb)
Ballot Counting with Computer Vision (ballots.ipynb)
Income Analysis with Boosting Trees (census_income.ipynb)

Each notebook runs a simulation that forms a dataframe containing confidence intervals produced by different methods (PPI, classical, imputation), for different values of labeled data set size n and over different trials. Based on the computed dataframe, the notebook plots:

five randomly chosen intervals for PPI and the classical method, and the imputed interval;
the average interval width for PPI and the classical method, together with a scatterplot of the widths from the five random draws.

Each notebook also compares PPI and classical inference in terms of the number of labeled examples needed to reject a natural null hypothesis in the analyzed problem.

The notebook tree_cover_ptd.ipynb shows how to use the Predict-Then-Debias (PTD) estimator from Kluger et al. (2025), 'Prediction-Powered Inference with Imputed Covariates and Nonuniform Sampling,' https://arxiv.org/abs/2501.18577.

Finally, there is a notebook that shows how to compute the optimal n and N given a cost constraint (power_analysis.ipynb).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Examples

FilesExpand file tree

examples

Directory actions

More options

Directory actions

More options

Latest commit

History

examples

Folders and files

parent directory

README.md

Examples