PanelSplit: a tool for panel data analysis

PanelSplit is a Python package designed to facilitate time series cross-validation when working with multiple entities (aka panel data). This tool is useful for handling panel data in various stages throughout the data pipeline, including feature engineering, hyper-parameter tuning, and model estimation.

Installation

You can install PanelSplit using pip:

pip install panelsplit

Documentation

To read the documentation, visit here.

Example Usage

import pandas as pd
from panelsplit import PanelSplit

# Generate example data
num_countries = 2
years = range(2001, 2004)
num_years = len(years)

data_dict = {
    'country_id': [c for c in range(1, num_countries + 1) for _ in years],
    'year': [year for _ in range(num_countries) for year in years],
    'y': np.random.normal(0, 1, num_countries * num_years),
    'x1': np.random.normal(0, 1, num_countries * num_years),
    'x2': np.random.normal(0, 1, num_countries * num_years)
}

panel_data = pd.DataFrame(data_dict)
panel_split = PanelSplit(periods = panel_data.year, n_splits =2)

splits = panel_split.split()

for train_idx, test_idx in splits:
    print("Train:"); display(panel_data.loc[train_idx]) 
    print("Test:"); display(panel_data.loc[test_idx])

For more examples and detailed usage instructions, refer to the examples directory in this repository. Also feel free to check out an article I wrote about PanelSplit.

Background

Work on panelsplit started at EconAI in December 2023 and has been under active development since then.

Contributing

Contributions to PanelSplit are welcome! If you encounter any issues or have suggestions for improvements, please feel free to open an issue or submit a pull request on GitHub.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 274 Commits
.github/workflows		.github/workflows
examples		examples
panelsplit		panelsplit
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CNAME		CNAME
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PanelSplit: a tool for panel data analysis

Installation

Documentation

Example Usage

Background

Contributing

License

About

Releases 7

Packages

Contributors 2

Languages

License

4Freye/panelsplit

Folders and files

Latest commit

History

Repository files navigation

PanelSplit: a tool for panel data analysis

Installation

Documentation

Example Usage

Background

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 2

Languages

Packages