Scikit-longitudinal

A specialised Python library for longitudinal data analysis built on Scikit-learn

💡 About The Project

Scikit-longitudinal (Sklong) is a machine learning library designed to analyse longitudinal data (Classification tasks focussed as of today). It offers tools and models for processing, analysing, and predicting longitudinal data, with a user-friendly interface that integrates with the Scikit-learn ecosystem.

Wait, what is Longitudinal Data — In layman's terms ?

Longitudinal data is a "time-lapse" snapshot of the same subject, entity, or group tracked over time-periods, similar to checking in on patients to see how they change. For instance, doctors may monitor a patient's blood pressure, weight, and cholesterol every year for a decade to identify health trends or risk factors. This data is more useful for predicting future results than a one-time survey because it captures evolution, patterns, and cause-effect throughout time.

Not enough?

For more scientific details, you can refer to our paper published in the Journal of Open Source Software (JOSS).
For more technical details, visit the official documentation.

🛠️ Installation

Note

Want to be using Jupyter Notebook, Marimo, Google Colab, or JupyterLab? Head to the Getting Started section of the documentation, we explain it all! 🎉 Additionally, note that Scikit-longitudinal works on Python 3.10+ to 3.13.

To install Scikit-longitudinal:

✅ Install the latest version:

pip install Scikit-longitudinal

To install a specific version:

pip install Scikit-longitudinal==0.1.0

Need Ray-backed parallelism? Install the optional extra:

pip install Scikit-longitudinal[parallelisation]

Parallel features automatically prompt you to install this extra when missing.

🚀 Getting Started

Here's how to analyse longitudinal data with Scikit-longitudinal:

from scikit_longitudinal.data_preparation import LongitudinalDataset
from scikit_longitudinal.estimators.ensemble.lexicographical.lexico_gradient_boosting import LexicoGradientBoostingClassifier

dataset = LongitudinalDataset('./stroke.csv') # Note this is a fictional dataset. Use yours!
dataset.load_data_target_train_test_split(
  target_column="class_stroke_wave_4",
)

# Pre-set or manually set your temporal dependencies 
dataset.setup_features_group(input_data="elsa")

model = LexicoGradientBoostingClassifier(
  features_group=dataset.feature_groups(),
  threshold_gain=0.00015 # Refer to the API for more hyper-parameters and their meaning
)

model.fit(dataset.X_train, dataset.y_train)
y_pred = model.predict(dataset.X_test)

# Classification report
print(classification_report(y_test, y_pred))

📝 How to Cite

If you use Sklong in your research, please cite our paper:

@article{Provost2025,
    doi = {10.21105/joss.08481},
    url = {https://doi.org/10.21105/joss.08481},
    year = {2025},
    publisher = {The Open Journal},
    volume = {10},
    number = {112},
    pages = {8481},
    author = {Provost, Simon and Freitas, Alex A.},
    title = {Scikit-Longitudinal: A Machine Learning Library for Longitudinal Classification in Python},
    journal = {Journal of Open Source Software}
}

🔐 License

Scikit-longitudinal is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 267 Commits
.github		.github
.run		.run
build_tools/github		build_tools/github
docs		docs
scikit_longitudinal		scikit_longitudinal
scripts/linux		scripts/linux
.coveragerc		.coveragerc
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.MD		CHANGELOG.MD
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
build_docs.sh		build_docs.sh
mkdocs.yml		mkdocs.yml
pylintrc		pylintrc
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
setup.py		setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scikit-longitudinal

A specialised Python library for longitudinal data analysis built on Scikit-learn

💡 About The Project

🛠️ Installation

🚀 Getting Started

📝 How to Cite

🔐 License

About

Uh oh!

Releases 6

Uh oh!

Languages

License

simonprovost/scikit-longitudinal

Folders and files

Latest commit

History

Repository files navigation

Scikit-longitudinal

A specialised Python library for longitudinal data analysis built on Scikit-learn

💡 About The Project

🛠️ Installation

🚀 Getting Started

📝 How to Cite

🔐 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Uh oh!

Languages