Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor temporalscope #17

Merged
merged 8 commits into from
Sep 21, 2024
4 changes: 4 additions & 0 deletions .github/workflows/license-compliance.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ jobs:
steps:
- name: Checkout Code
uses: actions/checkout@v4
with:
fetch-depth: 0 # Fetch full history to ensure we're on a branch
ref: ${{ github.head_ref }} # Ensure the branch is checked out

- name: Fix License Header
uses: apache/skywalking-eyes/header@v0.6.0
Expand All @@ -31,6 +34,7 @@ jobs:
author_name: License Bot
author_email: license_bot@github.com
message: 'chore: automatic application of license header'
push: true # Ensure the changes are pushed back to the branch

check_dependencies:
runs-on: ubuntu-latest
Expand Down
7 changes: 7 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,21 @@ repos:
rev: v0.6.5
hooks:
- id: ruff
# Exclude tests and tutorials
exclude: "^(test/|tutorial_notebooks/)"
# No args needed, uses pyproject.toml settings
- id: ruff-format
args: ["--line-length=120"]
# No need for --ignore options here, as ruff-format is for applying automatic fixes.

- repo: https://github.com/codespell-project/codespell
rev: v2.3.0
hooks:
- id: codespell
additional_dependencies:
- tomli
args: ["--ignore-words-list=Nam"]


- repo: https://github.com/rhysd/actionlint
rev: v1.7.1
Expand Down
48 changes: 40 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,46 @@
---
<!-- SPHINX-START -->

| | |
| --- | --- |
| Compatibility | ![Python Version](https://img.shields.io/badge/python-3.10%2B-blue) ![Linux Compatible](https://img.shields.io/badge/OS-Linux-blue) |
| License | ![License](https://img.shields.io/badge/License-Apache%202.0-green) |
| Code Quality | [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://docs.astral.sh/ruff/) ![Checked with mypy](https://www.mypy-lang.org/static/mypy_badge.svg)|
| Build Tools | [![Hatch project](https://img.shields.io/badge/%F0%9F%A5%9A-Hatch-4051b5.svg)](https://hatch.pypa.io/latest/) |
| CI/CD | [![pre-commit.ci status](https://results.pre-commit.ci/badge/github/philip-ndikum/TemporalScope/main.svg)](https://results.pre-commit.ci/latest/github/philip-ndikum/TemporalScope/main) [![codecov](https://codecov.io/gh/philip-ndikum/TemporalScope/branch/main/graph/badge.svg)](https://codecov.io/gh/philip-ndikum/TemporalScope)|
| Security | [![OpenSSF Best Practices](https://www.bestpractices.dev/projects/9424/badge)](https://www.bestpractices.dev/projects/9424) [![Security: Bandit](https://img.shields.io/badge/security-bandit-yellow.svg)](https://github.com/PyCQA/bandit)|
<div align="center">
<table>
<thead>
<tr>
<th>Compatibility</th>
<th>License</th>
<th>Code Quality</th>
<th>Build Tools</th>
<th>CI/CD</th>
<th>Security</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<img src="https://img.shields.io/badge/python-3.10%2B-blue" alt="Python Version"><br>
<img src="https://img.shields.io/badge/OS-Linux-blue" alt="Linux Compatible">
</td>
<td>
<img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License">
</td>
<td>
<a href="https://docs.astral.sh/ruff/"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json" alt="Ruff"></a><br>
<img src="https://www.mypy-lang.org/static/mypy_badge.svg" alt="Checked with mypy">
</td>
<td>
<a href="https://hatch.pypa.io/latest/"><img src="https://img.shields.io/badge/%F0%9F%A5%9A-Hatch-4051b5.svg" alt="Hatch project"></a>
</td>
<td>
<a href="https://results.pre-commit.ci/latest/github/philip-ndikum/TemporalScope/main"><img src="https://results.pre-commit.ci/badge/github/philip-ndikum/TemporalScope/main.svg" alt="pre-commit.ci status"></a><br>
<a href="https://codecov.io/gh/philip-ndikum/TemporalScope"><img src="https://codecov.io/gh/philip-ndikum/TemporalScope/branch/main/graph/badge.svg" alt="codecov"></a>
</td>
<td>
<a href="https://www.bestpractices.dev/projects/9424"><img src="https://www.bestpractices.dev/projects/9424/badge" alt="OpenSSF Best Practices"></a><br>
<a href="https://github.com/PyCQA/bandit"><img src="https://img.shields.io/badge/security-bandit-yellow.svg" alt="Security: Bandit"></a>
</td>
</tr>
</tbody>
</table>
</div>

---
**TemporalScope** is an open-source Python package designed to bridge the gap between scientific research and practical industry applications for analyzing the temporal dynamics of feature importance in AI & ML time series models. Developed in alignment with Linux Foundation standards and licensed under Apache 2.0, it builds on tools such as Boruta-SHAP and SHAP, using modern window partitioning algorithms to tackle challenges like non-stationarity and concept drift. The tool is flexible and extensible, allowing for bespoke enhancements and algorithms, and supports frameworks like Pandas, Polars, and Modin. Additionally, the optional *Clara LLM* modules (etymology from the word _Clarity_) are intended to serve as a model-validation tool to support explainability efforts (XAI). **Note**: TemporalScope is currently in **beta and pre-release** phase so some installation methods may not work as expected on all platforms. Please check the `CONTRIBUTIONS.md` for the full roadmap.
Expand Down
29 changes: 17 additions & 12 deletions .github/SCIENTIFIC_LITERATURE.md β†’ SCIENTIFIC_LITERATURE.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,20 @@
### SCIENTIFIC_LITERATURE.md

This document lists key literature that has informed the development of this package. Please note that this is not a conclusive list but highlights the most relevant works.

| **Category** | **Title** | **Authors** | **Publication** | **Summary** |
|--------------|-----------|-------------|-----------------|-------------|
| **Regulatory Literature** | [Machine learning algorithms for financial asset price forecasting](https://arxiv.org/abs/2004.01504) | Ndikum, P. | arXiv preprint, 2020 | Discusses the application of machine learning algorithms for forecasting financial asset prices, with implications for regulatory frameworks. |
| **Regulatory Literature** | [Advancing Investment Frontiers: Industry-grade Deep Reinforcement Learning for Portfolio Optimization](https://arxiv.org/abs/2403.07916) | Ndikum, P., & Ndikum, S. | arXiv preprint, 2024 | Explores deep reinforcement learning approaches for portfolio optimization, emphasizing industry-grade applications and regulatory considerations. |
| **Scientific Literature** | [SHAP-based insights for aerospace PHM: Temporal feature importance, dependencies, robustness, and interaction analysis](https://www.sciencedirect.com/science/article/pii/S2590123024000872) | Alomari, Y., & AndΓ³, M. | Results in Engineering, 2024 | This paper explores SHAP-based methods for analyzing temporal feature importance in aerospace predictive health management. |
| **Scientific Literature** | [Feature importance explanations for temporal black-box models](https://arxiv.org/pdf/2102.11934) | Sood, A., & Craven, M. | AAAI Conference on Artificial Intelligence, 2022 | Introduces the TIME framework for explaining temporal black-box models using feature importance. |
| **Scientific Literature** | [WindowSHAP: An efficient framework for explaining time-series classifiers based on Shapley values](https://doi.org/10.1016/j.jbi.2023.104438) | Nayebi, A., Tipirneni, S., Reddy, C. K., Foreman, B., & Subbian, V. | Journal of Biomedical Informatics, 2023 | Proposes the WindowSHAP framework to explain time-series classifiers, improving both computational efficiency and explanation quality. |
| **Scientific Literature** | [The sliding window and SHAP theoryβ€”an improved system with a long short-term memory network model for state of charge prediction in electric vehicle application](https://doi.org/10.3390/en14123692) | Gu, X., See, K. W., Wang, Y., Zhao, L., & Pu, W. | Energies, 2021 | Combines sliding window and SHAP theories to enhance LSTM-based SOC prediction models for electric vehicles. |
## Engineering Design

This document lists key literature that has informed the development of this package. Please note that this is not a conclusive list but highlights the most relevant works. Our design is explicitly built for flexibility, unlike other time series machine learning and deep learning packages that often enforce rigid preprocessing constraints. We intentionally adopt familiar software engineering patterns, inspired by scikit-learn, to provide a modular and adaptable framework. The only assumption we impose is that features must be organized in a context window prior to the target variable. This allows users to focus on their core applications while ensuring compatibility with SHAP and other explainability methods.


| **Category** | **Title** | **Authors** | **Publication** | **Summary** |
|-------------------------|------------------------------------------------------------------------------------------------------------------|----------------------------------------------------|---------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Regulatory Literature** | [Machine learning algorithms for financial asset price forecasting](https://arxiv.org/abs/2004.01504) | Ndikum, P. | arXiv preprint, 2020 | Discusses the application of machine learning algorithms for forecasting financial asset prices, with implications for regulatory frameworks. |
| **Regulatory Literature** | [Advancing Investment Frontiers: Industry-grade Deep Reinforcement Learning for Portfolio Optimization](https://arxiv.org/abs/2403.07916) | Ndikum, P., & Ndikum, S. | arXiv preprint, 2024 | Explores deep reinforcement learning approaches for portfolio optimization, emphasizing industry-grade applications and regulatory considerations. |
| **Scientific Literature** | [SHAP-based insights for aerospace PHM: Temporal feature importance, dependencies, robustness, and interaction analysis](https://www.sciencedirect.com/science/article/pii/S2590123024000872) | Alomari, Y., & AndΓ³, M. | Results in Engineering, 2024 | This paper explores SHAP-based methods for analyzing temporal feature importance in aerospace predictive health management. |
| **Scientific Literature** | [Feature importance explanations for temporal black-box models](https://arxiv.org/pdf/2102.11934) | Sood, A., & Craven, M. | AAAI Conference on Artificial Intelligence, 2022 | Introduces the TIME framework for explaining temporal black-box models using feature importance. |
| **Scientific Literature** | [WindowSHAP: An efficient framework for explaining time-series classifiers based on Shapley values](https://doi.org/10.1016/j.jbi.2023.104438) | Nayebi, A., Tipirneni, S., Reddy, C. K., et al. | Journal of Biomedical Informatics, 2023 | Proposes the WindowSHAP framework to explain time-series classifiers, improving both computational efficiency and explanation quality. |
| **Scientific Literature** | [The sliding window and SHAP theoryβ€”an improved system with a long short-term memory network model for state of charge prediction in electric vehicle application](https://doi.org/10.3390/en14123692) | Gu, X., See, K. W., Wang, Y., et al. | Energies, 2021 | Combines sliding window and SHAP theories to enhance LSTM-based SOC prediction models for electric vehicles. |
| **Scientific Literature** | [Cross-Frequency Time Series Meta-Forecasting](https://arxiv.org/abs/2302.02077) | Van Ness, M., Shen, H., Wang, H., et al. | arXiv preprint, 2023 | Proposes the CFA model, capable of handling varying frequencies in time series data, supporting flexible universal model assumptions in time series forecasting. |
| **Scientific Literature** | [Unified Training of Universal Time Series Forecasting Transformers](https://arxiv.org/abs/2402.02592) | Woo, G., Liu, C., Kumar, A., et al. | arXiv preprint, 2024 | Introduces Moirai, a transformer model that scales universally across multiple time series forecasting tasks without heavy preprocessing constraints. |
| **Scientific Literature** | [Universal Time-Series Representation Learning: A Survey](https://arxiv.org/abs/2401.03717) | Trirat, P., Shin, Y., Kang, J., et al. | arXiv preprint, 2024 | Provides a comprehensive survey of universal models for time series, outlining how generalization across datasets is achieved with minimal assumptions. |


### Partitioning Guidelines

Expand Down
18 changes: 15 additions & 3 deletions codecov.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,21 @@
# codecov.yml

coverage:
status:
project:
default:
target: auto
threshold: 5%
target: auto # Automatically adjusts based on the current project coverage
threshold: 80% # Allow up to an 80% drop in overall coverage during beta phase
# Note: This permits a large drop in overall project coverage during the fast-paced development phase.

patch:
default:
informational: true
informational: true # Informational only; won't fail the pipeline
target: 5% # Set a minimal target of 5% coverage for new code patches
threshold: 90% # Allow up to a 90% drop on new code coverage (extremely lenient)
# Note: This ensures that even with low coverage, the pipeline won’t block PRs, but you still get coverage insights.

parsers:
python:
include:
- "src/temporalscope/**" # Focus coverage checks only on main source files, excluding tests and docs for now
43 changes: 36 additions & 7 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,13 @@ dependencies = [
"ruff",
"jupyterlab",
"notebook",
"commitizen==3.29.0",
"mypy", # Include dependencies for QA scripts
"bandit", # Include dependencies for QA scripts
"black", # Include dependencies for QA scripts
"pytest", # Include pytest for running tests
"pytest-cov", # Include pytest-cov for coverage if needed
"docformatter", # Add docformatter for docstring formatting
"commitizen",
]

Expand Down Expand Up @@ -100,14 +107,13 @@ log_date_format = "%Y-%m-%d %H:%M:%S"
minversion = "6.0"
filterwarnings = "ignore"

[tool.black]
line-length = 120 # Set Black's line length to 120 for consistency

[tool.ruff]
extend-exclude = ["*.pyc"]
extend-exclude = ["*.pyc", "test/*", "tutorial_notebooks/*"]
target-version = "py310"
line-length = 88


[tool.ruff.format]
docstring-code-format = true
line-length = 120 # Consistent line length across all tools

[tool.ruff.lint]
select = [
Expand All @@ -133,7 +139,12 @@ select = [
# docstring rules
"D", # flake8-docstrings
]

ignore = [
"D400", # Ignore "First line should end with a period" for docstrings.
"D401", # Ignore "First line should be in imperative mood" for docstrings.
"D415", # Ignore "First line should end with a period, question mark, or exclamation point."
"E501", # Ignore "Line too long" in docstrings/comments for exceeding 120 characters.
"PERF203", # `try`-`except` within a loop incurs performance overhead
"PERF401", # Use a list comprehension to create a transformed list
"PLR1714", # repeated-equality-comparison
Expand All @@ -154,7 +165,7 @@ ignore = [
"docs/conf.py" = ["A001", "D103"]

[tool.mypy]
files = "temporalscope"
files = "src/temporalscope"
python_version = "3.10"
ignore_missing_imports = true
warn_unreachable = true
Expand All @@ -178,6 +189,24 @@ check = "ruff check {args}"
fix = "ruff check --fix"
format = "ruff format {args}"
format-check = "ruff format --check {args}"
docformat = """
docformatter --check --recursive --wrap-summaries 120 --wrap-descriptions 120 src/temporalscope || \
docformatter --in-place --recursive --wrap-summaries 120 --wrap-descriptions 120 src/temporalscope
"""
# Automated developer Q&A script
quality-assurance = """
pytest &&
docformatter --check --recursive --wrap-summaries 120 --wrap-descriptions 120 src/temporalscope || \
docformatter --in-place --recursive --wrap-summaries 120 --wrap-descriptions 120 src/temporalscope
ruff check src/temporalscope --output-format=full --show-files --show-fixes &&
mypy src/temporalscope --ignore-missing-imports --show-error-codes --warn-unreachable &&
bandit -r src/temporalscope
"""
generate-kernel = """
python -m ipykernel install --user --name temporalscope-kernel --display-name "TemporalScope"
echo "Jupyter kernel 'TemporalScope' created. You can now use it in Jupyter notebooks."
"""


[tool.commitizen]
version = "0.1.0"
Expand Down
Loading