Releases: DynamicsAndNeuralSystems/pyspi
PySPI v1.1.1
Bug Fixes
- Correction to the directionality of all time-lagged mutual information (
tlmi_
) SPIs. Previously,tlmi
was implemented as an undirected SPI, resulting in only the bottom half of the SPI matrix being computed. The SPI has since been confirmed to be a directed measure, and will now output asymmetric SPI tables. - Replaced all instances of
np.NaN
withnp.nan
to ensure compatibility with the latest versions of numpy. - Updated unit tests to reflect changes to the expected output for
tlmi
SPI tables.
Affected SPIs
The following SPIs, for which only the bottom half of the output tables were being computed (symmetric matrix) will now yield asymmetric tables due to their directionality:
tlmi_gaussian
tlmi_kraskov_NN-4
tlmi_kraskov_NN-4_DCE
tlmi_kernel_W-0.25
What's Changed
PySPI v1.1.0
Support for Python 3.10+
pyspi 1.1.0
brings additional support to Python versions: 3.10
, 3.11
and 3.12
and is therefore now compatible with Python 3.8-3.12
.
Key Changes
- Lifted constraints on some dependency version requirements, allowing users on Python 3.8-3.12 to run pyspi.
- Added
normalise
parameter to theCalculator
to allow the user to control whether their dataset is normalised before computing SPIs. Defaults toTrue
. Users who wish to skip normalisation of their dataset can pass the normalise flag to the Calculator object as follows:
calc = Calculator(dataset=..., normalise=False)
- Added an SPI computation results summary table which returns the number of SPIs successfully computed, as well as those which return NaN outputs.
- Added time to compute SPIs.
Testing
- Added runners to support wider python version coverage.
PySPI v1.0.3
SPI Reproducibility Fix
pyspi v1.0.3
is a patch update that addresses inconsistent results observed in several Information Theoretic and Convergent Cross-Mapping (ccm
) SPIs when running multiple trials on the same benchmarking dataset. As of this update, all 284 SPIs should now produce identical results across multiple runs on the same dataset.
Key Changes:
- Upgraded jidt dependency from
v1.5
tov1.6.1
.jidt v1.6.1
includes a newNOISE_SEED
property for all jidt calculators, enabling consistent results across multiple runs. For more information, see here. Since jidt is self-contained within the pyspi package, upgrading the jidt version should not introduce any breaking changes for users who have already installed pyspi. - Added random seed support within pyspi for jidt-based SPIs. All SPIs that rely on the jidt library now utilise a fixed random seed to ensure reproducibility across runs.
- Introduced random seed for Convergent Cross-Mapping (
ccm
) SPI. Theccm
SPI now uses a fixed random seed, addressing the previously observed stochasticity in its outputs.
Important Note to Users:
The addition of fixed random seeds for the affected SPIs may result in slightly different output values compared to previous versions of pyspi. This change is due to the improved consistency and reproducibility of the SPI outputs. Please keep this in mind if making exact numerical comparisons with previous versions of pyspi.
Affected SPIs:
The following SPIs, which previously produced varying outputs across multiple trials, should now yield consistent results:
ccm
(all 9 estimators)cce_kozachenko
ce_kozachenko
di_kozachenko
je_kozachenko
si_kozachenko_k-1
PySPI v1.0.2
New SPI - Gromov-Wasserstein Distance (GWτ)
This minor patch update introduces a new distance-based SPI, GWτ (called gwtau
in pyspi). An in-depth tutorial for incorporating new SPIs into the existing pyspi framework, using gwtau
as a prototypical example, is now available in the documentation.
What is it?
Based on the algorithm proposed by Kravtsova et al. (2023), GWτ is a new distance measure for comparing time series data, especially suited for biological applications. It works by representing each time series as a metric space and computing the distances from the start of each time series to every point. These distance distributions are then compared using the Wasserstein distance, which finds the optimal way to match the distances between two time series, making it robust to shifts and perturbations. The "tau" in GWτ emphasises that this distance measure is based on comparing the distributions of distances from the root (i.e., the starting point) to all other points in each time series, which is analogous to comparing the branch lengths in two tree-like structures. GWτ can be computed efficiently and is scalable.
How can I use it?
Currently, the default (subset = all
) SPI set and fast (subset = fast
) subset include gwtau
. This means you do not
have to do anything, unless you would like to compute gwtau
in isolation. Simply instantiate the calculator object and compute
SPIs as usual. You can access the matrix of pairwise interactions for gwtau
using it's identifier in the results table:
calc = Calculator(dataset=...)
calc.compute()
gwtau_results = calc.table['gwtau']
For technical details about the specific implementation of gwtau
, such as theoretical properties of this distance measure, see the original paper by Kravtsova et al. (2023). You can also find the original implementation of the algorithm in MATLAB in this GitHub repository.
PySPI V1.0.1
Bug Fixes
File location handling improvement for the filter_spis
function:
- Modified the
filter_spis
function to allow the user to specify the exact location of the source config YAML file. - Implemented a default file mechanism where, if no file is specified by the user, the function defaults to using the pre-defined
config.yaml
file located in the script's directory as the source file. - Updated unit tests to reflect the changes.
PySPI v1.0.0
Introduction to pyspi v1.0.0
This major release (1.0.0) brings several updates to pyspi including optional dependency checks and the ability to filter SPIs based on keywords.
Highlights of this release
- SPI Filtering: A new
filter_spis
function has been added to thepyspi.utils
module. This function allows users to create subsets of SPIs based on keywords (e.g., "linear", "non-linear"). It takes three arguments:keywords
: a list of one or more labels to filter the SPIs, e.g., ["linear", "signed"].output_name
: the name of the output YAML file, defaulting to{random_string}_config.yaml
if no name is provided as an argument.configfile
: the path to the source config file. If no configfile is provided, defaults to usingconfig.yaml
in the pyspi directory.
Example usage:
# using the default config.yaml as the source file
filter_spis(keywords=["linear", "signed"], output_name="linear_signed") # returns `linear_signed.yaml` in cwd
# or using a user-specified configfile as the source file
filter_spis(keywords=["linear", "signed"], output_name="linear_signed", configfile="myconfig.yaml")
A new yaml file is saved in the current working directory with the filtered subset of SPIs. This filtered config file can be loaded into the Calculator object using the configfile
argument as would be the case for a typical custom YAML file (see the docs for more info):
calc = Calculator(configfile="./linear_signed.yaml")
- Optional Dependency Checks: When instantiating a Calculator object, pyspi now automatically performs checks for optional dependencies (Java and Octave). If any dependencies are missing, the user will be notified about which SPIs will be excluded and due to which dependencies. The user can then choose to proceed with a reduced set of SPIs or install the missing dependencies.
- Restructured SPI Config File: The SPI configuration YAML file has been restructured to include the following keys for each base SPI:
labels
: base SPI specific labels (e.g., linear, non-linear, signed, etc.) that can be used by the filter function to create user-specified subsets of SPIs.dependencies
: external/system dependencies required by the base SPI (e.g., Octave for integrated information SPIs).config
: estimator settings and configurations e.g., EmpiricalCovariance for the Covariance base SPI.
Example YAML:
Here is an example of how the phi_star_t1_norm-0
SPI would be specified
IntegratedInformation:
labels:
- undirected
- nonlinear
- unsigned
- bivariate
- time-dependent
dependencies:
- octave
configs:
- phitype: "star"
Breaking Changes
This major version release introduces breaking changes for users who rely on custom SPI subsets (i.e., custom YAML files). Users relying on the pyspi default and pre-defined subsets are unaffected by these changes.
- The
octaveless
subset has been removed, as it is no longer necessary due to the automatic dependency checks. Users without Octave installed can now run pyspi without specifyingoctaveless
as a subset in the Calculator object. - Users who want to define a custom subset of SPIs should follow the new guide in the documentation to ensure their custom YAML file conforms to the new structure with labels, dependencies, and configs as keys.
Migration Guide
If you are an existing user of pyspi and have custom SPI subsets (custom YAML files), follow these steps to migrate to the new version:
- Review the updated structure of the SPI configuration YAML file (see the above example), which now includes labels, dependencies, and configs keys for each base SPI.
- Update your custom YAML files to match the new structure.
- If you were previously using the octaveless subset, you no longer need to specify it when instantiating the Calculator object. The dependency checks will automatically exclude Octave-dependent SPIs if Octave is not installed.
For more detailed instructions and examples, refer to the updated documentation.
Documentation
The pyspi documentation has been updated to reflect the new features and changes introduced in this release. You can find the latest documentation here.
Testing
- Added unit tests for the new
filter_spis
function. - Added unit tests for the CalculatorFrame and CorrelationFrame.
- Updated workflow file for Git Actions to use the latest checkout and python setup actions.
PySPI v0.4.2
Introduction
This patch release brings a few minor updates including a new high contrast logo for dark mode users, improved SPI unit testing (with a new benchmarking dataset) and fixes for potential security vulnerability issues.
Highlights of this release
- New high contrast logo for dark-mode users.
- Improved SPI unit testing with z-scoring approach to flag SPIs with differing outputs.
- New coupled map lattice (CML) benchmarking dataset.
- Fix for potential security vulnerability issues in scikit-learn.
What's Changed
- Replaced the old
standard_normal.npy
benchmarking dataset with a coupled map lattice (cml7.npy
), along with its associated .pkl file containing the benchmark values (CML7_benchmark_tables.pkl
) generated in a fresh Ubuntu environment. - Updated the README to automatically select either the regular or new dark mode logo based on the user's theme.
- Added new
conftest.py
file for pytest to customise the unit testing outputs. - Added a new
pyproject.toml
file for configuring the package for publishing to PyPI.
New features
- Improved SPI unit testing with a new coupled map lattice benchmarking dataset (
cml7.npy
) consisting of 7 processes and 100 observations per process. - Z-scoring approach in unit testing pipeline to flag potential changes in SPI outputs as a result of algorithmic changes, etc. SPIs with outputs differing by more than a specified threshold are "flagged" and summarised in a table.
- Added a darkmode pyspi logo to the README which is shown for users with the dark-mode GitHub theme.
Bug Fixes
- Fixed a scikit-learn security vulnerability issue with severity "high" (pertaining to denial of service) by upgrading scikit-learn from version
0.24.1
to version1.0.1
. - Fixed Int64 deprecation issue (cannot import name
Int64Index
frompandas
) by fixing pandas to version1.5.0
. - Fixed unknown character issue for Windows users resulting from not specifying an encoding when loading the "README" in
setup.py
. Now fixed toutf-8
for consistency across platforms.
PySPI v0.4.1
Introduction
PySPI v0.4.1 introduces several minor changes to the existing README, as well as migrating documentation from "readthedocs" to an all new "GitBook" page. Simple unit testing has also been incorporated for each of the SPIs using a benchmarking dataset to check for the consistency of outputs.
Highlights of this release
What's Changed
- Removal of old /docs directory
- Addition of a /tests directory for unit testing
- Updated README
- Addition of CODE_OF_CONDUCT.md and SECURITY.md
New features
- Basic unit testing incorporated into a GitHub Actions workflow.
- Updated README file with links to the new GitBooks hosted documentation to replace the old "readthedocs" documentation.
- Added a code of conduct markdown
- Added a security policy markdown
Bug Fixes
- Fixed a PyTorch security vulnerability issue with severity "critical" (pertaining to arbitrary code execution) by updating torch from version
1.10.0
to1.13.1
.
PySPI v0.4
- The directed info measure now uses entropy rate in its calculation to closer resemble the streaming method described in literature.
- The code (mostly) uses black formatting now for readability
Paper release
This release is the version that was used for computing the results in the paper.