Skip to content

[ENH] add a difference transformer to series transformations #2729

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

TinaJin0228
Copy link
Contributor

Reference Issues/PRs

fixes #1553

What does this implement/fix? Explain your changes.

This implements a transformer that computes the n-order differences of a time series.

The input time series is expected to have a NumPy inner-type, and the transformation is performed using numpy.diff().

To preserve the original shape of the series along the time axis, the first order element(s) of the output are filled with NaN.

Does your contribution introduce a new dependency? If yes, which one?

No. It only depends on NumPy.

Any other comments?

No.

PR checklist

For all contributions
  • I've added myself to the list of contributors. Alternatively, you can use the @all-contributors bot to do this for you after the PR has been merged.
  • The PR title starts with either [ENH], [MNT], [DOC], [BUG], [REF], [DEP] or [GOV] indicating whether the PR topic is related to enhancement, maintenance, documentation, bugs, refactoring, deprecation or governance.
For new estimators and functions
  • I've added the estimator/function to the online API documentation.
  • (OPTIONAL) I've added myself as a __maintainer__ at the top of relevant files and want to be contacted regarding its maintenance. Unmaintained files may be removed. This is for the full file, and you should not add yourself if you are just making minor changes or do not want to help maintain its contents.
For developers with write access
  • (OPTIONAL) I've updated aeon's CODEOWNERS to receive notifications about future changes to these files.

@TinaJin0228 TinaJin0228 requested a review from TonyBagnall as a code owner April 6, 2025 08:08
@aeon-actions-bot aeon-actions-bot bot added enhancement New feature, improvement request or other non-bug code enhancement transformations Transformations package labels Apr 6, 2025
@aeon-actions-bot
Copy link
Contributor

Thank you for contributing to aeon

I have added the following labels to this PR based on the title: [ $\color{#FEF1BE}{\textsf{enhancement}}$ ].
I have added the following labels to this PR based on the changes made: [ $\color{#41A8F6}{\textsf{transformations}}$ ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

  • Run pre-commit checks for all files
  • Run mypy typecheck tests
  • Run all pytest tests and configurations
  • Run all notebook example tests
  • Run numba-disabled codecov tests
  • Stop automatic pre-commit fixes (always disabled for drafts)
  • Disable numba cache loading
  • Push an empty commit to re-run CI checks

@TinaJin0228
Copy link
Contributor Author

It seems that one of the automated checks failed with errors indicating it couldn't find the merge base or my commit SHA:

Warning: Unable to find merge base between 45d0390d1a441264943f8963547409bb36935d2d and 8f78ec98e57b49640bd59f5570f35efb9fea7e67
Error: Unable to locate the commit sha: 45d0390d1a441264943f8963547409bb36935d2d
Error: Please verify that the commit sha is correct, and increase the 'fetch_depth' input if needed
Warning: fatal: bad object 45d0390d1a441264943f8963547409bb36935d2d
Warning: If this pull request is from a forked repository, please set the checkout action `repository` input to the same repository as the pull request.
Warning: This can be done by setting actions/checkout `repository` to ${{ github.event.pull_request.head.repo.full_name }}

Copy link
Member

@MatthewMiddlehurst MatthewMiddlehurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, few comments.

if not isinstance(order, int) or order < 1:
raise ValueError(f"`order` must be a positive integer, but got {order}")
self.order = order
super().__init__(axis=axis)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the axis parameter. This is used to determine the shape of the time series internally. We only want to apply this to series, not between channels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "axis" is inherited from BaseSeriesTransformer. Should "axis = 1" be used to indicate that the time series are all in rows, with shape (n_channels, n_timepoints)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thats telling the base class to convert the series to (n_channels, n_timepoints) before passing it to _fit and other functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

Comment on lines 79 to 80
if not isinstance(order, int) or order < 1:
raise ValueError(f"`order` must be a positive integer, but got {order}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do this in fit


dt1 = DifferenceTransformer(order=1)
Xt1 = dt1.fit_transform(X)
expected1 = np.array([[np.nan, 3.0, 5.0, 7.0, 9.0, 11.0]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO better to return a smaller series than include NaNs. Possibly a parameter if you think it is worth it but by default change the shape.


def test_diff():
"""Tests basic first and second order differencing."""
X = np.array([[1.0, 4.0, 9.0, 16.0, 25.0, 36.0]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also test multivariate series in another test perhaps

@TinaJin0228
Copy link
Contributor Author

I have made som modifications according to your comments. @MatthewMiddlehurst

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature, improvement request or other non-bug code enhancement transformations Transformations package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] Differencer series transformer
2 participants