Skip to content

[ENH] Kolmogorov Arnold Block for NBeats #1751

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

Sohaib-Ahmed21
Copy link
Contributor

@Sohaib-Ahmed21 Sohaib-Ahmed21 commented Jan 13, 2025

Description

Fixes: #1741

This PR adds Kolmogorov Arnold(KAN) Blocks in NBeats and also does refactoring of NBeats. Implementation of KAN blocks' layers is taken from original paper code.

Changes in Structure

  • Introduced the NBEATSKAN module, which enables usage of KAN blocks within the NBEATS architecture.

  • Integrated KANLayer logic, implemented in kan_layer.py, which handles KAN-specific operations such as:

    • Spline coefficient computation,
    • Grid initialization and updates, etc.
  • Imported KANLayer to submodules.py for block operations, allowing NBEATSKAN to delegate block-level behavior through use_kan=True.

  • Added the NBEATSAdapter class to encapsulate common methods shared by both NBEATS and NBEATSKAN, including:

    • Standard training, forward logic, etc.
    • Excludes block initialization (__init__), which is separately defined in each class to maintain architectural flexibility.

GridUpdateCallback

When training KAN-based models, the grid can be iteratively refined during training for better performance.

To support this, logic from the original [pykan](https://github.com/KindXiaoming/pykan) implementation has been adapted to define a custom callback named GridUpdateCallback. This callback automatically updates the grid at specified training steps, improving model accuracy and convergence.

This callback has been tested successfully and demonstrates improved results in practice.

An example usage is provided in:
examples/nbeats_with_kan.py

Checklist

  • Linked issues (if existing)
  • Used pre-commit hooks when committing to ensure that code is compliant with hooks. Install hooks with pre-commit install.
    To run hooks independent of commit, execute pre-commit run --all-files

Make sure to have fun coding!

@Sohaib-Ahmed21
Copy link
Contributor Author

@fkiraly @benHeid this PR is ready for review. Kindly review it. Thanks!

@Sohaib-Ahmed21
Copy link
Contributor Author

@fkiraly @benHeid can you kindly review/merge it so I integrate NBEATSX modification in NBEATS without conflicts as I have asked @julian-fong and he is not working on NBEATSX.

Copy link
Collaborator

@benHeid benHeid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fully reviewed yet. Will continue in the next days. But I share my current comments so that you already receive some feedbacks.

When training KANs, the grid can be iteratively be refined. I wonder, if there is a way to implement this also here. However, this might probably more difficult and require changes to the trainer. So probably out of scope for this PR. Do you have opinions on that?

@@ -0,0 +1,528 @@
import numpy as np
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The license of the original implementation is MIT. So in theory it is okay to copy the file. However, please add some credits at the top of the file.

Alternatively, we could think about adding KAN as a dependency.

@fkiraly do you have any additions on that matter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will add the appropriate credits at the top of the file. Additionally, I agree that adding KAN as a dependency—perhaps as a soft dependency—seems like a good idea, especially considering its increasing relevance in time series forecasting.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fkiraly pinging you again to check if this is okay for you :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which package are we exactly planning to add as a soft dep?

If it is a single layer, I think copying it over and including the license is perhaps better for now, because we do not have machinery to manage soft dependencies (like scikit-base or similar).

The proposed design in here sktime/enhancement-proposals#39 would allow that, but right now I think this would require a significant amounts of custom code to handle.

Or, is there an easy way that I am not seeing how the soft dependency import would work for part of the NN?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which package are we exactly planning to add as a soft dep?

It is pykan library , reference https://pypi.org/project/pykan/

If it is a single layer, I think copying it over and including the license is perhaps better for now, because we do not have machinery to manage soft dependencies (like scikit-base or similar).

The proposed design in here sktime/enhancement-proposals#39 would allow that, but right now I think this would require a significant amounts of custom code to handle.

Or, is there an easy way that I am not seeing how the soft dependency import would work for part of the NN?

yes it is a single layer, only used in NBEATS. Also the library pykan has much more, but we only need this. I have implemented what you already suggested above. I have copied it over and included the license.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, makes sense, as long as the license points to the original source and is provided in full form (assuming it hsa the usual reqiurement to reproduce)

@Sohaib-Ahmed21
Copy link
Contributor Author

Not fully reviewed yet. Will continue in the next days. But I share my current comments so that you already receive some feedbacks.

Thanks! Will address these reviews soon.

@Sohaib-Ahmed21
Copy link
Contributor Author

When training KANs, the grid can be iteratively be refined. I wonder, if there is a way to implement this also here. However, this might probably more difficult and require changes to the trainer. So probably out of scope for this PR. Do you have opinions on that?

I'll explore this and share my thoughts.

@Sohaib-Ahmed21
Copy link
Contributor Author

Sohaib-Ahmed21 commented Jan 23, 2025

@benHeid I have addressed the reviews. Kindly review the updated PR.

When training KANs, the grid can be iteratively be refined. I wonder, if there is a way to implement this also here. However, this > might probably more difficult and require changes to the trainer. So probably out of scope for this PR. Do you have opinions on > that?

To address this, I have taken logic from original implementation of pykan library and made custom Callback i.e. GridUpdateCallback which updates grid after specified steps. This has been tested and working fine with improved results. Your thoughts will be helpful in this regard.

@Sohaib-Ahmed21
Copy link
Contributor Author

Sohaib-Ahmed21 commented Jan 26, 2025

@benHeid @fkiraly I have addressed the reviews. Kindly review it so I proceed integrating NBEATSX without conflicts.

Copy link
Collaborator

@benHeid benHeid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have one last comment. But I also would like to hear @fkiraly opinion on the license of the kan_layer

@fkiraly fkiraly changed the title Kolmogorov Arnold Block for NBeats [ENH] Kolmogorov Arnold Block for NBeats Jul 6, 2025
@Sohaib-Ahmed21 Sohaib-Ahmed21 requested a review from benHeid July 7, 2025 05:11
@Sohaib-Ahmed21
Copy link
Contributor Author

@benHeid I have addressed the reviews. Kindly review the updated PR, thanks!

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Could you kindly also add some explanation in the PR fist post about the changes to the structure, and what goes in the common base class? What is going on with the grid callback?

Also requesting review by @phoeenniixx and @PranavBhatP

Optionally, could you check in the layers module whether we can use that to avoid duplicate layer specifications?

evaluate x on B-spline bases

Args:
-----
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please use Parameters like in sktime and follow the numpydoc style docstrings, No need to add colon (:)

):
"""'
Initialize a KANLayer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this docstring above just after the class definition?

shown to consistently outperform N-BEATS.

Args:
stack_types: One of the following values: “generic”, “seasonality" or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use numpydoc style doctsring

forecasting <http://arxiv.org/abs/1905.10437>`_. The network has (if
used as ensemble) outperformed all other methods including ensembles of
traditional statical methods in the M4 competition. The M4 competition is
arguably the most important benchmark for univariate time series forecasting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to mention the paper for KAN blocks here and how it works (just an overview) and how it is different from the original NBeats

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned the reference with details.

@Sohaib-Ahmed21
Copy link
Contributor Author

Could you kindly also add some explanation in the PR fist post about the changes to the structure, and what goes in the common base class? What is going on with the grid callback?

Updated the first post to include explanations on the structure changes, common base class, and grid callback as requested.

Optionally, could you check in the layers module whether we can use that to avoid duplicate layer specifications?

I checked it and currently blocks are reused from sub_modules.py and are not duplicated.

@Sohaib-Ahmed21
Copy link
Contributor Author

@fkiraly @benHeid kindly review/merge this pull request, thanks!

@phoeenniixx
Copy link
Contributor

Optionally, could you check in the layers module whether we can use that to avoid duplicate layer specifications?

I checked it and currently blocks are reused from sub_modules.py and are not duplicated.

I think we can also move some of the layers (like KAN layer) to the layers right? I think these layers can be reused by other models as well?

Also, a question for @fkiraly: Imo, we could also move networks present in sub_modules to layers, or is it better to keep them here along with NBeats only, as other models might not use them?

@Sohaib-Ahmed21
Copy link
Contributor Author

I think we can also move some of the layers (like KAN layer) to the layers right? I think these layers can be reused by other models as well?

Yes, in the future, KANLayer might be reused...I think then GridUpdateCallback should also be moved to layers. Thoughts?

@phoeenniixx
Copy link
Contributor

phoeenniixx commented Jul 12, 2025

I think we can also move some of the layers (like KAN layer) to the layers right? I think these layers can be reused by other models as well?

Yes, in the future, KANLayer might be reused...I think then GridUpdateCallback should also be moved to layers. Thoughts?

Yes this can also be added!
I think we need to decide which parts of this PR are reusable and can be moved to layers

But this is a callback and not essentially a neural layer, so I think moving it to layers may not be a good idea, rather a similar folder for callbacks? Not sure, what do you think?

@Sohaib-Ahmed21
Copy link
Contributor Author

Yes this can also be added! I think we need to decide which parts of this PR are reusable and can be moved to layers

But this is a callback and not essentially a neural layer, so I think moving it to layers may not be a good idea, rather a similar folder for callbacks? Not sure, what do you think?

What I was essentially thinking was that KANLayer in terms of my experience is not used much across the library...so the current implementation also sounds logical to me. Maybe we can ask from @benHeid also about what to include or the current implementation sounds good?

@phoeenniixx
Copy link
Contributor

phoeenniixx commented Jul 12, 2025

What I was essentially thinking was that KANLayer in terms of my experience is not used much across the library...so the current implementation also sounds logical to me.

I agree, it is not used anywhere right now, but we may need it in future, then it will be handy to have it in layers - A single API point for reusable layers.

As you said, maybe @benHeid, @fkiraly can also provide us a new perspective...

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 13, 2025

Also, a question for @fkiraly: Imo, we could also move networks present in sub_modules to layers, or is it better to keep them here along with NBeats only, as other models might not use them?

I think ultimately all layers should move to layers, although moving layers not changed or affected by this PR is out of scope. But I would move new layers, or layers that now get imported by both nbeats variants to there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] Kolmogorov Arnold Block for NBeats network
4 participants