-
Notifications
You must be signed in to change notification settings - Fork 692
[BUG] Correcly set lagged variables to known when lag >= horizon #1910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1910 +/- ##
=======================================
Coverage ? 86.36%
=======================================
Files ? 96
Lines ? 7801
Branches ? 0
=======================================
Hits ? 6737
Misses ? 1064
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, well spotted!
If I may ask, how did you notice this?
I had first implemented lagged features manually in my input data for one of my project and was worried about data leakage. When I noticed there was a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see! Could we add that as a test then?
Given that we consider this a bug, we should add a test to prevent this from occurring later again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Review of this would be appreciated, @fnhirwa, @phoeenniixx, @PranavBhatP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me😊
Reference Issues/PRs
Fixes #1909
What does this implement/fix? Explain your changes.
This PR fixes how the lagged variables are assigned to known or unknown variables by setting the ones originating from known variables to known and the others to known only if lag >= horizon to avoid data leaks
What should a reviewer concentrate their feedback on?
Did you add any tests for the change?
I just checked it with the issue code snippet for now, but it might need a unit test to avoid changing its behaviour again
Any other comments?
PR checklist
pre-commit install
.To run hooks independent of commit, execute
pre-commit run --all-files