-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stepwise LR scheduler #20211
base: master
Are you sure you want to change the base?
Stepwise LR scheduler #20211
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…ytorch-lightning into ddp-strategy-alias
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #20211 +/- ##
=========================================
- Coverage 88% 79% -9%
=========================================
Files 267 264 -3
Lines 23380 23325 -55
=========================================
- Hits 20481 18366 -2115
- Misses 2899 4959 +2060 |
Hii @Borda. Do I need to make any kind of changes in the PR ? |
This looks good, thank you for the contribution @01AbhiSingh Ideally we could add a test to verify the behavior described in #17544. The current test suite can't detect the current change and this is usually a sign of insufficient coverage. Would you be willing to contribute such test? |
Yes, sure let me look into it. |
Hi @lantiga , Do you want a new test written from scratch or need me to make the necessary changes in a preexisting file? All the tests have been passed. If the changes need to be made in a preexisting file, it would be very helpful if you could point out the test in which I need to make the changes, as all the tests have been passed, and due to that, I can't find the test. |
hey @01AbhiSingh sorry for the wait You can take inspiration from:
and add a new test where scheduling goes across epoch boundaries. Maybe @falckt can help too? |
Done please check |
Hey @01AbhiSingh can you import Change: from lightning.pytorch import Trainer to from lightning.pytorch import Trainer, LightningModule this should fix the failing test |
…pytorch-lightning into stepwiseLRscheduler
for more information, see https://pre-commit.ci
Yeah, my bad. Forgot to add it even after seeing it. Done, please check. |
This is the test that is currently failing.
should I add this and try to run the test again ? |
Go for it : ) You can also run this kind of test locally with |
I actually tried to run the test locally with the method you suggested but this error keeps showing up Edit: I've solved this problem, will now update the PR only when it's running perfectly on my local environment. Thanks :) Another Edit 😝 : updated the PR please check |
…utils.data import DataLoader, TensorDataset
…pytorch-lightning into stepwiseLRscheduler
for more information, see https://pre-commit.ci
…pytorch-lightning into stepwiseLRscheduler
for more information, see https://pre-commit.ci
Test passing on my local environment but not in the PR in the repo. |
I think this time it is all done. Can you please check once ? @lantiga |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, added a couple of comments
trainer.fit(model) | ||
|
||
# Debug print statements | ||
print(f"Mocked scheduler step calls: {mocked_sched.call_count}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the debug statements, I'd just convert them to asserts that compare the values with expected ones.
def training_step(self, batch, batch_idx): | ||
# Add print statement to track batch index and global step | ||
if hasattr(self, 'trainer'): | ||
print(f"Batch idx: {batch_idx}, Global step: {self.trainer.global_step}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Print statements in tests are not super helpful, just use assert
s so the test will break if we don't get the expected value here.
|
||
# Assert that the scheduler was called the expected number of times | ||
# Allow for a small difference due to environment or rounding discrepancies | ||
assert abs(mocked_sched.call_count - expected_steps) <= 1, ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why there should be rounding discrepancies. Shouldn't this be fully deterministic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the test was passing in my local environment but not in the CI / CD pipeline for some reason. I forgot to change it later. Let me correct it asap.
What does this PR do?
Fixes #<17544>
Hii @awaelchli. Can you please verify the changes I made. If they are correct then i will take up and correct any failing tests also.
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:
Reviewer checklist
- [ ] Is this pull request ready for review? (if not, please submit in draft mode) - [ ] Check that all items from **Before submitting** are resolved - [ ] Make sure the title is self-explanatory and the description concisely explains the PR - [ ] Add labels and milestones (and optionally projects) to the PR so it can be classified📚 Documentation preview 📚: https://pytorch-lightning--20211.org.readthedocs.build/en/20211/