-
Notifications
You must be signed in to change notification settings - Fork 34
Bug 1742093 - Refactor upload management logic to be exactly like in glean-core #988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1742093 - Refactor upload management logic to be exactly like in glean-core #988
Conversation
The worker will be responsible for assembling ping requests and performing the actual ping uploads.
i.e. Copied missing tests from https://github.com/mozilla/glean/blob/main/glean-core/src/upload/mod.rs
|
Adding @badboy here as a reviewer as well, because he knows this mechanism very well and also I think it wouldn't hurt to get 2 pairs of eyes looking at this change to be more confident we caught any issues before merging. |
Build size report
|
In the previous implementation we were tying the ability to retry sending recovarebly failed upload request to the rate limiter state i.e. a user could only retry after the rate limiter time window was over. In order to do that we introduced a new state to the rate limiter, the Stopped state. When max recoverable failures was fit, we would out the rate limiter in that state. It would work just a like the Throttled state in the sense that the rate limiter would only get out of it once the time window was complete. That is actually not how the implementation in glean-core works. In there we simply stop the upload worker sending it a Done signal once max recorable failures is hit. If the worker is re-started immediatelly after it will be allowed to get new tasks. That is now how the implementation works in this repository as well. I wonder if tying the retries to the rate limiter window is still a beter idea, but that has not caused us problems up until now in glean-core, so for now, I'll leave it like this.'
fc34fb7 to
32a966d
Compare
badboy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Minimal things inline, but no need for me to re-review.
| assert.strictEqual(resultInternalServerError.result, UploadResultStatus.Success); | ||
| assert.strictEqual(resultInternalServerError.status, 500); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Success: failure.
:D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, yeah naming is hard.
| continue; | ||
| // TODO(bug1727076): Actually set a timer to continue once timeout is complete. | ||
| // Note that the timer must be cleared in case an abort signal is issued. | ||
| case UploadTaskTypes.Wait: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this fall through to the Done case? If yes, all good (though maybe add a comment // explicit fallthrough).
If not it should explicitly return or it will run into the throttling 3 times immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap it falls through.
Co-authored-by: Jan-Erik Rediger <badboy@archlinux.us>
With the previous design it was hard and hacky to enforce the rate limit defaults at the correct time.
The new design (or old, depends on how you look at it) allows to do that easily.
It also has the following added benefits 1) logic is better encapsulated in different structures 2) the design is almost exactly the same as the one in glean-core, so transfer of knowledge and ensurace of standardized functionality is easier to do 3) it doesn't have an internal dispatcher instance 4) it's just easier to reason about it, IMO.
Note: I suggest reviewing this PR commit by commit. I did my best to isolate changes, but this is a big change and there is no going around that.
Pull Request checklist
glean/folder, run:npm run testRuns all testsnpm run lintRuns all lintersCHANGELOG.mdor an explanation of why it does not need one