-
-
Notifications
You must be signed in to change notification settings - Fork 265
Add retry mechanism with linear backoff for rate-limited requests in snippets #2656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Interesting. To get merged, we would have to add tests to exercise the code's logic paths (mocking the actual requests). We would need to update docs as well. I think this could be a good improvement for those who require it. |
@@ -237,6 +237,19 @@ If either of these is set to zero, the limits will be ignored. | |||
|
|||
To pass arbitrary HTTP headers in every HTTP request use `url_request_headers`. | |||
|
|||
/// new | New 10.16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually use these "New" blocks just to inform the user when a new feature was added. You should describe the settings outside of the block where we talk about url_request_headers
, url_max_size
, and url_timeout
. Eventually, I'll remove the really old "New" blocks, but the the documentation of the options should remain.
@0xLucca I see this PR has stalled for a couple weeks. Do you plan to continue working this or have you abandoned it? |
Hey @facelessuser, I haven’t had a chance to add the tests yet, but I’m aiming to do so this week. |
Cool, that's all I was looking for, I can never tell, so I just like to check in to see intentions. |
Hey @facelessuser ! I I've added the tests and updated the documentation. For the doc update, I'm not entirely sure if this aligns with what you had in mind. Could you take a look and let me know if any changes are needed? Regarding the tests, when I run them individually they all pass:
However, running the whole class causes some of them to fail:
It seems to be a concurrency related issue or perhaps some shared state between tests Let me know what you think, and happy to adjust things based on your feedback |
I imagine some of the network stuff is going to have to be mocked. We aren't really trying to unittest the network library, more how we respond to it when it responds with a failure, etc. I haven't looked too close at it yet, but we don't really need to be doing real network calls, but simulating them so we can make sure we are handling things appropriately based on our expectations of the returns. |
Just to clarify, all the network calls in the tests are already mocked so no real network requests are being made. |
s_lines.append('') | ||
return s_lines | ||
|
||
except urllib.error.HTTPError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we were checking status earlier, why would we now need to check it here? Are we actually getting a raised exception now instead of just being able to check the status? This isn't inherently bad per se, but I'm curious what's actually getting exercised in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, maybe we don't need a try, catch. Since we are checking status, that should be handled earlier. If we are getting an HTTP error, we have more issues than just a bad status, we should probably fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do see in the urlib code that some status can raise an exception. It looks like 429 may do this. If that is the case, then the earlier 429 check is not needed.
time.sleep(wait) | ||
continue | ||
raise SnippetMissingError(f"Cannot download snippet '{url}': {str(e)}") | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be here. Now we are altering the return. I may be raising ValueError as the payload is too big, but now it's being swallowed, retried, and the exception changed. This can cause existing tests to fail. We should break here by just calling raise
without a new exception. This will just cause the original exception to bubble up.
# Fail if status is not OK | ||
status = response.status if util.PY39 else response.code | ||
|
||
if status == 429: # Rate limited |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be simplified as below which should handle all bad status.
if status != 200:
if status == 429 and attempt < self.max_retries:
wait = self.backoff_factor * attempt
time.sleep(wait)
continue
raise SnippetMissingError(f"Cannot download snippet '{url}'")
Additionally, if we are capturing the rate limited status here, why do we need to check for urllib.error.HTTPError
later?
@0xLucca I've added some comments. I suspect because you are altering how exceptions are handled that you are now altering how the extension behaves, causing tests to fail. |
It looks like there is a number of issues, I'm working through them. |
There were a number of issues.
|
I went ahead and merged the work. Feel free to create another PR if you find issues or create a bug issue. |
Thanks! I will give this a try and let you know |
This PR adds a retry mechanism to handle rate-limited requests (HTTP 429) when downloading snippets from URLs. The implementation includes:
New configuration options:
max_retries
: Maximum number of retry attempts (default: 3)backoff_factor
: Backoff factor for retry attempts (default: 2)Linear backoff strategy:
backoff_factor * attempt
secondsError handling:
This change improves the reliability of snippet downloads when dealing with rate-limited APIs while maintaining backward compatibility with existing configurations.