Skip to content

Add retry mechanism with linear backoff for rate-limited requests in snippets #2656

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

0xLucca
Copy link

@0xLucca 0xLucca commented Apr 30, 2025

This PR adds a retry mechanism to handle rate-limited requests (HTTP 429) when downloading snippets from URLs. The implementation includes:

  • New configuration options:

    • max_retries: Maximum number of retry attempts (default: 3)
    • backoff_factor: Backoff factor for retry attempts (default: 2)
  • Linear backoff strategy:

    • Each retry attempt waits for backoff_factor * attempt seconds
    • For example, with default settings:
      • First retry: 2 seconds
      • Second retry: 4 seconds
      • Third retry: 6 seconds
  • Error handling:

    • Specific handling for HTTP 429 (Too Many Requests) responses
    • Graceful error propagation after all retries are exhausted
    • Maintains existing error handling for other HTTP status codes

This change improves the reliability of snippet downloads when dealing with rate-limited APIs while maintaining backward compatibility with existing configurations.

@gir-bot gir-bot added S: needs-review Needs to be reviewed and/or approved. C: snippets Related to the snippets extension. C: source Related to source code. labels Apr 30, 2025
@facelessuser
Copy link
Owner

Interesting. To get merged, we would have to add tests to exercise the code's logic paths (mocking the actual requests). We would need to update docs as well.

I think this could be a good improvement for those who require it.

@gir-bot gir-bot added the C: docs Related to documentation. label Apr 30, 2025
@@ -237,6 +237,19 @@ If either of these is set to zero, the limits will be ignored.

To pass arbitrary HTTP headers in every HTTP request use `url_request_headers`.

/// new | New 10.16
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually use these "New" blocks just to inform the user when a new feature was added. You should describe the settings outside of the block where we talk about url_request_headers, url_max_size, and url_timeout. Eventually, I'll remove the really old "New" blocks, but the the documentation of the options should remain.

@facelessuser
Copy link
Owner

@0xLucca I see this PR has stalled for a couple weeks. Do you plan to continue working this or have you abandoned it?

@0xLucca
Copy link
Author

0xLucca commented May 12, 2025

Hey @facelessuser, I haven’t had a chance to add the tests yet, but I’m aiming to do so this week.

@facelessuser
Copy link
Owner

Cool, that's all I was looking for, I can never tell, so I just like to check in to see intentions.

@gir-bot gir-bot added the C: tests Related to testing. label May 15, 2025
@0xLucca
Copy link
Author

0xLucca commented May 15, 2025

Hey @facelessuser ! I

I've added the tests and updated the documentation.

For the doc update, I'm not entirely sure if this aligns with what you had in mind. Could you take a look and let me know if any changes are needed?

Regarding the tests, when I run them individually they all pass:

python3 -m pytest tests/test_extensions/test_snippets.py::TestURLSnippetsRetry::test_successful_retry

However, running the whole class causes some of them to fail:

python3 -m pytest tests/test_extensions/test_snippets.py::TestURLSnippetsRetry

It seems to be a concurrency related issue or perhaps some shared state between tests

Let me know what you think, and happy to adjust things based on your feedback

@facelessuser
Copy link
Owner

I imagine some of the network stuff is going to have to be mocked. We aren't really trying to unittest the network library, more how we respond to it when it responds with a failure, etc.

I haven't looked too close at it yet, but we don't really need to be doing real network calls, but simulating them so we can make sure we are handling things appropriately based on our expectations of the returns.

@0xLucca
Copy link
Author

0xLucca commented May 16, 2025

Just to clarify, all the network calls in the tests are already mocked so no real network requests are being made.

s_lines.append('')
return s_lines

except urllib.error.HTTPError as e:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we were checking status earlier, why would we now need to check it here? Are we actually getting a raised exception now instead of just being able to check the status? This isn't inherently bad per se, but I'm curious what's actually getting exercised in the code.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, maybe we don't need a try, catch. Since we are checking status, that should be handled earlier. If we are getting an HTTP error, we have more issues than just a bad status, we should probably fail.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see in the urlib code that some status can raise an exception. It looks like 429 may do this. If that is the case, then the earlier 429 check is not needed.

time.sleep(wait)
continue
raise SnippetMissingError(f"Cannot download snippet '{url}': {str(e)}")
except Exception as e:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be here. Now we are altering the return. I may be raising ValueError as the payload is too big, but now it's being swallowed, retried, and the exception changed. This can cause existing tests to fail. We should break here by just calling raise without a new exception. This will just cause the original exception to bubble up.

# Fail if status is not OK
status = response.status if util.PY39 else response.code

if status == 429: # Rate limited
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be simplified as below which should handle all bad status.

if status != 200:
    if status == 429 and attempt < self.max_retries:
        wait = self.backoff_factor * attempt
        time.sleep(wait)
        continue
    raise SnippetMissingError(f"Cannot download snippet '{url}'")

Additionally, if we are capturing the rate limited status here, why do we need to check for urllib.error.HTTPError later?

@facelessuser
Copy link
Owner

@0xLucca I've added some comments. I suspect because you are altering how exceptions are handled that you are now altering how the extension behaves, causing tests to fail.

@facelessuser
Copy link
Owner

It looks like there is a number of issues, I'm working through them.

@facelessuser
Copy link
Owner

There were a number of issues.

  1. How exceptions were being handled.
  2. Global config in testing infrastructure was changed, but then the local processor was restored, and the global config was left altered, causing the next test to get unexpected results. Additionally, the order in which tests are run in a Class object is not guaranteed.

facelessuser added a commit that referenced this pull request May 16, 2025
This is a refactor of the work done by @0xLucca and replaces and
resolves #2656
@facelessuser
Copy link
Owner

@0xLucca, I refactored your work in PR #2669 fixing various issues. It was just easier for me to quickly resolve since you had it 99% there.

@facelessuser
Copy link
Owner

I went ahead and merged the work. Feel free to create another PR if you find issues or create a bug issue.

@0xLucca
Copy link
Author

0xLucca commented May 19, 2025

I went ahead and merged the work. Feel free to create another PR if you find issues or create a bug issue.

Thanks! I will give this a try and let you know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: docs Related to documentation. C: snippets Related to the snippets extension. C: source Related to source code. C: tests Related to testing. S: needs-review Needs to be reviewed and/or approved.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants