Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Requests' ChunkedEncodingError exceptions when downloading files #1960

Closed
rmartin16 opened this issue Aug 14, 2024 · 4 comments · Fixed by #2041
Closed

Handle Requests' ChunkedEncodingError exceptions when downloading files #1960

rmartin16 opened this issue Aug 14, 2024 · 4 comments · Fixed by #2041
Labels
bug A crash or error in behavior.

Comments

@rmartin16
Copy link
Member

rmartin16 commented Aug 14, 2024

Describe the bug

A download failure mode for Requests is a ChunkedEncodingError and Briefcase does not handle this when downloading arbitrary files.

For instance:

2024-08-14T08:50:59.1255448Z [verifyapp] Installing support package...
2024-08-14T08:50:59.1358324Z Using custom support package https://github.com/rmartin16/python-standalone-releases/releases/download/continuous/cpython-3.9-aarch64-unknown-linux-gnu-noopt.tar.gz
2024-08-14T08:50:59.3866734Z Downloading cpython-3.9-aarch64-unknown-linux-gnu-noopt.tar.gz...
2024-08-14T08:51:00.2657989Z    \ ━━━━━━━━━━━━━━━━━━━━━━━╸                           47.9% • 00:01
2024-08-14T08:51:00.2659826Z 
2024-08-14T08:51:00.2902173Z 
2024-08-14T08:51:06.1506679Z Log saved to /home/runner/work/briefcase-arch-test/briefcase-arch-test/verifyapp/logs/briefcase.2024_08_14-08_51_00.create.log
2024-08-14T08:51:06.1572024Z 
2024-08-14T08:51:06.1636586Z 
2024-08-14T08:51:06.1643694Z 
2024-08-14T08:51:06.1659243Z Traceback (most recent call last):
2024-08-14T08:51:06.1664565Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/urllib3/response.py", line 748, in _error_catcher
2024-08-14T08:51:06.1687342Z     yield
2024-08-14T08:51:06.1689610Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/urllib3/response.py", line 894, in _raw_read
2024-08-14T08:51:06.1699939Z     raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
2024-08-14T08:51:06.1704736Z urllib3.exceptions.IncompleteRead: IncompleteRead(20971520 bytes read, 22835863 more expected)
2024-08-14T08:51:06.1708533Z 
2024-08-14T08:51:06.1709127Z The above exception was the direct cause of the following exception:
2024-08-14T08:51:06.1709865Z 
2024-08-14T08:51:06.1710088Z Traceback (most recent call last):
2024-08-14T08:51:06.1713666Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/requests/models.py", line 820, in generate
2024-08-14T08:51:06.1722126Z     yield from self.raw.stream(chunk_size, decode_content=True)
2024-08-14T08:51:06.1723614Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/urllib3/response.py", line 1060, in stream
2024-08-14T08:51:06.1737314Z     data = self.read(amt=amt, decode_content=decode_content)
2024-08-14T08:51:06.1738885Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/urllib3/response.py", line 949, in read
2024-08-14T08:51:06.1750958Z     data = self._raw_read(amt)
2024-08-14T08:51:06.1752209Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/urllib3/response.py", line 902, in _raw_read
2024-08-14T08:51:06.1762120Z     self._fp.close()
2024-08-14T08:51:06.1762900Z   File "/usr/lib/python3.9/contextlib.py", line 135, in __exit__
2024-08-14T08:51:06.1767196Z     self.gen.throw(type, value, traceback)
2024-08-14T08:51:06.1768320Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/urllib3/response.py", line 772, in _error_catcher
2024-08-14T08:51:06.1778412Z     raise ProtocolError(arg, e) from e
2024-08-14T08:51:06.1780954Z urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(20971520 bytes read, 22835863 more expected)', IncompleteRead(20971520 bytes read, 22835863 more expected))
2024-08-14T08:51:06.1781773Z 
2024-08-14T08:51:06.1782017Z During handling of the above exception, another exception occurred:
2024-08-14T08:51:06.1782547Z 
2024-08-14T08:51:06.1782744Z Traceback (most recent call last):
2024-08-14T08:51:06.1784508Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/bin/briefcase", line 8, in <module>
2024-08-14T08:51:06.1785451Z     sys.exit(main())
2024-08-14T08:51:06.1786402Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/briefcase/__main__.py", line 29, in main
2024-08-14T08:51:06.1789169Z     command(**options)
2024-08-14T08:51:06.1790447Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/briefcase/commands/create.py", line 949, in __call__
2024-08-14T08:51:06.1801804Z     state = self.create_app(app, **full_options(state, options))
2024-08-14T08:51:06.1803740Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/briefcase/commands/create.py", line 888, in create_app
2024-08-14T08:51:06.1814779Z     self.install_app_support_package(app=app)
2024-08-14T08:51:06.1816771Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/briefcase/commands/create.py", line 313, in install_app_support_package
2024-08-14T08:51:06.1823926Z     support_file_path = self._download_support_package(app)
2024-08-14T08:51:06.1825991Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/briefcase/commands/create.py", line 374, in _download_support_package
2024-08-14T08:51:06.1831846Z     return self.tools.file.download(
2024-08-14T08:51:06.1833614Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/briefcase/integrations/file.py", line 214, in download
2024-08-14T08:51:06.1840003Z     self._fetch_and_write_content(response, filename)
2024-08-14T08:51:06.1841931Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/briefcase/integrations/file.py", line 250, in _fetch_and_write_content
2024-08-14T08:51:06.1845802Z     for data in response.iter_content(chunk_size=1024 * 1024):
2024-08-14T08:51:06.1847381Z   File "/runner/work/briefcase-arch-test/briefcase-arch-test/venv/lib/python3.9/site-packages/requests/models.py", line 822, in generate
2024-08-14T08:51:06.1857449Z     raise ChunkedEncodingError(e)
2024-08-14T08:51:06.1859009Z requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(20971520 bytes read, 22835863 more expected)', IncompleteRead(20971520 bytes read, 22835863 more expected))

Steps to reproduce

Cannot be reproduced at will.

Expected behavior

Catch ChunkedEncodingError for file downloads. Or maybe just catch any exception that Requests can raise since they should all mean the same thing: the download was unsuccessful.

Screenshots

No response

Environment

  • Operating System: virtualized arm64 Bullseye on GitHub Linux runner 2.319.0
  • Python version: 3.9
  • Software versions:

Logs

No response

Additional context

No response

@rmartin16 rmartin16 added the bug A crash or error in behavior. label Aug 14, 2024
@freakboy3742
Copy link
Member

Agreed we should be catching any error that clearly comes from the download - the question is how to determine what that canonical list is.

@sarayourfriend
Copy link
Contributor

sarayourfriend commented Oct 17, 2024

I'd like to work on this. I agree with @freakboy3742, catching all the errors would be a good idea.

Requests handles the possible underlying urllib3 errors and translates them into ChunkedEncodingError, ContentDecodingError, ConnectionError and RequestsSSLError. Catching those should suffice to cover the relevant errors for this code. (Expand for detailed review of the requests and urllib3 code)

Here's the relevant code in requests: https://github.com/psf/requests/blob/7335bbf480adc8e6aa88feb2022797a549a00aa3/src/requests/models.py#L819-L828

For reference, it can raise the following:

  • ChunkedEncodingError when catching ProtocolError
  • ContentDecodingError when catching DecodeError
  • ConnectionError when catching ReadTimeoutError
  • RequestsSSLError when catching SSLError. RequestsSSLError is a subclass of ConnectionError.

urllib3's stream function (that one in the requests code that is called as self.raw.stream) ultimately calls to read_chunked which will raise:

  • ResponseNotChunked: caught by requests as a subclass of ProtocolError
  • BodyNotHttplibCompatible: subclass of HTTPError. Not it's possible to reach it with this briefcase code, best to not handle this and wait for a bug report if it is possible? It is not handled by requests.

read_chunked uses the _error_catcher context manager around the code that actually reads the chunks. That context manager more or less translates any read-time error into a ProtocolError, SSLError, or ReadTimeoutError.

These could be caught and translated into BadNetworkResourceError. Let me know what y'all think, I can open a PR for that later today if it sounds good.

@freakboy3742
Copy link
Member

Wow... that's a seriously thorough analysis - nice work!

I agree that we don't need to keep the fidelity of all these different failure modes - "download didn't work" (with some extra error message context if it's illuminating) is enough granularity at the level that Briefcase is surfacing these errors.

At least at present, BadNetworkResourceError is framed as a "returns a HTTP status code" error response; these error modes are more in the "Didn't even get to the point of returning a status code" bucket. That said, the only error codes we really care about are 404 and 500, and 404 is already mapped to MissingNetworkResourceError, so I suspect the fix here will involve moving BadNetworkResourceError towards a more generic "network bad" error (that includes the HTTP status code in the error message if one is available). However, I wouldn't be opposed to adding a new exception (ConnectionFailedError?`) to cover the "no status code" case if that makes more sense in the implementation or error handling.

On a related note - if you're up for a slightly larger piece of related work, I'll draw your attention to #2039. I'm fairly sure this ticket can be closed out with a relatively minor modification to exception handling; but if there's a need for anything more substantial, it might be worth taking a step back and re-visiting the assumption that requests is the right tool for the job in this case.

@sarayourfriend
Copy link
Contributor

Thanks for the tip on that related issue. I would be very interested to work on that httpx issue. I'll try the simple (:crossed_fingers:) approach we've discussed here for this exception. While httpx's streaming API is slightly different to requests, I think the changes in this particular code path to use httpx will mostly be in File.download. In other words, I think it's possible to address this in _fetch_and_write_content somewhat agnostic of the HTTP client library, with the understanding that to use httpx anywhere, we'd have to swap references for the exception classes anyway.

Well, in any case, I'll try out the simple fix we've discussed and see how it goes 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A crash or error in behavior.
Projects
None yet
3 participants