Skip to content

gh-118107: Fix zipimporter ZIP64 handling. #118108

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
May 7, 2024
Merged

Conversation

jsirois
Copy link
Contributor

@jsirois jsirois commented Apr 19, 2024

The code that handles too large files and offsets was missing an import
and was manipulating a tuple as if it was a list.

The code that handles too large files and offsets was missing an import.
@ghost
Copy link

ghost commented Apr 19, 2024

All commit authors signed the Contributor License Agreement.
CLA signed

Copy link
Contributor

@itamaro itamaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for catching this and the quick patch!

could you add a test case that covers this code path please?

@jsirois
Copy link
Contributor Author

jsirois commented Apr 19, 2024

could you add a test case that covers this code path please?

@itamaro Absolutely. In Pex my test builds a >4GB file to add to the zip to trigger this (I run the test if cat is present and use that in 32 doubling operations to grow a file with a single newline into the required size). Is it OK to add a test that generates a >4GB temp file? I was wary since CPython has such a diverse dev community and I suspect a diverse range of personal and otherwise machines that its tests run on.

@Eclips4
Copy link
Member

Eclips4 commented Apr 19, 2024

could you add a test case that covers this code path please?

@itamaro Absolutely. In Pex my test builds a >4GB file to add to the zip to trigger this (I run the test if cat is present and use that in 32 doubling operations to grow a file with a single newline into the required size). Is it OK to add a test that generates a >4GB temp file? I was wary since CPython has such a diverse dev community and I suspect a diverse range of personal and otherwise machines that its tests run on.

That's a good point!
Sadly, we do not have helper like skip_no_disk_space in our support package (It's already present in Lib/test/test_largefile.py). Maybe it's a good reason to add it to the support? If so, it's should be done in a separate PR.

@itamaro
Copy link
Contributor

itamaro commented Apr 19, 2024

right, I think either using skip_no_disk_space + requires('largefile' ...) to gate the test, or making it a bigmemtest if it can be done in-memory instead of using actual files on disk

@jsirois
Copy link
Contributor Author

jsirois commented Apr 19, 2024

Maybe it's a good reason to add it to the support? If so, it's should be done in a separate PR.

@Eclips4 and @itamaro how about I add a test in this PR with code similar to that large file test setup code you pointed me at with an eye to parameterization of the file size and then if I get another positive ACK that that code should be moved to support and de-deduped, then I'll detour to pull it out to a new PR?

@itamaro
Copy link
Contributor

itamaro commented Apr 19, 2024

@Eclips4 and @itamaro how about I add a test in this PR with code similar to that large file test setup code you pointed me at with an eye to parameterization of the file size and then if I get another positive ACK that that code should be moved to support and de-deduped, then I'll detour to pull it out to a new PR?

sounds good to me

@@ -810,6 +812,27 @@ def testZip64CruftAndComment(self):
files = self.getZip64Files()
self.doTest(".py", files, "f65536", comment=b"c" * ((1 << 16) - 1))

def testZip64LargeFile(self):
support.requires(
Copy link
Contributor Author

@jsirois jsirois Apr 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had introduced the decorator from Lib/test/test_largefile.py and then stumbled upon the resources support (-u) that seems more widely used; so I went with it instead. The tests only run (and pass) with -ulargefile:

./python -m test test_zipimport -ulargefile -v -m testZip64LargeFile
== CPython 3.13.0a6+ (heads/fix-issue-118107:8d55e2f226, Apr 19 2024, 20:12:18) [GCC 11.4.0]
== Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 little-endian
== Python build: debug
== cwd: /home/jsirois/dev/python/cpython/build/test_python_worker_191158æ
== CPU count: 16
== encodings: locale=UTF-8 FS=utf-8
== resources (1): largefile

Using random seed: 3331549713
0:00:00 load avg: 3.80 Run 1 test sequentially
0:00:00 load avg: 3.80 [1/1] test_zipimport
testZip64LargeFile (test.test_zipimport.CompressedZipImportTestCase.testZip64LargeFile) ... ok
testZip64LargeFile (test.test_zipimport.UncompressedZipImportTestCase.testZip64LargeFile) ... ok

----------------------------------------------------------------------
Ran 2 tests in 42.905s

OK
test_zipimport passed in 43.0 sec

== Tests result: SUCCESS ==

1 test OK.

Total duration: 43.0 sec
Total tests: run=2 (filtered)
Total test files: run=1/1 (filtered)
Result: SUCCESS

values = struct.unpack_from(f"<{min(num_extra_values, 3)}Q",
extra_data, offset=4)
import struct
values = list(struct.unpack_from(f"<{min(num_extra_values, 3)}Q",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new test caught the need for the list here. Prior to the fix you'd see:

ERROR: testZip64LargeFile (test.test_zipimport.UncompressedZipImportTestCase.testZip64LargeFile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<frozen importlib._bootstrap_external>", line 1510, in _path_importer_cache
KeyError: '/home/jsirois/dev/python/cpython/build/test_python_worker_140690æ/junk95142.zip'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jsirois/dev/python/cpython/Lib/test/test_zipimport.py", line 89, in wrapper
    return fun(*args, **kwargs)
           ~~~^^^^^^^^^^^^^^^^^
  File "/home/jsirois/dev/python/cpython/Lib/test/test_zipimport.py", line 89, in wrapper
    return fun(*args, **kwargs)
           ~~~^^^^^^^^^^^^^^^^^
  File "/home/jsirois/dev/python/cpython/Lib/test/test_zipimport.py", line 849, in testZip64LargeFile
    self.doTestWithPreBuiltZip(".py", "large_file")
    ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
  File "/home/jsirois/dev/python/cpython/Lib/test/test_zipimport.py", line 157, in doTestWithPreBuiltZip
    mod = importlib.import_module(".".join(modules))
          ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
  File "/home/jsirois/dev/python/cpython/Lib/importlib/__init__.py", line 88, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1322, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1262, in _find_spec
  File "<frozen importlib._bootstrap_external>", line 1553, in find_spec
  File "<frozen importlib._bootstrap_external>", line 1525, in _get_spec
  File "<frozen importlib._bootstrap_external>", line 1512, in _path_importer_cache
  File "<frozen importlib._bootstrap_external>", line 1488, in _path_hooks
  File "<frozen zipimport>", line 98, in __init__
  File "<frozen zipimport>", line 528, in _read_directory
AttributeError: 'tuple' object has no attribute 'pop'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another good catch! glad you added the test case!

Copy link
Contributor

@itamaro itamaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, this lgtm!

values = struct.unpack_from(f"<{min(num_extra_values, 3)}Q",
extra_data, offset=4)
import struct
values = list(struct.unpack_from(f"<{min(num_extra_values, 3)}Q",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another good catch! glad you added the test case!

…dsr1J.rst

Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
@gpshead gpshead self-assigned this Apr 22, 2024
@jsirois
Copy link
Contributor Author

jsirois commented May 5, 2024

Alrighty @gpshead, I think this is good to go.

Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checked-in files contain mtimes, so they aren't reproducible. A few bytes were different when I re-ran the command.

I don't think this should block the PR.

@encukou encukou self-assigned this May 7, 2024
@encukou
Copy link
Member

encukou commented May 7, 2024

Thanks! Deleting the files and running the test now gives bit-for-bit identical results.

I've added zipimport_data to TESTSUBDIRS in Makefile.

@encukou encukou merged commit 49258ef into python:main May 7, 2024
33 checks passed
@jsirois jsirois deleted the fix-issue-118107 branch May 7, 2024 14:00
SonicField pushed a commit to SonicField/cpython that referenced this pull request May 8, 2024
Add missing import to code that handles too large files and offsets.
Use list, not tuple, for a mutable sequence.

Add tests to prevent similar mistakes.

---------

Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
facebook-github-bot pushed a commit to facebookincubator/cinder that referenced this pull request Apr 5, 2025
Summary:
My fix matches upstream

python/cpython#118108 (comment)

Reviewed By: itamaro

Differential Revision: D72491515

fbshipit-source-id: 8a469cb9c6612997dd42e8a349c947df32a6586a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants