Skip to content

bpo-34990: Change pyc headers to use 64-bit timestamps #19651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ammaraskar
Copy link
Member

@ammaraskar ammaraskar commented Apr 22, 2020

This is a more permanent alternative to the compileall issue. As Benjamin originally suggested here in the PEP 552 pull request this changes both the timestamp and size to use 64 bit integers. In the hashed pyc files, we end up padding out the space that would be used with zeroes.

It is possible that I missed a few tests since the larger 8-byte field can masquerade as the two 4-byte fields easily.

As a bonus this also lets us support large source files, though I question if anyone is really compiling larger than 2/4GB python files.

https://bugs.python.org/issue34990

@bmwiedemann
Copy link
Contributor

In practice, none of those fields really need to be 64 bit, as long as the handling of timestamps is done the way it is specified with &0xffff_ffff . And I guess the change makes backporting to existing releases hard. Though we all hope that nobody will use python-3.7 in 18 years, we cannot be certain.

@brettcannon brettcannon removed their request for review April 22, 2020 21:25
@ammaraskar
Copy link
Member Author

handling of timestamps is done the way it is specified with &0xffff_ffff

Sorry I might be missing something, is this specified in a PEP or some document? Or are you referring to the mask in _bootstrap_external.py?

And I guess the change makes backporting to existing releases hard

That's true, I don't particularly mind backporting the changes to 3.7 as I think this qualifies as a bugfix, though the release manager will have final say on that.


I pinged Stephane about their original PR and if they give me the go-ahead on taking it over I can update that for a comparison on which one would be better.

@bmwiedemann
Copy link
Contributor

I learned it from https://bugs.python.org/issue34990#msg327750

I went through git history to find it:
5136ac0c added source_size = source_stats['size'] & 0xFFFFFFFF" for bpo-13645
and for PEP552, this concept was extended to source_mtime in 42aa93b8
though neither mentioned it in text anywhere.

For backporting, compatibility might be a consideration. What happens if such new .pyc files created by python-3.7.x are encountered by a python-3.7.0 ?

@ammaraskar
Copy link
Member Author

I went through git history to find it

Thank you for investigating! I'm speculating but best I can tell the reason & 0xFFFFFFFF was added was to deal with the fact that stat->st_size could be a 64-bit value. The mask likely serves to make sure it fits.

What happens if such new .pyc files created by python-3.7.x are encountered by a python-3.7.0 ?

Aah I see what you mean. Now that I look at it I don't think there's any precedent for backporting changes to the bytecode format mid-release. I think what ends up happening is that because of the MAGIC value change we end up regenerating the pyc files, however this does break pyc only distributions.

I'll leave it to the experts but I don't this can be safely backported.


I've gotten the go ahead from Stephane to continue their work from the previous PR so worst case we can apply that to 3.7 and 3.8

@ammaraskar ammaraskar reopened this Apr 25, 2020
@ammaraskar
Copy link
Member Author

Closing in favor of #19708

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants