Description
Hi,
in Debian we backported the changes from #27926 to cpython 3.10 and indeed most pyc files that we generate when installing python packages are now bit-by-bit reproducible. There is only a single pyc file that still sometimes leads to a different pyc file after installation: /usr/lib/python3.10/json/__pycache__/decoder.cpython-310.pyc
. I tracked down the reason to its use of the variable name _m
and filed this as Debian bug 1010368. More specifically, that pyc file will have one content in 1/3 cases and another in 2/3 of the cases. The diff between the pyc files is:
@@ -1,8 +1,8 @@
00000000: 6f0d 0d0a 0300 0000 5371 fe33 17b6 dd59 o.......Sq.3...Y
00000010: e300 0000 0000 0000 0000 0000 0000 0000 ................
00000020: 0001 0000 0040 0000 0073 0800 0000 6500 .....@...s....e.
-00000030: 0100 6400 5300 2901 4e29 01da 025f 6da9 ..d.S.).N)..._m.
-00000040: 0072 0200 0000 7202 0000 00fa 0f2f 746d .r....r....../tm
+00000030: 0100 6400 5300 2901 4e29 015a 025f 6da9 ..d.S.).N).Z._m.
+00000040: 0072 0100 0000 7201 0000 00fa 0f2f 746d .r....r....../tm
00000050: 702f 6465 636f 6465 722e 7079 da08 3c6d p/decoder.py..<m
00000060: 6f64 756c 653e 0100 0000 7302 0000 0008 odule>....s.....
00000070: 00
I can make this problem trigger on a different variable name than _m
via the following patch:
--- a/Lib/types.py
+++ b/Lib/types.py
@@ -37,8 +37,8 @@ _ag = _ag()
AsyncGeneratorType = type(_ag)
class _C:
- def _m(self): pass
-MethodType = type(_C()._m)
+ def _b(self): pass
+MethodType = type(_C()._b)
BuiltinFunctionType = type(len)
BuiltinMethodType = type([].append) # Same as BuiltinFunctionType
With that patch, python files containing the variable name _b
are now sometimes unreproducible.
I don't seem to be the first who stumbled across the _m
variable: #78903 (comment)
In the Debian bug I referenced above, Chris Lamb states, that there is no semantic difference between the different pyc files and Keith Amling points out, that the difference is FLAG_REF
being added or not.
Since pyc files containing the variable _m
are only sometimes unreproducible on new python installations but stable on the same installation, this might as well be a python packaging bug in Debian or maybe we need to backport more than just #27926 to 3.10 to make pyc files stable across different installations. I thus wanted to ask you for any input or advice you can give on this issue.
Thanks!