Description
Bug report
Currently the marshal
module will emit a previously-unseen object flagged as a potential reference from later objects, unless the object has a reference count of 1. See
Line 305 in b27b57c
This is an overly-conservative heuristic -- it's easy to construct cases where an object has a reference count >1 but is not actually referenced by any other object about to be marshaled, so FLAG_REF
is set when it does not need to be.
This makes marshal output unstable depending on accidents of reference counting behavior in the code calling marshal.dumps
.
I ran into this because the Cinder JIT is able to reduce unnecessary increfs, and that resulted in some importlib tests failing on comparison of marshal output at
cpython/Lib/test/test_importlib/test_abc.py
Lines 870 to 871 in b27b57c
code_object
in that method is 1.
This previously caused issues in distutils reproducibility, resulting in a partial fix that applies only to interned strings: #8226
It would be better if marshal would actually determine which objects have multiple parents in the DAG and deterministically use FLAG_REF
or not based on that.