-
-
Notifications
You must be signed in to change notification settings - Fork 33.3k
bpo-45653: Freeze parts of the encodings package #30030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ec2bf3f to
e835a2f
Compare
Python/codecs.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... This should not be necessary. It is already handled by FrozenImporter.find_spec() (in Lib/importlib/_bootstrap.py). If path is not getting set then something went wrong and needs to be fixed.
Is it here to provide a fallback for the config->stdlib_dir == NULL case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is not working right when embedding Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😞
I'll try to take a look this week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I spent some time looking at this today. I take back what I said: "This should not be necessary." The approach you took is probably good enough until we can find a better solution. I plan on troubleshooting the test_embed failures if you don't figure them out first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that sys._stdlib_dir is set to None. This can happen in some embedding scenarios (for now). This is the reason why FrozenImporter.find_spec() doesn't populate encodings.__path__ in the failing tests. It is why _set_encodings_path() is failing.
The problem is that sys._stdlib_dir is set to None. sys._stdlib_dir is set from _Py_GetStdlibDir(), which returns the value calculated by the getpath.c code during runtime init. In some embedding cases that code refuses to extrapolate the stdlib dir, so it ends up None.
FrozenImporter.find_spec() uses sys._stdlib_dir to figured out the encodings.__path__ entry to add. If the stdlib dir is unknown then it doesn't add any. This is also why _set_encodings_path() isn't working.
We have several options:
- extrapolate
sys._stdlib_dirsome other way - fall back to using the non-frozen encodings module
- freeze all "encodings" submodules
(2) effectively accomplishes the same thing as (1), though it doesn't actually update sys._stdlib_dir. Furthermore, we already know it works. (2) also has the benefit of being very simple, since we'd use the normal import machinery unchanged. (Note that (1) and (2) are not guaranteed to find the stdlib dir. However, with (2) that failure mode already exists, so embedders would already have to deal with it.) (3) would get what we want but would make the compiled binary bigger and would add a bunch of noise to make output when building.
So I recommend (2).
(2) involves 2 things: drop _set_encodings_path() here, and update FrozenImporter.find_spec().
diff --git a/Lib/importlib/_bootstrap.py b/Lib/importlib/_bootstrap.py
index afb95f4e1d..4200afae7c 100644
--- a/Lib/importlib/_bootstrap.py
+++ b/Lib/importlib/_bootstrap.py
@@ -941,6 +941,9 @@ def find_spec(cls, fullname, path=None, target=None):
origin=cls._ORIGIN,
is_package=ispkg)
filename, pkgdir = cls._resolve_filename(origname, fullname, ispkg)
+ if ispkg and not pkgdir:
+ # We can't resolve __path__, so Fall back to the path finder.
+ return None
spec.loader_state = type(sys.implementation)(
filename=filename,
origname=origname,|
The question is, do we still need this version, or is Kumar's original #29788 sufficient? I'd like to have a single PR that we can actually merge. :-) |
Signed-off-by: Christian Heimes <christian@python.org> Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
0ce223d to
eef175c
Compare
The PR is inspired by Kumar's PR GH-29788. I extended
_PyCodecRegistry_Initto set theencodings.__path__to[os.path.join(config->stdlib_dir, "encodings)].Signed-off-by: Christian Heimes christian@python.org
Co-authored-by: Kumar Aditya 59607654+kumaraditya303@users.noreply.github.com
https://bugs.python.org/issue45653