Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip's subprocesses have different module search path from parent process. #12309

Open
1 task done
haampie opened this issue Oct 2, 2023 · 8 comments
Open
1 task done
Labels
state: needs discussion This needs some more discussion type: feature request Request for a new feature

Comments

@haampie
Copy link

haampie commented Oct 2, 2023

Description

Pip creates subprocesses, such as this one which runs hooks:

python = self.python_executable
self._subprocess_runner(
[python, abspath(str(script)), hook_name, td],
cwd=self.source_dir,
extra_environ=extra_environ
)

The issue is that this subprocess may run with different module search paths than the pip python process itself.

For example, when you run pip under python -S like this:

python -S -m pip install --no-build-isolation ...

the system and user site-packages directories are dropped from sys.path, only PYTHONPATH and standard libs are considered.

However, pip (or, vendored pyproject) then runs hooks by just executing sys.executable without the -S flag, meaning that hooks from system and user site-packages are executed.

There are no environment variables equivalent to -S. Closest is PYTHONNOUSERSITE=1, but that only disables user site-packages, not system site-packages -- it's equivalent to lowercase python -s, not python -S.

Expected behavior

If pip is executed under python -S, also run subprocesses with python -S.

Code of Conduct

@haampie haampie added S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior labels Oct 2, 2023
@haampie
Copy link
Author

haampie commented Oct 2, 2023

FWIW, I don't immediately see public API for knowing whether -S is passed to Python, so maybe it's impossible to know in a portable way.

So, either pip shouldn't run subprocesses, or it should accept some environment variable like PIP_PYTHONFLAGS

@pfmoore
Copy link
Member

pfmoore commented Oct 2, 2023

Pip has to run subprocesses, it's essentially part of the PEP 517 specification. It's possible we should be more explicit about to what extent we "inherit" information from the parent process and to what extent we explicitly configure the subprocess ("the subprocess is not isolated using -S" would be an example of that, reflecting the current behaviour).

But taking a step back, why do you need this behaviour? What actual use case requires that pip copy the state of the -S option from the invoking Python, and why is it not possible to achieve the same results in a different way (for example, by using a clean virtual environment to invoke pip)? Also, to what extent is this just about -S? Are there not equally good arguments for matching other Python configuration values, such as -s, -p, -I, -E, or the various environment variables Python responds to (both copying them to the subprocess, or explicitly removing them from the subprocess, for example PYTHONPATH)? Depending on the use case(s), we might need to consider a more general approach to "invoking Python" than simply running sys.executable - but that's a major change, so there needs to be a good reason to think it's worth it.

In practical terms this seems like a rather niche requirement, given that we've basically had few if any requests for this over the years, so I'd like to see some evidence of how significant it is before putting too much time into designing an interface.

@pfmoore pfmoore added state: needs discussion This needs some more discussion type: feature request Request for a new feature and removed type: bug A confirmed bug or unintended behavior S: needs triage Issues/PRs that need to be triaged labels Oct 2, 2023
@pradyunsg
Copy link
Member

For posterity, I suggested filing an issue about this over at spack/spack#40224.

@pradyunsg
Copy link
Member

And, yea a more concise description of the use case would be appreciated.

@haampie
Copy link
Author

haampie commented Oct 3, 2023

What actual use case requires that pip copy the state of the -S option

In Spack we resolve dependencies ourselves, so we want to avoid pip's "build isolation", since that would make pip do a separate solve + fetch + install of build deps.

So, we use pip install --no-build-isolation, and make build deps detectable through PYTHONPATH.

In principle this would be enough, even if the user has different versions of the same python packages installed in say ~/.local, since PYTHONPATH hash higher precedence.

However, things break when hooks are executed. For example setuptools does something along the lines of:

for ep in importlib.metadata.entry_points(group="setuptools.finalize_distribution_options"):
    ep.load()(self)

This scans all installed packages for entrypoints, even those that aren't dependencies of the package to be installed.

So, this could pick up some completely unrelated package in ~/.local that happens to define the relevant entrypoint.

My hope was that just running python -S -m pip install --no-build-isolation ... was enough to prevent that, but it's not, because -S isn't propagated to subprocesses where those hooks located and executed.

Why is it not possible to achieve the same results in a different way (for example, by using a clean virtual environment to invoke pip)?

That's a workaround we currently implement. If Python itself is not installed by Spack, and may have system site-packages, we create a temporary virtual environment (let's call it a build environment), which drops user and system site-packages also in sub-processes.

But this approach has a major downside, namely that upon installation, generated shebangs point to the temporary build environment, instead of the underlying python, so they have to be patched post-hoc. I don't see any way to prevent that.


Hope that explains things. If you know of another approach that allows us to run pip w/o user/system site-packages search paths, and without the downside of broken shebangs, I'd be interested.

@pfmoore
Copy link
Member

pfmoore commented Oct 3, 2023

So would an option to change the shebang lines work for you? That’s much less difficult, and IIRC it’s been requested before, so would help in more than just this case.

@haampie
Copy link
Author

haampie commented Oct 3, 2023

Yeah, that would work 👍.

Would be much better if pip did this, cause then

  1. pip's standard handling of long shebang lines kicks in
  2. content hashes of generated executables would be correct

Do you know if apart from shebangs there are other instances where pip copies over the sys.executable path into some file?

@pfmoore
Copy link
Member

pfmoore commented Oct 3, 2023

Related: #11483, #6278, #1351 (comment)

Do you know if apart from shebangs there are other instances where pip copies over the sys.executable path into some file?

Probably in scripts in the scripts folder in the wheel - see #10661

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state: needs discussion This needs some more discussion type: feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

3 participants