-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Closed
Copy link
Labels
Description
In https://heroku.support/890604 an app using pip editable installs combined with multiple buildpacks started seeing build failures when using newer setuptools.
The output from a reduced testcase looks like:
remote: Traceback (most recent call last):
remote: File "/app/.heroku/python/bin/demo_say_hello", line 33, in <module>
remote: sys.exit(load_entry_point('demobuildissue', 'console_scripts', 'demo_say_hello')())
remote: File "/app/.heroku/python/bin/demo_say_hello", line 22, in importlib_load_entry_point
remote: for entry_point in distribution(dist_name).entry_points
remote: File "/app/.heroku/python/lib/python3.8/importlib/metadata.py", line 504, in distribution
remote: return Distribution.from_name(distribution_name)
remote: File "/app/.heroku/python/lib/python3.8/importlib/metadata.py", line 177, in from_name
remote: raise PackageNotFoundError(name)
remote: importlib.metadata.PackageNotFoundError: demobuildissue
Investigating I found that:
- between setuptools v47.1.1 and 47.2.0 the wrapper scripts created for editable-install packages (only) were refactored to use a different module for resolving the target package (in Decrease start-up time of editable-installed entry points on newer versions of Python pypa/setuptools#2194)
- this new module no longer looks at the
.egg-link
files insite-packages
, and only the standard Python import paths (incl.pth
files)
The reason this causes breakage in the multi-buildpack case is:
- the build system runs builds in a different directory (
/tmp/build_*
) from which the app will be run at runtime (/app
) - much of Python + its tooling expects it's location not to change after install (ie: it's not relocatable) and as such the Python buildpack has to use several path rewriting hacks to make this work
- one such hack is rewriting the
.pth
files added tosite-packages
for editable installs (installs where the package is effectively symlinked from its source directory rather than being packaged and copied into site-packages) so that the absolute paths inside reference/app
instead of the/tmp/...
paths ofBUILD_DIR
- this rewrite occurs right at the end of the Python compile step, since at that point it's presumed all Python commands have been run (eg: the
bin/post_compile
hook has already been called) - however if the app is using multiple buildpacks, this leaves the contents of
BUILD_DIR
in a state that only works at runtime, and not for subsequent buildpacks - which causes issues if it uses say the NodeJS buildpack's heroku-postbuild hook to run a script that calls a Python package installed in editable mode - the only reason this issue wasn't seen before, is that the Python buildpack doesn't rewrite the paths in the ez_install
.egg-link
files that are added tosite-packages
(meaning they still use theBUILD_DIR
paths) and old setuptools used to check the egg-link files in addition to the pth files when performing package resolution, but now it only checks the latter.
The path rewriting in question is here:
heroku-buildpack-python/bin/steps/eggpath-fix2
Lines 3 to 4 in bce5bf4
# rewrite build dir in egg links to /app so things are found at runtime | |
find .heroku/python/lib/python*/site-packages/ -name "*.pth" -print0 2> /dev/null | xargs -r -0 -n 1 sed -i -e "s#$(pwd)#/app#" &> /dev/null |
None of the solutions are great for this, since the buildpack API doesn't provide a way to "run this script after all other buildpacks, but before the end of the build" (and such a feature probably isn't a good idea anyway).
Ideas so far:
- Move Python’s .pth rewriting to be a runtime
.profile.d/
script (since it’s not until runtime that we can guarantee that no further buildpacks will run) - Skip the .pth rewriting entirely, and instead use a
.profile.d/
script to (at runtime) add a symlink back from/tmp/build*/
to/app
, such that any stray references to the build dir “just work” - Continue to use .pth rewriting, but add more bandaids (eg instead of replacing the
/tmp/build*/...
paths, duplicate them with the/app/...
variants, and hope that having the redundant paths doesn’t break anything else). (I checked whether pth supported relative paths but it doesn't appear to.) - Spend some time seeing how much actually breaks if we were to switch builds back to running in
/app
(though I already know of several buildpacks that will, so this would be a substantial project of feature flags, communication, outreach etc)
Ideally we'd choose (4), but that's not a quick change to make.