Skip to content

Editable installs don't work with setuptools 47.2.0+ and multi-buildpack #1006

@edmorley

Description

@edmorley

In https://heroku.support/890604 an app using pip editable installs combined with multiple buildpacks started seeing build failures when using newer setuptools.

The output from a reduced testcase looks like:

remote: Traceback (most recent call last):
remote:   File "/app/.heroku/python/bin/demo_say_hello", line 33, in <module>
remote:     sys.exit(load_entry_point('demobuildissue', 'console_scripts', 'demo_say_hello')())
remote:   File "/app/.heroku/python/bin/demo_say_hello", line 22, in importlib_load_entry_point
remote:     for entry_point in distribution(dist_name).entry_points
remote:   File "/app/.heroku/python/lib/python3.8/importlib/metadata.py", line 504, in distribution
remote:     return Distribution.from_name(distribution_name)
remote:   File "/app/.heroku/python/lib/python3.8/importlib/metadata.py", line 177, in from_name
remote:     raise PackageNotFoundError(name)
remote: importlib.metadata.PackageNotFoundError: demobuildissue

Investigating I found that:

The reason this causes breakage in the multi-buildpack case is:

  • the build system runs builds in a different directory (/tmp/build_*) from which the app will be run at runtime (/app)
  • much of Python + its tooling expects it's location not to change after install (ie: it's not relocatable) and as such the Python buildpack has to use several path rewriting hacks to make this work
  • one such hack is rewriting the .pth files added to site-packages for editable installs (installs where the package is effectively symlinked from its source directory rather than being packaged and copied into site-packages) so that the absolute paths inside reference /app instead of the /tmp/... paths of BUILD_DIR
  • this rewrite occurs right at the end of the Python compile step, since at that point it's presumed all Python commands have been run (eg: the bin/post_compile hook has already been called)
  • however if the app is using multiple buildpacks, this leaves the contents of BUILD_DIR in a state that only works at runtime, and not for subsequent buildpacks - which causes issues if it uses say the NodeJS buildpack's heroku-postbuild hook to run a script that calls a Python package installed in editable mode
  • the only reason this issue wasn't seen before, is that the Python buildpack doesn't rewrite the paths in the ez_install .egg-link files that are added to site-packages (meaning they still use the BUILD_DIR paths) and old setuptools used to check the egg-link files in addition to the pth files when performing package resolution, but now it only checks the latter.

The path rewriting in question is here:

# rewrite build dir in egg links to /app so things are found at runtime
find .heroku/python/lib/python*/site-packages/ -name "*.pth" -print0 2> /dev/null | xargs -r -0 -n 1 sed -i -e "s#$(pwd)#/app#" &> /dev/null

None of the solutions are great for this, since the buildpack API doesn't provide a way to "run this script after all other buildpacks, but before the end of the build" (and such a feature probably isn't a good idea anyway).

Ideas so far:

  1. Move Python’s .pth rewriting to be a runtime .profile.d/ script (since it’s not until runtime that we can guarantee that no further buildpacks will run)
  2. Skip the .pth rewriting entirely, and instead use a .profile.d/ script to (at runtime) add a symlink back from /tmp/build*/ to /app, such that any stray references to the build dir “just work”
  3. Continue to use .pth rewriting, but add more bandaids (eg instead of replacing the /tmp/build*/... paths, duplicate them with the /app/... variants, and hope that having the redundant paths doesn’t break anything else). (I checked whether pth supported relative paths but it doesn't appear to.)
  4. Spend some time seeing how much actually breaks if we were to switch builds back to running in /app (though I already know of several buildpacks that will, so this would be a substantial project of feature flags, communication, outreach etc)

Ideally we'd choose (4), but that's not a quick change to make.

GUS-W-7828034.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions