Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea for major (breaking) change: merge cache and launch directories #23

Closed
sclevine opened this issue Oct 29, 2018 · 16 comments
Closed
Assignees

Comments

@sclevine
Copy link
Member

sclevine commented Oct 29, 2018

Note: This proposal has been edited. Comments below it may be outdated. See the edit history for previous versions.

After reviewing the various initial buildpack implementations, I've observed these patterns, anti-patterns, and limitations:

  1. Layers with similar contents tend to be created in both the cache and launch directories.
  2. Buildpacks tend to build abstractions that persist these layers to either or both of the directories depending on whether they are needed by buildpack code or by app code.
  3. Buildpacks use conventions in the build plan to indicate whether a layer should be provided to other buildpacks via the cache or made available at launch-time (see Consider adding launch and build booleans as top-level keys in the build plan #22).
  4. When a layer is present in both cache and launch directories, buildpack code is often complex because of the different possible starting states of the cache, launch, and app directories (metadata vs. no-metadata, cached vs. not cached, vendored vs. not vendored).
  5. In some cases, ensuring that entire layers from the last build are restored may be desirable (see Persistent build metadata #8).
  6. Occasionally, buildpacks need to cache dependencies that shouldn't be accessible to other buildpacks. It seems reasonable to decouple these concepts.
  7. Cached layers often have the same metadata as corresponding launch layers.
  8. It is currently impossible for a /bin/build script to make a symlink from the app directory to a layer that remains unbroken during both build and launch. Separately, relative symlinks from layers to the app directory are easy to break when copying layer directories between cache and launch dirs.

Proposal

Instead of providing separate launch and cache directories to each buildpack, we could provide a single <layers> directory with the following extra, top-level fields in each <layer>.toml:

launch = false # layer is available at launch time (default: false)
build = false # layer is available to subsequent buildpacks (default: false)
cache = false # restore layer contents on the next build (default: false)
persist-cache = false # guarantee that layer contents are recovered (default: false, ignored if !cache)

[metadata] # all user-provided metadata now nested here

Rules:

  • Metadata written to <layer>.toml is always restored.
  • If launch && cache && !persist-cache, but the local layer contents do not match the remote layer contents, then the cached local layer is deleted before the build and the remote metadata is provided. This ensures that recovered local layers are never out of sync with remote layers.
  • If !launch && cache && persist-cache, the build fails unless/until we decide to support a persistent cache that isn't used in the remote image.
  • If a layer changes from launch to !launch, then the remote layer is deleted.
  • A platform may choose to cache layers locally when cache && persist-cache as long as the cached layers are only restored when they are identical to the remote launch layers.
  • To guarantee consistent behavior between builds, !build layers should always be moved such that they are inaccessible to subsequent buildpacks.

The combined <layers> directory would continue to look like this:

my.buildpack.id/my-layer/              # directory contents
my.buildpack.id/my-layer.toml          # metadata for my-layer
...

The new interface would be:
Executable: /bin/build <platform[AR]> <layers[EI]>, Working Dir: <app[AI]>

Input Description
/dev/stdin Build plan from detection (TOML)
<platform>/env/ User-provided environment variables for build
<platform>/# Platform-specific extensions
Output Description
[exit status] Success (0) or failure (1+)
/dev/stdout Logs (info)
/dev/stderr Logs (warnings, errors)
<layers>/launch.toml Launch metadata (see File: launch.toml)
<layers>/<layer>.toml Layer content metadata
<layers>/<layer>/bin/ Binaries for subsequent buildpacks and/or launch
<layers>/<layer>/lib/ Shared libraries for subsequent buildpacks and/or launch
<layers>/<layer>/profile.d/ Scripts sourced by bash before launch
<layers>/<layer>/include/ C/C++ headers for subsequent buildpacks
<layers>/<layer>/pkgconfig/ Search path for pkg-config for subsequent buildpacks
<layers>/<layer>/env/ Env vars for launch/build, set before env.build or env.run
<layers>/<layer>/env.build/ Env vars for set for subsequent buildpacks
<layers>/<layer>/env.run/ Env vars for set before profile.d scripts are sourced
<layers>/<layer>/* Other content for subsequent buildpacks and/or launch

READ THIS: New Behavior Introduced:

  • Clearing the cache when the local layer contents don't match the remote layer contents means that "stale" launch layers are never re-used. For example, this means that a Node.js build that jumps back and forth between two VMs would sometimes need to rebuild the node modules from scratch, even if they are cached on both VMs. This means that buildpacks need to perform objectively fewer metadata comparisons for the same effect (not just less copying). However, it is less efficient. I'm okay with this behavior change, because the previous behavior is easy to replicate with two layers if necessary (and it requires the exact same logic). In addition, this new behavior is safer and more similar to the current v2a/b globally-persistent buildpack cache behavior. Another way to understand this change is "locally-recovered launch layers must always match the previous build."
  • persist-cache requires downloading layers from the registry before the build, but provides a guaranteed, globally-persistent cache.
  • The <layers>/<layer>/env/ directory is split into <layers>/<layer>/env/build/ and <layers>/<layer>/env/run/ to make the behavior of the directory more clear and to provide a safer, more convenient way to set runtime environment variables.

Advantages:

  • Buildpack code would be less complex
  • Paths to the same dependency would always be the same
  • Symlinks from the app dir to the layer would remain valid for build and launch
  • Relative symlinks from the layers to the app dir would remain valid for build and launch
  • Local disk usage would be reduced by as much as 50%
  • Possibly safer due to guarantee that data intended for launch always matches the previous build

Disadvantages:

  • Slightly more TOML writing needed (when providing dependencies to subsequent buildpacks)
  • More complicated for the lifecycle to implement
  • Less efficient when a "cached-launch" layer is used instead of a separate cache and launch layers because "stale" launch layers are never re-used
  • Less efficient if buildpacks use persist-cache due to registry downloading

Possible Extensions:

  • We could allow !launch && cache && persist-cache in the future using a dedicated image repository (or tag in the same repository). This would be entirely backwards-compatible when introduced.

Thoughts? If desired, we should make this change before ratifying v3.0.0 of the specification.

@nebhale @jkutner @hone @ekcasey @dgodd @jchesterpivotal @ameyer-pivotal

@dgodd
Copy link

dgodd commented Oct 29, 2018

I am very excited about the simplification this allows for buildpack authors, and removing their code to copy between cache and launch layers.

I don't personally object to cache-from = "launch", but was surprised at your including it, since:

  1. You have previously appeared to think it had terrble performance characteristics.
  2. Even without cache-from = "launch", this spec appears to include all of the abilities of the current specification but with easier use for buildpack authors

My feeling is to try and get some benchmarks for performance of cache-from = "launch" and consider that tiny portion of this spec based on that.

Everything else looks great to me.

@sclevine
Copy link
Member Author

I don't personally object to cache-from = "launch", but was surprised at your including it, since:

  1. You have previously appeared to think it had terrble performance characteristics.
  2. Even without cache-from = "launch", this spec appears to include all of the abilities of the current specification but with easier use for buildpack authors

My concern is that wide use of this functionality would unnecessarily increase build time and data transfer. Currently, builds don't have any mandatory network requirements. If buildpack authors use this feature often, FS layer downloads from the registry would be required in many or most scenarios before a build can start. However, considering:

A platform may choose to cache layers locally when cache-from = "launch" as long as the cached layers are only restored when they are identical to the remote launch layers.

as well as requests for the functionality in certain cases (#8, as well as the need to preserve node_modules in the app directory for node.js buildpacks), I'm convinced we should keep it as an option.

@sclevine
Copy link
Member Author

Note: I removed the build plan changes due to feedback pointing out that it wouldn't work without adding a provider key to each build plan entry. See the edit history for the original version.

@djoyahoy
Copy link
Member

djoyahoy commented Oct 29, 2018

This would be a welcome change, especially when working on buildpacks for interpreted languages, such as nodejs, where there are multiple starting states and edge cases to handle: vendored vs unvendored node_modules, location of the package.json, how to handle the npm-cache, etc.

In practice, there is a lot of minutiae (and many code branches... and many potential bugs/pitfalls) in determining where and when to place dependencies, all while trying to gain an understanding of the V3 spec as a new buildpack author. Any complexity we can push into the platform while maintaining functionality (our cache and launch layers almost always end up the same) is useful.

@sclevine
Copy link
Member Author

New proposal for the build plan: #24
(This is a workable alternative to the build plan changes that were proposed here.)

@dgodd
Copy link

dgodd commented Oct 30, 2018

As @djoyahoy mentioned this simplifies things immensely for buildpacks for interpreted languages. I recently created a ruby buildpack to be used for a blog post about how to create buildpacks. In doing so I created the same buildpack multiple times from scratch to help myself explain the buildpack to my co-author and even with that we had bugs in our code related to copying things (eg. ruby) between the cache dir and the launch dir.

Combining the directories would remove about half (of my admittedly simplistic) buildpack. (https://github.com/dgodd/ruby-cnb-buildpack)

I also found it very hard to explain to my co-author, or in the blog post why we had to copy directories around so much, and why that was better than not doing so.

One of my fears is that with the current copy around approach, almost no buildpacks will correctly implement the ability for later buildpacks to use their interpreter during staging, either due to incorrect copying (or not) of directories, or environment variables that point to launch directories in both staging and launch (and ofcourse the launch directory will often not be there during staging)

@sclevine
Copy link
Member Author

Updated to reflect 10/30/2018 WG discussion. Note the "Behavior Changes" section for the complexity/performance trade-offs.

Please leave feedback, or 👍 if you would like this change to be integrated into the spec and lifecycle.

@sclevine sclevine self-assigned this Oct 31, 2018
@nebhale
Copy link
Contributor

nebhale commented Oct 31, 2018

If we're going ahead with the combination of layers, then we should move the export fields into the <layer>/<layer>.toml file. No need for a proliferation of files generally.

@sclevine
Copy link
Member Author

@nebhale updated

@nebhale
Copy link
Contributor

nebhale commented Oct 31, 2018

We currently have two ways to specify environment variables depending on whether the layer is used for build or launch. Given that many times the same environment variables need to be set in both cases ($JAVA_HOME, $JAVA_OPTS) should these be treated as two different groups of environment variables?

If they should be different, should there be two different ways of specifying those variables, or should the .profile.d style (a super-set of the /env/<KEY> style) continue to exist?

@sclevine
Copy link
Member Author

sclevine commented Nov 1, 2018

should these be treated as two different groups of environment variables?

Given that this proposal allows a layer to be used for both build and launch, I think that layer will need the ability to specify separate build-time and run-time environment variables. I think it's worth exploring whether we should have a shortcut for setting them at the same time.

If they should be different, should there be two different ways of specifying those variables, or should the .profile.d style (a super-set of the /env/ style) continue to exist?

profile.d is useful for dynamic modification of the environment at runtime. For example, some APM integrations use it to manipulate service binding credentials in CF. It's not useful at build-time, because the build environment doesn't change between buildpacks.

I went ahead and split <layers>/<layer>/env/ into <layers>/<layer>/env/build/ and <layers>/<layer>/env/run/. This should make the behavior of the env directory more clear (probably important given that layers can be dual-purpose). It should also provide an alternative, safer way to set runtime environment variables. Additionally, this will let us safely introduce <layers>/<layer>/env/both/ if we decide it's useful.

@sclevine
Copy link
Member Author

sclevine commented Nov 9, 2018

Update: After considering the different combinations of options and applying to them to various buildpack use cases, I think we can reduce the four options to just three:

  1. launch = true - for launch
  2. build = true - for subsequent buildpacks
  3. cache = true - recover layer from last build

Where:

  • If launch && cache, the layer is always restored (either from the VM or remote layer)
  • If build && cache, the layer is the last copy available locally
  • If launch && !cache, only metadata is restored
  • If build && !cache, the layer is not kept

This should further reduce buildpack complexity.

@jkutner
Copy link
Member

jkutner commented Nov 9, 2018

@sclevine that's a nice improvement. Are those the defaults you're proposing? Do we want to default cache to true though? Certainly the current behavior of the spec (and classic buildpacks) is not to do this. If something needs to be explicitly cached, it not much effort to add cached = true to the metadata.

@jkutner
Copy link
Member

jkutner commented Nov 9, 2018

@sclevine What is the use case for build = false? Does build = false affect any of those examples you provided that don't reference use build?

@sclevine
Copy link
Member Author

sclevine commented Nov 9, 2018

Sorry, didn't mean to suggest defaults yet. Defaults could be all false (layer is just a temporary storage location for the current buildpack), or just launch = true.

  • build = false would mean that the layer is not visible to subsequent buildpacks
  • launch = false means that layer is not included in the final image

@jkutner
Copy link
Member

jkutner commented Nov 10, 2018

@sclevine i understand the mechanics of build = false. but i'm wondering when that would be useful. i suspect it's less about visibility and more about the env vars or bins on the PATH.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants