Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoPull Images in cacheFrom Should not be default behavior #6925

Open
adawalli opened this issue Dec 1, 2021 · 4 comments
Open

AutoPull Images in cacheFrom Should not be default behavior #6925

adawalli opened this issue Dec 1, 2021 · 4 comments

Comments

@adawalli
Copy link

adawalli commented Dec 1, 2021

Expected behavior

PR #1495 introduced pre-pulling behavior as the default when the cacheFrom keyword is used (with no option to disable this, I believe).

In my humble opinion, this is a mistake and honestly can introduce security risks.

Let me explain. If the first or second command of a Dockerfile is something like apt-get update && apt-get dist-upgrade -y, the user might think that they are actually getting the latest security updates periodically. Now, let's say the dockerfile only changes towards the end (changes default shell for example). In skaffold's current behavior, if I want to use cacheFrom to speed up builds, it downloads the latest image, which from a Dockerfile perspective only shows a change on the last line. it's possible that another security update will never be run again until the cache images are invalidated!

The way we previously handled this with our docker builds in gitlab runners was something like this

docker pull <image>:latest
docker image prune -a --force --filter "until=12h"
docker build --cache-from <image>:latest .....

In this way, we would invalidate the cache anytime it had cooled off for more than 12 hours. With the current implementation of skaffold, this isn't really possible.

The following file will always pull the cache even with tryImportMissing: false (which is likely an unrelated command, but confusing all the same)

apiVersion: skaffold/v2beta24
kind: Config
build:
  artifacts:
    - image: myimage
      context: .
      docker:
        cacheFrom:
          - "myimage:latest"
  local:
    useBuildkit: true
    push: false
    concurrency: 0
    tryImportMissing: false

A couple of possible solutions from (my) preferred to least preferred

  • Skaffold remove this "always pull" cacheFrom behavior as the default policy with an option to enable it
  • Skaffold leave the default behavior as is, but allow an option to disable it on a per image basis
  • evaluate tryImportMissing ...and don't import cache files if they are...missing
  • Use Patching to remove cacheFrom stanzas conditionally based on some variable - this is painful and devs are likely going to forget to set variables for this
  • Add an environment variable near the top of the Dockerfiles with the date, that changes every 24 hours, and thus invalidates the cache - this is error prone since devs will forget to put this in their files. Also, we will now waste time pulling a cache image that we might know is invalid.
  • Create a secondary service that intentionally invalidates my cache every night (e.g., docker pull scratch && docker tag scratch <myimage>:latest && docker push <myimage>:latest) - This is getting pretty gross in my opinion
  • Stop using caching completely - Unfortunately, we can't afford to do this...our images are time-consuming to build

Information

  • Skaffold version: v1.35.0
  • Operating system: Ubuntu, 18.04.6 LTS
  • Installed via: skaffold.dev
@adawalli
Copy link
Author

adawalli commented Dec 1, 2021

@dgageot FYI

@adawalli adawalli changed the title AutoPull Images in cacheFrom Should not be default Policy AutoPull Images in cacheFrom Should not be default behavior Dec 1, 2021
@adawalli
Copy link
Author

adawalli commented Dec 2, 2021

@briandealwis I believe this one should be considered a fairly high priority as it potentially introduces poor security postures into folk's images - and they will likely have no idea.

@tejal29
Copy link
Member

tejal29 commented Dec 15, 2021

@adawalli Thanks for the issue and the explaining the security concerns here. I like your top 2 of your preferred options i.e. enable or disable it via config for two reasons

  1. In dev loop, developers do want the subsequent builds to be faster.
  2. For CI/CD purposes, skaffold should always build fresh or users should have an option invalidate cache after X mins.

We will look into this as part of efforts to enable users to securely build software with best security practices.

@tejal29 tejal29 added priority/p1 High impact feature/bug. and removed priority/p2 May take a couple of releases labels Dec 15, 2021
@cyraid
Copy link

cyraid commented May 21, 2024

How's this one coming along? Did something change since? Personally I like the second one as I'm coming up with instances where I'd like to invalidate the cache of only one artifact.

Being able to name an artifact, and reference it like:

artifacts:
- name: api-server
  image: some.registry/url/api-server
- name: web-server
  image: some.registry/url/web-server
skaffold invalidate -p release api-server

I believe you can currently do:

skaffold build -b some.registry/url/api-server -p release --cache-artifacts=false

But who can remember that? ^_^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants