Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: Kaniko 1.7 unstable authentication against GCP Artifact Registry #1893

Open
deedubs opened this issue Jan 25, 2022 · 17 comments
Open
Assignees
Labels
area/registry For all bugs having to do with pushing/pulling into registries priority/p0 Highest priority. Break user flow. We are actively looking at delivering it. regression
Milestone

Comments

@deedubs
Copy link

deedubs commented Jan 25, 2022

Actual behavior
While building several containers against GCP Artifact Registry via skaffold we are getting intermittent authentication failures.

INFO[0000] Retrieving image gcr.io/kaniko-project/executor:v1.5.1@sha256:c6166717f7fe0b7da44908c986137ecfeab21f31ec3992f6e128fff8a94be8a5 from registry gcr.io 
E0124 14:27:12.856809       1 metadata.go:166] while reading 'google-dockercfg-url' metadata: http status code: 404 while fetching url http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg-url
INFO[0000] Built cross stage deps: map[]                
INFO[0000] Retrieving image manifest gcr.io/kaniko-project/executor:v1.5.1@sha256:c6166717f7fe0b7da44908c986137ecfeab21f31ec3992f6e128fff8a94be8a5 
INFO[0000] Returning cached image manifest              
INFO[0000] Executing 0 build triggers                   
INFO[0000] Skipping unpacking as no commands require it. 
INFO[0000] Taking snapshot of full filesystem...        
INFO[0000] Pushing image to us-east4-docker.pkg.dev/******/platform/containers/tools/kaniko:abaee2d 
INFO[0001] Pushed image to 1 destinations               
Building [bases/alpine]...
E0124 14:27:20.443958       1 aws_credentials.go:77] while getting AWS credentials NoCredentialProviders: no valid providers in chain. Deprecated.
	For verbose messaging see aws.Config.CredentialsChainVerboseErrors
error checking push permissions -- make sure you entered the correct tag name, and that you are authenticated correctly, and try again: checking push permission for "us-east4-docker.pkg.dev/******/platform/containers/bases/alpine:abaee2d": creating push check transport for us-east4-docker.pkg.dev failed: GET https://us-east4-docker.pkg.dev/v2/token?scope=repository%3A******%2Fplatform%2Fcontainers%2Fbases%2Falpine%3Apush%2Cpull&service=us-east4-docker.pkg.dev: UNAUTHORIZED: authentication failed

Prior to invoking skaffold we issue:

docker-credential-gcr configure-docker --registries=us-east4-docker.pkg.dev

Expected behavior
We expect pushes continue to work throughout the whole build.

Additional Information

  • Google Internal Case 29377744
  • Kaniko Image (fully qualified with digest) gcr.io/kaniko-project/executor:v1.7.0-debug@sha256:88dacc7ea3f5c04709eae96776693c717869405364b19d6e78850fe54c63c6a2
@imjasonh
Copy link
Collaborator

There have been some bugs with v1.7.0 related to auth, specifically against GCR, that caused us to roll back :latest to point at v1.6.0.

I believe these issues are fixed at head. Until v1.8.0 is out (#1871), could you try your build with the latest commit-tagged image, built from a7425d1, and let me know if that works for you?

gcr.io/kaniko-project/executor:a7425d1fd0442b58dc24698285102176365a28d9@sha256:939b0a1a0aaad97a06db665291ac2270a9abe538af4198000046f743d1e61cd4

If it does, then when v1.8.0 is released you should get the fix (and until then you can use the commit-tagged image)

If not, please let me know so we can find and fix the issue.

@deedubs
Copy link
Author

deedubs commented Jan 25, 2022

Confirmed that our pipelines can build against artifact registry using a7425d1

We'll continue to use the commit tagged image, thanks so much for the quick response!

o/t while bisecting my way from 1.6 to 1.7, I noticed the GCR helpers, is it even necessary to call docker-credential-gcr manually as a pre-step?

@imjasonh
Copy link
Collaborator

I don't think it should be necessary* -- in v1.6.0 and v1.7.0 it was initialized in the Dockerfile (setting up /kaniko/.config/gcloud/docker_credential_gcr_config.json which the helper uses), and at head the cred helpers aren't technically needed in the image since the same logic is embedded in kaniko itself -- it looks for creds available in the environment and will use those even if the cred helpers aren't available or initialized.

So in all cases it should be okay to omit any cred helper initialization pre-step, as far as I know.

*if you test and find out that it is necessary, please let me know!

@deedubs
Copy link
Author

deedubs commented Jan 25, 2022

When I dropped the call, and installation of docker-credential-gcr, I get

time="2022-01-25T20:57:16Z" level=error msg="No matching credentials were found for \"us-east4-docker.pkg.dev\""
time="2022-01-25T20:57:16Z" level=error msg="No matching credentials were found for \"us-east4-docker.pkg.dev\""
time="2022-01-25T20:57:16Z" level=error msg="No matching credentials were found for \"us-east4-docker.pkg.dev\""
time="2022-01-25T20:57:16Z" level=fatal msg="deleting pod: context canceled" subtask=tools/skaffold task=Buil

Note this is being invoked via tekton

steps:
    - name: skaffold-build
      image: gcr.io/k8s-skaffold/skaffold:v1.35.1@sha256:edd5fefb172bb60396fed6b83868cfec38be8083e81b3c1aa8d3ec5cac66c09f
      workingDir: $(workspaces.source.path)
      script: |
        skaffold build \
          --default-repo=us-east4-docker.pkg.dev/$(params.DEFAULT_REPO) \
          --output="{{range \$index, \$artifact := .Builds}}{{if \$index}},{{end}}{{\$artifact.Tag}}{{end}}" \
          --file-output=/tekton/results/IMAGES

@imjasonh
Copy link
Collaborator

Well that's a little surprising to me. 🤔

It works with a step to initialize the cred helper? What's that look like?

And this is with the kaniko executor @main? Or v1.6.0 or 1.7.0?

@deedubs
Copy link
Author

deedubs commented Jan 26, 2022

When we run

docker-credential-gcr configure-docker --registries=us-east4-docker.pkg.dev

And then the skaffold build invocation it works.

This is with the commit tagged version

@nmousouros
Copy link

I thought dockerhub had the issue but apparently, I had authentication issues with :latest tag, I didn't realize that you rollback to 1.6 so I thought dockerhub had the issue but now with the new release of 1.8 we still get the authentication error.

@imjasonh
Copy link
Collaborator

I thought dockerhub had the issue but apparently, I had authentication issues with :latest tag, I didn't realize that you rollback to 1.6 so I thought dockerhub had the issue but now with the new release of 1.8 we still get the authentication error.

The original issue seemed to be reporting issues authenticating with GCR/AR, not Dockerhub. Are you saying you also have issues with Dockerhub now?

In any case, especially where auth is involved, it's useful to tell whether you can successfully authorize a push to your registry using docker push or another similar tool. If that works and Kaniko doesn't, it's a bug in Kaniko.

@nmousouros
Copy link

Yes we do use dockerhub sorry for not saying this clearly.

I cannot say with confidence that this is a bug with kaniko. It happened randomly about the time 1.7 was released then fixed it self which I now think was tagging 1.6 with latest again. We thought it was something with dockerhub. We had it again when 1.8 was released yesterday, most of our pushes are failing but not all and yes we can push with docker push. I know I am vague, If there is more information I could give to help please let me know.

@BarthV
Copy link

BarthV commented Mar 10, 2022

I've hit the same issue here.

  • my build job are running in GKE pods (so they have a generic, unauthorized GCS service account exposed by GKE)
  • I'm mounting a json secret as file inside the container
  • set the env GOOGLE_APPLICATION_CREDENTIALS=/gcp-builder-sa/token.json
  • running echo "{\"credHelpers\":{\"europe-west1-docker.pkg.dev\":\"gcr\"}}" > /kaniko/.docker/config.json
  • Im also using cache feature (on the same gcr repo) and mirror feature over gcr mirror.

using executor:v1.8.0-debug :

error checking push permissions --
make sure you entered the correct tag name, and that you are authenticated correctly, and try again: checking push permission for
   "europe-west1-docker.pkg.dev/foo/bar/mayapp:f64ca23c": creating push check transport for europe-west1-docker.pkg.dev failed:
   GET https://europe-west1-docker.pkg.dev/v2/token?scope=repository%3Afoo%2Fbar%2Fmyapp%3Apush%2Cpull&service=europe-west1-docker.pkg.dev:
   UNAUTHORIZED: authentication failed

using executor:v1.7.0-debug :

WARN[0000] Skip running docker-credential-gcr as user provided docker configuration exists at /kaniko/.docker/config.json
E0310 17:12:37.509233
        18 aws_credentials.go:77] while getting AWS credentials NoCredentialProviders: no valid providers in chain. Deprecated.
	For verbose messaging see aws.Config.CredentialsChainVerboseErrors
	
error checking push permissions -- make sure you entered the correct tag name, and that you are authenticated correctly, and try again: checking push permission for
    "europe-west1-docker.pkg.dev/foo/bar/myapp:39038142": creating push check transport for europe-west1-docker.pkg.dev failed:
    GET https://europe-west1-docker.pkg.dev/v2/token?scope=repository%3Afoo%2Fbar%2Fmyapp%3Apush%2Cpull&service=europe-west1-docker.pkg.dev:
    UNAUTHORIZED: authentication failed

using executor:v1.6.0-debug :

WARN[0000] 
Skip running docker-credential-gcr as user provided docker configuration exists at /kaniko/.docker/config.json
E0310 17:14:13.558756
        18 aws_credentials.go:77] while getting AWS credentials NoCredentialProviders: no valid providers in chain. Deprecated.
	For verbose messaging see aws.Config.CredentialsChainVerboseErrors
INFO[0000] Using dockerignore file: /builds/foo/bar/myapp/.dockerignore 
[...]
INFO[0013] Pushing image to europe-west1-docker.pkg.dev/foo/bar/myapp:1db4ec45 
INFO[0014] Pushing image to europe-west1-docker.pkg.dev/foo/bar/myapp:latest 
INFO[0014] Pushed image to 2 destinations

Is this somehow related ?

@imjasonh
Copy link
Collaborator

@BarthV could you do me a favor, and try this build without your ~/.docker/config.json file mounted in? There were changes since v1.7.0 to compile in the cred helper logic into the Kaniko binary that should pick up your token.json creds when pushing to *-docker.pkg.dev, but they're only checked after the Docker config JSON.

If removing that causes your push to work again, that would be great signal that the cred helper fallback is working, and would give us an option for others facing similar auth issues.

@BarthV
Copy link

BarthV commented Mar 11, 2022

@imjasonh

GOOGLE_APPLICATION_CREDENTIALS ENV set with token.json file path
no ~/.docker/config.json at all

Version Working?
v1.6.0 ✔️
v1.7.0
v1.8.0 ✔️

GOOGLE_APPLICATION_CREDENTIALS ENV set with token.json file path
~/.docker/config.json loaded with unused third-party external credentials (non-gcr, non-credHelper)

Version Working?
v1.6.0
v1.7.0
v1.8.0 ✔️

GOOGLE_APPLICATION_CREDENTIALS ENV set with token.json file path
~/.docker/config.json loaded with gcr credHelpers for target registry

Version Working?
v1.6.0 ✔️
v1.7.0
v1.8.0

So far, it looks good by removing config.json file. It even works when using a file with unused credentials 👍

@ionosphere80
Copy link

I can't get authentication to work with GAR using version 1.8 and any of the methods in the previous post.

@beehivewarrior
Copy link

I also can not get authentication to work with GAR using any of the above.

@deedubs
Copy link
Author

deedubs commented Feb 6, 2023

@beehivewarrior are you using a GKE cluster? You need to ensure your cluster has the oauth scope

@beehivewarrior
Copy link

@deedubs Ah, that makes sense. Thanks!

@aaron-prindle aaron-prindle added priority/p0 Highest priority. Break user flow. We are actively looking at delivering it. regression area/registry For all bugs having to do with pushing/pulling into registries labels May 30, 2023
@jessequinn
Copy link

1.12.1 is doing the same now. I had no issues with GCP GSA and private artifact registry yesterday. First time i am seeing these errors.

@aaron-prindle aaron-prindle added this to the v1.20.0 milestone Nov 29, 2023
@aaron-prindle aaron-prindle self-assigned this Nov 29, 2023
@aaron-prindle aaron-prindle modified the milestones: v1.20.0, v1.21.0 Jan 17, 2024
@aaron-prindle aaron-prindle modified the milestones: v1.21.0, v1.22.0 Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/registry For all bugs having to do with pushing/pulling into registries priority/p0 Highest priority. Break user flow. We are actively looking at delivering it. regression
Projects
None yet
Development

No branches or pull requests

8 participants