Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ErrImageNeverPull with trivy.command = filesystem or rootfs #1978

Open
chary1112004 opened this issue Apr 4, 2024 · 10 comments
Open

ErrImageNeverPull with trivy.command = filesystem or rootfs #1978

chary1112004 opened this issue Apr 4, 2024 · 10 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/backlog Higher priority than priority/awaiting-more-evidence. target/kubernetes Issues relating to kubernetes cluster scanning

Comments

@chary1112004
Copy link

chary1112004 commented Apr 4, 2024

What steps did you take and what happened:

Hi,

We saw there is issue when we configure trivy.command = filesystem or trivy.command = rootfs then sometimes scan job appear status ErrImageNeverPull.

Here is log of scan job

kubectl logs scan-vulnerabilityreport-755cd9546-k7wz6 -n trivy-system
Defaulted container "k8s-cluster" out of: k8s-cluster, 9797c3dc-a05b-4d8c-9e03-537c5348af40 (init), 4c278c3b-6eb8-449d-be86-2111c6f58d38 (init)
Error from server (BadRequest): container "k8s-cluster" in pod "scan-vulnerabilityreport-755cd9546-k7wz6" is waiting to start: ErrImageNeverPull

And this is message when we describe scan pod

...
  containerStatuses:
  - image: k8s.io/kubernetes:1.25.16-eks-508b6b3
    imageID: ""
    lastState: {}
    name: k8s-cluster
    ready: false
    restartCount: 0
    started: false
    state:
      waiting:
        message: Container image "k8s.io/kubernetes:1.25.16-eks-508b6b3" is not present
          with pull policy of Never
        reason: ErrImageNeverPull
...

Any suggestion to resolve this issue would be very much appreciated!

Thanks!

Environment:

  • Trivy-Operator version (use trivy-operator version): 0.18.3
  • Kubernetes version (use kubectl version): 1.25
  • OS (macOS 10.15, Windows 10, Ubuntu 19.10 etc):
@chary1112004 chary1112004 added the kind/bug Categorizes issue or PR as related to a bug. label Apr 4, 2024
@chen-keinan chen-keinan added priority/backlog Higher priority than priority/awaiting-more-evidence. target/kubernetes Issues relating to kubernetes cluster scanning labels Apr 4, 2024
@chen-keinan
Copy link
Contributor

@chary1112004 thanks for reporting this issue, I have never experienced it and I'll have to investigate it and update you

@chen-keinan
Copy link
Contributor

@chary1112004 tried to investigate this, however no luck , I'm unable to reproduce it.

@chary1112004
Copy link
Author

@chen-keinan you have tried to reproduce when deploy trivy to eks?

@chen-keinan
Copy link
Contributor

@chary1112004 nope, but I do not think its related to cloud provider setting, its look like cluster config in a way.

@chary1112004
Copy link
Author

@chen-keinan sorry, I just mean kubernetes

@rknightion
Copy link

I also get this on EKS when using Bottlerocket nodes (no idea if normal AL23 nodes also have it).

@rickymulder
Copy link

rickymulder commented May 20, 2024

Also happens in a disconnected openshift environment.
What I specifically see is the hash matches the tag of the kubelet version...so it's trying to pull a matching image, I just don't understand where it's getting the idea to pull from k8s.io - that's nowhere in my config.

I also have .Values.operator.infraAssessmentScannerEnabled: false, so I don't suspect its the nodeCollector. Any other ideas?

@titansmc
Copy link
Contributor

I am also seeing this, let me know if I can provide any configuration details:

helm upgrade --install trivy-operator aqua/trivy-operator \
  --namespace trivy-system \
  --create-namespace \
  -f values.yaml \
  --version 0.21.4
nodeCollector:
  useNodeSelector: false
#  excludeNodes: node-role.kubernetes.io/control-plane=true
trivy:
  ignoreUnfixed: true
  command: filesystem
operator:
  controllerCacheSyncTimeout: 25m
trivyOperator:
  scanJobPodTemplateContainerSecurityContext:
    runAsUser: 0

@ondrejmo
Copy link

I have the same issue. The cluster is running v1.29.5+k3s1 on Ubuntu 22.04 and Trivy-operator is deployed using:

---

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: trivy-system

resources:
  - trivy-operator.yml
  - https://raw.githubusercontent.com/aquasecurity/trivy-operator/v0.21.1/deploy/static/trivy-operator.yaml

patches:
  - patch: |-
      - op: replace
        path: /data/OPERATOR_METRICS_EXPOSED_SECRET_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_METRICS_CONFIG_AUDIT_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_METRICS_RBAC_ASSESSMENT_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_METRICS_INFRA_ASSESSMENT_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_METRICS_IMAGE_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_METRICS_CLUSTER_COMPLIANCE_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_CONCURRENT_SCAN_JOBS_LIMIT
        value: "3"
    target:
      kind: ConfigMap
      name: trivy-operator-config
  - patch: |-
      - op: replace
        path: /data/trivy.command
        value: "rootfs"
    target:
      kind: ConfigMap
      name: trivy-operator-trivy-config
  - patch: |-
      - op: replace
        path: /data/scanJob.podTemplateContainerSecurityContext
        value: "{\"allowPrivilegeEscalation\":false,\"capabilities\":{\"drop\":[\"ALL\"]},\"privileged\":false,\"readOnlyRootFilesystem\":true,\"runAsUser\":0}"
    target:
      kind: ConfigMap
      name: trivy-operator

@ondrejmo
Copy link

I have the same issue. The cluster is running v1.29.5+k3s1 on Ubuntu 22.04 and Trivy-operator is deployed using:

---

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: trivy-system

resources:
  - trivy-operator.yml
  - https://raw.githubusercontent.com/aquasecurity/trivy-operator/v0.21.1/deploy/static/trivy-operator.yaml

patches:
  - patch: |-
      - op: replace
        path: /data/OPERATOR_METRICS_EXPOSED_SECRET_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_METRICS_CONFIG_AUDIT_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_METRICS_RBAC_ASSESSMENT_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_METRICS_INFRA_ASSESSMENT_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_METRICS_IMAGE_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_METRICS_CLUSTER_COMPLIANCE_INFO_ENABLED
        value: "false"
      - op: replace
        path: /data/OPERATOR_CONCURRENT_SCAN_JOBS_LIMIT
        value: "3"
    target:
      kind: ConfigMap
      name: trivy-operator-config
  - patch: |-
      - op: replace
        path: /data/trivy.command
        value: "rootfs"
    target:
      kind: ConfigMap
      name: trivy-operator-trivy-config
  - patch: |-
      - op: replace
        path: /data/scanJob.podTemplateContainerSecurityContext
        value: "{\"allowPrivilegeEscalation\":false,\"capabilities\":{\"drop\":[\"ALL\"]},\"privileged\":false,\"readOnlyRootFilesystem\":true,\"runAsUser\":0}"
    target:
      kind: ConfigMap
      name: trivy-operator

I did a hard-restart of the cluster (rebooted all nodes, deleted & re-created all pods) and it seems to have fixed the issue for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/backlog Higher priority than priority/awaiting-more-evidence. target/kubernetes Issues relating to kubernetes cluster scanning
Projects
None yet
Development

No branches or pull requests

6 participants