Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Artifact input file permission problems #9651

Closed
2 of 3 tasks
chr-b opened this issue Sep 21, 2022 · 20 comments · Fixed by #10664
Closed
2 of 3 tasks

Artifact input file permission problems #9651

chr-b opened this issue Sep 21, 2022 · 20 comments · Fixed by #10664
Labels
area/executor P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important solution/suggested A solution to the bug has been suggested. Someone needs to implement it. type/bug type/regression Regression from previous behavior (a specific type of bug)

Comments

@chr-b
Copy link

chr-b commented Sep 21, 2022

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

The high level problem is as follows:

  • Workflow step A creates an output artifact
  • Workflow step B consumes the output artifact
  • When the input artifact (in B) is an entire directory instead of just a single file, the permissions of the input artifact directory are messed up. This prevents reading the input artifact, unless executed with the root user.

The problem can be reproduced with the two workflows below.
Note: I have added securityContext only for the reproducible workflow. In my original workflows there is no securityContext. But the container images define a non-root user in the Dockerfile.

Additional context:
Deployed Argo Workflows from the official Helm Chart version 0.19.0. Therefore using version 3.4.0 by default. Problem remains when setting images.tag to latest in the Helm values.yaml file.
The artifact repository is GCP GCS. The workflows are executed with a service account that has the required permissions to access GCS.

Version

v3.4.0

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

The following workflow **works** as expected:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: artifact-passing-ok-
spec:
  entrypoint: overall
  templates:
  - name: overall
    dag:
      tasks:
        - name: step-a
          template: step-a
        - name: step-b
          template: step-b
          depends: "step-a"
          arguments:
            artifacts:
            - name: result
              from: "{{tasks.step-a.outputs.artifacts.result}}"
  - name: step-a
    outputs:
      artifacts:
        - name: result
          path: /tmp/results/a.txt
    script:
      image: debian:bullseye-slim
      command: [bash]
      source: |
        mkdir /tmp/results
        echo "abc" > /tmp/results/a.txt
  - name: step-b
    inputs:
      artifacts:
      - name: result
        path: /tmp/results/a.txt
        #mode: 0644
        #recurseMode: true
    script:
      image: debian:bullseye-slim
      command: [bash]
      source: |
        set -e
        ls -l /tmp/
        ls -l /tmp/results/
        cat /tmp/results/a.txt
      securityContext:
        runAsUser: 1000
        runAsGroup: 3000

The following workflow fails:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: artifact-passing-fail-
spec:
  entrypoint: overall
  templates:
  - name: overall
    dag:
      tasks:
        - name: step-a
          template: step-a
        - name: step-b
          template: step-b
          depends: "step-a"
          arguments:
            artifacts:
            - name: result
              from: "{{tasks.step-a.outputs.artifacts.result}}"
  - name: step-a
    outputs:
      artifacts:
        - name: result
          path: /tmp/results
    script:
      image: debian:bullseye-slim
      command: [bash]
      source: |
        mkdir /tmp/results
        echo "abc" > /tmp/results/a.txt
  - name: step-b
    inputs:
      artifacts:
      - name: result
        path: /tmp/results
        #mode: 0644
        #recurseMode: true
    script:
      image: debian:bullseye-slim
      command: [bash]
      source: |
        set -e
        ls -l /tmp/
        ls -l /tmp/results/
        cat /tmp/results/a.txt
      securityContext:
        runAsUser: 1000
        runAsGroup: 3000


### Logs from the workflow controller

time="2022-09-21T17:56:57.159Z" level=info msg="Processing workflow" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:56:57.168Z" level=info msg="Updated phase  -> Running" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:56:57.168Z" level=info msg="DAG node artifact-passing-fail-dxfjp initialized Running" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:56:57.168Z" level=info msg="All of node artifact-passing-fail-dxfjp.step-a dependencies [] completed" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:56:57.168Z" level=info msg="Pod node artifact-passing-fail-dxfjp-3288406837 initialized Pending" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:56:57.184Z" level=info msg="Created pod: artifact-passing-fail-dxfjp.step-a (artifact-passing-fail-dxfjp-step-a-3288406837)" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:56:57.184Z" level=info msg="TaskSet Reconciliation" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:56:57.184Z" level=info msg=reconcileAgentPod namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:56:57.197Z" level=info msg="Workflow update successful" namespace=argo-workflows phase=Running resourceVersion=195589168 workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:07.184Z" level=info msg="Processing workflow" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:07.185Z" level=info msg="Task-result reconciliation" namespace=argo-workflows numObjs=1 workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:07.185Z" level=info msg="task-result changed" namespace=argo-workflows nodeID=artifact-passing-fail-dxfjp-3288406837 workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:07.186Z" level=info msg="node changed" namespace=argo-workflows new.message= new.phase=Succeeded new.progress=0/1 nodeID=artifact-passing-fail-dxfjp-3288406837 old.message= old.phase=Pending old.progress=0/1 workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:07.186Z" level=info msg="All of node artifact-passing-fail-dxfjp.step-b dependencies [step-a] completed" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:07.186Z" level=info msg="Pod node artifact-passing-fail-dxfjp-3238073980 initialized Pending" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:07.202Z" level=info msg="Created pod: artifact-passing-fail-dxfjp.step-b (artifact-passing-fail-dxfjp-step-b-3238073980)" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:07.203Z" level=info msg="TaskSet Reconciliation" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:07.203Z" level=info msg=reconcileAgentPod namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:07.221Z" level=info msg="Workflow update successful" namespace=argo-workflows phase=Running resourceVersion=195589289 workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.203Z" level=info msg="Processing workflow" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.204Z" level=info msg="Task-result reconciliation" namespace=argo-workflows numObjs=2 workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.204Z" level=info msg="task-result changed" namespace=argo-workflows nodeID=artifact-passing-fail-dxfjp-3238073980 workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.204Z" level=info msg="node changed" namespace=argo-workflows new.message="Error (exit code 2)" new.phase=Failed new.progress=0/1 nodeID=artifact-passing-fail-dxfjp-3238073980 old.message= old.phase=Pending old.progress=0/1 workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.204Z" level=info msg="node unchanged" namespace=argo-workflows nodeID=artifact-passing-fail-dxfjp-3288406837 workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.205Z" level=info msg="Outbound nodes of artifact-passing-fail-dxfjp set to [artifact-passing-fail-dxfjp-3238073980]" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.205Z" level=info msg="node artifact-passing-fail-dxfjp phase Running -> Failed" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.205Z" level=info msg="node artifact-passing-fail-dxfjp finished: 2022-09-21 17:57:17.20509817 +0000 UTC" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.205Z" level=info msg="Checking daemoned children of artifact-passing-fail-dxfjp" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.205Z" level=info msg="TaskSet Reconciliation" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.205Z" level=info msg=reconcileAgentPod namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.205Z" level=info msg="Updated phase Running -> Failed" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.205Z" level=info msg="Marking workflow completed" namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.205Z" level=info msg="Checking daemoned children of " namespace=argo-workflows workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.210Z" level=info msg="cleaning up pod" action=deletePod key=argo-workflows/artifact-passing-fail-dxfjp-1340600742-agent/deletePod
time="2022-09-21T17:57:17.224Z" level=info msg="Workflow update successful" namespace=argo-workflows phase=Failed resourceVersion=195589401 workflow=artifact-passing-fail-dxfjp
time="2022-09-21T17:57:17.256Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo-workflows/artifact-passing-fail-dxfjp-step-a-3288406837/labelPodCompleted
time="2022-09-21T17:57:17.256Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo-workflows/artifact-passing-fail-dxfjp-step-b-3238073980/labelPodCompleted


### Logs from in your workflow's wait container

time="2022-09-21T17:57:12.196Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2022-09-21T17:57:12.196Z" level=info msg="No output parameters"
time="2022-09-21T17:57:12.196Z" level=info msg="No output artifacts"
time="2022-09-21T17:57:12.196Z" level=info msg="GCS Save path: /tmp/argo/outputs/logs/main.log, key: argo-workflows/artifact-passing-fail-dxfjp/artifact-passing-fail-dxfjp-step-b-3238073980/main.log"
time="2022-09-21T17:57:12.354Z" level=info msg="Save artifact" artifactName=main-logs duration=158.361385ms error="<nil>" key=argo-workflows/artifact-passing-fail-dxfjp/artifact-passing-fail-dxfjp-step-b-3238073980/main.log
time="2022-09-21T17:57:12.354Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2022-09-21T17:57:12.354Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2022-09-21T17:57:12.369Z" level=info msg="Create workflowtaskresults 201"
time="2022-09-21T17:57:12.370Z" level=info msg="stopping progress monitor (context done)" error="context canceled"
time="2022-09-21T17:57:12.370Z" level=info msg="Alloc=24976 TotalAlloc=31317 Sys=34770 NumGC=5 Goroutines=10"
time="2022-09-21T17:57:02.384Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/artifacts/result.tgz
time="2022-09-21T17:57:02.384Z" level=info msg="Successfully saved file: /tmp/argo/outputs/artifacts/result.tgz"
time="2022-09-21T17:57:02.384Z" level=info msg="GCS Save path: /tmp/argo/outputs/logs/main.log, key: argo-workflows/artifact-passing-fail-dxfjp/artifact-passing-fail-dxfjp-step-a-3288406837/main.log"
time="2022-09-21T17:57:02.674Z" level=info msg="Save artifact" artifactName=main-logs duration=290.043663ms error="<nil>" key=argo-workflows/artifact-passing-fail-dxfjp/artifact-passing-fail-dxfjp-step-a-3288406837/main.log
time="2022-09-21T17:57:02.674Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2022-09-21T17:57:02.674Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2022-09-21T17:57:02.693Z" level=info msg="Create workflowtaskresults 201"
time="2022-09-21T17:57:02.694Z" level=info msg="stopping progress monitor (context done)" error="context canceled"
time="2022-09-21T17:57:02.694Z" level=info msg="Deadline monitor stopped"
time="2022-09-21T17:57:02.694Z" level=info msg="Alloc=23224 TotalAlloc=48300 Sys=51410 NumGC=6 Goroutines=11"
@sarabala1979
Copy link
Member

@chr-b Can you try on v3.3.9?

@chr-b
Copy link
Author

chr-b commented Sep 24, 2022

Hi @sarabala1979 ,

The workflow artifact-passing-fail- from my examples works when using the v3.3.9 container images.

@stale
Copy link

stale bot commented Oct 15, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is a mentoring request, please provide an update here. Thank you for your contributions.

@stale stale bot added the problem/stale This has not had a response in some time label Oct 15, 2022
@chr-b
Copy link
Author

chr-b commented Oct 25, 2022

This problem still exists with Argo v3.4.2.
This breaks every workflows that uses input artifacts and a non-root user.

@stale stale bot removed the problem/stale This has not had a response in some time label Oct 25, 2022
@sebltm
Copy link

sebltm commented Nov 4, 2022

Can confirm we have the same issue, this is really blocking us

@k-ebu
Copy link

k-ebu commented Nov 11, 2022

same issue with v3.4.3

@PacoDu
Copy link

PacoDu commented Nov 16, 2022

I also confirm the issue with v3.4.3, I cannot use any artifact input directory with a non-root user.

@aneja-arun1
Copy link

aneja-arun1 commented Nov 20, 2022

This is impacting us as well, with v 3.4.3
I am not able to access artifact input directory as that is having permission as follows -
drwx------ root root
No user other than root is able to access these files.

@sarabala1979 sarabala1979 added P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important and removed P3 Low priority labels Nov 28, 2022
@sarabala1979
Copy link
Member

@aneja-arun1 @PacoDu can you uncomment the below lines and try?

        mode: 0644
        recurseMode: true

@chr-b
Copy link
Author

chr-b commented Nov 29, 2022

Hi @sarabala1979 ,
Adding mode and recurseMode to the input file artifact does not resolve the problem.

@dcd000
Copy link

dcd000 commented Dec 21, 2022

Hi all,
Same issue with v3.4.4.
I think the problem is related to #8292
That PR replaces the untar function in workflow/executor/executor.go by an go based implementation that creates the directories with 0o700 permissions (owner only permissions) instead of 0o755 (allow read/exec to group and others)

https://github.com/argoproj/argo-workflows/pull/8292/files#diff-791eed50c295312394166c66addd3b676cc50ba3126730304c1fab2d5cac7a23R836

I have made a test changing the directory permission to 0o755 and it's working now.

@stale
Copy link

stale bot commented Jan 21, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is a mentoring request, please provide an update here. Thank you for your contributions.

@stale stale bot added the problem/stale This has not had a response in some time label Jan 21, 2023
@rab-skybrid
Copy link

Bump. This is still a blocker issure for us.

@stale stale bot removed the problem/stale This has not had a response in some time label Jan 23, 2023
@jrguerrero
Copy link

We have the same problem, we are still waiting a fix to update our environment.

@iainlbc
Copy link

iainlbc commented Jan 24, 2023

Just ran into this as well when trying to read from /src as nonroot user

W
fatal: failed to read object 871b916c4754bccccdd30cad6eac717b9d896cc2: Permission denied
time="2023-01-24T15:03:08.155Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 128

@RenePinnow
Copy link

Same problem here.

@sandeepk8s
Copy link
Contributor

sandeepk8s commented Feb 9, 2023

Same problem here , we did version bump from 3.3.8 to 3.4.4, it's blocking us, may need to go back to 3.3.8

@graillus
Copy link

graillus commented Feb 9, 2023

Had the same problem upgrading to 3.4.5.

For now I'm setting the securityContext in all WorkflowSpec fields to workaround the issue:

spec:
  securityContext:
    runAsUser: 1000
    fsGroup: 1000

Hope it helps

@sandeepk8s
Copy link
Contributor

sandeepk8s commented Feb 10, 2023

v3.4.4 did some testing, Using above example, i tried few (only) modes 700, 755, 644 and without mode (default). Finally i used mode: 700 for our wfs, it worked. Below are screenshots for each case. But the problem is we have lot of workflows in different namespaces. Not sure if there's an easy to default the mode of artifact path 😞

image

@sandeepk8s
Copy link
Contributor

sandeepk8s commented Feb 10, 2023

Thanks @graillus and @dcd000 your comments were helpful in my tests. After trying all methods, below one worked for me. Got it worked by making two changes

  1. Controller configmap, executor section
executor: |
    image: org.com:port/argoproj/argoexec:v3.4.4
    securityContext:
      capabilities:
        drop:
        - ALL
      runAsNonRoot: true
      runAsUser: 1000
  1. Controller configmap, workflow specs
workflowDefaults: | 
    spec:
      securityContext:
        runAsUser: 1000
        runAsNonRoot: true

@caelan-io caelan-io added the solution/suggested A solution to the bug has been suggested. Someone needs to implement it. label Feb 23, 2023
@alexec alexec added the type/regression Regression from previous behavior (a specific type of bug) label Mar 11, 2023
alexec pushed a commit that referenced this issue Mar 11, 2023
Signed-off-by: Sandeep Vagulapuram <sandeeppuram7@gmail.com>
terrytangyuan pushed a commit that referenced this issue Mar 29, 2023
Signed-off-by: Sandeep Vagulapuram <sandeeppuram7@gmail.com>
JPZ13 pushed a commit to pipekit/argo-workflows that referenced this issue Jul 4, 2023
…10664)

Signed-off-by: Sandeep Vagulapuram <sandeeppuram7@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/executor P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important solution/suggested A solution to the bug has been suggested. Someone needs to implement it. type/bug type/regression Regression from previous behavior (a specific type of bug)
Projects
None yet