-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ArtifactGC fails when workflows are retried #13161
Comments
This looks like a duplicate of #12845, which has a fix out in #13066
Also, v3.5.5 is neither |
All logs etc. are from our v3.5.5 installation, hence why I put it there. I tested
Indeed it does. I have clearly missed it when searching through passed issues for this bug. Thank you so much, happy to hear you are already working on a fix! We are enjoying working with Argo-Workflows, it's a great tool 👏 |
Oh, gotcha, sorry about that then. It might be more clear to write "v3.5.5 +
That's because it was mistakenly filed as a discussion by the user, so it wouldn't pop up on issue search 😅 Also GH didn't unify search for Discussions when they made that feature, so things do get lost and duplicated between them 😕 That's one of the reasons I don't turn on Discussions in my personal repos |
Of course, no worries! I can see how it was not clear to you. I'll keep that in mind in case I report anything in the future 🙂
That makes sense, it's neither the first nor last time this has happened 😅 |
Pre-requisites
:latest
image tag (i.e.quay.io/argoproj/workflow-controller:latest
) and can confirm the issue still exists on:latest
. If not, I have explained why, in detail, in my description below.What happened/what did you expect to happen?
When running a workflow with retries using artifact storage, the artifact garbage collection fails if there are any failed runs. While I haven't done a deep dive into the code, my theory is that the
WorkflowArtifactGCTask
is generated incorrectly. Failed runs have an artifact listed but no storage location - this is only found with the successful run. The artifact gc pod deletes logs and artifacts until an artifact without a storage location is encountered where it produces the error shown from the logs below:Artifact gc task:
Logs from the artifact gc pod:
To my knowledge of how argo workflows work, I'd expect there to be some kind of check/guard on the status of the workflow/container before including its artifacts in garbage collection (although I don't know how that would work with templates producing multiple artifacts)
Version
v3.5.5 - Bitnami helm chart v6.7.2
Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
Logs from the workflow controller
Logs from in your workflow's wait container
The text was updated successfully, but these errors were encountered: