-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: fix missing artifacts for stopped workflows. Fixes #12401 #12402
fix: fix missing artifacts for stopped workflows. Fixes #12401 #12402
Conversation
Signed-off-by: Garett MacGowan <garettsoftware@gmail.com>
…tputs in status is now considered. Signed-off-by: Garett MacGowan <garettsoftware@gmail.com>
Signed-off-by: Garett MacGowan <garettsoftware@gmail.com>
Signed-off-by: Garett MacGowan <garettsoftware@gmail.com>
Thanks for being on top of any new regressions that may have been caused by your PR! I glanced the other day but then it sounded like you were saying that it was a UI issue. From going through that Issue now, it looks like this is not a direct fix to that issue, right? In that case, do you want to remove "Fixes" from your description and instead say something like "Related to"? Otherwise, this will automatically close that Issue. |
These are two separate issues! #12331 must have been caused by another PR. I fixed that issue in #12397. This PR fixes #12401, which occurs when you stop a workflow. My previous PR (#11947) didn't guarantee that artifacts would get archived for stopped workflows, although it fixed the garbage collection. |
Oh, okay! |
workflow/controller/operator.go
Outdated
@@ -802,7 +802,7 @@ func (woc *wfOperationCtx) persistUpdates(ctx context.Context) { | |||
|
|||
func (woc *wfOperationCtx) checkTaskResultsCompleted() bool { | |||
taskResultsCompleted := woc.wf.Status.GetTaskResultsCompleted() | |||
if len(taskResultsCompleted) == 0 { | |||
if len(taskResultsCompleted) == 0 && woc.wf.Status.Outputs != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem like we want to make sure there are actually Outputs.
But now it makes me wonder, when do the Outputs get set? Is it possible to arrive here when Outputs==nil
because they haven't been set yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juliev0 Happy Holidays!
Outputs are first initialized and populated (at a node level) in taskResultReconciliation()
. Outputs are subsequently initialized and populated to the global scope wf.Status.Outputs
in woc.podReconciliation()
via woc.addOutputsToGlobalScope(newState.Outputs)
, assuming Parameters
or Artifacts
are defined for any node. Therefore, Outputs==nil
when either woc.TaskResultReconciliation()
hasn't been run yet, or Parameters
or Artifacts
aren't defined for any node in the workflow.
As long as checkTaskResultsCompleted()
has been run after taskResultReconciliation()
, && woc.wf.Status.Outputs != nil
is meaningful.
checkTaskResultsCompleted()
is used in two places:
- Before
woc.garbageCollectArtifacts(ctx)
(aftertaskResultReconciliation()
)
// Do artifact GC if all task results are completed.
if woc.checkTaskResultsCompleted() {
if err := woc.garbageCollectArtifacts(ctx); err != nil {
woc.log.WithError(err).Error("failed to GC artifacts")
return
}
} else {
woc.log.Debug("Skipping artifact GC")
}
- Before
woc.deletetaskResults(ctx)
as part ofpersistUpdates(ctx)
(called in conjunction withwoc.wf.Status.Phase.Completed()
, which ensures that the workflow has either succeeded, failed, or errored, and therefore aftercheckTaskResultsCompleted()
has had a chance to run).
// Make sure the TaskResults are incorporated into WorkflowStatus before we delete them.
if woc.wf.Status.Phase.Completed() && woc.checkTaskResultsCompleted() {
if err := woc.deleteTaskResults(ctx); err != nil {
woc.log.WithError(err).Warn("failed to delete task-results")
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy holidays!
Trying to follow. So, based on your statement about how "Outputs are subsequently initialized and populated to the global scope wf.Status.Outputs
in woc.podReconciliation()
", doesn't that mean that Outputs
could also be nil in the case that taskResultReconciliation()
had been run but podReconciliation()
had not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I meant to say:
- Before
woc.deletetaskResults(ctx)
as part ofpersistUpdates(ctx)
(called in conjunction withwoc.wf.Status.Phase.Completed()
, which ensures that the workflow has either succeeded, failed, or errored, and therefore afterwoc.podReconciliation()
has had a chance to run).
I missed the obvious (case 3), which I just introduced, inside markWorkflowPhase
, which is called at the end of the operator loop under various conditions.
switch phase {
case wfv1.WorkflowSucceeded, wfv1.WorkflowFailed, wfv1.WorkflowError:
// wait for all daemon nodes to get terminated before marking workflow completed & make sure task results are complete as well.
if phase.Completed() && !woc.hasDaemonNodes() && woc.checkTaskResultsCompleted() {
woc.log.Info("Marking workflow completed")
Good catch, yes, you are right. The case you identified can happen for woc.garbageCollectArtifacts(ctx)
(case 1), which gets called right after woc.taskResultReconciliation()
, but before woc.podReconciliation()
. I think a race condition could occur where the taskResultInformer
hasn't received an update yet, and therefore garbage collection is allowed to happen because wf.Status.Outputs == nil.
Since woc.wf.Status.Outputs != nil
is meant to check that outputs are in global scope (for archiving purposes), maybe it's best that we pull that logic up to if phase.Completed() && !woc.hasDaemonNodes() && woc.checkTaskResultsCompleted() {
.
For example:
switch phase {
case wfv1.WorkflowSucceeded, wfv1.WorkflowFailed, wfv1.WorkflowError:
taskResultsCompleted := woc.checkTaskResultsCompleted()
// wait for all daemon nodes to get terminated before marking workflow completed & make sure task results are complete as well.
if phase.Completed() && !woc.hasDaemonNodes() && (taskResultsCompleted || (!taskResultsCompleted && woc.wf.Status.Outputs == nil)) {
woc.log.Info("Marking workflow completed")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above doesn't work. I'm thinking through it and will propose another solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juliev0 I suggest the following changes. Notice the new considerStatusOutputs bool
parameter for checkTaskResultsCompleted
:
func (woc *wfOperationCtx) checkTaskResultsCompleted(considerStatusOutputs bool) bool {
taskResultsCompleted := woc.wf.Status.GetTaskResultsCompleted()
if considerStatusOutputs && woc.wf.Status.Outputs != nil && len(taskResultsCompleted) == 0 {
return false
}
if !considerStatusOutputs && len(taskResultsCompleted) == 0 {
return false
}
for _, completed := range taskResultsCompleted {
if !completed {
return false
}
}
return true
}
Case 1)
// Do artifact GC if all task results are completed.
if woc.checkTaskResultsCompleted(false) {
if err := woc.garbageCollectArtifacts(ctx); err != nil {
woc.log.WithError(err).Error("failed to GC artifacts")
return
}
} else {
woc.log.Debug("Skipping artifact GC")
}
Case 2)
// Make sure the TaskResults are incorporated into WorkflowStatus before we delete them.
if woc.wf.Status.Phase.Completed() && woc.checkTaskResultsCompleted(true) {
if err := woc.deleteTaskResults(ctx); err != nil {
woc.log.WithError(err).Warn("failed to delete task-results")
}
}
Case 3)
switch phase {
case wfv1.WorkflowSucceeded, wfv1.WorkflowFailed, wfv1.WorkflowError:
// wait for all daemon nodes to get terminated before marking workflow completed & make sure task results are complete as well.
if phase.Completed() && !woc.hasDaemonNodes() && woc.checkTaskResultsCompleted(true) {
woc.log.Info("Marking workflow completed")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, going back....let's start from "what was the original intention of this new check?" To handle which use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juliev0 If we don't do this check, the operator will hang in marking the workflow as completed if the workflow has no nodes with outputs because woc.checkTaskResultsCompleted
will return false if len(taskResultsCompleted) == 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(taskResultsCompleted) == 0
can happen if there are no outputs for any nodes, so we can't always say that task results have not completed if len(taskResultsCompleted) == 0
. If wf.Status.Outputs
is defined and len(taskResultsCompleted) == 0
, then we can say that task results have not completed.
If wf.Status.Outputs
is defined (outputs or parameters do exist in at least one node), and taskResultsComplete
is an empty map, then we know that the wait containers aren't done executing.
The new usage of woc.checkTaskResultsCompleted(true)
in case wfv1.WorkflowSucceeded, wfv1.WorkflowFailed, wfv1.WorkflowError:
is to prevent the workflow from archiving before the status is updated with the outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juliev0 I just updated my last two comments since I did a really poor job in answering the first time.
workflow/controller/operator.go
Outdated
@@ -802,7 +802,7 @@ func (woc *wfOperationCtx) persistUpdates(ctx context.Context) { | |||
|
|||
func (woc *wfOperationCtx) checkTaskResultsCompleted() bool { | |||
taskResultsCompleted := woc.wf.Status.GetTaskResultsCompleted() | |||
if len(taskResultsCompleted) == 0 { | |||
if len(taskResultsCompleted) == 0 && woc.wf.Status.Outputs != nil { | |||
return false | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, do you mind if we first go back to the original code and talk about the case you were trying to take care of with if len(taskResultsCompleted) == 0 {
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, maybe we should rename taskResultsCompleted
to something like taskResultCompletionStatus
since it sounds like we're talking about the number completed here but we're not.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, we're looking at the case in which no taskresults have ever even been initialized as incomplete, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we could probably do with a rename.
Yes, we're looking at the case where no task results have been initialized as incomplete yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the case we're concerned about, or is it more generally speaking the case in which a task result would be initialized in the future and hasn't been yet? As far as I can tell, it's once a Pod exists that a new task result will get added? So, wouldn't it be the same case if say this map has one entry and it's completed, but another task result simply hasn't been initialized yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💩 Yes, you're right. We should be handling that. I'm going to think on this a bit more & try to come up with something better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've flipped the way I'm looking at this and reformulated the check as checkTaskResultsInProgress
. When a task result hasn't completed yet, it is added to a TaskResultsInProgress
map. Once it is completed, it is removed from the map.
func (woc *wfOperationCtx) checkTaskResultsInProgress() bool {
taskResultsInProgress := woc.wf.Status.GetTaskResultsInProgress()
if len(taskResultsInProgress) != 0 {
return true
}
return false
}
In all three cases, !woc.checkTaskResultsInProgress()
is called. All tests seem to be passing & the code is much easier to understand. I'm re-running the test script from my last PR to ensure the garbage collection race condition doesn't exist. #11947 (comment).
Assuming all the tests pass locally, I'll push and you can review @juliev0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juliev0 The case of "no task results have been initialized as incomplete yet" doesn't actually seem to be an issue based on my testing (with the new approach).
… 12401. Signed-off-by: Garett MacGowan <garettsoftware@gmail.com>
Signed-off-by: Garett MacGowan <garettsoftware@gmail.com>
workflow/controller/operator.go
Outdated
@@ -240,7 +240,7 @@ func (woc *wfOperationCtx) operate(ctx context.Context) { | |||
woc.taskResultReconciliation() | |||
|
|||
// Do artifact GC if all task results are completed. | |||
if woc.checkTaskResultsCompleted() { | |||
if !woc.checkTaskResultsInProgress() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to check if Phase.Completed()
for this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The artifact gc strategy protects us for this case. We could conceivably add this though. It would make things a bit more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plus, if we have common logic between this and the other calls, we might be able to have a common function, which would be nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually kind of thinking about a common function which checks for the Workflow being completed, plus compares the number of pods that were run to the number of task results that got incorporated into the WorkflowStatus.Outputs
, which could be called Workflow.ReconciliationComplete()
or something like that. Does that make any sense?
This method could be called at any place in the code, and it should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know in this case you were only concerned about which taskResults were saved at all, as opposed to which ones had actually gone through podReconciliation()
(incorporated into WorkflowStatus.Outputs
as well, so maybe the common function wouldn't be used here if it doesn't make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(just edited my comment 2 comments back because it wasn't clear what i was talking about)
workflow/controller/operator.go
Outdated
// wait for all daemon nodes to get terminated before marking workflow completed | ||
if markCompleted && !woc.hasDaemonNodes() { | ||
// wait for all daemon nodes to get terminated before marking workflow completed & make sure task results are complete as well. | ||
if phase.Completed() && !woc.hasDaemonNodes() && !woc.checkTaskResultsInProgress() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need podReconciliation()
to have updated Workflow Status for this one?
workflow/controller/operator.go
Outdated
@@ -788,7 +788,7 @@ func (woc *wfOperationCtx) persistUpdates(ctx context.Context) { | |||
} | |||
|
|||
// Make sure the TaskResults are incorporated into WorkflowStatus before we delete them. | |||
if woc.wf.Status.Phase.Completed() && woc.checkTaskResultsCompleted() { | |||
if woc.wf.Status.Phase.Completed() && !woc.checkTaskResultsInProgress() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need to make sure that the podReconciliation has occurred to incorporate the taskResults into the Workflow's Status, or is that not needed based on where this is being called from?
Something I alluded to in my other comment, but I'm kind of wondering if there could be a risk of only checking whether there's a taskresult in progress - that requires that it got added to the map in the first place. It may be okay if it's only called from a particular place in the code I suppose. But I was thinking: isn't the simplest thing to check if the Workflow is complete, plus confirm that all TaskResults have been fully processed (therefore equal to the number of Pods run), which could be a function called from anywhere? Just let me know if you disagree! I haven't looked that carefully. |
What you are saying is probably the safest solution. Here's what I'm thinking now: As part of Assuming the above issue is resolved. it should be okay to use something like: func (woc *wfOperationCtx) checkReconciliationComplete() {
numCompletedNodes := len(woc.wf.Status.Nodes.Filter(func(x wfv1.NodeStatus) interface{} {
return x.Phase.Completed()
}))
return woc.wf.Status.Phase.Completed() && numCompletedNodes == woc.wf.Status.GetTaskResultsCompleted()
} Thoughts? |
Thanks for all the flexibility on the approach! I’m on vacation now, so may not get a chance to look at your response until after 1/1 unfortunately. I’ll get back to you then. |
Have a good vacation! I will take a crack at what I've suggested above. |
…j#12401 Signed-off-by: Garett MacGowan <garettsoftware@gmail.com>
@juliev0 I had to make some tweaks the implementation I suggested above, but I think I've got something more along the lines of what you were thinking. |
Working through the test failures. They are to do with termination as far as I can tell. |
Unfortunately the number of pods run is not a reliable indicator of wait container execution. If the workflow is terminated as soon as it is created, the wait container does not start up and I'm going to push an alternative solution. |
…robustness of reconciliation completion check. Add check for workflow completion to reconciliation completion check. Add call to addOutputsToGlobalScope to ensure that reconciled outputs are included in archived workflows. Fixes: argoproj#12401 Signed-off-by: Garett MacGowan <garettsoftware@gmail.com>
…ent. Fixes argoproj#12401. Signed-off-by: Garett MacGowan <garettsoftware@gmail.com>
@juliev0 I feel like the tests fail 1/3 times because the argo exec image download keeps timing out. It's super annoying to have to keep submitting empty commits to re run the CI. Is there anything in the works to address this? |
Signed-off-by: Garett MacGowan <garettsoftware@gmail.com>
Sorry, I know the feeling. There was a discussion of potentially implementing automatic retries for failures in the CI but somebody decided against it since it would hide actual failures and they preferred to address failures. As for the argoexec image download timing out, I'm not sure why that's happening so frequently. Is it just today or have you seen it in the past too? |
@Garett-MacGowan Thanks for keeping an eye on the new incoming bugs. Now that I've merged this, it would be great if you could continue to keep an eye on anything that could be caused by this, just in case, such as related to output parameters/artifacts, or workflows not being cleaned up, etc. |
Thanks for merging! Will do. |
This one has been happening for at least the last few days. I can understand not wanting to automatically retry test failures, but setup failures (like image downloads) fall into a different category. Perhaps we can increase the timeout for that? Maybe it's happening when there is a particularly high load on the image server? |
Very true. Definitely feel free to create a PR for that if you're motivated! (I for one am only spending a limited amount of time working on this repo these days) |
argoproj#12402) Signed-off-by: Garett MacGowan <garettsoftware@gmail.com> Signed-off-by: Raymond Chow <rchow@atlassian.com>
argoproj#12402) Signed-off-by: Garett MacGowan <garettsoftware@gmail.com> Signed-off-by: Raymond Chow <rchow@atlassian.com>
Hi all, We face a regression problem after upgrading to release We have a global artifact which is used and exported in two different steps WITH different values:
In the previous release, we can get the current global artifact from the last step (e2e-test-output-testupdate). I found there is a NEW change here: // Add outputs to global scope here to ensure that they are reflected in archive.
woc.addOutputsToGlobalScope(newNode.Outputs) Which will overwrite the global artifact after every task and should be the reason to make our process failed. We can find the overwriting log here:
I don't know if I found the current code and don't know how to solve that. Can you please help check if my investigation is correct? I also posted a comment in the argo-workflow channel: Thanks! |
Maybe also cc @Garett-MacGowan for help. Thanks! |
@zhangtbj Do you have a workflow example to reproduce? I can see if I can fix it for next release. |
This makes it sound like you are running these steps in parallel. If that's the case, then I think you're getting expected behavior. Either way, send a long a workflow that reproduces the issue and I will have a better idea of what you're trying to do. |
Hi @Garett-MacGowan , Thanks for your reply. We are not running these steps in parallel, but it is also not a simple workflow. Please check: apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generation: 21
generateName: test-artifact-12402-
namespace: action-dev2
spec:
activeDeadlineSeconds: 28800
arguments: {}
entrypoint: test-root
serviceAccountName: test
templates:
- inputs: {}
metadata: {}
name: test-root
outputs: {}
steps:
- - arguments: {}
name: inject-input-artifact
template: inject-input-artifact
- - arguments:
artifacts:
- from: '{{workflow.outputs.artifacts.testInput}}'
name: testInput
- from: '{{workflow.outputs.artifacts.testUpdate}}'
name: testUpdate
name: start-entrypoint
template: main
- - arguments:
artifacts:
- from: '{{workflow.outputs.artifacts.testUpload}}'
name: testUpload
name: upload-output-type-instances-step
template: upload-output-type-instances
- - arguments:
artifacts:
- from: '{{workflow.outputs.artifacts.testUpdate}}'
name: testUpdate
name: upload-update-type-instances-step
template: upload-update-type-instances
- inputs: {}
metadata: {}
name: main
outputs: {}
steps:
- - arguments:
artifacts:
- from: '{{workflow.outputs.artifacts.testInput}}'
name: input
name: cp
template: cp
- - arguments:
artifacts:
- from: '{{workflow.outputs.artifacts.testUpdate}}'
name: input-parameters
name: render-update-ti
template: main-render-update-ti-template
- - arguments:
artifacts:
- from: '{{steps.cp.outputs.artifacts.upload}}'
name: testUpload
name: output-testupload
template: output-testupload
- - arguments:
artifacts:
- from: '{{steps.render-update-ti.outputs.artifacts.render}}'
name: testUpdate
name: output-testupdate
template: output-testupdate
- container:
image: alpine:3.7
command: [sh, -c]
args: ["sleep 1; echo -n test input > /testInput.yaml; echo -n test update > /testUpdate.yaml"]
name: inject-input-artifact
outputs:
artifacts:
- globalName: testInput
name: testInput
path: /testInput.yaml
- globalName: testUpdate
name: testUpdate
path: /testUpdate.yaml
- container:
command: [sh, -c]
args: ["sleep 1; cp /input /upload"]
image: alpine:3.7
name: ""
resources: {}
inputs:
artifacts:
- name: input
path: /input
metadata: {}
name: cp
outputs:
artifacts:
- name: upload
path: /upload
- container:
image: alpine:3.7
command: [sh, -c]
args: ["sleep 1; echo -n change update!!! > /render.yml"]
metadata: {}
name: main-render-update-ti-template
outputs:
artifacts:
- name: render
path: /render.yml
- container:
args:
- sleep 1
command:
- sh
- -c
image: alpine:3.18.4
name: ""
resources: {}
inputs:
artifacts:
- name: testUpload
path: /typeinstance
metadata: {}
name: output-testupload
outputs:
artifacts:
- globalName: testUpload
name: testUpload
path: /typeinstance
- container:
args:
- sleep 1
command:
- sh
- -c
image: alpine:3.18.4
name: ""
resources: {}
inputs:
artifacts:
- name: testUpdate
path: /typeinstance
metadata: {}
name: output-testupdate
outputs:
artifacts:
- globalName: testUpdate
name: testUpdate
path: /typeinstance
- container:
args:
- sleep 1
command: [sh, -c]
args: ["sleep 1; cat /upload/typeInstances/testUpload; cat /upload/typeInstances/testUpload > /upload/result.yaml"]
image: alpine:3.18.4
name: ""
resources: {}
inputs:
artifacts:
- name: testUpload
path: /upload/typeInstances/testUpload
metadata: {}
name: upload-output-type-instances
outputs:
artifacts:
- globalName: uploadresult
name: uploadresult
path: /upload/result.yaml
- container:
args:
- sleep 1
command: [sh, -c]
args: ["sleep 1; cat /upload/typeInstances/testUpdate"]
image: alpine:3.18.4
name: ""
resources: {}
inputs:
artifacts:
- name: testUpdate
path: /upload/typeInstances/testUpdate
metadata: {}
name: upload-update-type-instances
outputs: {} If you run this workflow several times, you will find the final global output artifact will be changed to:
But acutally, it should use the the artifact from later step, like: Please let me know if anything is not clear. Thanks! |
@zhangtbj I have a hypothesis for what is wrong. I'm working though how I can properly test it. Can you open an issue with the information above? I will submit a fix linked to that issue. Please mention this issue in the new issue. Thanks! |
Hi @Garett-MacGowan , Sure thing. I opened a new task: #12554 and link this issue in the new issue. Thanks! |
Fixes #12401
Motivation
Stopped workflows would not have artifacts available in the UI, this can can be particularly annoying when debugging a workflow.
Modifications
Ensure workflow is archived only once all task results are finished processing.
Verification
Added a argo server test to ensure that the artifact is available. Manually verified that artifacts are now present in the UI.