Skip to content

Conversation

@Adrien-D
Copy link
Member

Fixes #13395

Motivation

We're encountering an infinite loop because DFS sorter doesn't account for cycles in the graph. If a node is revisited while it’s still being processed, we can enter a cycle

Modifications

To fix this, I track nodes that are currently being visited (i.e., "visiting") in addition to those that are already visited ("discovered") and skip the nodes we are currently visiting.

Verification

Tested on a workflow that reproduces the error, and multiple others to make sure nothing is broken.

Here is the workflow I used
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: stack-overflow-test-
spec:
  podGC:
    strategy: OnWorkflowCompletion
  entrypoint: main
  templates:
    # Main function
    - name: main
      steps:
      - - name: step-a
          template: step-a
          arguments:
            artifacts:
              - name: test
                archive:
                  none: {}
                s3:
                  bucket: my-bucket
                  key: foo/message.txt
                  endpoint: minio.argo.svc.cluster.local:9000
                  insecure: true
                  accessKeySecret:
                    name: my-minio-cred
                    key: accesskey
                  secretKeySecret:
                    name: my-minio-cred
                    key: secretkey
      
      - - name: step-b
          template: step-b
          arguments:
            parameters:
            - name: test
              value: "{{=sprig.fromJson(steps['step-a'].outputs.parameters['test'])['this']}}"
          

      - - name: step-c
          when: "{{=steps['step-b'].outputs.parameters['test'] != ''}}"
          template: step-c
          arguments:
            parameters:
            - name: test
              value: "{{=sprig.fromJson(steps['step-b'].outputs.parameters['test'])['that']}}"
          

      - - name: step-d
          when: "{{=steps['step-b'].outputs.parameters['test'] != ''}}"
          arguments:
            artifacts:
            - name: test
              from: '{{steps.step-b.outputs.artifacts.test}}' 
          inline: 
            inputs:
              artifacts:
              - name: test
                path: /tmp/test
                archive:
                  none: {}
            outputs:
              artifacts:
              - name: test
                path: /tmp/test
                archive:
                  none: {}
                s3:
                  bucket: my-bucket
                  key: foo/message.txt
                  endpoint: minio.argo.svc.cluster.local:9000
                  insecure: true
                  accessKeySecret:
                    name: my-minio-cred
                    key: accesskey
                  secretKeySecret:
                    name: my-minio-cred
                    key: secretkey
            container:
              image: docker.io/argoproj/argosay:v2
              command: [/argosay, echo, 'This is a test']

    - name: step-a
      inputs:
        artifacts:
        - name: test
          path: /tmp/test
          archive:
            none: {}
      outputs:
        artifacts:
        - name: test
          path: /tmp/test
          archive:
            none: {}
        parameters:
        - name: test
          valueFrom:
            path: /tmp/test
      container:
        image: docker.io/argoproj/argosay:v2
        command: [/argosay, echo, 'This is a test']
    - name: step-b
      inputs:
        parameters:
        - name: test
      outputs:
        artifacts:
        - name: test
          path: /tmp/test
          archive:
            none: {}
        parameters:
        - name: test
          valueFrom:
            path: /tmp/test
      script:
        image: docker.io/library/alpine:3
        command: [/bin/sh, -xec]
        env:
        - name: TEST
          value: '{{inputs.parameters.test}}'
        args:
        - |
          echo '{"that": "'"${TEST}"'"}' > /tmp/test
    - name: step-c
      inputs:
        parameters:
        - name: test
      
      container:
        image: docker.io/argoproj/argosay:v2
        command: [/argosay, echo, 'This is a test']
Normal render Fast render
Screenshot 2025-04-15 at 17 31 07 Screenshot 2025-04-15 at 17 31 15

Documentation

…3395

Signed-off-by: Adrien Delannoy <a.delannoyfr@gmail.com>
@Adrien-D Adrien-D force-pushed the fix-dfs-recursion branch from 60eb576 to 97eaa3a Compare April 15, 2025 15:59
if (this.discovered.has(n)) {
return;
}
if (this.visiting.has(n)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Adrien-D - your Javascript/Typescript knowledge is more current than mine. Do you know how a Set's has function does object comparison? I saw some stuff that said it was a reference comparison and others saying it's a deep equals comparison. In both cases, would a node ID be better?

For the reference comparison, if an equivalent node gets compared but has a different reference due to react deep copying it on the next render, wouldn't has return a false negative?

Similarly for the deep equality, if the node gets a field added or modified, wouldn't has also produce a false negative as well?

I'll default to your expertise, but flagging in case these scenarios are valid

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JPZ13 Great catch, you're absolutely right. Set.has() is reference-based, example here. So that could definitely lead to a false comparison, but not in this case here because Node is actually the node ID (naming is confusing I agree). So it's a string that shouldn't lead to this issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's in a name? 😁

Thanks, @Adrien-D - approving now

if (this.discovered.has(n)) {
return;
}
if (this.visiting.has(n)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's in a name? 😁

Thanks, @Adrien-D - approving now

Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@terrytangyuan terrytangyuan merged commit 364bde2 into argoproj:main Apr 17, 2025
17 checks passed
@Joibel
Copy link
Member

Joibel commented Apr 23, 2025

/cherry-pick release-3.6

gcp-cherry-pick-bot bot pushed a commit that referenced this pull request Apr 23, 2025
…4391)

Signed-off-by: Adrien Delannoy <a.delannoyfr@gmail.com>
Joibel pushed a commit that referenced this pull request Apr 23, 2025
…erry-pick #14391) (#14420)

Signed-off-by: Adrien Delannoy <a.delannoyfr@gmail.com>
Co-authored-by: Adrien Delannoy <a.delannoyfr@gmail.com>
kim-codefresh added a commit to codefresh-io/argo-workflows that referenced this pull request May 20, 2025
…abilities fixes (Cr 28355) (#358)

* fix: bump deps for k8schain to fix ecr-login (argoproj#14008) (release-3.6 cherry-pick) (argoproj#14174)

* fix(ci): python sdk release process (release-3.6) (argoproj#14183)

Signed-off-by: Alan Clucas <alan@clucas.org>

* docs: clarify qps/burst on controller (cherry-pick argoproj#14190) (argoproj#14192)

Signed-off-by: Tim Collins <tim@thecollins.team>
Co-authored-by: Tim Collins <45351296+tico24@users.noreply.github.com>

* fix(api/jsonschema): use unchanging JSON Schema version (cherry-pick argoproj#14092) (argoproj#14256)

Signed-off-by: Roger Peppe <rogpeppe@gmail.com>
Co-authored-by: Roger Peppe <rogpeppe@gmail.com>

* fix(api/jsonschema): use working `$id` (cherry-pick argoproj#14257) (argoproj#14258)

Signed-off-by: Roger Peppe <rogpeppe@gmail.com>
Co-authored-by: Roger Peppe <rogpeppe@gmail.com>

* docs: autogenerate tested k8s versions and centralize config (argoproj#14176) (release-3.6) (argoproj#14262)

Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>
Signed-off-by: Alan Clucas <alan@clucas.org>
Co-authored-by: Mason Malone <651224+MasonM@users.noreply.github.com>

* chore(deps): bump minio-go to newer version (argoproj#14185) (release-3.6) (argoproj#14261)

Co-authored-by: Vaibhav Kaushik <vaibhavkaushik@salesforce.com>

* fix: split pod controller from workflow controller (argoproj#14129) (release-3.6) (argoproj#14263)

* chore(deps): fix snyk (argoproj#14264) (release-3.6) (argoproj#14268)

* chore: revert to correct k8s versions

Accidental bump from argoproj#14176 cherry-pick

Signed-off-by: Alan Clucas <alan@clucas.org>

* chore(deps): bump github.com/go-jose/go-jose/v3 from 3.0.3 to 3.0.4 in the go_modules group (cherry-pick argoproj#14231) (argoproj#14269)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: wait for workflow informer to sync before pod informer (cherry-pick argoproj#14248) (argoproj#14266)

Signed-off-by: Rohan K <rohankmr414@gmail.com>
Co-authored-by: Rohan K <rohankmr414@gmail.com>

* fix(cli): remove red from log colour selection. Fixes argoproj#6740 (cherry-pick argoproj#14215) (argoproj#14278)

Signed-off-by: Prabakaran Kumaresshan <4676330+nixphix@users.noreply.github.com>
Co-authored-by: Prabakaran Kumaresshan <4676330+nixphix@users.noreply.github.com>

* fix: correct semaphore configmap keys for multiple semaphores (argoproj#14184) (release-3.6) (argoproj#14281)

* fix: don't print help for non-validation errors. Fixes argoproj#14234 (cherry-pick argoproj#14249) (argoproj#14283)

Signed-off-by: Koichi Shimada <jumpe1programming@gmail.com>
Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>
Co-authored-by: koichi <51446844+jumpe1@users.noreply.github.com>
Co-authored-by: Mason Malone <651224+MasonM@users.noreply.github.com>

* docs: fix kubernetes versions (release-3.6) (argoproj#14273)

Signed-off-by: Alan Clucas <alan@clucas.org>

* fix(workflow/sync): use RWMutex to prevent concurrent map access (cherry-pick argoproj#14321) (argoproj#14322)

Signed-off-by: Ryan Currah <ryan@currah.ca>
Co-authored-by: Ryan Currah <ryan@currah.ca>

* chore(lint): update golangci-lint to 2.1.1 (argoproj#14390) (cherry-pick release-3.6) (argoproj#14417)

* chore: bump golang 1.23->1.24 (argoproj#14385) (cherry-pick release-3.6) (argoproj#14418)

* fix: gracefully handle invalid CronWorkflows and simplify logic.  (cherry-pick argoproj#14197) (argoproj#14419)

Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>

* fix: prevent dfs sorter infinite recursion on cycle. Fixes argoproj#13395 (cherry-pick argoproj#14391) (argoproj#14420)

Signed-off-by: Adrien Delannoy <a.delannoyfr@gmail.com>
Co-authored-by: Adrien Delannoy <a.delannoyfr@gmail.com>

* chore(deps): bump github.com/expr-lang/expr from 1.16.9 to 1.17.0 (argoproj#14307) (release-3.6) (argoproj#14421)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps)!: update k8s and argo-events (release-3.6) (argoproj#14424)

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: william.vanhevelingen <william.vanhevelingen@acquia.com>
Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>
Signed-off-by: William Van Hevelingen <william.vanhevelingen@acquia.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: William Van Hevelingen <William.VanHevelingen@acquia.com>
Co-authored-by: Mason Malone <651224+MasonM@users.noreply.github.com>

* fix: correct retry logic (argoproj#13734) (release-3.6) (argoproj#14428)

Signed-off-by: isubasinghe <isitha@pipekit.io>
Signed-off-by: Alan Clucas <alan@clucas.org>
Co-authored-by: Isitha Subasinghe <isitha@pipekit.io>

* fix: manual retries exit handler cleanup. Fixes argoproj#14180 (argoproj#14181) (release-3.6) (argoproj#14429)

Signed-off-by: isubasinghe <isitha@pipekit.io>
Signed-off-by: Alan Clucas <alan@clucas.org>
Co-authored-by: Isitha Subasinghe <isitha@pipekit.io>

* fix: correct manual retry logic. Fixes argoproj#14124 (argoproj#14328) (release-3.6) (argoproj#14430)

Signed-off-by: oninowang <oninowang@tencent.com>
Signed-off-by: Alan Clucas <alan@clucas.org>
Co-authored-by: jswxstw <jswxstw@gmail.com>

* fix: disable ALPN in argo-server as a workaround (argoproj#14433)

Signed-off-by: Alan Clucas <alan@clucas.org>

* result of codegen

Signed-off-by: Kim <kim.aharfi@codefresh.io>

* fix:lint

Signed-off-by: Kim <kim.aharfi@codefresh.io>

---------

Signed-off-by: Alan Clucas <alan@clucas.org>
Signed-off-by: Tim Collins <tim@thecollins.team>
Signed-off-by: Roger Peppe <rogpeppe@gmail.com>
Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Rohan K <rohankmr414@gmail.com>
Signed-off-by: Prabakaran Kumaresshan <4676330+nixphix@users.noreply.github.com>
Signed-off-by: Koichi Shimada <jumpe1programming@gmail.com>
Signed-off-by: Ryan Currah <ryan@currah.ca>
Signed-off-by: Adrien Delannoy <a.delannoyfr@gmail.com>
Signed-off-by: william.vanhevelingen <william.vanhevelingen@acquia.com>
Signed-off-by: William Van Hevelingen <william.vanhevelingen@acquia.com>
Signed-off-by: isubasinghe <isitha@pipekit.io>
Signed-off-by: oninowang <oninowang@tencent.com>
Signed-off-by: Kim <kim.aharfi@codefresh.io>
Co-authored-by: Alan Clucas <alan@clucas.org>
Co-authored-by: gcp-cherry-pick-bot[bot] <98988430+gcp-cherry-pick-bot[bot]@users.noreply.github.com>
Co-authored-by: Tim Collins <45351296+tico24@users.noreply.github.com>
Co-authored-by: Roger Peppe <rogpeppe@gmail.com>
Co-authored-by: Mason Malone <651224+MasonM@users.noreply.github.com>
Co-authored-by: Vaibhav Kaushik <vaibhavkaushik@salesforce.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Rohan K <rohankmr414@gmail.com>
Co-authored-by: Prabakaran Kumaresshan <4676330+nixphix@users.noreply.github.com>
Co-authored-by: koichi <51446844+jumpe1@users.noreply.github.com>
Co-authored-by: Ryan Currah <ryan@currah.ca>
Co-authored-by: Adrien Delannoy <a.delannoyfr@gmail.com>
Co-authored-by: William Van Hevelingen <William.VanHevelingen@acquia.com>
Co-authored-by: Isitha Subasinghe <isitha@pipekit.io>
Co-authored-by: jswxstw <jswxstw@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Artifact "Cycle" Crashes UI

4 participants