Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High latency during workflow submission (e.g. validation of longer Workflows with many templateRefs) #13403

Closed
4 tasks done
gpomykala opened this issue Jul 26, 2024 · 2 comments
Labels
area/api Argo Server API area/cli The `argo` CLI area/sdks area/workflow-templates P3 Low priority solution/duplicate This issue or PR is a duplicate of an existing one type/bug

Comments

@gpomykala
Copy link

gpomykala commented Jul 26, 2024

Pre-requisites

  • I have double-checked my configuration
  • I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened? What did you expect to happen?

Workflow submission latency is unreasonably high - it takes much longer than call to create Workflow CR via kube-apiserver normally takes.
It affects submission via:

  • Argo UI
  • Argo CLI
  • Go Argo client

There is also a linear correlation between number of workflow template references and overall submission time.

➜  Documents time argo submit --from workflowtemplate/10-echos  
...    
argo submit --from workflowtemplate/10-echos  0.17s user 0.04s system 6% cpu 3.365 total
➜  Documents time argo submit --from workflowtemplate/20-echos
...
argo submit --from workflowtemplate/20-echos  0.16s user 0.05s system 2% cpu 7.520 total
➜  Documents time argo submit --from workflowtemplate/40-echos
...     
argo submit --from workflowtemplate/40-echos  0.31s user 0.08s system 2% cpu 15.558 total

Workflow submissions also do not parallelize very well, batch submissions take considerably longer than single instance of a workflow

#!/bin/bash

# Function to run argo submit
run_argo() {
  argo submit --from workflowtemplate/10-echos > /dev/null
}

export -f run_argo

# Measure the time to run 10 instances in parallel
time parallel -j 300 run_argo ::: {1..300}

➜  Documents ./submit_parallel.sh

real	0m13.796s
user	0m40.132s
sys	0m16.141s

The submission time does not seem to have upper boundary, we've seen batch submissions taking hundreds of seconds for more sophisticated workflows ~80 steps, each reference some template.

The main culprits are:

  • same piece of code is used to submit workflow regardless whether it's via Argo Server or argoKubeClient
  • in all cases when workflow is created validate.ValidateWorkflow is called
  • validate.ValidateWorkflow relies on template resolver which does not have any caching capabilities
  • validation logic recursively scans the workflow body, follow the templates, its content, nested references etc
  • validation logic does not utilize parallelism at all (recursion within single goroutine)
    As as result templates are resolved from kube-apiserver, deserialized and copied in memory every time validation tries to validate a template reference, which is not the only, but major contributor to overall execution time.
    There is also a considerable load on kube-apiserver - we've seen as much as 150 calls to GET template for every POST that creates workflow resource.

I discussed this issue at #argo-workflows Slack, hopefully this problem gets more traction here

Version(s)

v3.5.8

Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

YAML files
---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: echo-1
spec:
  entrypoint: print-message-1
  templates:
    - name: print-message-1
      inputs:
        parameters:
          - name: message
            value: "hello world!"
      container:
        image: docker/whalesay
        command: [cowsay]
        args: ["{{inputs.parameters.message}}"]
---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: echo-2
spec:
  entrypoint: print-message-2
  templates:
    - name: print-message-2
      inputs:
        parameters:
          - name: message
            value: "hello world!"
      container:
        image: docker/whalesay
        command: [cowsay]
        args: ["{{inputs.parameters.message}}"]
---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: 10-echos
spec:
  entrypoint: start 
  activeDeadlineSeconds: 1800 
  templates:
    - name: start 
      dag:
        tasks:
          - name: echo-1
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-1
          - name: echo-2
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-2
          - name: echo-3
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-3
          - name: echo-4
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-4
          - name: echo-5
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-5
          - name: echo-6
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-6
          - name: echo-7
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-7
          - name: echo-8
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-8
          - name: echo-9
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-9
          - name: echo-10
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-10
---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: 20-echos
spec:
  entrypoint: start 
  activeDeadlineSeconds: 1800 
  templates:
    - name: start 
      dag:
        tasks:
          - name: echo-1
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-1
          - name: echo-2
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-2
          - name: echo-3
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-3
          - name: echo-4
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-4
          - name: echo-5
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-5
          - name: echo-6
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-6
          - name: echo-7
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-7
          - name: echo-8
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-8
          - name: echo-9
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-9
          - name: echo-10
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-10
          - name: echo-11
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-11
          - name: echo-12
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-12
          - name: echo-13
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-13
          - name: echo-14
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-14
          - name: echo-15
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-15
          - name: echo-16
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-16
          - name: echo-17
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-17
          - name: echo-18
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-18
          - name: echo-19
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-19
          - name: echo-20
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-20
---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: 40-echos
spec:
  entrypoint: start 
  activeDeadlineSeconds: 1800 
  templates:
    - name: start 
      dag:
        tasks:
          - name: echo-1
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-1
          - name: echo-2
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-2
          - name: echo-3
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-3
          - name: echo-4
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-4
          - name: echo-5
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-5
          - name: echo-6
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-6
          - name: echo-7
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-7
          - name: echo-8
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-8
          - name: echo-9
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-9
          - name: echo-10
            templateRef:
              name: echo-1
              template: print-message-1
            arguments:
              parameters:
                - name: message
                  value: echo-10
          - name: echo-11
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-11
          - name: echo-12
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-12
          - name: echo-13
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-13
          - name: echo-14
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-14
          - name: echo-15
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-15
          - name: echo-16
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-16
          - name: echo-17
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-17
          - name: echo-18
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-18
          - name: echo-19
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-19
          - name: echo-20
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-20
          - name: echo-21
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-21
          - name: echo-22
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-22
          - name: echo-23
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-23
          - name: echo-24
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-24
          - name: echo-25
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-25
          - name: echo-26
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-26
          - name: echo-27
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-27
          - name: echo-28
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-28
          - name: echo-29
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-29
          - name: echo-30
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-30
          - name: echo-31
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-31
          - name: echo-32
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-32
          - name: echo-33
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-33
          - name: echo-34
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-34
          - name: echo-35
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-35
          - name: echo-36
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-36
          - name: echo-37
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-37
          - name: echo-38
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-38
          - name: echo-39
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-39
          - name: echo-40
            templateRef:
              name: echo-2
              template: print-message-2
            arguments:
              parameters:
                - name: message
                  value: echo-40

Logs from the workflow controller

kubectl logs -n argo deploy/workflow-controller | grep ${workflow}

Logs from in your workflow's wait container

Argo Server nor argo-kube-client does not really produce meaningful logs that are useful to troubleshoot this issue

@agilgur5
Copy link
Member

agilgur5 commented Jul 26, 2024

Workflow submission latency is unreasonably high - it takes much longer than call to create Workflow CR via kube-apiserver normally takes.

  • in all cases when workflow is created validate.ValidateWorkflow is called
  • validate.ValidateWorkflow relies on template resolver which does not have any caching capabilities
  • validation logic recursively scans the workflow body, follow the templates, its content, nested references etc
  • validation logic does not utilize parallelism at all (recursion within single goroutine)

Your logic sounds correct to me -- validation is the main additional piece during submission and it can probably be optimized as you point out. Based on my understanding of the codebase, it also sounds about right that it would more disproportionately affect longer Workflows with more template references. So that all lines up.

There's been 1 or 2 PRs to optimize template resolution (let me find them later), but otherwise it's likely it hasn't been optimized before.

Also I modified your links to use permalinks so that they don't become outdated.

I discussed this issue at #argo-workflows Slack, hopefully this problem gets more traction here

There's only a couple people that actively respond there. I did give you the green light on a few of your suggestions there. An issue is definitely better for tracking purposes and going over implementation details.

@agilgur5 agilgur5 added the P3 Low priority label Jul 26, 2024
@agilgur5 agilgur5 changed the title High latency during workflow submission High latency during workflow submission (e.g. validation of longer Workflows with many templateRefs) Jul 26, 2024
@agilgur5 agilgur5 changed the title High latency during workflow submission (e.g. validation of longer Workflows with many templateRefs) High latency during workflow submission (ex: validation of longer Workflows with many templateRefs) Jul 26, 2024
@agilgur5 agilgur5 changed the title High latency during workflow submission (ex: validation of longer Workflows with many templateRefs) High latency during workflow submission (e.g.: validation of longer Workflows with many templateRefs) Jul 26, 2024
@agilgur5 agilgur5 changed the title High latency during workflow submission (e.g.: validation of longer Workflows with many templateRefs) High latency during workflow submission (e.g. validation of longer Workflows with many templateRefs) Jul 26, 2024
@agilgur5
Copy link
Member

Looks like this duplicates an older issue: #7418

@argoproj argoproj locked as resolved and limited conversation to collaborators Sep 20, 2024
@agilgur5 agilgur5 added the solution/duplicate This issue or PR is a duplicate of an existing one label Sep 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/api Argo Server API area/cli The `argo` CLI area/sdks area/workflow-templates P3 Low priority solution/duplicate This issue or PR is a duplicate of an existing one type/bug
Projects
None yet
Development

No branches or pull requests

2 participants