Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Error: assignment to entry in nil map when hooks are used #11521

Closed
2 of 3 tasks
sandeepk8s opened this issue Aug 4, 2023 · 8 comments · Fixed by #11535
Closed
2 of 3 tasks

Getting Error: assignment to entry in nil map when hooks are used #11521

sandeepk8s opened this issue Aug 4, 2023 · 8 comments · Fixed by #11535
Labels
area/hooks area/workflow-templates type/bug type/regression Regression from previous behavior (a specific type of bug)

Comments

@sandeepk8s
Copy link
Contributor

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

When I upgraded from 3.4.4 to 3.4.9. I'm encountering an error saying Error: assignment to entry in nil map

I found that a recent change in operator.go is causing the issue
https://github.com/argoproj/argo-workflows/blame/cb1713d01542a7233d9bcb6646cc3c3409c5d870/workflow/controller/operator.go#L405

I have tested on 3.4.8, the wf is running fine but on 3.4.9 it is throwing error
I suspect the reason is hooks. There's a sample wf below to reproduce the error

Version

3.4.9

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

spec:
  templates:
    - name: main
      inputs: {}
      outputs: {}
      metadata: {}
      steps:
        - - name: step1
            template: hello
            arguments: {}
    - name: hello
      inputs: {}
      outputs: {}
      metadata: {}
      container:
        name: ''
        image: alpine:3.6
        command:
          - sh
          - '-c'
        args:
          - echo 'hello'
        resources: {}
    - name: send-status
      inputs:
        parameters:
          - name: state
      outputs: {}
      metadata: {}
      container:
        name: ''
        image: alpine:3.6
        command:
          - sh
          - '-c'
        args:
          - echo '{{inputs.parameters.state}}'
        resources: {}
  entrypoint: main
  arguments: {}
  hooks:
    exit:
      template: send-status
      arguments:
        parameters:
          - name: state
            value: INPROGRESS
    running:
      template: send-status
      arguments:
        parameters:
          - name: state
            value: SUCCESSFUL
      expression: workflow.status == "Running"

Logs from the workflow controller

kubectl logs -n argo deploy/workflow-controller | grep ${workflow}
time="2023-08-04T16:16:30.239Z" level=info msg="Processing workflow" namespace=testing workflow=lifecycle-hook-qrwsh
time="2023-08-04T16:16:30.245Z" level=error msg="Recovered from panic" namespace=testing r="assignment to entry in nil map" stack="goroutine 289 [running]:\nruntime/debug.Stack()\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x65\ngithub.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).operate.func2()\n\t/go/src/github.com/argoproj/argo-workflows/workflow/controller/operator.go:193 +0xbc\npanic({0x1f4c880, 0x2579ce0})\n\t/usr/local/go/src/runtime/panic.go:890 +0x263\ngithub.com/argoproj/argo-workflows/v3/workflow/util.MergeTo(0xc000470b98?, 0xc000c51680)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/util/merge.go:45 +0x305\ngithub.com/argoproj/argo-workflows/v3/workflow/util.JoinWorkflowSpec(0xc000c50598, 0xc000470b98, 0xc0008f1c18)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/util/merge.go:76 +0x1bb\ngithub.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).setStoredWfSpec(0xc000b260b0)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/controller/operator.go:3714 +0x2e5\ngithub.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).setExecWorkflow(0xc000b260b0, {0x25a2108, 0xc0001a6000})\n\t/go/src/github.com/argoproj/argo-workflows/workflow/controller/operator.go:3601 +0x211\ngithub.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).operate(0xc000b260b0, {0x25a2108, 0xc0001a6000})\n\t/go/src/github.com/argoproj/argo-workflows/workflow/controller/operator.go:207 +0x1f2\ngithub.com/argoproj/argo-workflows/v3/workflow/controller.(*WorkflowController).processNextItem(0xc0008eea00, {0x25a2108, 0xc0001a6000})\n\t/go/src/github.com/argoproj/argo-workflows/workflow/controller/controller.go:769 +0x759\ngithub.com/argoproj/argo-workflows/v3/workflow/controller.(*WorkflowController).runWorker(0xc000114ea0?)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/controller/controller.go:691 +0x9e\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)\n\t/go/pkg/mod/k8s.io/apimachinery@v0.24.3
+0x3e\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x0?, {0x257ffc0, 0xc000556030}, 0x1, 0xc000194960)\n\t/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:156 +0xb6\nk8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0?, 0x3b9aca00, 0x0, 0x0?, 0x0?)\n\t/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:133 +0x89\nk8s.io/apimachinery/pkg/util/wait.Until(0x0?, 0x0?, 0x0?)\n\t/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:90 +0x25\ncreated by github.com/argoproj/argo-workflows/v3/workflow/controller.(*WorkflowController).Run\n\t/go/src/github.com/argoproj/argo-workflows/workflow/controller/controller.go:318 +0x1a45\n" workflow=lifecycle-hook-qrwsh
time="2023-08-04T16:16:30.245Z" level=info msg="Updated phase  -> Error" namespace=testing workflow=lifecycle-hook-qrwsh
time="2023-08-04T16:16:30.245Z" level=info msg="Updated message  -> assignment to entry in nil map" namespace=testing workflow=lifecycle-hook-qrwsh
time="2023-08-04T16:16:30.245Z" level=info msg="Marking workflow completed" namespace=testing workflow=lifecycle-hook-qrwsh
time="2023-08-04T16:16:30.245Z" level=info msg="Checking daemoned children of " namespace=testing workflow=lifecycle-hook-qrwsh
time="2023-08-04T16:16:30.252Z" level=info msg="cleaning up pod" action=deletePod key=testing/lifecycle-hook-qrwsh-1340600742-agent/deletePod
time="2023-08-04T16:16:30.255Z" level=info msg="Workflow to be dehydrated" Workflow Size=1300
time="2023-08-04T16:16:30.265Z" level=info msg="Workflow update successful" namespace=testing phase=Error resourceVersion=136010817 workflow=lifecycle-hook-qrwsh

Logs from in your workflow's wait container

kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded
@terrytangyuan
Copy link
Member

@GeunSam2 Would you like to take a look at this?

@GeunSam2
Copy link
Contributor

GeunSam2 commented Aug 4, 2023

i'll check it ASAP.

@toyamagu-2021
Copy link
Member

I cannot reproduce this in v3.4.9 working on k3d cluster.
image

@sandeepk8s
Copy link
Contributor Author

on k8s 1.22 and argo workflows v3.4.9 (executor image v3.4.9)

Sample wf example which i mentioned above

image

@toyamagu-2021
Copy link
Member

toyamagu-2021 commented Aug 6, 2023

OK, thanks! But I cannot reproduce this problem when use following k8s versions:

  • K8s 1.26
  • K8s 1.22.17: k3d cluster create --image "rancher/k3s:v1.22.17-k3s1
  • K8s 1.22.3: k3d cluster create --image "rancher/k3s:v1.22.3-k3s1
    image
  • K8s 1.22.2: k3d cluster create --image "rancher/k3s:v1.22.2-k3s1

My install procedure is the same as Quick Start.
Does your problem occur when clean installing?
(Your problem might be caused by in-place upgrading, such as CRD)

metadata:
  name: issue-11521
  namespace: argo
  uid: 36dae1b5-6e21-4218-8948-e9e93591a306
  resourceVersion: '1069'
  generation: 4
  creationTimestamp: '2023-08-06T14:09:37Z'
  labels:
    workflows.argoproj.io/completed: 'true'
    workflows.argoproj.io/creator: system-serviceaccount-argo-argo-server
    workflows.argoproj.io/phase: Succeeded
  annotations:
    workflows.argoproj.io/pod-name-format: v2
  managedFields:
    - manager: argo
      operation: Update
      apiVersion: argoproj.io/v1alpha1
      time: '2023-08-06T14:09:37Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:labels:
            .: {}
            f:workflows.argoproj.io/creator: {}
        f:spec: {}
    - manager: workflow-controller
      operation: Update
      apiVersion: argoproj.io/v1alpha1
      time: '2023-08-06T14:09:57Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:workflows.argoproj.io/pod-name-format: {}
          f:labels:
            f:workflows.argoproj.io/completed: {}
            f:workflows.argoproj.io/phase: {}
        f:status: {}
spec:
  templates:
    - name: main
      inputs: {}
      outputs: {}
      metadata: {}
      steps:
        - - name: step1
            template: hello
            arguments: {}
    - name: hello
      inputs: {}
      outputs: {}
      metadata: {}
      container:
        name: ''
        image: alpine:3.6
        command:
          - sh
          - '-c'
        args:
          - echo 'hello'
        resources: {}
    - name: send-status
      inputs:
        parameters:
          - name: state
      outputs: {}
      metadata: {}
      container:
        name: ''
        image: alpine:3.6
        command:
          - sh
          - '-c'
        args:
          - echo '{{inputs.parameters.state}}'
        resources: {}
  entrypoint: main
  arguments: {}
  hooks:
    exit:
      template: send-status
      arguments:
        parameters:
          - name: state
            value: INPROGRESS
    running:
      template: send-status
      arguments:
        parameters:
          - name: state
            value: SUCCESSFUL
      expression: workflow.status == "Running"
status:
  phase: Succeeded
  startedAt: '2023-08-06T14:09:37Z'
  finishedAt: '2023-08-06T14:09:57Z'
  progress: 3/3
  nodes:
    issue-11521:
      id: issue-11521
      name: issue-11521
      displayName: issue-11521
      type: Steps
      templateName: main
      templateScope: local/issue-11521
      phase: Succeeded
      startedAt: '2023-08-06T14:09:37Z'
      finishedAt: '2023-08-06T14:09:47Z'
      progress: 2/2
      resourcesDuration:
        cpu: 6
        memory: 6
      children:
        - issue-11521-3920630365
        - issue-11521-3187500750
      outboundNodes:
        - issue-11521-3430472454
    issue-11521-2012696352:
      id: issue-11521-2012696352
      name: issue-11521.onExit
      displayName: issue-11521.onExit
      type: Pod
      templateName: send-status
      templateScope: local/issue-11521
      phase: Succeeded
      startedAt: '2023-08-06T14:09:47Z'
      finishedAt: '2023-08-06T14:09:50Z'
      progress: 1/1
      resourcesDuration:
        cpu: 4
        memory: 4
      inputs:
        parameters:
          - name: state
            value: INPROGRESS
      outputs:
        exitCode: '0'
      hostNodeName: k3d-k3s-default-server-0
    issue-11521-3187500750:
      id: issue-11521-3187500750
      name: issue-11521.hooks.running
      displayName: issue-11521.hooks.running
      type: Pod
      templateName: send-status
      templateScope: local/issue-11521
      phase: Succeeded
      startedAt: '2023-08-06T14:09:37Z'
      finishedAt: '2023-08-06T14:09:40Z'
      progress: 1/1
      resourcesDuration:
        cpu: 3
        memory: 3
      inputs:
        parameters:
          - name: state
            value: SUCCESSFUL
      outputs:
        exitCode: '0'
      hostNodeName: k3d-k3s-default-server-0
    issue-11521-3430472454:
      id: issue-11521-3430472454
      name: issue-11521[0].step1
      displayName: step1
      type: Pod
      templateName: hello
      templateScope: local/issue-11521
      phase: Succeeded
      boundaryID: issue-11521
      startedAt: '2023-08-06T14:09:37Z'
      finishedAt: '2023-08-06T14:09:40Z'
      progress: 1/1
      resourcesDuration:
        cpu: 3
        memory: 3
      outputs:
        exitCode: '0'
      hostNodeName: k3d-k3s-default-server-0
    issue-11521-3920630365:
      id: issue-11521-3920630365
      name: issue-11521[0]
      displayName: '[0]'
      type: StepGroup
      templateScope: local/issue-11521
      phase: Succeeded
      boundaryID: issue-11521
      startedAt: '2023-08-06T14:09:37Z'
      finishedAt: '2023-08-06T14:09:47Z'
      progress: 1/1
      resourcesDuration:
        cpu: 3
        memory: 3
      children:
        - issue-11521-3430472454
  conditions:
    - type: PodRunning
      status: 'False'
    - type: Completed
      status: 'True'
  resourcesDuration:
    cpu: 10
    memory: 10
  artifactRepositoryRef:
    default: true
    artifactRepository: {}
  artifactGCStatus:
    notSpecified: true

@sandeepk8s
Copy link
Contributor Author

Yes I did clean install (upgrade) using namespace-install.yaml file from releases page (3.4.9)

@toyamagu-2021
Copy link
Member

toyamagu-2021 commented Aug 6, 2023

Thank you! Does not resolved completely, but my observation is following:

spec:
  templates:
    - name: main
      inputs: {}
      outputs: {}
      metadata: {}
      steps:
        - - name: step1
            template: argosay
            arguments: {}
    - name: argosay
      inputs: {}
      outputs: {}
      metadata: {}
      container:
        name: ''
        image: argoproj/argosay:v2
        command:
          - /bin/sh
          - '-c'
        args:
          - /argosay
        resources: {}
  entrypoint: main
  arguments: {}
  hooks:
    running:
      template: argosay
      arguments: {}
      expression: workflow.status == "Running"

@toyamagu-2021
Copy link
Member

toyamagu-2021 commented Aug 6, 2023

@sandeepk8s
I think it is a regression when submitting WorkflowTemplate with Hook and will be fixed #11535.
This might be also caused when we use WorkflowTemplateDefaults with Hook.

Thank you from the bottom of my heart for your patient debugging process.

@agilgur5 agilgur5 added area/workflow-templates area/hooks type/regression Regression from previous behavior (a specific type of bug) labels Aug 31, 2023
dpadhiar pushed a commit to dpadhiar/argo-workflows that referenced this issue May 9, 2024
…11535)

Signed-off-by: Dillen Padhiar <dillen_padhiar@intuit.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/hooks area/workflow-templates type/bug type/regression Regression from previous behavior (a specific type of bug)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants