Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm upgrade fails when you install spark-operator chart for the first time with webhook enabled #1364

Closed
danielyahn opened this issue Oct 7, 2021 · 3 comments

Comments

@danielyahn
Copy link

I have a use case where I include spark-operator chart as a dependency of my chart, and control its deployment with a Helm variable.

For example, with

# Chart.yaml
apiVersion: v2
name: hello-world
description: A Helm chart for Hello World
dependencies:
  - name: spark-operator
    version: 1.1.6
    repository: https://googlecloudplatform.github.io/spark-on-k8s-operator
    condition: spark-operator.enabled
# values.yaml
spark-operator:
  serviceAccounts:
    spark:
      create: true
      name: spark-service-user
    sparkoperator:
      create: true
      name: spark-operator-service-user
  sparkJobNamespace: spark
  webhook:
    enable: true

It works fine when I helm install my chart with spark-operator.enabled=true.

However, if I helm install the chart with spark-operator.enabled=false then try to helm upgrade with spark-operator.enabled=true, it errors out because of the missing service account, spark-operator-service-user. I think service account are not being created properly during upgrade because the ServiceAccount manifest only has pre-install hook, not pre-upgrade.

For the actual error message, when I helm upgrade, it just times out because the spark-operator-webhook-cleanup job cannot execute. I can see the following error message when I inspect the event log of the job (with kubernetes describe command).

Events:
  Type     Reason        Age                  From            Message
  ----     ------        ----                 ----            -------
  Warning  FailedCreate  67s (x5 over 3m37s)  job-controller  Error creating: pods "hello-world-spark-operator-webhook-cleanup-" is forbidden: error looking up service account spark/spark-operator-service-user: serviceaccount "spark-operator-service-user" not found
@julienlau
Copy link

Did you retry since #1359 is fixed in 1.1.27 ?

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Copy link

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants