Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Role and RoleBinding not installed for webhook-init in Helm pre-hook #1379

Merged
merged 2 commits into from
Nov 1, 2021

Conversation

zzvara
Copy link
Contributor

@zzvara zzvara commented Oct 30, 2021

This causes the issue: fluxcd/flux2#2030

Tested with the following chart:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: spark-operator-development
  namespace: development
spec:
  chart:
    spec:
      chart: spark-operator
      sourceRef:
        kind: HelmRepository
        name: enliven-systems
        namespace: kube-system
      version: 1.1.9
  interval: 30m0s
  install:
    crds: CreateReplace
    replace: true
    remediation:
      retries: -1
    disableWaitForJobs: true
    disableWait: true
    timeout: "2m"
  upgrade:
    crds: CreateReplace
    remediation:
      retries: -1
    disableWaitForJobs: true
  uninstall:
    timeout: "1m"
  values:
    image:
      tag: v1beta2-1.2.3-3.1.1
      pullPolicy: Always
    webhook:
      enable: true
      namespaceSelector: "name=development"
    sparkJobNamespace: development
    resyncInterval: 10
    metrics:
      enable: false
    controllerThreads: 1
    batchScheduler:
      enable: true
    resourceQuotaEnforcement:
      enable: true
    rbac:
      createClusterRole: true
      createRole: true
    resources:
      requests:
        memory: 250Mi
        cpu: 100m
      limits:
        memory: 250Mi
        cpu: 500m

Also, with Flux 2 the chart seems to stay on state "Release reconciliation succeeded".

@liyinan926 liyinan926 merged commit 6e149e8 into kubeflow:master Nov 1, 2021
@jehof
Copy link

jehof commented Nov 11, 2021

This seems to be a breaking-change, because existing ClusterRole and ClusterRoleBinding (from previous installation) are deleted when using helm upgrade with this version of the spark-operator-chart. ClusterRole and ClusterRoleBinding are no longer part of the deployment and are now hook ressources.

I tried fixing my helm upgrade problem by adding pre-upgrade hook and before-hook-creation policy, but the resource are still removed on upgrade.

I noticed this behavior while updating our spark-operator from 1.1.6 to 1.1.10. ClusterRole and ClusterRoleBinding where gone and operator was logging errors.

% make install-operator stage=lab
You're installing/upgrading components of Apache Spark operator.
(BE SURE TO HAVE THE RIGHT KUBECONFIG!)
Continue? (Y/N): y
history.go:56: [debug] getting history for release spark-operator
upgrade.go:139: [debug] preparing upgrade for spark-operator
upgrade.go:147: [debug] performing update for spark-operator
upgrade.go:319: [debug] creating upgraded release for spark-operator
client.go:299: [debug] Starting delete for "spark-operator" ServiceAccount
client.go:128: [debug] creating 1 resource(s)
client.go:299: [debug] Starting delete for "spark-operator" ClusterRole
client.go:128: [debug] creating 1 resource(s)
client.go:299: [debug] Starting delete for "spark-operator" ClusterRoleBinding
client.go:128: [debug] creating 1 resource(s)
client.go:218: [debug] checking 5 resources for changes
client.go:267: [debug] Deleting "spark-operator" in ...
client.go:267: [debug] Deleting "spark-operator" in ...
wait.go:48: [debug] beginning wait for 5 resources with timeout of 5m0s
upgrade.go:154: [debug] updating status for upgraded release for spark-operator

The two log lines

client.go:267: [debug] Deleting "spark-operator" in ...
client.go:267: [debug] Deleting "spark-operator" in ...

remove the previously added ClusterRole and Binding.

Running helm upgrade again, will fix the deployment process.

jbhalodia-slack pushed a commit to jbhalodia-slack/spark-operator that referenced this pull request Oct 4, 2024
…e-hook` (kubeflow#1379)

* Update rbac.yaml

* Update Chart.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants