Skip to content

Latest commit

 

History

History
181 lines (149 loc) · 14.8 KB

File metadata and controls

181 lines (149 loc) · 14.8 KB

spark-operator

Version: 2.1.0 AppVersion: 2.1.0

A Helm chart for Spark on Kubernetes operator.

Homepage: https://github.com/kubeflow/spark-operator

Introduction

This chart bootstraps a Kubernetes Operator for Apache Spark deployment using the Helm package manager.

Prerequisites

  • Helm >= 3
  • Kubernetes >= 1.16

Previous Helm Chart

The previous spark-operator Helm chart hosted at helm/charts has been moved to this repository in accordance with the Deprecation timeline. Note that a few things have changed between this version and the old version:

  • This repository only supports Helm chart installations using Helm 3+ since the apiVersion on the chart has been marked as v2.
  • Previous versions of the Helm chart have not been migrated, and the version has been set to 1.0.0 at the onset. If you are looking for old versions of the chart, it's best to run helm pull incubator/sparkoperator --version <your-version> until you are ready to move to this repository's version.
  • Several configuration properties have been changed, carefully review the values section below to make sure you're aligned with the new values.

Usage

Add Helm Repo

helm repo add spark-operator https://kubeflow.github.io/spark-operator

helm repo update

See helm repo for command documentation.

Install the chart

helm install [RELEASE_NAME] spark-operator/spark-operator

For example, if you want to create a release with name spark-operator in the spark-operator namespace:

helm install spark-operator spark-operator/spark-operator \
    --namespace spark-operator \
    --create-namespace

Note that by passing the --create-namespace flag to the helm install command, helm will create the release namespace if it does not exist.

See helm install for command documentation.

Upgrade the chart

helm upgrade [RELEASE_NAME] spark-operator/spark-operator [flags]

See helm upgrade for command documentation.

Uninstall the chart

helm uninstall [RELEASE_NAME]

This removes all the Kubernetes resources associated with the chart and deletes the release, except for the crds, those will have to be removed manually.

See helm uninstall for command documentation.

Values

Key Type Default Description
nameOverride string "" String to partially override release name.
fullnameOverride string "" String to fully override release name.
commonLabels object {} Common labels to add to the resources.
image.registry string "docker.io" Image registry.
image.repository string "kubeflow/spark-operator" Image repository.
image.tag string If not set, the chart appVersion will be used. Image tag.
image.pullPolicy string "IfNotPresent" Image pull policy.
image.pullSecrets list [] Image pull secrets for private image registry.
controller.replicas int 1 Number of replicas of controller.
controller.workers int 10 Reconcile concurrency, higher values might increase memory usage.
controller.logLevel string "info" Configure the verbosity of logging, can be one of debug, info, error.
controller.driverPodCreationGracePeriod string "10s" Grace period after a successful spark-submit when driver pod not found errors will be retried. Useful if the driver pod can take some time to be created.
controller.maxTrackedExecutorPerApp int 1000 Specifies the maximum number of Executor pods that can be tracked by the controller per SparkApplication.
controller.uiService.enable bool true Specifies whether to create service for Spark web UI.
controller.uiIngress.enable bool false Specifies whether to create ingress for Spark web UI. controller.uiService.enable must be true to enable ingress.
controller.uiIngress.urlFormat string "" Ingress URL format. Required if controller.uiIngress.enable is true.
controller.uiIngress.ingressClassName string "" Optionally set the ingressClassName.
controller.batchScheduler.enable bool false Specifies whether to enable batch scheduler for spark jobs scheduling. If enabled, users can specify batch scheduler name in spark application.
controller.batchScheduler.kubeSchedulerNames list [] Specifies a list of kube-scheduler names for scheduling Spark pods.
controller.batchScheduler.default string "" Default batch scheduler to be used if not specified by the user. If specified, this value must be either "volcano" or "yunikorn". Specifying any other value will cause the controller to error on startup.
controller.serviceAccount.create bool true Specifies whether to create a service account for the controller.
controller.serviceAccount.name string "" Optional name for the controller service account.
controller.serviceAccount.annotations object {} Extra annotations for the controller service account.
controller.serviceAccount.automountServiceAccountToken bool true Auto-mount service account token to the controller pods.
controller.rbac.create bool true Specifies whether to create RBAC resources for the controller.
controller.rbac.annotations object {} Extra annotations for the controller RBAC resources.
controller.labels object {} Extra labels for controller pods.
controller.annotations object {} Extra annotations for controller pods.
controller.volumes list [{"emptyDir":{"sizeLimit":"1Gi"},"name":"tmp"}] Volumes for controller pods.
controller.nodeSelector object {} Node selector for controller pods.
controller.affinity object {} Affinity for controller pods.
controller.tolerations list [] List of node taints to tolerate for controller pods.
controller.priorityClassName string "" Priority class for controller pods.
controller.podSecurityContext object {"fsGroup":185} Security context for controller pods.
controller.topologySpreadConstraints list [] Topology spread constraints rely on node labels to identify the topology domain(s) that each Node is in. Ref: Pod Topology Spread Constraints. The labelSelector field in topology spread constraint will be set to the selector labels for controller pods if not specified.
controller.env list [] Environment variables for controller containers.
controller.envFrom list [] Environment variable sources for controller containers.
controller.volumeMounts list [{"mountPath":"/tmp","name":"tmp","readOnly":false}] Volume mounts for controller containers.
controller.resources object {} Pod resource requests and limits for controller containers. Note, that each job submission will spawn a JVM within the controller pods using "/usr/local/openjdk-11/bin/java -Xmx128m". Kubernetes may kill these Java processes at will to enforce resource limits. When that happens, you will see the following error: 'failed to run spark-submit for SparkApplication [...]: signal: killed' - when this happens, you may want to increase memory limits.
controller.securityContext object {"allowPrivilegeEscalation":false,"capabilities":{"drop":["ALL"]},"privileged":false,"readOnlyRootFilesystem":true,"runAsNonRoot":true} Security context for controller containers.
controller.sidecars list [] Sidecar containers for controller pods.
controller.podDisruptionBudget.enable bool false Specifies whether to create pod disruption budget for controller. Ref: Specifying a Disruption Budget for your Application
controller.podDisruptionBudget.minAvailable int 1 The number of pods that must be available. Require controller.replicas to be greater than 1
controller.pprof.enable bool false Specifies whether to enable pprof.
controller.pprof.port int 6060 Specifies pprof port.
controller.pprof.portName string "pprof" Specifies pprof service port name.
controller.workqueueRateLimiter.bucketQPS int 50 Specifies the average rate of items process by the workqueue rate limiter.
controller.workqueueRateLimiter.bucketSize int 500 Specifies the maximum number of items that can be in the workqueue at any given time.
controller.workqueueRateLimiter.maxDelay.enable bool true Specifies whether to enable max delay for the workqueue rate limiter. This is useful to avoid losing events when the workqueue is full.
controller.workqueueRateLimiter.maxDelay.duration string "6h" Specifies the maximum delay duration for the workqueue rate limiter.
webhook.enable bool true Specifies whether to enable webhook.
webhook.replicas int 1 Number of replicas of webhook server.
webhook.logLevel string "info" Configure the verbosity of logging, can be one of debug, info, error.
webhook.port int 9443 Specifies webhook port.
webhook.portName string "webhook" Specifies webhook service port name.
webhook.failurePolicy string "Fail" Specifies how unrecognized errors are handled. Available options are Ignore or Fail.
webhook.timeoutSeconds int 10 Specifies the timeout seconds of the webhook, the value must be between 1 and 30.
webhook.resourceQuotaEnforcement.enable bool false Specifies whether to enable the ResourceQuota enforcement for SparkApplication resources.
webhook.serviceAccount.create bool true Specifies whether to create a service account for the webhook.
webhook.serviceAccount.name string "" Optional name for the webhook service account.
webhook.serviceAccount.annotations object {} Extra annotations for the webhook service account.
webhook.serviceAccount.automountServiceAccountToken bool true Auto-mount service account token to the webhook pods.
webhook.rbac.create bool true Specifies whether to create RBAC resources for the webhook.
webhook.rbac.annotations object {} Extra annotations for the webhook RBAC resources.
webhook.labels object {} Extra labels for webhook pods.
webhook.annotations object {} Extra annotations for webhook pods.
webhook.sidecars list [] Sidecar containers for webhook pods.
webhook.volumes list [{"emptyDir":{"sizeLimit":"500Mi"},"name":"serving-certs"}] Volumes for webhook pods.
webhook.nodeSelector object {} Node selector for webhook pods.
webhook.affinity object {} Affinity for webhook pods.
webhook.tolerations list [] List of node taints to tolerate for webhook pods.
webhook.priorityClassName string "" Priority class for webhook pods.
webhook.podSecurityContext object {"fsGroup":185} Security context for webhook pods.
webhook.topologySpreadConstraints list [] Topology spread constraints rely on node labels to identify the topology domain(s) that each Node is in. Ref: Pod Topology Spread Constraints. The labelSelector field in topology spread constraint will be set to the selector labels for webhook pods if not specified.
webhook.env list [] Environment variables for webhook containers.
webhook.envFrom list [] Environment variable sources for webhook containers.
webhook.volumeMounts list [{"mountPath":"/etc/k8s-webhook-server/serving-certs","name":"serving-certs","readOnly":false,"subPath":"serving-certs"}] Volume mounts for webhook containers.
webhook.resources object {} Pod resource requests and limits for webhook pods.
webhook.securityContext object {"allowPrivilegeEscalation":false,"capabilities":{"drop":["ALL"]},"privileged":false,"readOnlyRootFilesystem":true,"runAsNonRoot":true} Security context for webhook containers.
webhook.podDisruptionBudget.enable bool false Specifies whether to create pod disruption budget for webhook. Ref: Specifying a Disruption Budget for your Application
webhook.podDisruptionBudget.minAvailable int 1 The number of pods that must be available. Require webhook.replicas to be greater than 1
spark.jobNamespaces list ["default"] List of namespaces where to run spark jobs. If empty string is included, all namespaces will be allowed. Make sure the namespaces have already existed.
spark.serviceAccount.create bool true Specifies whether to create a service account for spark applications.
spark.serviceAccount.name string "" Optional name for the spark service account.
spark.serviceAccount.annotations object {} Optional annotations for the spark service account.
spark.serviceAccount.automountServiceAccountToken bool true Auto-mount service account token to the spark applications pods.
spark.rbac.create bool true Specifies whether to create RBAC resources for spark applications.
spark.rbac.annotations object {} Optional annotations for the spark application RBAC resources.
prometheus.metrics.enable bool true Specifies whether to enable prometheus metrics scraping.
prometheus.metrics.port int 8080 Metrics port.
prometheus.metrics.portName string "metrics" Metrics port name.
prometheus.metrics.endpoint string "/metrics" Metrics serving endpoint.
prometheus.metrics.prefix string "" Metrics prefix, will be added to all exported metrics.
prometheus.podMonitor.create bool false Specifies whether to create pod monitor. Note that prometheus metrics should be enabled as well.
prometheus.podMonitor.labels object {} Pod monitor labels
prometheus.podMonitor.jobLabel string "spark-operator-podmonitor" The label to use to retrieve the job name from
prometheus.podMonitor.podMetricsEndpoint object {"interval":"5s","scheme":"http"} Prometheus metrics endpoint properties. metrics.portName will be used as a port

Maintainers

Name Email Url
yuchaoran2011 yuchaoran2011@gmail.com https://github.com/yuchaoran2011
ChenYi015 github@chenyicn.net https://github.com/ChenYi015