|
| 1 | +# Kubeflow Spark Operator |
1 | 2 | [](https://goreportcard.com/report/github.com/kubeflow/spark-operator) |
2 | 3 |
|
3 | | -**This is not an officially supported Google product.** |
| 4 | +## Overview |
| 5 | +The Kubernetes Operator for Apache Spark aims to make specifying and running [Spark](https://github.com/apache/spark) applications as easy and idiomatic as running other workloads on Kubernetes. It uses |
| 6 | +[Kubernetes custom resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) |
| 7 | +for specifying, running, and surfacing status of Spark applications. For a complete reference of the custom resource definitions, please refer to the [API Definition](docs/api-docs.md). For details on its design, please refer to the [design doc](docs/design.md). It requires Spark 2.3 and above that supports Kubernetes as a native scheduler backend. |
4 | 8 |
|
5 | | -## Community |
| 9 | +The Kubernetes Operator for Apache Spark currently supports the following list of features: |
6 | 10 |
|
7 | | -* Join our [Slack](https://kubernetes.slack.com/messages/CALBDHMTL) channel on [Kubernetes on Slack](https://slack.k8s.io/). |
8 | | -* Check out [who is using the Kubernetes Operator for Apache Spark](docs/who-is-using.md). |
| 11 | +* Supports Spark 2.3 and up. |
| 12 | +* Enables declarative application specification and management of applications through custom resources. |
| 13 | +* Automatically runs `spark-submit` on behalf of users for each `SparkApplication` eligible for submission. |
| 14 | +* Provides native [cron](https://en.wikipedia.org/wiki/Cron) support for running scheduled applications. |
| 15 | +* Supports customization of Spark pods beyond what Spark natively is able to do through the mutating admission webhook, e.g., mounting ConfigMaps and volumes, and setting pod affinity/anti-affinity. |
| 16 | +* Supports automatic application re-submission for updated `SparkApplication` objects with updated specification. |
| 17 | +* Supports automatic application restart with a configurable restart policy. |
| 18 | +* Supports automatic retries of failed submissions with optional linear back-off. |
| 19 | +* Supports mounting local Hadoop configuration as a Kubernetes ConfigMap automatically via `sparkctl`. |
| 20 | +* Supports automatically staging local application dependencies to Google Cloud Storage (GCS) via `sparkctl`. |
| 21 | +* Supports collecting and exporting application-level metrics and driver/executor metrics to Prometheus. |
9 | 22 |
|
10 | 23 | ## Project Status |
11 | 24 |
|
@@ -72,26 +85,11 @@ If you are running the Kubernetes Operator for Apache Spark on Google Kubernetes |
72 | 85 |
|
73 | 86 | For more information, check the [Design](docs/design.md), [API Specification](docs/api-docs.md) and detailed [User Guide](docs/user-guide.md). |
74 | 87 |
|
75 | | -## Overview |
76 | | - |
77 | | -The Kubernetes Operator for Apache Spark aims to make specifying and running [Spark](https://github.com/apache/spark) applications as easy and idiomatic as running other workloads on Kubernetes. It uses |
78 | | -[Kubernetes custom resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) |
79 | | -for specifying, running, and surfacing status of Spark applications. For a complete reference of the custom resource definitions, please refer to the [API Definition](docs/api-docs.md). For details on its design, please refer to the [design doc](docs/design.md). It requires Spark 2.3 and above that supports Kubernetes as a native scheduler backend. |
80 | | - |
81 | | -The Kubernetes Operator for Apache Spark currently supports the following list of features: |
82 | | - |
83 | | -* Supports Spark 2.3 and up. |
84 | | -* Enables declarative application specification and management of applications through custom resources. |
85 | | -* Automatically runs `spark-submit` on behalf of users for each `SparkApplication` eligible for submission. |
86 | | -* Provides native [cron](https://en.wikipedia.org/wiki/Cron) support for running scheduled applications. |
87 | | -* Supports customization of Spark pods beyond what Spark natively is able to do through the mutating admission webhook, e.g., mounting ConfigMaps and volumes, and setting pod affinity/anti-affinity. |
88 | | -* Supports automatic application re-submission for updated `SparkApplication` objects with updated specification. |
89 | | -* Supports automatic application restart with a configurable restart policy. |
90 | | -* Supports automatic retries of failed submissions with optional linear back-off. |
91 | | -* Supports mounting local Hadoop configuration as a Kubernetes ConfigMap automatically via `sparkctl`. |
92 | | -* Supports automatically staging local application dependencies to Google Cloud Storage (GCS) via `sparkctl`. |
93 | | -* Supports collecting and exporting application-level metrics and driver/executor metrics to Prometheus. |
94 | | - |
95 | 88 | ## Contributing |
96 | 89 |
|
97 | 90 | Please check [CONTRIBUTING.md](CONTRIBUTING.md) and the [Developer Guide](docs/developer-guide.md) out. |
| 91 | + |
| 92 | +## Community |
| 93 | + |
| 94 | +* Join our [Kubeflow Slack Channel](https://kubeflow.slack.com/archives/C06627U3XU3) |
| 95 | +* Check out [who is using the Kubernetes Operator for Apache Spark](docs/who-is-using.md). |
0 commit comments