Closed
Description
Spark Connect
Spark 3.5 introduces a new client called Spark Connect.
The use case seems to be thin clients that connect to a running spark driver.
This probably means that the operator needs to be able to start spark connect servers without spark applications and publish a service for "connect" clients.
Roadmap
Rough roadmap to GA:
- POC: can set up a spark-connect server with kubernetes as resource manager, basic integration test
- minimal CRD: drop the stateful set, minimum configuration for the server (jvm props, logging)
- server
- deployment with one replica
- jvm arg overrides
- config overrides
- env overrides
- log configuration and aggregation with vector
- pod overrides
- resource requests
- status and transition events
- reconciliation operation (paused, stopped, etc)
- executor
- jvm arg overrides
- config overrides
- env overrides
- log configuration and aggregation
- resource requests
- pod affinity
- server
- add preliminary documentation
- expose Prometheus metrics
-
integrate with the history serverSee: doc: comment on spark history integration #559 - integrate with the listener op
- create a new demo
Related PRs
- feat: add support for SparkConnect #539
- fix(spark-k8s): refactor for Spark Connect docker-images#1034
- feat: Add Deployments to ClusterResources operator-rs#992
- feat: add DeploymentConditionBuilder operator-rs#993
- feat: add spark connect client to image build flows docker-images#1051
- doc: comment on spark history integration #559
- feat: expose services with listener classes #562
Metadata
Metadata
Assignees
Type
Projects
Status
Share
Status
Done