Skip to content

Commit 2aef79a

Browse files
rvessesrowen
authored andcommitted
[SPARK-25023] More detailed security guidance for K8S
## What changes were proposed in this pull request? Highlights specific security issues to be aware of with Spark on K8S and recommends K8S mechanisms that should be used to secure clusters. ## How was this patch tested? N/A - Documentation only CC felixcheung tgravescs skonto Closes apache#23013 from rvesse/SPARK-25023. Authored-by: Rob Vesse <rvesse@dotnetrdf.org> Signed-off-by: Sean Owen <sean.owen@databricks.com>
1 parent 4ac8f9b commit 2aef79a

File tree

1 file changed

+15
-1
lines changed

1 file changed

+15
-1
lines changed

docs/running-on-kubernetes.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,19 @@ container images and entrypoints.**
1515
# Security
1616

1717
Security in Spark is OFF by default. This could mean you are vulnerable to attack by default.
18-
Please see [Spark Security](security.html) and the specific security sections in this doc before running Spark.
18+
Please see [Spark Security](security.html) and the specific advice below before running Spark.
19+
20+
## User Identity
21+
22+
Images built from the project provided Dockerfiles do not contain any [`USER`](https://docs.docker.com/engine/reference/builder/#user) directives. This means that the resulting images will be running the Spark processes as `root` inside the container. On unsecured clusters this may provide an attack vector for privilege escalation and container breakout. Therefore security conscious deployments should consider providing custom images with `USER` directives specifying an unprivileged UID and GID.
23+
24+
Alternatively the [Pod Template](#pod-template) feature can be used to add a [Security Context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#volumes-and-file-systems) with a `runAsUser` to the pods that Spark submits. Please bear in mind that this requires cooperation from your users and as such may not be a suitable solution for shared environments. Cluster administrators should use [Pod Security Policies](https://kubernetes.io/docs/concepts/policy/pod-security-policy/#users-and-groups) if they wish to limit the users that pods may run as.
25+
26+
## Volume Mounts
27+
28+
As described later in this document under [Using Kubernetes Volumes](#using-kubernetes-volumes) Spark on K8S provides configuration options that allow for mounting certain volume types into the driver and executor pods. In particular it allows for [`hostPath`](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath) volumes which as described in the Kubernetes documentation have known security vulnerabilities.
29+
30+
Cluster administrators should use [Pod Security Policies](https://kubernetes.io/docs/concepts/policy/pod-security-policy/) to limit the ability to mount `hostPath` volumes appropriately for their environments.
1931

2032
# Prerequisites
2133

@@ -214,6 +226,8 @@ Starting with Spark 2.4.0, users can mount the following types of Kubernetes [vo
214226
* [emptyDir](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir): an initially empty volume created when a pod is assigned to a node.
215227
* [persistentVolumeClaim](https://kubernetes.io/docs/concepts/storage/volumes/#persistentvolumeclaim): used to mount a `PersistentVolume` into a pod.
216228

229+
**NB:** Please see the [Security](#security) section of this document for security issues related to volume mounts.
230+
217231
To mount a volume of any of the types above into the driver pod, use the following configuration property:
218232

219233
```

0 commit comments

Comments
 (0)