-
Notifications
You must be signed in to change notification settings - Fork 34
[SPARK-542] Add and document Driver<->Executor TLS support. #188
Conversation
conf/spark-env.sh
Outdated
else | ||
BASE64_D="base64 -D" # BSD | ||
fi | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about base64 --decode, should be the same:
https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man1/base64.1.html
http://manpages.ubuntu.com/manpages/zesty/man1/base64.1.html
@ArtRand I think sandbox access is protected, yet depending on different envs it would be good to support both options. For example with DC/OS you have better handling of env secrets using the secret store. Pure mesos does not provide such support out of the box. Would it be possible that the end-user would like to store all credentials in his key store? Hope I don't miss anything here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
little edits
docs/security.md
Outdated
@@ -37,26 +40,35 @@ Both stores must be base64 encoded, for example: | |||
|
|||
**Note:** The base64 string of the keystore will probably be much longer than the snippet above, spanning 50 lines or so. | |||
|
|||
With this and the password `secret` for the keystore and the private key, your JSON options file will look like this: | |||
Add the stores to your secrets in the DC/OS Secret store, for example if your base64 encoded keystores and truststores are server.jks.base64 and trust.jks.base64, respectively then do the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the stores to your secrets in the DC/OS secret store. For example, if your base64-encoded keystores and truststores are server.jks.base64
and trust.jks.base64
, respectively, then use the following commands to add them to the secret store:
docs/security.md
Outdated
} | ||
} | ||
} | ||
In this case you're adding two secrets `/truststore` and `/keystore` that you will need to pass to the Spark Driver and Executors. You will need to add the following configurations to your `dcos spark run ` command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, you are adding....
You must add the following configurations...
docs/security.md
Outdated
|
||
Make sure to connect the DC/OS cluster only using an SSL connection (i.e. by using an `https://<dcos-url>`). Use the following command to set your DC/OS URL: | ||
Importantly the `spark.mesos.driver.labels` and `spark.mesos.task.labels` must be set as shown. If you upload your secret with another path (e.g. not `/keystore` and `/truststore`) then change the `name` in the value accordingly. Lastly, `spark.mesos.task.lables` must have the `DCOS_SPACE:<dcos_space>` label as well, to have access to the secret. See the [Secrets Documentation about SPACES][13] for more details about Spaces, but usually you want `/spark` as shown. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: The spark.mesos.driver.labels
and spark.mesos.task.labels
must be set as shown. If you upload your secret with another path (e.g. not /keystore
and /truststore
), then change the name
in the value accordingly. Lastly, spark.mesos.task.lables
must have also have the DCOS_SPACE:<dcos_space>
label in order to access the secret. See the [Secrets Documentation about spaces][13] for more details about spaces. Usually, you will want to set the space label to /spark
, as shown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @ArtRand Should we add a regression test for this?
docs/security.md
Outdated
|
||
[11]: https://docs.mesosphere.com/1.9/overview/architecture/components/ | ||
[12]: http://docs.oracle.com/javase/8/docs/technotes/tools/unix/keytool.html | ||
[13]: https://docs.mesosphere.com/service-docs/spark/v2.0.1-2.2.0-1/run-job/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As referenced above, this link is supposed to point to "Secrets Documentation about SPACES".
docs/security.md
Outdated
| Variable | Description | | ||
|----------------------------------|-------------------------------------------------| | ||
| `spark.ssl.enabled` | Whether to enable SSL (default: `false`). | | ||
| `spark.ssl.keyStoreBase64` | Base64 encoded blob containing a Java keystore. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not required anymore, if using secrets.
docs/security.md
Outdated
--conf spark.ssl.trustStore=trust.jks \ # this MUST be set this way | ||
--conf spark.ssl.trustStorePassword=<truststore password> \ | ||
--conf spark.mesos.driver.labels=DCOS_SECRETS_DIRECTIVE:[{\"name\"\:\"/keystore\"\,\"type\"\:\"ENVIRONMENT\"\,\"environment\"\:{\"name\"\:\"KEYSTORE_BASE64\"}}\,{\"name\"\:\"/truststore\"\,\"type\"\:\"ENVIRONMENT\"\,\"environment\"\:{\"name\"\:\"TRUSTSTORE_BASE64\"}}] \ | ||
--conf spark.mesos.task.labels=DCOS_SECRETS_DIRECTIVE:[{\"name\"\:\"/keystore\"\,\"type\"\:\"ENVIRONMENT\"\,\"environment\"\:{\"name\"\:\"KEYSTORE_BASE64\"}}\,{\"name\"\:\"/truststore\"\,\"type\"\:\"ENVIRONMENT\"\,\"environment\"\:{\"name\"\:\"TRUSTSTORE_BASE64\"}}],DCOS_SPACE:/spark, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty long command. Did you consider setting some of these configs automatically in the CLI? Or, do you want the user to be aware of each config being set?
conf/spark-env.sh
Outdated
echo "spark-env: Copying krb config from $KRB5_CONFIG_BASE64 to /etc/" >&2 | ||
echo "${KRB5_CONFIG_BASE64}" | ${BASE64_D} > /etc/krb5.conf | ||
else | ||
echo "spark-env: No kerberos KDC config found" >&2 | ||
fi | ||
|
||
if [[ -n "${KEYSTORE_BASE64}" ]]; then | ||
echo "spark-env: Copying keystore to SANDBOX" >&2 | ||
echo "${KEYSTORE_BASE64}" | ${BASE64_D} > ${MESOS_SANDBOX}/server.jks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest using file based secrets that are more secure than writing contents of keystore to mesos sandbox.
This will be replaced with file-based secrets. |
I'm going to reopen this PR and replace the env-based secrets with file-based secrets. |
ca8ae07
to
2a7a40a
Compare
I have replaced the env-based secrets with file-based secrets, pending this Apache Spark PR: apache/spark#19437 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few nits. Thanks!
tests/test_spark.py
Outdated
truststore_path = os.path.join(resources_folder, '{}.base64'.format(truststore_file)) | ||
keystore_secret = '__dcos_base64__keystore' | ||
truststore_secret = '__dcos_base64__truststore' | ||
shakedown.run_dcos_command('security secrets create /{} --value-file {}'.format(keystore_secret, keystore_path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how come this works now? I remember having to jump through some hoops to get the secrets API to work (https://github.com/mesosphere/spark-build/blob/master/tests/utils.py#L179). Could you homogenize the way secrets are added in the tests by either using the SecretsHandler or changing/removing it?
tests/test_spark.py
Outdated
@@ -318,6 +318,53 @@ def test_secrets(): | |||
LOGGER.warn("Error when deleting secret, {}".format(r.content)) | |||
|
|||
|
|||
# Skip DC/OS < 1.10, because it doesn't have support for file-based secrets. | |||
@pytest.mark.skip("Enable when SPARK-22131 is merged and released in DC/OS Spark") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you've tested this locally. How about we test with a specific docker image (that has the patch, built from master) instead of the universe package. I don't like the idea of merging "skipped tests"
docs/security.md
Outdated
--conf spark.ssl.protocol=TLS \ | ||
--conf spark.ssl.trustStore=trust.jks \ # this MUST be set this way | ||
--conf spark.ssl.trustStorePassword=<truststore password> \ | ||
--conf spark.mesos.driver.secret.names=__dcos_base64__keystore,__dcos_base64__truststore \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keithchambers should probably review this. We could add some sugar to the CLI so these extra configs aren't required (like we do with Kerberos), but it becomes the same thing.
docs/security.md
Outdated
--conf spark.mesos.driver.secret.filenames=server.jks,trust.jks \ | ||
--conf spark.mesos.executor.secret.names=__dcos_base64__keystore,__dcos_base64__truststore \ | ||
--conf spark.mesos.executor.secret.filenames=server.jks,trust.jks \ | ||
--conf spark.mesos.task.labels=DCOS_SPACE:/spark \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably add this via the CLI automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more intuitive for the secret:file pairs to be together?
e.g., conf spark.mesos.executor.secrets=secret1:file1,secret2:file2
24d75c8
to
e223ddf
Compare
docs/security.md
Outdated
} | ||
In this case, you are adding two secrets `/truststore` and `/keystore`. | ||
You must add the following configurations to your `dcos spark run ` command. | ||
The ones in parentheses are optional.: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, @keithchambers . Actually, that was the old verbose command. Here is the simplified one. What do you think? --
6c5246b
to
e7b869b
Compare
cli/dcos-spark/submit_builder.go
Outdated
paths = append(paths, args.truststoreSecretPath) | ||
filenames = append(filenames, trustStoreFileName) | ||
} | ||
args.properties["spark.mesos.driver.secret.names"] = strings.Join(paths, ",") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this overwrite someone's specified secrets if they've already specified some?
i.e.
docs spark run --submit-args="\
...
--conf spark.mesos.driver.secret.names=awsaccess \
--conf spark.mesos.driver.secret.filenames=awsaccess \
...
@@ -43,6 +43,8 @@ type sparkArgs struct { | |||
keytabSecretPath string | |||
tgtSecretPath string | |||
tgtSecretValue string | |||
keystoreSecretPath string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about adding a password because the other "magic" configs are here also?
cli/dcos-spark/submit_builder.go
Outdated
// Make sure passwords are set | ||
if _, ok := args.properties["spark.ssl.keyPassword"]; !ok { | ||
return "", errors.New("Need to provide key password with keystore: " + | ||
"--conf spark.ssl.keyPassword=<password>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, I think we should add a DC/OS specific flag here otherwise your TLS args are both in the DC/OS-specific args and the submit-args
docs/security.md
Outdated
| Variable | Description | Default Value | | ||
|----------------------------------|-------------------------------------------------|---------------| | ||
| `spark.ssl.enabledAlgorithms` | Allowed cyphers | JVM defaults | | ||
| `spark.ssl.keyPassword` | The password for the private key | None | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change default value to REQUIRED? @sascala ?
…onfig properties.
904a4a0
to
17bccac
Compare
|
cli/dcos-spark/submit_builder.go
Outdated
if args.keystorePassword == "" || args.privateKeyPassword == "" { | ||
return "", errors.New("Need to provide keystore password and key password with keystore") | ||
} else { | ||
args.properties["spark.ssl.keyStorePassword"] = args.keystorePassword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/question: any reason the password configs are set here instead of in `setupTLSArgs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular reason. I'll move the setting into setupTLSArgs
and just leave the validation check here.
Then calculates the value of pi. | ||
""" | ||
|
||
check_secret(sys.argv[2], sys.argv[3]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this also check the secret in the executors or just the driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't bother adding a test for executor secrets because the CLI code (wrt TLS) is the same for driver and executor secrets.
|
||
dcos package install --options=options.json spark | ||
**Note:** If you have specified a space for your secrets other than the default value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can specify this with a "magic" flag? d2iq-archive/dcos-commons#1872
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which magic flag? The link ^^ doesn't seem relevant?
Make sure to connect the DC/OS cluster only using an SSL connection (i.e. by using an `https://<dcos-url>`). Use the following command to set your DC/OS URL: | ||
**Note:** If you specify environment-based secrets with `spark.mesos.[driver|executor].secret.envkeys`, | ||
the keystore and truststore secrets will also show up as environment-based secrets, | ||
due to the way secrets are implemented. You can ignore these extra environment variables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also be mentioned in Limitations (https://docs.mesosphere.com/service-docs/spark/2.1.0-2.2.0-1/limitations/)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding ...
@@ -286,5 +290,51 @@ def test_cli_multiple_spaces(): | |||
" --class ", "org.apache.spark.examples.SparkPi"]) | |||
|
|||
|
|||
# Skip DC/OS < 1.10, because it doesn't have support for file-based secrets. | |||
@pytest.mark.skipif('shakedown.dcos_version_less_than("1.10")') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to skip on open also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
tests/test_spark.py
Outdated
# Skip DC/OS < 1.10, because it doesn't have support for file-based secrets. | ||
@pytest.mark.skipif('shakedown.dcos_version_less_than("1.10")') | ||
@pytest.mark.sanity | ||
def test_driver_executor_tls(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make the job rely on a file being served by the driver (i.e. the channel effected by TLS)? This may be more work than necessary because we'd need to enable/check the --files
flag.
for _, taskType := range taskTypes { | ||
appendToProperty(fmt.Sprintf("spark.mesos.%s.secret.names", taskType), joinedPaths, args) | ||
appendToProperty(fmt.Sprintf("spark.mesos.%s.secret.filenames", taskType), joinedFilenames, args) | ||
appendToPropertyIfSet(fmt.Sprintf("spark.mesos.%s.secret.envkeys", taskType), joinedEnvkeys, args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
def check_secret(secret_name, secret_content): | ||
''' | ||
Make sure the extra secret envvar and secret file show up in driver. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check driver and executor?
TLS in Spark
This PR is about 2 things.
How it works
Spark has native support for TLS with java KeyStores. It requires the KeyStore and optionally the TrustStore (hereafter referred to as Stores) to be in the sandbox of the Driver and the Executors.
The solution implemented here uses environment based secrets. The user base64 encodes their Stores and puts them in the DC/OS secret store. Then you use add the appropriate configuration (either on the command line or with a properties file) and the base64 string is injected in the Driver and executor environments. The pre-start script
spark-env.sh
notices these environment variables and decodes them to files.The question.
This solution is similar to the older Kerberos implementation, which is why I asked @triclambert and @vishnu2kmohan to review this PR. My question is: is this solution inherently insecure and thus we should not even allow/advise customers to use it? Is exposing the Stores files in the sandbox a common practice?
The better alternative is to use file-based secrets as we do for Kerberos. Unfortunately, the MesosCoarseGrainedSchedulerBackend (the Driver) does not have secrets support which it needs to to start the executors with secrets. I recently added secret support to the Dispatcher (in Spark) and it would be a trivial change to add it to the Driver also (but would require the ASF review process).
Thanks for the input.