Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-19720][CORE] Redact sensitive information from SparkSubmit con…
…sole ## What changes were proposed in this pull request? This change redacts senstive information (based on `spark.redaction.regex` property) from the Spark Submit console logs. Such sensitive information is already being redacted from event logs and yarn logs, etc. ## How was this patch tested? Testing was done manually to make sure that the console logs were not printing any sensitive information. Here's some output from the console: ``` Spark properties used, including those specified through --conf and those from the properties file /etc/spark2/conf/spark-defaults.conf: (spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted)) (spark.authenticate,false) (spark.executorEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted)) ``` ``` System properties: (spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted)) (spark.authenticate,false) (spark.executorEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted)) ``` There is a risk if new print statements were added to the console down the road, sensitive information may still get leaked, since there is no test that asserts on the console log output. I considered it out of the scope of this JIRA to write an integration test to make sure new leaks don't happen in the future. Running unit tests to make sure nothing else is broken by this change. Author: Mark Grover <mark@apache.org> Closes apache#17047 from markgrover/master_redaction.
- Loading branch information