Skip to content

[SPARK-2718] [yarn] Handle quotes and other characters in user args. #1724

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

vanzin
Copy link
Contributor

@vanzin vanzin commented Aug 1, 2014

Due to the way Yarn runs things through bash, normal quoting doesn't
work as expected. This change applies the necessary voodoo to the user
args to avoid issues with bash and special characters.

The change also uncovered an issue with the event logger app name
sanitizing code; it wasn't cleaning up all "bad" characters, so
sometimes it would fail to create the log dirs. I just added some
more bad character replacements.

Due to the way Yarn runs things through bash, normal quoting doesn't
work as expected. This change applies the necessary voodoo to the user
args to avoid issues with bash and special characters.

The change also uncovered an issue with the event logger app name
sanitizing code; it wasn't cleaning up all "bad" characters, so
sometimes it would fail to create the log dirs. I just added some
more bad character replacements.
@SparkQA
Copy link

SparkQA commented Aug 1, 2014

QA tests have started for PR 1724. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17713/consoleFull

@vanzin
Copy link
Contributor Author

vanzin commented Aug 1, 2014

Screenshot of RM app list:

Mental note: always test both.
@SparkQA
Copy link

SparkQA commented Aug 2, 2014

QA results for PR 1724:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17713/consoleFull

@andrewor14
Copy link
Contributor

Does this work with backslashes as well?

@vanzin
Copy link
Contributor Author

vanzin commented Aug 4, 2014

Backslashes seem to work fine. Added test to be sure.

@SparkQA
Copy link

SparkQA commented Aug 4, 2014

QA tests have started for PR 1724. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17881/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA results for PR 1724:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17881/consoleFull

arg.charAt(i) match {
case '$' => escaped.append("\\$")
case '"' => escaped.append("\\\"")
case '\'' => escaped.append("'\\\"'\\\"'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to parse this in my head, looks like a single quote becomes this: '\"'\"'. Could you explain this? It looks like we're wrapping a single quote in a pair of double quotes in a pair of single quotes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is the tricky one. Escaping single quotes inside a single-quoted string does not work.

So what you have to do is close the previous string (remember the whole thing is wrapped in single quotes), and start a new string, delimited by double quotes, with a single single quote in it. So basically, for a string containing a single single quote, you're concatenating three different strings:

  • An empty string at the start: ''
  • The single quote, wrapped in double quotes: "'"
  • An empty string at the end: ''

And since all this is already inside double quotes in the bash script itself, you need to also escape the double quotes. Fun.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess a different approach could be:

'\\''

(i.e., same string-concatenating trick, but escaping the single quote instead of wrapping it in double quotes) A little more readable, maybe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Yeah that looks simpler

@andrewor14
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Aug 6, 2014

QA tests have started for PR 1724. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17987/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 6, 2014

QA results for PR 1724:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17987/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 6, 2014

QA tests have started for PR 1724. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18073/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 6, 2014

QA results for PR 1724:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18073/consoleFull

@andrewor14
Copy link
Contributor

failed tests are not related, test this please

@andrewor14
Copy link
Contributor

test this please...

@pwendell
Copy link
Contributor

pwendell commented Aug 9, 2014

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Aug 9, 2014

QA tests have started for PR 1724. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18245/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 9, 2014

QA results for PR 1724:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18245/consoleFull

@andrewor14
Copy link
Contributor

Failed test is a flaky test that has been failing in many other PRs. This LGTM.

@vanzin
Copy link
Contributor Author

vanzin commented Aug 14, 2014

Jenkins retest this please.

@SparkQA
Copy link

SparkQA commented Aug 14, 2014

QA tests have started for PR 1724. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18544/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 14, 2014

QA results for PR 1724:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18544/consoleFull

@vanzin
Copy link
Contributor Author

vanzin commented Aug 18, 2014

Ping.

@andrewor14
Copy link
Contributor

@vanzin I'm going to quickly test this on a yarn cluster first, and then I'll merge it.

@andrewor14
Copy link
Contributor

Thanks, I've merged this into master and 1.1

@asfgit asfgit closed this in 6201b27 Aug 18, 2014
asfgit pushed a commit that referenced this pull request Aug 18, 2014
Due to the way Yarn runs things through bash, normal quoting doesn't
work as expected. This change applies the necessary voodoo to the user
args to avoid issues with bash and special characters.

The change also uncovered an issue with the event logger app name
sanitizing code; it wasn't cleaning up all "bad" characters, so
sometimes it would fail to create the log dirs. I just added some
more bad character replacements.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #1724 from vanzin/SPARK-2718 and squashes the following commits:

cc84b89 [Marcelo Vanzin] Review feedback.
c1a257a [Marcelo Vanzin] Add test for backslashes.
55571d4 [Marcelo Vanzin] Unbreak yarn-client.
515613d [Marcelo Vanzin] [SPARK-2718] [yarn] Handle quotes and other characters in user args.

(cherry picked from commit 6201b27)
Signed-off-by: Andrew Or <andrewor14@gmail.com>
@vanzin vanzin deleted the SPARK-2718 branch August 18, 2014 21:19
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
Due to the way Yarn runs things through bash, normal quoting doesn't
work as expected. This change applies the necessary voodoo to the user
args to avoid issues with bash and special characters.

The change also uncovered an issue with the event logger app name
sanitizing code; it wasn't cleaning up all "bad" characters, so
sometimes it would fail to create the log dirs. I just added some
more bad character replacements.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes apache#1724 from vanzin/SPARK-2718 and squashes the following commits:

cc84b89 [Marcelo Vanzin] Review feedback.
c1a257a [Marcelo Vanzin] Add test for backslashes.
55571d4 [Marcelo Vanzin] Unbreak yarn-client.
515613d [Marcelo Vanzin] [SPARK-2718] [yarn] Handle quotes and other characters in user args.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants