-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-2718] [yarn] Handle quotes and other characters in user args. #1724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Due to the way Yarn runs things through bash, normal quoting doesn't work as expected. This change applies the necessary voodoo to the user args to avoid issues with bash and special characters. The change also uncovered an issue with the event logger app name sanitizing code; it wasn't cleaning up all "bad" characters, so sometimes it would fail to create the log dirs. I just added some more bad character replacements.
QA tests have started for PR 1724. This patch merges cleanly. |
Mental note: always test both.
QA results for PR 1724: |
Does this work with backslashes as well? |
Backslashes seem to work fine. Added test to be sure. |
QA tests have started for PR 1724. This patch merges cleanly. |
QA results for PR 1724: |
arg.charAt(i) match { | ||
case '$' => escaped.append("\\$") | ||
case '"' => escaped.append("\\\"") | ||
case '\'' => escaped.append("'\\\"'\\\"'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to parse this in my head, looks like a single quote becomes this: '\"'\"'
. Could you explain this? It looks like we're wrapping a single quote in a pair of double quotes in a pair of single quotes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is the tricky one. Escaping single quotes inside a single-quoted string does not work.
So what you have to do is close the previous string (remember the whole thing is wrapped in single quotes), and start a new string, delimited by double quotes, with a single single quote in it. So basically, for a string containing a single single quote, you're concatenating three different strings:
- An empty string at the start: ''
- The single quote, wrapped in double quotes: "'"
- An empty string at the end: ''
And since all this is already inside double quotes in the bash script itself, you need to also escape the double quotes. Fun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess a different approach could be:
'\\''
(i.e., same string-concatenating trick, but escaping the single quote instead of wrapping it in double quotes) A little more readable, maybe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Yeah that looks simpler
test this please |
QA tests have started for PR 1724. This patch merges cleanly. |
QA results for PR 1724: |
QA tests have started for PR 1724. This patch merges cleanly. |
QA results for PR 1724: |
failed tests are not related, test this please |
test this please... |
Jenkins, test this please. |
QA tests have started for PR 1724. This patch merges cleanly. |
QA results for PR 1724: |
Failed test is a flaky test that has been failing in many other PRs. This LGTM. |
Jenkins retest this please. |
QA tests have started for PR 1724. This patch merges cleanly. |
QA results for PR 1724: |
Ping. |
@vanzin I'm going to quickly test this on a yarn cluster first, and then I'll merge it. |
Thanks, I've merged this into master and 1.1 |
Due to the way Yarn runs things through bash, normal quoting doesn't work as expected. This change applies the necessary voodoo to the user args to avoid issues with bash and special characters. The change also uncovered an issue with the event logger app name sanitizing code; it wasn't cleaning up all "bad" characters, so sometimes it would fail to create the log dirs. I just added some more bad character replacements. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #1724 from vanzin/SPARK-2718 and squashes the following commits: cc84b89 [Marcelo Vanzin] Review feedback. c1a257a [Marcelo Vanzin] Add test for backslashes. 55571d4 [Marcelo Vanzin] Unbreak yarn-client. 515613d [Marcelo Vanzin] [SPARK-2718] [yarn] Handle quotes and other characters in user args. (cherry picked from commit 6201b27) Signed-off-by: Andrew Or <andrewor14@gmail.com>
Due to the way Yarn runs things through bash, normal quoting doesn't work as expected. This change applies the necessary voodoo to the user args to avoid issues with bash and special characters. The change also uncovered an issue with the event logger app name sanitizing code; it wasn't cleaning up all "bad" characters, so sometimes it would fail to create the log dirs. I just added some more bad character replacements. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#1724 from vanzin/SPARK-2718 and squashes the following commits: cc84b89 [Marcelo Vanzin] Review feedback. c1a257a [Marcelo Vanzin] Add test for backslashes. 55571d4 [Marcelo Vanzin] Unbreak yarn-client. 515613d [Marcelo Vanzin] [SPARK-2718] [yarn] Handle quotes and other characters in user args.
Due to the way Yarn runs things through bash, normal quoting doesn't
work as expected. This change applies the necessary voodoo to the user
args to avoid issues with bash and special characters.
The change also uncovered an issue with the event logger app name
sanitizing code; it wasn't cleaning up all "bad" characters, so
sometimes it would fail to create the log dirs. I just added some
more bad character replacements.