Skip to content

[SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for executor launch with leading arguments #15579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

sheepduke
Copy link

@sheepduke sheepduke commented Oct 21, 2016

What changes were proposed in this pull request?

Sometimes it is rather useful if Spark can run with leading extra arguments.

For example, currently Spark is unaware of NUMA, which leads a performance boost when there are many remote memory allocations.

So with this patch, it is possible to run Spark with leading commands.
This patch is not only for NUMA support, but a more general one that makes it possible to integrate Spark with valgrind or some trace analysis tools.
In a word, it pvovides many possibilities to do performance tuning in different use cases.

TODO
Currently it only works for YARN. Following features for Mesos and Standalone are on the way...

How was this patch tested?

It was tested using BigBench on the cluster described below:

Cluster Topo: 1 Master + 4 Slaves (Spark on Yarn)
CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz(72 Cores)
Memory: 128GB(2 NUMA Nodes)
NIC: 1x10Gb/Sec
Disk: Write -1.5GB/Sec, Read- 5GB/Sec
SW Version: Hadoop-5.7.0 + Spark-2.0.0

More testing is still being made and this thread will be updated then. Because this is considered to be useful feature, so we would like to publish it out here.

How to use?

Currently it is using environment variable SPARK_COMMAND_PREFIX to provide a prefix command before spark (only works for YARN now).
Let's take strace as an example.
In spark-env.sh file, specify the following:

SPARK_COMMAND_PREFIX="strace -ttT"

By doing this, Spark will be launched under the above strace, thus some information about the time each syscall was made and how long each syscall took will be given.

For more information about how to use strace, please refer to the man page.

@rxin
Copy link
Contributor

rxin commented Oct 21, 2016

This seems really weird to do (and also only works in YARN).

@rxin
Copy link
Contributor

rxin commented Oct 21, 2016

Please also review https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark

Thanks a lot!

@sheepduke
Copy link
Author

This is rather useful sometimes because you wan tto add some extra tuning arguments like 'numactl'.
Otherwise it is not even possible to achieve that.

Yes it only works with YARN for now because we have requirements on it.

In future more features may be added to other facilities.

Any idea or potential improvement?

@mridulm
Copy link
Contributor

mridulm commented Oct 21, 2016

Btw, curious if you have actually tested this in yarn - I have a feeling it wont work.

@srowen
Copy link
Member

srowen commented Oct 21, 2016

I'm still not clear this is worth doing just for numactl which few or no OSes will have installed by default

@sheepduke
Copy link
Author

@mridulm Yes, I tested it in our cluster (5 nodes including 1 master). My colleague tested with some benchmarks. It seems that NUMA helps a lot for those applications that have very bad cache locality (like data lake).

@sheepduke
Copy link
Author

@srowen Hi, at first we wanted to add support for numactl only, but later we thought that it is rather better to make it possible for adding the prefix command. It is not only for NUMA, but also for other optimization tools.

This feature makes life a lot easier when spark is called from a script or so.

@srowen
Copy link
Member

srowen commented Oct 24, 2016

What's another use case? I can't think of one. We wouldn't do this with env variables anyway. (PS you need to fix up the title/description in any event)

@SparkQA
Copy link

SparkQA commented Oct 24, 2016

Test build #3370 has finished for PR 15579 at commit a24aff9.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jerryshao
Copy link
Contributor

@srowen, another use cases would be trace tools like strace which will trace the system calls for process. One way of using strace is to add strace before executing command.

@sheepduke, is your current solution possible to set different NUMA parameters per executor? Looks like the parameter is set through env variable, will this lead to a situation where all the executors are running in the same NUMA zone (if there're multiple executors on one NM)? Since the current solution uses env variable which is unchangeable during runtime.

@chenghao-intel
Copy link
Contributor

@srowen
Besides numactl, some profiling tools like the valgrind, strace, vtune, and also the system call hackings we probably needed before the executor process launched.

I'll agree this probably not that secure to provide the prefixed command via configuration, but an earlier code review if community will accept that;

@rxin And a follow up commits with unit tests, standardalone and mesos mode support will be added
soon.

@sheepduke please update the PR description, title first.

@chenghao-intel
Copy link
Contributor

Oh, thank you @jerryshao , just noticed you gave inputs also. :)

@sheepduke sheepduke changed the title Added support for extra command in front of spark. [SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for running with extra optimization such as NUMA Oct 25, 2016
@sheepduke sheepduke changed the title [SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for running with extra optimization such as NUMA [SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for running with extra arguments for optimization Oct 25, 2016
@sheepduke sheepduke changed the title [SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for running with extra arguments for optimization [SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for running with extra arguments Oct 25, 2016
@sheepduke sheepduke changed the title [SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for running with extra arguments [SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for running Spark with extra leading arguments Oct 25, 2016
@sheepduke sheepduke changed the title [SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for running Spark with extra leading arguments [SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for executor launch with leading arguments Oct 25, 2016
@sheepduke
Copy link
Author

Hi,

This is rather useful sometimes because you want to add some extra
tuning arguments like 'numactl'.

Otherwise it is not even possible to achieve that.

Yes it only works with yarn for now (because we have the requirement
with it).

Any idea or potential improvement?

Reynold Xin notifications@github.com writes:

Please also review https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark

Thanks a lot!


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.

@sheepduke
Copy link
Author

Pull request closed. Please move to #15579

@xwu-intel
Copy link

Typo. @sheepduke
Pls move to #16411

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants