-
Notifications
You must be signed in to change notification settings - Fork 28.6k
SPARK-4040. Update documentation to exemplify use of local (n) value, fo... #2964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can one of the admins verify this patch? |
(bump) - any thoughts on this ? I'd also like role some more improvements into it in a follow up...... |
|
||
{% highlight scala %} | ||
val conf = new SparkConf() | ||
.setMaster("local") | ||
.setMaster("local[1]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually though the default behavior of local
was 2 threads, but in the code I don't think that's true. I thought @mateiz mentioned one time that it's better to run with minimal parallelism by default to expose issues that might only appear when there are multiple executors.
In any event, given that, and the thrust of this doc change, is it good to encourage people to use 1 worker? how about explicitly 2? Making it explicit is a small good thing anyway.
Hi sean . I like that idea of running with 2 threads , and making it explicit : Thats the main purpose of the PR.... i'll update that (and the // stuff) , and then rebase this PR |
0ea9a4b
to
6bcab3f
Compare
okay ! updated . After this i think we can look into some deeper updates into the streaming docs as well. (fyi @srowen ) looking good now? |
.setAppName("CountingSheep") | ||
.set("spark.executor.memory", "1g") | ||
val sc = new SparkContext(conf) | ||
{% endhighlight %} | ||
|
||
Note that we can have more than 1 worker in local mode, and in cases like spark streaming, we may actually | ||
require one to prevent any sort of starvation issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One note on this, the threads shouldn't be called "workers", since that means something else in our distributed cluster mode. It's better to call them threads here.
Also, on this like, capitalize Spark Streaming.
Thanks for adding these clarifications, it's a good idea. |
6bcab3f
to
3f22a91
Compare
3f22a91
to
35b5a5e
Compare
(bump) all set on this guy ? or shall we wait till after the upcoming spark release? |
This looks fine to merge into 1.2; will do so. Thanks! |
… fo... This is a minor docs update which helps to clarify the way local[n] is used for streaming apps. Author: jay@apache.org <jayunit100> Closes #2964 from jayunit100/SPARK-4040 and squashes the following commits: 35b5a5e [jay@apache.org] SPARK-4040: Update documentation to exemplify use of local (n) value. (cherry picked from commit 868cd4c) Signed-off-by: Matei Zaharia <matei@databricks.com>
This is a minor docs update which helps to clarify the way local[n] is used for streaming apps.