-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-18750][yarn] Avoid using "mapValues" when allocating containers. #16667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
That method is prone to stack overflows when the input map is really large; instead, use plain "map". Also includes a unit tests that was tested and caused stack overflows without the fix.
Test build #71751 has finished for PR 16667 at commit
|
Argh, api not available in old hadoop... fix coming. |
Test build #71756 has finished for PR 16667 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine. I'm surprised there is a difference but take your word for it if it fixes it. It should be functionally equivalent anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Just a couple small nits.
import org.apache.hadoop.fs.CommonConfigurationKeysPublic | ||
import org.apache.hadoop.net.DNSToSwitchMapping | ||
import org.apache.hadoop.yarn.api.records._ | ||
import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not used that I see, remove
yarnConf, resource) | ||
|
||
val totalTasks = 32 * 1024 | ||
val totalContainers = totalTasks / 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any particular reason for the 16 here? I assume its just random selected that shows the issue but perhaps add in a comment.
val count = containers.size / hosts.size / 2 | ||
|
||
val hostToContainerMap = new HashMap[String, Set[ContainerId]]() | ||
hosts.keys.take(hosts.size / 2).zipWithIndex.foreach { case (host, i) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar wouldn't hurt to have small description here for someone looking at it later
Test build #71867 has finished for PR 16667 at commit
|
+1 |
That method is prone to stack overflows when the input map is really large; instead, use plain "map". Also includes a unit test that was tested and caused stack overflows without the fix. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #16667 from vanzin/SPARK-18750. (cherry picked from commit 76db394) Signed-off-by: Tom Graves <tgraves@yahoo-inc.com>
That method is prone to stack overflows when the input map is really large; instead, use plain "map". Also includes a unit test that was tested and caused stack overflows without the fix. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #16667 from vanzin/SPARK-18750. (cherry picked from commit 76db394) Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
That method is prone to stack overflows when the input map is really large; instead, use plain "map". Also includes a unit test that was tested and caused stack overflows without the fix. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#16667 from vanzin/SPARK-18750.
…ainers. apache#16667 without adding LocalityPlacementStrategySuite.scala
That method is prone to stack overflows when the input map is really large; instead, use plain "map". Also includes a unit test that was tested and caused stack overflows without the fix. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#16667 from vanzin/SPARK-18750.
That method is prone to stack overflows when the input map is really
large; instead, use plain "map". Also includes a unit test that was
tested and caused stack overflows without the fix.