-
Notifications
You must be signed in to change notification settings - Fork 28.5k
[SPARK-2140] Updating heap memory calculation for YARN stable and alpha. #2253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can one of the admins verify this patch? |
can you just remove the entire calculateAMMemory functions and just use the args.amMemory where they are called (ClientBase.createContainerLaunchContext)? |
Jenkins, test this please |
Hey Tom, All that mod business was done so that we allocate container memory in multiples of minimum resource capability. I think this was required (atleast a long while ago). If that is not the case, then we need to definitely clean all that complicated computation up ! |
Hey Mridul, Yarn actually handles doing the multiple for you. Perhaps really early 0.23 builds didn't do this. For instance minimum container size of 512m and you ask for 700m, it will round that up and give you a 1g container. Obviously one thing it doesn't do now is change your heap size. It will be exactly what the user requests vs us rounding it up. If we think that is important then we can put it back in. I am leaning on the side of removing unneeded logic and giving the user the size they really request. Also as you will notice that logic hasn't been there in the hadoop 2.x port anyway which I think more people are using at this point. |
Jenkins, test this please |
I think it's preferable to give the user the size they actually request. This avoids them requesting the same size later under different conditions and unexpectedly OOMing. Though it's possibly worth warning users about how much their containers sizes got rounded up? |
not sure why jenkins isn't running on this.... try again. test this please |
Independent of Jenkins, I notice that some of the test suites fails sometimes. This is unrelated to this patch, just thought I'd mention it. |
Jenkins, test this please |
It's been down this afternoon |
Jenkins, test this please |
Can one of the admins verify this patch? |
Is this merge blocking on Jenkins? I can provide a JUnit test report if you want? |
Jenkins, test this please |
yes just want to get a clean jenkins run. Otherwise it looks good. Jenkins should be back up now. |
I was looking at this a bit more and I think our checks around the memory args in ClientBase.validateArgs are now out of date. It used to be that the memory specific by the user had to be greater then the memory overhead, but not we just take that on so those checks are no longer valid. Would you mind taking a look at those and fixing along with this? If not we can file a separate jira for it also since it wasn't introduced by you. |
Jenkins, test this please |
@mateiz @pwendell @andrewor14 anyone know why jenkins isn't picking this pr up? |
Manually triggered Jenkins using my new script; automated triggering should be added shortly. |
QA tests have started for PR 2253 at commit
|
LGTM, I was actually making a similar change in parallel. |
The changes look fine. I'm waiting for Jenkins. I'll file a separate jira to handle changing ClientBase.validateArgs. |
Has Jenkins been stuck for others, or is my pr just special? |
@copester It should actually be testing now, triggered manually by my private hook: https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/19/consoleFull. My current theory is that the Jenkins plugin is recognizing new PRs but not comments on those PRs. We're in the process of building a new mechanism for triggering PRs to replace the Jenkins plugin; I hope to have this live by this afternoon. |
@JoshRosen thanks. I'm brand new to the Spark community and just wanted to make sure my workflow was correct per testing. |
QA tests have finished for PR 2253 at commit
|
+1. Thanks @copester |
Updated pull request, reflecting YARN stable and alpha states. I am getting intermittent test failures on my own test infrastructure. Is that tracked anywhere yet? Author: Chris Cope <ccope@resilientscience.com> Closes #2253 from copester/master and squashes the following commits: 5ad89da [Chris Cope] [SPARK-2140] Removing calculateAMMemory functions since they are no longer needed. 52b4e45 [Chris Cope] [SPARK-2140] Updating heap memory calculation for YARN stable and alpha. (cherry picked from commit ed1980f) Signed-off-by: Thomas Graves <tgraves@apache.org>
Updated pull request, reflecting YARN stable and alpha states. I am getting intermittent test failures on my own test infrastructure. Is that tracked anywhere yet?