[SPARK-7200] Check that memory is not leaked in TaskMemoryManagerSuite's after()#17402
[SPARK-7200] Check that memory is not leaked in TaskMemoryManagerSuite's after()#17402jsoltren wants to merge 4 commits intoapache:masterfrom
Conversation
…ryManagerSuite Cleans up all memory used by tests explicitly in after(). This new function has the JUnit @after tag and thus is guaranteed to run, even if an @test method throws an exception. Testing: Ran tests locally. Hacked an existing test and made sure that a failure was reported in the original test, and not in after(). Verified test logging in test-reports/org.apache.spark.memory.TaskMemoryManagerSuite.xml.
|
ok to test |
|
Test build #75129 has finished for PR 17402 at commit
|
vanzin
left a comment
There was a problem hiding this comment.
Please update the title. You're not adding a cleanup, you're adding a check that no memory was leaked after tests ran.
|
|
||
| public class TaskMemoryManagerSuite { | ||
|
|
||
| static TaskMemoryManager manager; |
There was a problem hiding this comment.
Does not need to be static. Also should be private.
|
|
||
| @After | ||
| public void after() { | ||
| Assert.assertEquals(0, manager.getMemoryConsumptionForThisTask()); |
There was a problem hiding this comment.
Add Assert.assertNotNull(manager).
Also, probably minor, but if a test fails, these checks will probably fail in weird ways and might mask the original error... I'd try to insert a failure in a test and seeing what happens.
There was a problem hiding this comment.
@squito had the same concern. I purposefully leaked some memory at the end of leakedPageMemoryIsDetected and got this very helpful trace:
[info] Test org.apache.spark.memory.TaskMemoryManagerSuite.leakedPageMemoryIsDetected started
[error] Test org.apache.spark.memory.TaskMemoryManagerSuite.leakedPageMemoryIsDetected failed: java.lang.AssertionError: expected:<0> but was:<4096>, took 0.007 sec
[error] at org.apache.spark.memory.TaskMemoryManagerSuite.after(TaskMemoryManagerSuite.java:34)
[error] ...
There was a problem hiding this comment.
This sounds like this assert triggering, not the actual test failing?
I'm talking about is leaking memory and inserting a fail("blah") in a test and seeing what's reported.
| final MemoryBlock block = manager.allocatePage(4096, c); // leak memory | ||
| Assert.assertEquals(4096, manager.getMemoryConsumptionForThisTask()); | ||
| Assert.assertEquals(4096, manager.cleanUpAllAllocatedMemory()); | ||
| manager.freePage(block, c); |
There was a problem hiding this comment.
Is this really needed? Why isn't the call above freeing this memory?
There was a problem hiding this comment.
It should. We can check for that instead with:
Assert.assertEquals(0, manager.cleanUpAllAllocatedMemory());
|
Hi @jsoltren, is this still active? |
|
I'm still waiting on feedback from @JoshRosen on my earlier question. I can update this patch in response to @vanzin's review feedback if it would help to get this checked in. IMO this is a pretty trivial fix that we should get checked in regardless. |
|
I see. Thank you for your input. |
|
At this point I'd like to get this checked in whether or not @JoshRosen chimes in here. @HyukjinKwon @vanzin I've updated this based on the previous review feedback received. Thanks! |
|
I think that the scope / intent of the original SPARK-7200 JIRA was to include this type of logic as a generic assertion in multiple suites so that we gain additional implicit assertions in suites which would otherwise maybe not have them, whereas this patch looks mostly like cleanup/refactoring of a single suite's existing assertions. I don't have any objections to this patch but I don't see it as a big win either. I don't think that this PR's specific changes will actually help to prevent any bugs because we don't appear to be adding assertions which weren't already present in the existing code. |
|
Test build #76992 has finished for PR 17402 at commit
|
|
@jsoltren do you plan to follow up on Josh's comments? Seems like the scope of the bug was a little larger than you thought. |
|
Yes, I'll post an updated change once I wrap up a couple of other things...
@vanzin The scope of the bug (specifically the text "after each suite") was vague so this was a purposefully small proof-of-concept change. I tried to be clear about this in JIRA comments and in the PR description. That being said: yes, the direction I'll go is reviewing more existing tests and seeing where it would make sense to put a check for leaked memory. Ideally I'd put this in an after() or afterAll() block but I need to do some more digging to figure out exactly where. |
|
ping @jsoltren |
|
Pretty sure we can close this out. |
Closes apache#11494 Closes apache#14158 Closes apache#16803 Closes apache#16864 Closes apache#17455 Closes apache#17936 Closes apache#19377 Added: Closes apache#19380 Closes apache#18642 Closes apache#18377 Closes apache#19632 Added: Closes apache#14471 Closes apache#17402 Closes apache#17953 Closes apache#18607 Also cc srowen vanzin HyukjinKwon gatorsmile cloud-fan to see if you have other PRs to close. Author: Xingbo Jiang <xingbo.jiang@databricks.com> Closes apache#19669 from jiangxb1987/stale-prs.
Cleans up all memory used by tests explicitly in after(). This new
function has the JUnit @after tag and thus is guaranteed to run, even
if an @test method throws an exception.
Testing:
Ran tests locally.
Hacked an existing test and made sure that a failure was reported in
the original test, and not in after().
Verified test logging in
test-reports/org.apache.spark.memory.TaskMemoryManagerSuite.xml.
What changes were proposed in this pull request?
Updates TaskMemoryManagerSuite based on Josh Rosen's comments in SPARK-7200. It would be good to get some feedback as to whether or not this is the right direction.
How was this patch tested?
See "Testing" above.