Skip to content

Conversation

@LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Oct 12, 2023

What changes were proposed in this pull request?

This pr replaces refQueue.poll() with refQueue.remove() in the test case reference to sub iterator should not be available after completion to ensure that a PhantomReference object can be retrieved from refQueue.

Why are the changes needed?

#43325 replaces Reference#isEnqueued with Reference#refersTo(null) to eliminate the use of deprecated APIs.

However, there are some differences between ref.isEnqueued and ref.refersTo(null).

  • The ref.isEnqueued method is used to check whether this PhantomReference object has been added to its reference queue by the garbage collector. When the garbage collector decides to recycle an object, if this object has one or more PhantomReference, then these PhantomReference will be added to their reference queues. So, if ref.isEnqueued returns true, it means that this PhantomReference has been added to the reference queue, which means that the object it references has been recycled by the garbage collector.

  • The ref.refersTo(null) method is used to check whether this PhantomReference object refers to the specified object. In the current code, ref.refersTo(null) is used to check whether ref still refers to sub. If ref.refersTo(null) returns true, it means that ref no longer refers to sub, which means that sub might have been recycled by the garbage collector. But this does not mean that this ref has been added to the reference queue.

So we can see the following test failure in GA:

https://github.com/apache/spark/actions/runs/6484510414/job/17608536854

[info] - reference to sub iterator should not be available after completion *** FAILED *** (287 milliseconds)
[info]   null did not equal java.lang.ref.PhantomReference@11e8f090 (CompletionIteratorSuite.scala:67)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
[info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
[info]   at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231)
[info]   at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295)
[info]   at org.apache.spark.util.CompletionIteratorSuite.$anonfun$new$3(CompletionIteratorSuite.scala:67)

To solve this issue, this PR replaces refQueue.poll() with refQueue.remove() to allow for waiting until ref is put into refQueue and can be retrieved from refQueue.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Pass GitHub Actions

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the CORE label Oct 12, 2023
}
assert(ref.refersTo(null))
assert(refQueue.poll() === ref)
assert(refQueue.remove() === ref)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove is a blocking method, should we add a timeout(ms)? like .remove(1000)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also cc @srowen

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh yeah might be safer to add a timeout

@LuciferYang
Copy link
Contributor Author

LuciferYang commented Oct 12, 2023

OK, 24fe2c0 add 1000ms timeout

@LuciferYang
Copy link
Contributor Author

LuciferYang commented Oct 12, 2023

Some test tasks failed, I will re-trigger them later

@LuciferYang
Copy link
Contributor Author

[info] *** 1 TEST FAILED ***
[error] Failed: Total 3689, Failed 1, Errors 0, Passed 3688, Ignored 10, Canceled 2
[error] Failed tests:
[error] 	org.apache.spark.scheduler.HealthTrackerIntegrationSuite

The failed test case is not related to the current PR.

@LuciferYang LuciferYang changed the title [SPARK-45499][CORE][TESTS][FOLLOWUP] Use ReferenceQueue#remove instead of ReferenceQueue#poll [SPARK-45499][CORE][TESTS][FOLLOWUP] Use ReferenceQueue#remove instead of ReferenceQueue#poll Oct 12, 2023
@LuciferYang
Copy link
Contributor Author

Merged into master, thanks @srowen @dongjoon-hyun @yaooqinn

@LuciferYang LuciferYang deleted the ref-remove branch October 18, 2023 05:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants