Skip to content

8344332: (bf) Migrate DirectByteBuffer away from jdk.internal.ref.Cleaner #25289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from

Conversation

kimbarrett
Copy link

@kimbarrett kimbarrett commented May 18, 2025

This change makes java.nio no longer use jdk.internal.ref.Cleaner to manage
native memory for Direct-X-Buffers. Instead it uses bespoke PhantomReferences
and a dedicated ReferenceQueue. This differs from PR 22165, which proposed to
use java.lang.ref.Cleaner.

This change is algorithmically similar to the two previous versions:
JDK-6857566 and JDK-8156500 (current mainline). The critical function is
Bits::reserveMemory(). For both of those versions and this change, a thread
calls that function and tries to reserve some space. If it fails, then it
keeps trying until all cleaners deactivated (cleared) by prior GCs have been
cleaned. If reservation still fails, then it invokes the GC to try to
deactivate more cleaners for cleaning. After that GC it keeps trying the
reservation and waiting for cleaning, with sleeps to avoid a spin loop,
eventually either succeeding or giving up and throwing OOME.

Retaining that algorithmic approach is one of the goals of this change, since
it has been successfully in use since JDK 9 (and was originally developed and
extensively tested in JDK 8).

The key to this approach is having a way to determine that deactivated
cleaners have been cleaned. JDK-6857566 accomplished this by having waiting
threads help the reference processor until there was no available work.
JDK-8156500 waits for the reference processor to quiesce, relying on its
immediate processing of cleaners. java.lang.ref.Cleaner doesn't provide a way
to do this, which is why this change rolls its own Cleaner-like mechanism from
the underlying primitives. Like JDK-6857566, this change has waiting threads
help with cleaning references. This was a potentially undesirable feature of
JDK-6857566, as arbitrary allocating threads were invoking arbitrary cleaners.
(Though by the time of JDK-6857566 the cleaners were only used by DBB, and
became internal-only somewhere around that time as well.) That's not a concern
here, as the cleaners involved are only from DBB, and we know what they look
like.

As noted in the discussion of JDK-6857566, it's good to have DBB cleaning
being done off the reference processing thread, as it may be expensive and
slow down enqueuing other pending references. JDK-6857566 only did some of
that, and JDK-8156500 lost that feature. This change moves all of the DBB
cleaning off of the reference processing thread. (So does PR 22165.)

Neither JDK-6857566 nor this change are completely precise. For both, a thread
may find there is no available work while other threads have work in progress.
Making this change more precise seems to cost complexity and performance.
JDK-8156500 is precise in this respect, so we're losing that. But this
imprecision wasn't known to cause problems for JDK-6857566, and there hasn't
been any evidence of problems with this change either.

During the development of JDK-6857566 it was noticed that parallel cleaning
didn't seem to have much (if any) performance benefit. That seems to be true
for this change as well.

PR 22165 uses java.lang.ref.Cleaner to manage cleaning. That class doesn't
provide a good way to detect progress toward or completion of cleaning of
deactivated cleaners from prior GCs. So PR 22165 uses a somewhat clumsy and
unreliable mechanism (the canaries) to try to do that. A proposal for such
functionality was discussed (in PR 22165) but deemed (probably rightly so) too
intrusive. An unpublished alternative was less intrusive, but still might
raise questions. The change being proposed here avoids changing or using that
class, and performs at least as well.

Another issue with PR 22165 is that if we are indeed out of memory and on our
way to OOME, each allocating thread may come up against the slow path lock in
Bits::reserveMemory, and in turn perform 9 full GCs and then OOME. That seems
kind of pathological. For JDK-6857566, JDK-8156500, and this change, an
allocating thread only performs 1 full GC before OOME.

One issue with this change is that it incorporates a near-copy of the
CleanableList class from java.lang.ref.Cleaner. Possible future work would
merge the two into a common utility. There's another potential client for
this: java.desktop/share/classes/sun/java2d/Disposer.java. I tried using a
hashtable for this change (as with Disposer), but the CleanableList performed
significantly better.

A well-known issue with all of these approaches is -XX:+DisableExplicitGC. If
used, then the GCs to request reference processing don't happen. That will
likely lead to OOME, though the sleeps might provide an opportunity for
automatic GCs to occur, maybe sometimes dodging OOME that way.

https://mail.openjdk.org/pipermail/core-libs-dev/2013-October/021547.html
Thread for discussion of development of JDK-6857566

Testing: mach5 tier1-6

Many runs of new tests micro/org/openjdk/bench/java/nio/DirectByteBuffer{GC,Churn}
(thanks for those @shipilev), and jdk/java/nio/Buffer/DirectByteBufferAlloc
for various versions of this change.

The test java/nio/Buffer/DirectByteBufferAlloc.java can be run explicitly as a
benchmark. But the arguments suggested in that file cause the measurements to
be dominated by full GC times, swamping any other differences. Increasing the
value of XX:MaxDirectMemorySize from 128m to 1024m provides a more useful
comparison.

Result of running that test with -XX:MaxDirectMemorySize=1024m, with other
options as suggested in the file, and comparing the periodic per thread
ms/allocation outputs, produces results like this:

this change PR 22165 baseline
avg 0.76165 1.00368 1.02719
stddev 0.12396 0.18361 0.27210
min 0.54 0.6 0.59
max 1.54 2.13 2.39

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)

Issue

  • JDK-8344332: (bf) Migrate DirectByteBuffer away from jdk.internal.ref.Cleaner (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25289/head:pull/25289
$ git checkout pull/25289

Update a local copy of the PR:
$ git checkout pull/25289
$ git pull https://git.openjdk.org/jdk.git pull/25289/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25289

View PR using the GUI difftool:
$ git pr show -t 25289

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25289.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 18, 2025

👋 Welcome back kbarrett! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 18, 2025

@kimbarrett This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8344332: (bf) Migrate DirectByteBuffer away from jdk.internal.ref.Cleaner

Reviewed-by: rriggs, bchristi

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 81 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label May 18, 2025
@openjdk
Copy link

openjdk bot commented May 18, 2025

@kimbarrett The following labels will be automatically applied to this pull request:

  • core-libs
  • nio

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added nio nio-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels May 18, 2025
@mlbridge
Copy link

mlbridge bot commented May 18, 2025

@AlanBateman
Copy link
Contributor

/reviewers 2 reviewer

@openjdk
Copy link

openjdk bot commented May 18, 2025

@AlanBateman
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).

* questions.
*/

package jdk.internal.nio;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation/internal classes for this area are in sun.nio (for historical reasons). Probably best to keep them together.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Done.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But MhUtil was added to the newly created jdk.internal.invoke package in #20972 instead of adding it to the pre‑existing sun.invoke package.

@kimbarrett
Copy link
Author

BTW, I'm fine with a suggestion to wait until JDK 26 before integrating this.

if (RESERVE_GC_EPOCH == cleanedEpoch) {
// Increment with overflow to 0, so the value can
// never equal the initial/reset cleanedEpoch value.
RESERVE_GC_EPOCH = Integer.max(0, RESERVE_GC_EPOCH + 1);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also do the following which avoids the branch in Integer.max:

Suggested change
RESERVE_GC_EPOCH = Integer.max(0, RESERVE_GC_EPOCH + 1);
RESERVE_GC_EPOCH = (RESERVE_GC_EPOCH + 1) & Integer.MAX_VALUE;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the use of Integer.max is clearer, any performance difference is insignificant, and the compiler can
make that change as an optimization if appropriate. So I'm not inclined to accept this suggestion.

@@ -197,7 +206,7 @@ class Direct$Type$Buffer$RW$$BO$
#if[rw]
super(-1, 0, cap, cap, fd, isSync, segment);
address = addr;
cleaner = Cleaner.create(this, unmapper);
cleaner = (unmapper == null) ? null : BufferCleaner.register(this, unmapper);
Copy link

@ExE-Boss ExE-Boss May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenJDK unfortunately uses the less accessible spaces1:

Suggested change
cleaner = (unmapper == null) ? null : BufferCleaner.register(this, unmapper);
cleaner = (unmapper == null) ? null : BufferCleaner.register(this, unmapper);

Footnotes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenJDK has not considered adoption of tabs-for-accessibility and sticks to using spaces across the codebase.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice that a tab had snuck in there. Thanks for spotting.

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial PR description is copied into every email. The detail in the PR is appreciated but can be less intrusive if included in a comment after the initial description.

if (sleeps >= MAX_SLEEPS) {
break;
}
for (int sleeps = 0; true; ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More typical coding pattern in openjdk code. Here and elsewhere in this PR.

Suggested change
for (int sleeps = 0; true; ) {
while (true) {
int sleeps = 0;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not the same thing, and doesn't do what's needed here. Perhaps you meant

int sleeps = 0;
while (true) {

I like limiting the scope of the variable. Is that a suggestion or a request to change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, your form does better limit the scope of the loop, and is correct as is; (just looks unusual)

import java.lang.ref.ReferenceQueue;
import java.util.Objects;
import sun.nio.Cleaner;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A class cleaner describing the overall objective (an excerpt from the PR description) would be useful.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assume you meant "class comment". I've added such.

} catch (Throwable x) {
// Long-standing behavior: when deallocation fails, VM exits.
if (System.err != null) {
new Error("Cleaner terminated abnormally", x).printStackTrace();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message would be more useful to identify this as a Buffer Cleaner terminated abnormally.

Copy link
Author

@kimbarrett kimbarrett May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was cribbed from #22165, but your suggestion has led me to
discover that wasn't right. I'm going to need to do some rework of the exception handling around cleaning.

/**
* {@code Cleaner} represents an object and a cleaning action.
*/
public interface Cleaner {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be renamed NIOCleaner or NIOBufClenaer or something to avoid the ambiguity between the other cleaner.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally (re)used the "Cleaner" name to avoid a bunch of renames that
would increase the size of the change and distract from the meat of it. I
think the name to use might be affected by how the implementation of the set
of cleanup objects might get merged between the new java.nio.BufferCleaner and
java.lang.ref.Cleaner. Perhaps the java.lang.ref.Cleaner.Cleanable interface
should be used throughout? I didn't want to expand this change to include
those kinds of questions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of renaming to avoid ambiguity. There is already java.lang.ref.Cleaner and jdk.internal.ref.Cleaner, and now this Cleaner is really the same thing as java.lang.ref.Cleaner.Cleanable.
It could be done as a follow-on change, if need-be.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that jdk.internal.ref.Cleaner is no longer used after this change. PR 22165 would have
removed it. I'd like to leave some of the cleanup, including long-term choice of naming that
internal class, for later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's fine.

if (System.err != null) {
new Error("nio Cleaner terminated abnormally", x).printStackTrace();
}
System.exit(1);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same behavior as jdk.internal.ref.Cleaner, for which this class is
substituting in the new regime for DBB management. PR 22165 (and earlier
versions of this PR) put this in the DBB's Deallocator::run method, but I
think it's both clearer here, and better to leave the Deallocator as it was in
mainline and be more consistent with the mainline code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think the prior placement from PR22165 doesn't cover the unmapper
case, but only the DBB.Deallocator case, so (I think) was a hidden change from
the baseline use of jdk.internal.ref.Cleaner.

@kimbarrett
Copy link
Author

It's been a while since anyone other than me did or said anything here. Any takers?
@AlanBateman @RogerRiggs @shipilev ?

private static final int MAX_SLEEPS = 9;

private static final Object RESERVE_SLOWPATH_LOCK = new Object();
private static int RESERVE_GC_EPOCH = 0; // Never negative.
Copy link
Member

@bchristi-git bchristi-git Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment on how this is used? Also, it seems more like a "counter".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some comment updates that I hope address this.

Comment on lines 129 to 131
// Keep trying to reserve until either succeed or there is no
// further cleaning available from prior GCs. If the latter then
// GC to hopefully find more cleaning to do.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment could better describe what happens. For instance, only one thread at a time will GC, and each thread only GCs once (assuming I'm reading the code right).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some comment updates that I hope address this.

Copy link
Member

@bchristi-git bchristi-git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. I have some comments.

private static boolean tryReserveOrClean(long size, long cap)
throws InterruptedException
{
JavaLangRefAccess jlra = SharedSecrets.getJavaLangRefAccess();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to have jlra here, as a local, versus where it was as a constant?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it doesn't matter, and this puts it close to where it's used.

/**
* {@code Cleaner} represents an object and a cleaning action.
*/
public interface Cleaner {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of renaming to avoid ambiguity. There is already java.lang.ref.Cleaner and jdk.internal.ref.Cleaner, and now this Cleaner is really the same thing as java.lang.ref.Cleaner.Cleanable.
It could be done as a follow-on change, if need-be.

// If being cleaned explicitly by application, rather than via
// reference processing by BufferCleaner, clear the referent so
// reference processing is disabled for this object.
clear();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference.clear() does not have any JMM guarantees. However, I've not found anything that might race with accessing the referent.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you are concerned about here. However, the referents for
these objects aren't ever accessed by other than the GC. And we don't really
care about the potential race between clear() and GC; the atomic removal from
the cleanerList deals with any timing issues there.

this.action = action;
}

public void clean() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be @Override

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Done.

}
}

// Cribbed from jdk.internal.ref.CleanerImpl.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe sometime later this can be shared instead of copied.

Copy link
Author

@kimbarrett kimbarrett Jun 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the PR description:

Possible future work would merge the two into a common utility.

I intend to file a JBS issue for that if this change is approved.

Comment on lines +239 to +243
synchronized (cleanerList) {
if (cleaningThread != null) {
return;
}
cleaningThread = new CleaningThread();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think double-checked locking could work well here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but why bother?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reduce the thread-safety cost of every register() (after the first) down to a volatile read, instead of an Object.lock().

But perhaps this isn't that hot a code path - depending on how frequently the app creates (and registers) buffers.

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 4, 2025
@kimbarrett
Copy link
Author

Thanks for reviews, @RogerRiggs and @bchristi-git

@kimbarrett
Copy link
Author

/integrate

@openjdk
Copy link

openjdk bot commented Jul 4, 2025

Going to push as commit 21f2e9a.
Since your change was applied there have been 81 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jul 4, 2025
@openjdk openjdk bot closed this Jul 4, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jul 4, 2025
@openjdk
Copy link

openjdk bot commented Jul 4, 2025

@kimbarrett Pushed as commit 21f2e9a.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@kimbarrett kimbarrett deleted the direct-buffer-cleaner branch July 4, 2025 04:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated nio nio-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

5 participants