-
Notifications
You must be signed in to change notification settings - Fork 6k
8356228: NMT does not record reserved memory base address correctly #25719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
8356228: NMT does not record reserved memory base address correctly #25719
Conversation
👋 Welcome back azafari! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
@afshin-zafari The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
/label hotspot-runtime |
@afshin-zafari The |
@afshin-zafari
|
// Once the memory is reserved, NMT can track it as mtClassShared. | ||
// During an aligned reserve, some extra sub-regions of the memory region may be released due to alignment requirements. | ||
// NMT will ignore releasing sub-regions (i.e., contained in other regions) that have mtClassShared tags. | ||
MemTracker::record_virtual_memory_tag(rs, mtClassShared); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to capture this logic in a test, so we can always check for this behavior?
Where in the code do we release and check for the special case of mtClassShared
?
I'm not 100% sure I understand why we have to first reserve with mtNone
and only then change it to mtClassShared
. Why can't we reserve with mtClassShared
right from the start?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to capture this logic in a test, so we can always check for this behavior?
Work on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where in the code do we release and check for the special case of
mtClassShared
?
In virtualMemoryTracker.cpp, remove_released_region(adddress, size)
in an if
block as this:
if (reserved_rgn->mem_tag() == mtClassShared) {
if (reserved_rgn->contain_region(addr, size)) {
// This is an unmapped CDS region, which is part of the reserved shared
// memory region.
// See special handling in VirtualMemoryTracker::add_reserved_region also.
return true;
}
if (size > reserved_rgn->size()) {
// This is from release the whole region spanning from archive space to class space,
// so we release them altogether.
ReservedMemoryRegion class_rgn(addr + reserved_rgn->size(),
(size - reserved_rgn->size()));
ReservedMemoryRegion* cls_rgn = _reserved_regions->find(class_rgn);
assert(cls_rgn != nullptr, "Class space region not recorded?");
assert(cls_rgn->mem_tag() == mtClass, "Must be class mem tag");
remove_released_region(reserved_rgn);
remove_released_region(cls_rgn);
return true;
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure I understand why we have to first reserve with
mtNone
and only then change it tomtClassShared
. Why can't we reserve withmtClassShared
right from the start?
The sequence is as follows:
All the reserve/release operations here are from os::
namespace and therefore will call NMT tracking API
1- CDS requests for reserving S
bytes at address Base
with alignment A
and memTag M
.
2- Due to OS alignment limitations regarding A
, new base XBase
to be used with new size XS
. Read X as extra here.
3- Do instead, reserve XS
bytes at XBase
.
4- If succeeds, memory would be like <--s1---><----S----><----s2--->
, where XS = s1 + S + s2
and XBase + s1 == Base
5- Release extra parts s1
and s2
.
6- request at step 1 above is fulfilled now.
If at step 1, the memTag M
is mtClassShared
, the releases at step 5 will be ignored by NMT. Which means that there are regions in NMT lists that are actually released (at os level) but NMT takes them still as reserved. Therefore, further reservations by os at those regions (e.g., for stack) will intersect with existing NMT regions and the NMT assertions fail. So the memTag is to be mtNone
to bypass/avoid the NMT ignore branch.
If we want to track the new memory region as belongs to CDS, we need to set the memTag of the region to mtClassShared
with the corresponding MemTracker
API.(as is done here in this PR)
The special-case handling of CDS in NMT breaks abstractions and has caused a number of bugs. I'm worried that trying to workaround this special-case code makes the code even more susceptible for future bugs. So, therefore I'm interested to here about the planes you have to replace the NMT implementation with a tree. Isn't that going to remove the CDS special-case code and maybe be even solve this issue without the need for this fix? If not, feel free to ignore my comment. |
In the new implementation of NMT with tree, there will be no special-case handling neither for CDS nor for any other memory types. Thanks for bringing this up. |
Do we need to bother with this fix then? Should we wait for the VMATree and just verify? |
The problem here stems from the following facts:
So, if ArchiveBuffer memory is reserved with CDS tag, there might be some released regions that NMT ignored and then later complains their intersection with other regions (e.g., thread stack regions observed in the issue).
Solution is to reserve the memory with no tag (mtNone) and if it succeeds change the tag of the region to CDS for further trackings.
Tests:
linux-x64: runtime/NMT jtreg tests passed
linux-aarch64: tier1 to tier3 passed
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25719/head:pull/25719
$ git checkout pull/25719
Update a local copy of the PR:
$ git checkout pull/25719
$ git pull https://git.openjdk.org/jdk.git pull/25719/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25719
View PR using the GUI difftool:
$ git pr show -t 25719
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25719.diff
Using Webrev
Link to Webrev Comment