-
Notifications
You must be signed in to change notification settings - Fork 6k
8357662: applications/runthese/RunThese8H.java Crash: 'Failed to uncommit metaspace' #25720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
👋 Welcome back coleenp! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
I'm not sure this is safe. Could you explain how you come to the conclusion that failing to uncommit shouldn't be considered fatal? If the failure mode really is that we failed to split the memory mapping then I don't think we can be sure that the OS left the memory reservation the way the JVM has book keeped it. In other situations like this we have found that this could lead to other threads allocating into memory that another sub-system thinks that it has reserved (and owns), which can lead to various memory corruption issues. |
If pd_uncommit_memory fails, it returns a boolean. If this is unsafe, shouldn't pd_uncommit_memory have a fatal error? |
I think we should have a fatal error. I argue for that here: (You will likely have to read surrounding comments to get the context of that discussion) There was an action item in there to create a bug report for this. I'm fine to be proven wrong w.r.t. this but I think there needs to be some very convincing argument to prove that the OS leaves the JVM in a safe state when we hit the various error parts in the kernel that returns ENOMEM. |
I don't see any evidence of a bug filed to examine whether uncommit failures should always/never/sometimes be a fatal error. As the code is now, it's up to the caller, which is a situation that could go badly wrong if the caller guesses wrong. I'd rather it always be a fatal error inside our os code that does uncommit in the cases that would corrupt memory. If that's the case. Not really sure what to do about this other than close it as WNF. |
Make the fatal error a log warning and return. The metaspace commit accounting is done after this return. Tested with this always returning here, and it seems okay. Tested with this change tier 1-4 and 8.
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25720/head:pull/25720
$ git checkout pull/25720
Update a local copy of the PR:
$ git checkout pull/25720
$ git pull https://git.openjdk.org/jdk.git pull/25720/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25720
View PR using the GUI difftool:
$ git pr show -t 25720
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25720.diff
Using Webrev
Link to Webrev Comment