Skip to content

Conversation

@YaSuenag
Copy link
Member

@YaSuenag YaSuenag commented Sep 5, 2020

If Agent_OnAttach() in JVMTI agent which is attempted to load via JVMTI.agent_load dcmd is failed, it would not be unloaded.
We've discussed it on serviceability-dev. This PR is a continuation of that.

This PR also includes to call Agent_OnUnload() when Agent_OnAttach() failed.

How to reproduce:

  1. Build JVMTI agent for test
$ git clone https://github.com/YaSuenag/jvmti-examples.git
$ cd jvmti-examples/helloworld/out/build
$ cmake ../..
  1. Run JShell

  2. Load JVMTI agent via jcmd <PID> JVMTI.agent_load with "error" ("error" means Agent_OnAttach() returns JNI_ERR)

$ jcmd
89456 jdk.jshell.execution.RemoteExecutionControl 45651
89547 sun.tools.jcmd.JCmd
89436 jdk.jshell/jdk.internal.jshell.tool.JShellToolProvider
$ jcmd 89436 JVMTI.agent_load `pwd`/libhelloworld.so error
89436:
return code: -1
  1. Check loaded libraries via jcmd <PID> VM.dynlibs
$ jcmd 89436 VM.dynlibs | grep libhelloworld
7f2f8b06b000-7f2f8b06c000 r--p 00000000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so
7f2f8b06c000-7f2f8b06d000 r-xp 00001000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so
7f2f8b06d000-7f2f8b06e000 r--p 00002000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so
7f2f8b06e000-7f2f8b06f000 r--p 00002000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so
7f2f8b06f000-7f2f8b070000 rw-p 00003000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8252657: JVMTI agent is not unloaded when Agent_OnAttach is failed

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/19/head:pull/19
$ git checkout pull/19

@YaSuenag
Copy link
Member Author

YaSuenag commented Sep 5, 2020

/test

@openjdk
Copy link

openjdk bot commented Sep 5, 2020

@YaSuenag you need to get approval to run the tests in tier1 for commits up until 3cc731a

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 5, 2020

👋 Welcome back ysuenaga! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Sep 5, 2020

@YaSuenag The following labels will be automatically applied to this pull request: hotspot`, `hotspot-runtime`, `serviceability. When this pull request is ready to be reviewed, an RFR email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label (add|remove) "label" command.

@openjdk openjdk bot added serviceability serviceability-dev@openjdk.org hotspot-runtime hotspot-runtime-dev@openjdk.org hotspot hotspot-dev@openjdk.org labels Sep 5, 2020
@edvbld
Copy link
Member

edvbld commented Sep 5, 2020

/test approve

@YaSuenag
Copy link
Member Author

YaSuenag commented Sep 5, 2020

/issue add JDK-8252657

@YaSuenag
Copy link
Member Author

YaSuenag commented Sep 5, 2020

/test

@openjdk openjdk bot changed the title JVMTI agent is not unloaded when Agent_OnAttach is failed 8252657: JVMTI agent is not unloaded when Agent_OnAttach is failed Sep 5, 2020
@openjdk
Copy link

openjdk bot commented Sep 5, 2020

@YaSuenag The primary solved issue for a PR is set through the PR title. Since the current title does not contain an issue reference, it will now be updated.

@openjdk
Copy link

openjdk bot commented Sep 5, 2020

Could not create test job

@dholmes-ora
Copy link
Member

/test

@openjdk
Copy link

openjdk bot commented Sep 8, 2020

Could not create test job

@YaSuenag
Copy link
Member Author

YaSuenag commented Sep 8, 2020

/test

@openjdk
Copy link

openjdk bot commented Sep 8, 2020

@YaSuenag you need to get approval to run the tests in tier1 for commits up until 3cc731a

@YaSuenag
Copy link
Member Author

YaSuenag commented Sep 8, 2020

@edvbld Can you approve me to run tier1 tests with /test PR command again?

@YaSuenag
Copy link
Member Author

/test

@openjdk
Copy link

openjdk bot commented Sep 11, 2020

@YaSuenag you need to get approval to run the tests in tier1 for commits up until 3cc731a

@YaSuenag YaSuenag marked this pull request as ready for review September 11, 2020 00:22
@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 11, 2020
@mlbridge
Copy link

mlbridge bot commented Sep 11, 2020

Webrevs

@openjdk openjdk bot removed the hotspot-runtime hotspot-runtime-dev@openjdk.org label Sep 23, 2020
@sspitsyn
Copy link
Contributor

sspitsyn commented Oct 2, 2020

I was initially wrong by supporting this, and now I share David's concerns about unclear semantics of this. The questions are:

  • Q1: Is it necessary to call the Agent_OnUnload()?
  • Q2: Would it be a JVMTI spec violation to call the Agent_OnAttach() multiple times? (It seems to be the case to me.)
  • Q3: What has to be done for statically linked agent?
  • Q4: Should the agent be correctly loadable in the first place? What were the reasons its loading to fail?

Yes, at least, a CSR is needed for this.

@YaSuenag
Copy link
Member Author

YaSuenag commented Oct 2, 2020

  • Q1: Is it necessary to call the Agent_OnUnload()?

JVMTI spec of Agent_OnUnload() says this function will be called when the agent library will be unloaded by platform specific mechanism. OTOH it also says Agent_OnUnload() will be called both at VM termination and by other reasons.
The spec don't say for the case if Agent_OnAttach() would be failed. IMHO Agent_OnUnload() should be called because this PR would unload library if Agent_OnAttach() failed.

  • Q2: Would it be a JVMTI spec violation to call the Agent_OnAttach() multiple times? (It seems to be the case to me.)

Agent_OnAttach() should be called only once per attach request, but VM should accept multiple attach request for same agent library.

For example, we can add multiple -agentlib and -agentpath request as below. JVMTI agent might change behavior due to arguments or configuration file.

-agentlib:test=profile=A -agentlib:test=profile=B -agentpath:/path/to/libtest=profile=C

Agent developers should have responsibility for the behavior when more than one agent is loaded at a time.

  • Q3: What has to be done for statically linked agent?

JVMTI spec says "unless it is statically linked into the executable", so I think we can ignore about Agent_OnUnload_L() in this PR.

  • Q4: Should the agent be correctly loadable in the first place? What were the reasons its loading to fail?

Agent (Agent_OnAttach()) might fail due to error in agent logic. For example, some agents load configuration file at initialization. If the user gives wrong value, it will fail.

Yes, at least, a CSR is needed for this.

I will file CSR for this PR after this discussion.

@YaSuenag
Copy link
Member Author

If we can change the spec that agent library would not be unloaded when Agent_OnAttach() failed, we can change like webrev.00. It is simple, and similar behavior with Agent_OnLoad(). It might be prefer for JVMTI agent developers.

fisk pushed a commit to fisk/jdk that referenced this pull request Oct 28, 2020
Simplify timed join to use awaiNanos.
@YaSuenag
Copy link
Member Author

YaSuenag commented Nov 12, 2020

In case of Agent_OnLoad(), if it is failed (it returns other than zero), JVM is aborted and Agent_OnUnload() is not called. I think it is compliant with JVMTI spec of Agent_OnUnload() which says uncontrolled shutdown (aborting JVM) is an exception to this rule.

I will add CSR for this fix, but I want to discuss what we should do before. I like that Agent_OnUnload() wouldn't be called when Agent_OnAttach() is failed if we can change the spec - it is consistent and friendly with Agent_OnLoad().

@mlbridge
Copy link

mlbridge bot commented Dec 16, 2020

Mailing list message from Yasumasa Suenaga on serviceability-dev:

Hi David,

"This function will be called by the VM when the library is about to be unloaded. The library will be unloaded (unless it is statically linked into the executable) and this function will be called if some platform specific mechanism causes the unload (an unload mechanism is not specified in this document) or the library is (in effect) unloaded by the termination of the VM. VM termination includes normal termination and VM failure, including start-up failure, but not, of course, uncontrolled shutdown. An implementation may also choose to not call this function if the Agent_OnAttach/Agent_OnAttach_L function reported an error (returned a non-zero value)."

Thank you so much!
According to this, we can fix this bug simply. (We can unload agent library without `Agent_OnUnload()` call)

However I think following sentence should be removed because DLL unloading mechanism is not hooked, and also `Agent_OnUnload()` wouldn't be called by dll unloading - it depends on VM termination / failure / non-zero value from entry points.

"this function will be called if some platform specific mechanism causes the unload (an unload mechanism is not specified in this document)"

If you are OK, I will update CSR.

Cheers,

Yasumasa

On 2020/12/16 11:09, David Holmes wrote:

Hi Yasumasa,

Sorry for the delay getting back to this.

On 1/12/2020 4:41 pm, Yasumasa Suenaga wrote:

Hi David,

On 2020/12/01 14:59, David Holmes wrote:

Looking at the original webrev from September:

https://cr.openjdk.java.net/~ysuenaga/JDK-8252657/webrev.00/src/hotspot/share/prims/jvmtiExport.cpp.cdiff.html

The suggestion to unload the library if Agent_OnAttach fails seems quite reasonable - it is what happens if no Agent_OnAttach function can be found in the agent.

But the specification is lacking because it simply states any such error is "ignored" - which is an oversimplification and I think was really intended to contrast with the abort behaviour if Agent_OnLoad fails. In addition the specification indicates that Agent_OnUnload is called any time the agent library is unloaded, except for "uncontrolled shutdown". This unfortunately suggests that before unloading the library after OnAttach fails, we also call OnUnload. That seems wrong and in fact the VM does not do that - and noone seems to have complained.

Also note the specification states an agent "must export a start-up function ..." but doesn't say what happens if it doesn't do so.

My gut feeling for what the specification should say here is that if the start-up function does not exist, or the call to Agent_OnAttach reports an error, then the agent library is immediately unloaded with no attempt to call the agent shutdown function (Agent_OnUnload).

That's what I wanted to suggest!
I understand we need to consider following case in this issue, is it right?

Agent_OnLoad / Agent_OnLoad_L does not exist:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (JVM will abort)

Agent_OnLoad / Agent_OnLoad_L failed:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (JVM will abort)

These cases are already well covered.

Agent_OnAttach does not exist:
???? - Agent_OnUnload is not called
???? - DLL is unloaded

Agent_OnAttach_L does not exist:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (static link)

These cases are kind of implicitly covered, but could be clarified.

Agent_OnAttach failed:
???? - Agent_OnUnload is not called
???? - DLL is unloaded

Agent_OnAttach_L faled:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (static link)

These are the problematic cases.

But with my "compatibility glasses" on this may be too strong to retrofit to the specification as other VMs may behave differently. So as I stated in the CSR request we probably want to allow the current hotspot behaviour but not mandate it (unless we check with all the other VM implementations that such a specification is okay with them).

Ok, for example, can we change the spec as following?
(Of course, this is not all)

```
diff --git a/src/hotspot/share/prims/jvmti.xml b/src/hotspot/share/prims/jvmti.xml
index 44553b8065f..57c87b1a71b 100644
--- a/src/hotspot/share/prims/jvmti.xml
+++ b/src/hotspot/share/prims/jvmti.xml
@@ -688,7 +688,8 @@ Agent_OnUnload_L(JavaVM *vm)</example>
????? mechanism causes the unload (an unload mechanism is not specified in this document)
????? or the library is (in effect) unloaded by the termination of the VM whether through
????? normal termination or VM failure, including start-up failure.
-??? Uncontrolled shutdown is, of course, an exception to this rule.
+??? Uncontrolled shutdown is, of course, an exception to this rule, and also it might not be called
+??? when start-up function does not exist or fails (returns non-zero value).

I think this is what was originally proposed and as I said then I don't like burying this detail inside the reference to uncontrolled shutdown. We should make this much more explicit which requires some reworking of the existing far-too-long sentence.

"This function will be called by the VM when the library is about to be unloaded. The library will be unloaded (unless it is statically linked into the executable) and this function will be called if some platform specific mechanism causes the unload (an unload mechanism is not specified in this document) or the library is (in effect) unloaded by the termination of the VM. VM termination includes normal termination and VM failure, including start-up failure, but not, of course, uncontrolled shutdown. An implementation may also choose to not call this function if the Agent_OnAttach/Agent_OnAttach_L function reported an error (returned a non-zero value)."

Cheers,
David
-----

????? Note the distinction between this function and the
????? <eventlink id="VMDeath">VM Death event</eventlink>: for the VM Death event
????? to be sent, the VM must have run at least to the point of initialization and a valid
```

I agree that the more general issue of re-loading an agent is a separate issue.

Thanks!

Yasumasa

David
-----

On 1/12/2020 3:21 pm, David Holmes wrote:

On 1/12/2020 3:19 pm, David Holmes wrote:

On 1/12/2020 2:46 pm, Yasumasa Suenaga wrote:

Hi Chris, David,

Currently Agent_OnUnload() is not called when Agent_OnLoad() is failed - JVM will abort.
Should we re-think this behavior?

We should we rethink that? It is probably one of the clearest parts of the spec. If Agent_Onload fails that is considered a fatal error - end of story.

I meant, of course, "Why should we rethink that?".

David

The issue is with Agent_onAttach and how its failure should, or should not, impact Agent_OnUnload.

David
-----

https://github.com/YaSuenag/jvmti-examples/tree/master/helloworld

```
$ java -agentpath:/path/to/libhelloworld.so=error --version
Hello World from Agent_OnLoad()
?? options = error
Error occurred during initialization of VM
agent library failed to init: /path/to/libhelloworld.so
```

Thanks,

Yasumasa

On 2020/12/01 11:44, David Holmes wrote:

On 1/12/2020 11:45 am, Chris Plummer wrote:

On Fri, 2 Oct 2020 07:27:43 GMT, Yasumasa Suenaga <ysuenaga at openjdk.org> wrote:

* Q3: What has to be done for statically linked agent?

JVMTI spec says "unless it is statically linked into the executable", so I think we can ignore about Agent_OnUnload_L() in this PR.

I don't think that makes sense. If you call it for dynamically linked then you need to call it for statically linked also.

Agreed. Even though you can't physically unload the statically linked library, if it is logically unloaded by some mechanism, then Agent_OnUnload_L is supposed to be called.

David
-----

@mlbridge
Copy link

mlbridge bot commented Dec 16, 2020

Mailing list message from David Holmes on serviceability-dev:

On 16/12/2020 4:33 pm, Yasumasa Suenaga wrote:

Hi David,

"This function will be called by the VM when the library is about to
be unloaded. The library will be unloaded (unless it is statically
linked into the executable) and this function will be called if some
platform specific mechanism causes the unload (an unload mechanism is
not specified in this document) or the library is (in effect) unloaded
by the termination of the VM. VM termination includes normal
termination and VM failure, including start-up failure, but not, of
course, uncontrolled shutdown. An implementation may also choose to
not call this function if the Agent_OnAttach/Agent_OnAttach_L function
reported an error (returned a non-zero value)."

Thank you so much!
According to this, we can fix this bug simply. (We can unload agent
library without `Agent_OnUnload()` call)

Yes I think it is reasonable to allow this. But of course there must be
a general consensus.

However I think following sentence should be removed because DLL
unloading mechanism is not hooked, and also `Agent_OnUnload()` wouldn't
be called by dll unloading - it depends on VM termination / failure /
non-zero value from entry points.

"this function will be called if some platform specific mechanism causes
the unload (an unload mechanism is not specified in this document)"

I'm not sure you mean exactly by DLL unloading mechanism is not hooked.
But that sentence has to remain because it makes it clear that the
decision as to when unloading occurs is up to an implementation and is
not defined by the specification.

Cheers,
David
-----

If you are OK, I will update CSR.

Cheers,

Yasumasa

On 2020/12/16 11:09, David Holmes wrote:

Hi Yasumasa,

Sorry for the delay getting back to this.

On 1/12/2020 4:41 pm, Yasumasa Suenaga wrote:

Hi David,

On 2020/12/01 14:59, David Holmes wrote:

Looking at the original webrev from September:

https://cr.openjdk.java.net/~ysuenaga/JDK-8252657/webrev.00/src/hotspot/share/prims/jvmtiExport.cpp.cdiff.html

The suggestion to unload the library if Agent_OnAttach fails seems
quite reasonable - it is what happens if no Agent_OnAttach function
can be found in the agent.

But the specification is lacking because it simply states any such
error is "ignored" - which is an oversimplification and I think was
really intended to contrast with the abort behaviour if Agent_OnLoad
fails. In addition the specification indicates that Agent_OnUnload
is called any time the agent library is unloaded, except for
"uncontrolled shutdown". This unfortunately suggests that before
unloading the library after OnAttach fails, we also call OnUnload.
That seems wrong and in fact the VM does not do that - and noone
seems to have complained.

Also note the specification states an agent "must export a start-up
function ..." but doesn't say what happens if it doesn't do so.

My gut feeling for what the specification should say here is that if
the start-up function does not exist, or the call to Agent_OnAttach
reports an error, then the agent library is immediately unloaded
with no attempt to call the agent shutdown function (Agent_OnUnload).

That's what I wanted to suggest!
I understand we need to consider following case in this issue, is it
right?

Agent_OnLoad / Agent_OnLoad_L does not exist:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (JVM will abort)

Agent_OnLoad / Agent_OnLoad_L failed:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (JVM will abort)

These cases are already well covered.

Agent_OnAttach does not exist:
???? - Agent_OnUnload is not called
???? - DLL is unloaded

Agent_OnAttach_L does not exist:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (static link)

These cases are kind of implicitly covered, but could be clarified.

Agent_OnAttach failed:
???? - Agent_OnUnload is not called
???? - DLL is unloaded

Agent_OnAttach_L faled:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (static link)

These are the problematic cases.

But with my "compatibility glasses" on this may be too strong to
retrofit to the specification as other VMs may behave differently.
So as I stated in the CSR request we probably want to allow the
current hotspot behaviour but not mandate it (unless we check with
all the other VM implementations that such a specification is okay
with them).

Ok, for example, can we change the spec as following?
(Of course, this is not all)

```
diff --git a/src/hotspot/share/prims/jvmti.xml
b/src/hotspot/share/prims/jvmti.xml
index 44553b8065f..57c87b1a71b 100644
--- a/src/hotspot/share/prims/jvmti.xml
+++ b/src/hotspot/share/prims/jvmti.xml
@@ -688,7 +688,8 @@ Agent_OnUnload_L(JavaVM *vm)</example>
????? mechanism causes the unload (an unload mechanism is not
specified in this document)
????? or the library is (in effect) unloaded by the termination of
the VM whether through
????? normal termination or VM failure, including start-up failure.
-??? Uncontrolled shutdown is, of course, an exception to this rule.
+??? Uncontrolled shutdown is, of course, an exception to this rule,
and also it might not be called
+??? when start-up function does not exist or fails (returns non-zero
value).

I think this is what was originally proposed and as I said then I
don't like burying this detail inside the reference to uncontrolled
shutdown. We should make this much more explicit which requires some
reworking of the existing far-too-long sentence.

"This function will be called by the VM when the library is about to
be unloaded. The library will be unloaded (unless it is statically
linked into the executable) and this function will be called if some
platform specific mechanism causes the unload (an unload mechanism is
not specified in this document) or the library is (in effect) unloaded
by the termination of the VM. VM termination includes normal
termination and VM failure, including start-up failure, but not, of
course, uncontrolled shutdown. An implementation may also choose to
not call this function if the Agent_OnAttach/Agent_OnAttach_L function
reported an error (returned a non-zero value)."

Cheers,
David
-----

????? Note the distinction between this function and the
????? <eventlink id="VMDeath">VM Death event</eventlink>: for the VM
Death event
????? to be sent, the VM must have run at least to the point of
initialization and a valid
```

I agree that the more general issue of re-loading an agent is a
separate issue.

Thanks!

Yasumasa

David
-----

On 1/12/2020 3:21 pm, David Holmes wrote:

On 1/12/2020 3:19 pm, David Holmes wrote:

On 1/12/2020 2:46 pm, Yasumasa Suenaga wrote:

Hi Chris, David,

Currently Agent_OnUnload() is not called when Agent_OnLoad() is
failed - JVM will abort.
Should we re-think this behavior?

We should we rethink that? It is probably one of the clearest
parts of the spec. If Agent_Onload fails that is considered a
fatal error - end of story.

I meant, of course, "Why should we rethink that?".

David

The issue is with Agent_onAttach and how its failure should, or
should not, impact Agent_OnUnload.

David
-----

https://github.com/YaSuenag/jvmti-examples/tree/master/helloworld

```
$ java -agentpath:/path/to/libhelloworld.so=error --version
Hello World from Agent_OnLoad()
?? options = error
Error occurred during initialization of VM
agent library failed to init: /path/to/libhelloworld.so
```

Thanks,

Yasumasa

On 2020/12/01 11:44, David Holmes wrote:

On 1/12/2020 11:45 am, Chris Plummer wrote:

On Fri, 2 Oct 2020 07:27:43 GMT, Yasumasa Suenaga
<ysuenaga at openjdk.org> wrote:

* Q3: What has to be done for statically linked agent?

JVMTI spec says "unless it is statically linked into the
executable", so I think we can ignore about Agent_OnUnload_L()
in this PR.

I don't think that makes sense. If you call it for dynamically
linked then you need to call it for statically linked also.

Agreed. Even though you can't physically unload the statically
linked library, if it is logically unloaded by some mechanism,
then Agent_OnUnload_L is supposed to be called.

David
-----

@mlbridge
Copy link

mlbridge bot commented Dec 16, 2020

Mailing list message from Yasumasa Suenaga on serviceability-dev:

Hi David,

I'm not sure you mean exactly by DLL unloading mechanism is not hooked.

I want to say we do not hook `FreeLibrary()` on Windows and `dlclose()` on Linux.
Can we describe ""this function may be called - " at here?

Cheers,
Yasumasa

On 2020/12/16 15:48, David Holmes wrote:

On 16/12/2020 4:33 pm, Yasumasa Suenaga wrote:

Hi David,

"This function will be called by the VM when the library is about to be unloaded. The library will be unloaded (unless it is statically linked into the executable) and this function will be called if some platform specific mechanism causes the unload (an unload mechanism is not specified in this document) or the library is (in effect) unloaded by the termination of the VM. VM termination includes normal termination and VM failure, including start-up failure, but not, of course, uncontrolled shutdown. An implementation may also choose to not call this function if the Agent_OnAttach/Agent_OnAttach_L function reported an error (returned a non-zero value)."

Thank you so much!
According to this, we can fix this bug simply. (We can unload agent library without `Agent_OnUnload()` call)

Yes I think it is reasonable to allow this. But of course there must be a general consensus.

However I think following sentence should be removed because DLL unloading mechanism is not hooked, and also `Agent_OnUnload()` wouldn't be called by dll unloading - it depends on VM termination / failure / non-zero value from entry points.

"this function will be called if some platform specific mechanism causes the unload (an unload mechanism is not specified in this document)"

I'm not sure you mean exactly by DLL unloading mechanism is not hooked. But that sentence has to remain because it makes it clear that the decision as to when unloading occurs is up to an implementation and is not defined by the specification.

Cheers,
David
-----

If you are OK, I will update CSR.

Cheers,

Yasumasa

On 2020/12/16 11:09, David Holmes wrote:

Hi Yasumasa,

Sorry for the delay getting back to this.

On 1/12/2020 4:41 pm, Yasumasa Suenaga wrote:

Hi David,

On 2020/12/01 14:59, David Holmes wrote:

Looking at the original webrev from September:

https://cr.openjdk.java.net/~ysuenaga/JDK-8252657/webrev.00/src/hotspot/share/prims/jvmtiExport.cpp.cdiff.html

The suggestion to unload the library if Agent_OnAttach fails seems quite reasonable - it is what happens if no Agent_OnAttach function can be found in the agent.

But the specification is lacking because it simply states any such error is "ignored" - which is an oversimplification and I think was really intended to contrast with the abort behaviour if Agent_OnLoad fails. In addition the specification indicates that Agent_OnUnload is called any time the agent library is unloaded, except for "uncontrolled shutdown". This unfortunately suggests that before unloading the library after OnAttach fails, we also call OnUnload. That seems wrong and in fact the VM does not do that - and noone seems to have complained.

Also note the specification states an agent "must export a start-up function ..." but doesn't say what happens if it doesn't do so.

My gut feeling for what the specification should say here is that if the start-up function does not exist, or the call to Agent_OnAttach reports an error, then the agent library is immediately unloaded with no attempt to call the agent shutdown function (Agent_OnUnload).

That's what I wanted to suggest!
I understand we need to consider following case in this issue, is it right?

Agent_OnLoad / Agent_OnLoad_L does not exist:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (JVM will abort)

Agent_OnLoad / Agent_OnLoad_L failed:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (JVM will abort)

These cases are already well covered.

Agent_OnAttach does not exist:
???? - Agent_OnUnload is not called
???? - DLL is unloaded

Agent_OnAttach_L does not exist:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (static link)

These cases are kind of implicitly covered, but could be clarified.

Agent_OnAttach failed:
???? - Agent_OnUnload is not called
???? - DLL is unloaded

Agent_OnAttach_L faled:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (static link)

These are the problematic cases.

But with my "compatibility glasses" on this may be too strong to retrofit to the specification as other VMs may behave differently. So as I stated in the CSR request we probably want to allow the current hotspot behaviour but not mandate it (unless we check with all the other VM implementations that such a specification is okay with them).

Ok, for example, can we change the spec as following?
(Of course, this is not all)

```
diff --git a/src/hotspot/share/prims/jvmti.xml b/src/hotspot/share/prims/jvmti.xml
index 44553b8065f..57c87b1a71b 100644
--- a/src/hotspot/share/prims/jvmti.xml
+++ b/src/hotspot/share/prims/jvmti.xml
@@ -688,7 +688,8 @@ Agent_OnUnload_L(JavaVM *vm)</example>
????? mechanism causes the unload (an unload mechanism is not specified in this document)
????? or the library is (in effect) unloaded by the termination of the VM whether through
????? normal termination or VM failure, including start-up failure.
-??? Uncontrolled shutdown is, of course, an exception to this rule.
+??? Uncontrolled shutdown is, of course, an exception to this rule, and also it might not be called
+??? when start-up function does not exist or fails (returns non-zero value).

I think this is what was originally proposed and as I said then I don't like burying this detail inside the reference to uncontrolled shutdown. We should make this much more explicit which requires some reworking of the existing far-too-long sentence.

"This function will be called by the VM when the library is about to be unloaded. The library will be unloaded (unless it is statically linked into the executable) and this function will be called if some platform specific mechanism causes the unload (an unload mechanism is not specified in this document) or the library is (in effect) unloaded by the termination of the VM. VM termination includes normal termination and VM failure, including start-up failure, but not, of course, uncontrolled shutdown. An implementation may also choose to not call this function if the Agent_OnAttach/Agent_OnAttach_L function reported an error (returned a non-zero value)."

Cheers,
David
-----

????? Note the distinction between this function and the
????? <eventlink id="VMDeath">VM Death event</eventlink>: for the VM Death event
????? to be sent, the VM must have run at least to the point of initialization and a valid
```

I agree that the more general issue of re-loading an agent is a separate issue.

Thanks!

Yasumasa

David
-----

On 1/12/2020 3:21 pm, David Holmes wrote:

On 1/12/2020 3:19 pm, David Holmes wrote:

On 1/12/2020 2:46 pm, Yasumasa Suenaga wrote:

Hi Chris, David,

Currently Agent_OnUnload() is not called when Agent_OnLoad() is failed - JVM will abort.
Should we re-think this behavior?

We should we rethink that? It is probably one of the clearest parts of the spec. If Agent_Onload fails that is considered a fatal error - end of story.

I meant, of course, "Why should we rethink that?".

David

The issue is with Agent_onAttach and how its failure should, or should not, impact Agent_OnUnload.

David
-----

https://github.com/YaSuenag/jvmti-examples/tree/master/helloworld

```
$ java -agentpath:/path/to/libhelloworld.so=error --version
Hello World from Agent_OnLoad()
?? options = error
Error occurred during initialization of VM
agent library failed to init: /path/to/libhelloworld.so
```

Thanks,

Yasumasa

On 2020/12/01 11:44, David Holmes wrote:

On 1/12/2020 11:45 am, Chris Plummer wrote:

On Fri, 2 Oct 2020 07:27:43 GMT, Yasumasa Suenaga <ysuenaga at openjdk.org> wrote:

* Q3: What has to be done for statically linked agent?

JVMTI spec says "unless it is statically linked into the executable", so I think we can ignore about Agent_OnUnload_L() in this PR.

I don't think that makes sense. If you call it for dynamically linked then you need to call it for statically linked also.

Agreed. Even though you can't physically unload the statically linked library, if it is logically unloaded by some mechanism, then Agent_OnUnload_L is supposed to be called.

David
-----

@mlbridge
Copy link

mlbridge bot commented Dec 18, 2020

Mailing list message from David Holmes on serviceability-dev:

On 16/12/2020 5:36 pm, Yasumasa Suenaga wrote:

Hi David,

I'm not sure you mean exactly by DLL unloading mechanism is not hooked.

I want to say we do not hook `FreeLibrary()` on Windows and `dlclose()`
on Linux.
Can we describe ""this function may be called - " at here?

I think it is clear this refers to the VM doing the unloading not
somebody issuing direct dlclose/FreeLibrary commands.

David

Cheers,
Yasumasa

On 2020/12/16 15:48, David Holmes wrote:

On 16/12/2020 4:33 pm, Yasumasa Suenaga wrote:

Hi David,

"This function will be called by the VM when the library is about to
be unloaded. The library will be unloaded (unless it is statically
linked into the executable) and this function will be called if some
platform specific mechanism causes the unload (an unload mechanism
is not specified in this document) or the library is (in effect)
unloaded by the termination of the VM. VM termination includes
normal termination and VM failure, including start-up failure, but
not, of course, uncontrolled shutdown. An implementation may also
choose to not call this function if the
Agent_OnAttach/Agent_OnAttach_L function reported an error (returned
a non-zero value)."

Thank you so much!
According to this, we can fix this bug simply. (We can unload agent
library without `Agent_OnUnload()` call)

Yes I think it is reasonable to allow this. But of course there must
be a general consensus.

However I think following sentence should be removed because DLL
unloading mechanism is not hooked, and also `Agent_OnUnload()`
wouldn't be called by dll unloading - it depends on VM termination /
failure / non-zero value from entry points.

"this function will be called if some platform specific mechanism
causes the unload (an unload mechanism is not specified in this
document)"

I'm not sure you mean exactly by DLL unloading mechanism is not
hooked. But that sentence has to remain because it makes it clear that
the decision as to when unloading occurs is up to an implementation
and is not defined by the specification.

Cheers,
David
-----

If you are OK, I will update CSR.

Cheers,

Yasumasa

On 2020/12/16 11:09, David Holmes wrote:

Hi Yasumasa,

Sorry for the delay getting back to this.

On 1/12/2020 4:41 pm, Yasumasa Suenaga wrote:

Hi David,

On 2020/12/01 14:59, David Holmes wrote:

Looking at the original webrev from September:

https://cr.openjdk.java.net/~ysuenaga/JDK-8252657/webrev.00/src/hotspot/share/prims/jvmtiExport.cpp.cdiff.html

The suggestion to unload the library if Agent_OnAttach fails seems
quite reasonable - it is what happens if no Agent_OnAttach
function can be found in the agent.

But the specification is lacking because it simply states any such
error is "ignored" - which is an oversimplification and I think
was really intended to contrast with the abort behaviour if
Agent_OnLoad fails. In addition the specification indicates that
Agent_OnUnload is called any time the agent library is unloaded,
except for "uncontrolled shutdown". This unfortunately suggests
that before unloading the library after OnAttach fails, we also
call OnUnload. That seems wrong and in fact the VM does not do
that - and noone seems to have complained.

Also note the specification states an agent "must export a
start-up function ..." but doesn't say what happens if it doesn't
do so.

My gut feeling for what the specification should say here is that
if the start-up function does not exist, or the call to
Agent_OnAttach reports an error, then the agent library is
immediately unloaded with no attempt to call the agent shutdown
function (Agent_OnUnload).

That's what I wanted to suggest!
I understand we need to consider following case in this issue, is
it right?

Agent_OnLoad / Agent_OnLoad_L does not exist:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (JVM will abort)

Agent_OnLoad / Agent_OnLoad_L failed:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (JVM will abort)

These cases are already well covered.

Agent_OnAttach does not exist:
???? - Agent_OnUnload is not called
???? - DLL is unloaded

Agent_OnAttach_L does not exist:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (static link)

These cases are kind of implicitly covered, but could be clarified.

Agent_OnAttach failed:
???? - Agent_OnUnload is not called
???? - DLL is unloaded

Agent_OnAttach_L faled:
???? - Agent_OnUnload is not called
???? - DLL is not unloaded (static link)

These are the problematic cases.

But with my "compatibility glasses" on this may be too strong to
retrofit to the specification as other VMs may behave differently.
So as I stated in the CSR request we probably want to allow the
current hotspot behaviour but not mandate it (unless we check with
all the other VM implementations that such a specification is okay
with them).

Ok, for example, can we change the spec as following?
(Of course, this is not all)

```
diff --git a/src/hotspot/share/prims/jvmti.xml
b/src/hotspot/share/prims/jvmti.xml
index 44553b8065f..57c87b1a71b 100644
--- a/src/hotspot/share/prims/jvmti.xml
+++ b/src/hotspot/share/prims/jvmti.xml
@@ -688,7 +688,8 @@ Agent_OnUnload_L(JavaVM *vm)</example>
????? mechanism causes the unload (an unload mechanism is not
specified in this document)
????? or the library is (in effect) unloaded by the termination of
the VM whether through
????? normal termination or VM failure, including start-up failure.
-??? Uncontrolled shutdown is, of course, an exception to this rule.
+??? Uncontrolled shutdown is, of course, an exception to this
rule, and also it might not be called
+??? when start-up function does not exist or fails (returns
non-zero value).

I think this is what was originally proposed and as I said then I
don't like burying this detail inside the reference to uncontrolled
shutdown. We should make this much more explicit which requires some
reworking of the existing far-too-long sentence.

"This function will be called by the VM when the library is about to
be unloaded. The library will be unloaded (unless it is statically
linked into the executable) and this function will be called if some
platform specific mechanism causes the unload (an unload mechanism
is not specified in this document) or the library is (in effect)
unloaded by the termination of the VM. VM termination includes
normal termination and VM failure, including start-up failure, but
not, of course, uncontrolled shutdown. An implementation may also
choose to not call this function if the
Agent_OnAttach/Agent_OnAttach_L function reported an error (returned
a non-zero value)."

Cheers,
David
-----

????? Note the distinction between this function and the
????? <eventlink id="VMDeath">VM Death event</eventlink>: for the
VM Death event
????? to be sent, the VM must have run at least to the point of
initialization and a valid
```

I agree that the more general issue of re-loading an agent is a
separate issue.

Thanks!

Yasumasa

David
-----

On 1/12/2020 3:21 pm, David Holmes wrote:

On 1/12/2020 3:19 pm, David Holmes wrote:

On 1/12/2020 2:46 pm, Yasumasa Suenaga wrote:

Hi Chris, David,

Currently Agent_OnUnload() is not called when Agent_OnLoad() is
failed - JVM will abort.
Should we re-think this behavior?

We should we rethink that? It is probably one of the clearest
parts of the spec. If Agent_Onload fails that is considered a
fatal error - end of story.

I meant, of course, "Why should we rethink that?".

David

The issue is with Agent_onAttach and how its failure should, or
should not, impact Agent_OnUnload.

David
-----

https://github.com/YaSuenag/jvmti-examples/tree/master/helloworld

```
$ java -agentpath:/path/to/libhelloworld.so=error --version
Hello World from Agent_OnLoad()
?? options = error
Error occurred during initialization of VM
agent library failed to init: /path/to/libhelloworld.so
```

Thanks,

Yasumasa

On 2020/12/01 11:44, David Holmes wrote:

On 1/12/2020 11:45 am, Chris Plummer wrote:

On Fri, 2 Oct 2020 07:27:43 GMT, Yasumasa Suenaga
<ysuenaga at openjdk.org> wrote:

* Q3: What has to be done for statically linked agent?

JVMTI spec says "unless it is statically linked into the
executable", so I think we can ignore about
Agent_OnUnload_L() in this PR.

I don't think that makes sense. If you call it for
dynamically linked then you need to call it for statically
linked also.

Agreed. Even though you can't physically unload the statically
linked library, if it is logically unloaded by some mechanism,
then Agent_OnUnload_L is supposed to be called.

David
-----

@YaSuenag
Copy link
Member Author

@dholmes-ora Thank you for explanation! I updated CSR. Could you review it?

@openjdk openjdk bot removed the csr Pull request needs approved CSR before integration label Jan 14, 2021
@YaSuenag
Copy link
Member Author

CSR has been approved. I updated jvmti.xml to follow it (I haven't make any changes in other files in latest commit)
Could you review again?

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Yasumasa,

Spec update looks good.

VM tweak to call dll_unload on failure looks good.

I don't understand the test change though.

Thanks,
David

@openjdk
Copy link

openjdk bot commented Jan 14, 2021

@YaSuenag This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8252657: JVMTI agent is not unloaded when Agent_OnAttach is failed

Reviewed-by: dholmes, sspitsyn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 18 new commits pushed to the master branch:

  • 4f881ba: 8258652: Assert in JvmtiThreadState::cur_stack_depth() can noticeably slow down debugging single stepping
  • d18d26e: 8259350: Add some internal debugging APIs to the debug agent
  • a6b2162: 8259278: Optimize Vector API slice and unslice operations
  • da6bcf9: 8255019: Shenandoah: Split STW and concurrent mark into separate classes
  • aba3431: 8258956: Memory Leak in StringCoding on ThreadLocal resultCached StringCoding.Result
  • 8554fe6: 8253866: Security Libs Terminology Refresh
  • c2a3c7e: 8259727: Remove redundant "target" arguments to methods in Links
  • 1620664: 8259723: Move Table class to formats.html package
  • 38a1201: 8258912: Remove JVM options CountJNICalls and CountJVMCalls
  • be57cf1: 8226416: MonitorUsedDeflationThreshold can cause repeated async deflation requests
  • ... and 8 more: https://git.openjdk.java.net/jdk/compare/51e14f2e2ab322ef551437845925e08a997c74fc...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 14, 2021
@YaSuenag
Copy link
Member Author

Hi Yasumasa,

Spec update looks good.

VM tweak to call dll_unload on failure looks good.

Thanks!

I don't understand the test change though.

To check the agent is unloaded, call VM.dynlibs dcmd and check whether agent path is not contained.
libReturnError.so would be loaded only once in AttachReturnError.java, so it is expected to unload when Agent_OnAttach() fails.

@mlbridge
Copy link

mlbridge bot commented Jan 14, 2021

Mailing list message from David Holmes on serviceability-dev:

On 14/01/2021 6:29 pm, Yasumasa Suenaga wrote:

On Thu, 14 Jan 2021 07:19:06 GMT, David Holmes <dholmes at openjdk.org> wrote:

Hi Yasumasa,

Spec update looks good.

VM tweak to call dll_unload on failure looks good.

Thanks!

I don't understand the test change though.

To check the agent is unloaded, call VM.dynlibs dcmd and check whether agent path is not contained.
libReturnError.so would be loaded only once in AttachReturnError.java, so it is expected to unload when `Agent_OnAttach()` fails.

Got it! Thanks.

David
-----

Copy link
Contributor

@sspitsyn sspitsyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Yasumasa,

Both the CSR and the fix look good.
Thank you for your patience with the process.

Thanks,
Serguei

@YaSuenag
Copy link
Member Author

/integrate

@openjdk openjdk bot closed this Jan 15, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 15, 2021
@openjdk
Copy link

openjdk bot commented Jan 15, 2021

@YaSuenag Since your change was applied there have been 21 commits pushed to the master branch:

  • e3b548a: 8257736: InputStream from BodyPublishers.ofInputStream() leaks when IOE happens
  • 978bed6: 8259522: Apply java.io.Serial annotations in java.desktop
  • bf28f92: 8259713: Fix comments about ResetNoHandleMark in deoptimization
  • 4f881ba: 8258652: Assert in JvmtiThreadState::cur_stack_depth() can noticeably slow down debugging single stepping
  • d18d26e: 8259350: Add some internal debugging APIs to the debug agent
  • a6b2162: 8259278: Optimize Vector API slice and unslice operations
  • da6bcf9: 8255019: Shenandoah: Split STW and concurrent mark into separate classes
  • aba3431: 8258956: Memory Leak in StringCoding on ThreadLocal resultCached StringCoding.Result
  • 8554fe6: 8253866: Security Libs Terminology Refresh
  • c2a3c7e: 8259727: Remove redundant "target" arguments to methods in Links
  • ... and 11 more: https://git.openjdk.java.net/jdk/compare/51e14f2e2ab322ef551437845925e08a997c74fc...master

Your commit was automatically rebased without conflicts.

Pushed as commit 90960c5.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@YaSuenag YaSuenag deleted the JDK-8252657 branch January 16, 2021 13:06
zzambers pushed a commit to zzambers/jdk that referenced this pull request Feb 28, 2023
caojoshua pushed a commit to caojoshua/jdk that referenced this pull request Apr 6, 2023
c = log(n) * log(log(n)) * 3 / 2, where n is available processor.
it's only reasonable to use CICompilerCount=1 in debug.

Co-authored-by: Xin Liu <xxinliu@amazon.com>
caojoshua pushed a commit to caojoshua/jdk that referenced this pull request Apr 10, 2023
c = log(n) * log(log(n)) * 3 / 2, where n is available processor.
it's only reasonable to use CICompilerCount=1 in debug.

Co-authored-by: Xin Liu <xxinliu@amazon.com>
gnu-andrew pushed a commit to gnu-andrew/jdk that referenced this pull request Aug 18, 2023
openjdk-notifier bot pushed a commit that referenced this pull request Apr 15, 2024
* Even more review comments

* Re-write of atomic copy loops

* Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark}
lahodaj pushed a commit to lahodaj/jdk that referenced this pull request Mar 17, 2025
…enjdk#19)

* Erase pattern signature for records

* Erase pattern signature for deconstructors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

5 participants