Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JNI methods cannot be attached inside clinit #277

Closed
mikehearn opened this issue Jun 5, 2014 · 9 comments · Fixed by #283
Closed

JNI methods cannot be attached inside clinit #277

mikehearn opened this issue Jun 5, 2014 · 9 comments · Fixed by #283

Comments

@mikehearn
Copy link
Contributor

Consider FooClass. -> System.loadLibrary() -> libfoo -> clazz = JNI_FindClass("FooClass"); JNI_RegisterNatives(clazz, "nativeMethod", ...);

On Avian the JNI_FindClass call fails and returns null, presumably because it's called inside the class static initialiser so the class didn't init yet. This means that the scrypt library cannot initialise and indeed crashes the VM because it ends up passing a null class ptr to RegisterNatives.

@dicej
Copy link
Member

dicej commented Jun 5, 2014

Can you provide a complete test case? I just tried to reproduce it with the following patch, but was unable to do so.

diff --git a/test/JNI.java b/test/JNI.java
index 955bfc9..8c99405 100644
--- a/test/JNI.java
+++ b/test/JNI.java
@@ -5,6 +5,7 @@ import java.lang.reflect.Field;
 public class JNI {
   static {
     System.loadLibrary("test");
+    expect(JNI.class == findClass("JNI"));
   }

   private static void expect(boolean v) {
@@ -59,6 +60,8 @@ public class JNI {

   private static native Object testLocalRef(Object o);

+  private static native Class findClass(String name);
+
   public static int method242() { return 242; }

   public static final int field950 = 950;
diff --git a/test/jni.cpp b/test/jni.cpp
index 8b8d6dc..8dfd9b9 100644
--- a/test/jni.cpp
+++ b/test/jni.cpp
@@ -107,6 +107,19 @@ Java_JNI_testLocalRef(JNIEnv* e, jclass, jobject o)
   return e->NewLocalRef(o);
 }

+extern "C" JNIEXPORT jclass JNICALL
+Java_JNI_findClass(JNIEnv* e, jclass, jstring name)
+{
+  const char* chars = e->GetStringUTFChars(name, 0);
+  if (chars) {
+    jclass c = e->FindClass(chars);
+    e->ReleaseStringUTFChars(name, chars);
+    return c;
+  } else {
+    return 0;
+  }
+}
+
 extern "C" JNIEXPORT jobject JNICALL
 Java_Buffers_allocateNative(JNIEnv* e, jclass, jint capacity)
 {

@mikehearn
Copy link
Contributor Author

OK, so it turns out that I am being confused by the stack traces Avian generates. Sorry, I don't have much experience of stepping through a JITC VM in a debugger.

The stack trace I get looks like this, after I added an assert that c is != NULL at the top of RegisterNatives. Because this argument comes directly from a call to FindClass I assumed it had returned null.

thread #1: tid = 0x1a03, 0x00007fff8c3ac866 libsystem_kernel.dylib__pthread_kill + 10, stop reason = signal SIGABRT frame #0: 0x00007fff8c3ac866 libsystem_kernel.dylib__pthread_kill + 10
frame #1: 0x00007fff8f28a35c libsystem_pthread.dylibpthread_kill + 92 frame #2: 0x00007fff869d9b1a libsystem_c.dylibabort + 125
frame #3: 0x00000001000047b9 avianavian::system::crash() + 9 at signal.cpp:157 frame #4: 0x00000001000037b3 avianabort(this=0x0000000104907f30) + 17 at posix.cpp:898
frame #5: 0x000000010003820f avianvoid avian::util::abort<vm::Thread*>(t=0x0000000105009a08) + 47 at abort.h:33 frame #6: 0x0000000100038253 avianvoid avian::util::expectvm::Thread*(t=0x0000000105009a08, v=false) + 51 at abort.h:40
frame #7: 0x0000000100038289 avianvoid avian::util::assert<vm::Thread*>(t=0x0000000105009a08, v=false) + 41 at abort.h:49 frame #8: 0x00000001000b7aab avianRegisterNatives(t=0x0000000105009a08, c=0x0000000000000000, methods=0x0000000104bd8060, methodCount=1) + 59 at jnienv.cpp:3309
frame #9: 0x0000000104bd53b0 scrypt5558926548229267165libJNI_OnLoad + 80 frame #10: 0x0000000100098a27 avianvm::runOnLoadIfFound(t=0x0000000105009a08, library=0x0000000107602d90) + 103 at classpath-common.h:139
frame #11: 0x00000001000a132f avianvm::loadLibrary(t=0x0000000105009a08, path=0x0000000105000111, name=0x00007fff5fbfcfc0, mapName=false, runOnLoad=true, throw_=true) + 1439 at classpath-common.h:239 frame #12: 0x00000001000a16b1 avianloadLibrary(t=0x0000000105009a08, unnamed_arg=0x000000010a7c5ab0, arguments=0x00007fff5fbfd1d0) + 705 at classpath-openjdk.cpp:1898
frame #13: 0x0000000100073540 avianinvokeNativeFast(t=0x0000000105009a08, method=0x000000010a7c5ab0, function=0x00000001000a13f0) + 224 at compile.cpp:7157 frame #14: 0x000000010008c4cb avianinvokeNative2(t=0x0000000105009a08, method=0x000000010a7c5ab0) + 139 at compile.cpp:7334
frame #15: 0x00000001000715be avian`invokeNative(t=0x0000000105009a08) + 430 at compile.cpp:7368

Stepping through in the debugger though, I end up quite confused. What I actually see is:

env->FindClass -> findClass -> resolveClass -> resolveSystemClass which fails to find the class and then calls vm::throw with an exception. After attempting to step over the vm::throw statement I end up hitting that assert, so I assume at this point Avian unwinds the stack and I don't get to see what caught the exception, presumably something deep inside the JNI machinery.

So the issue is quite clearly that it's getting stuck trying to find the class that's in the process of classloading. To see what I see, do this:

git clone https://github.com/wg/scrypt.git
cd scrypt
make
mvn package

$AVIAN_HOME/build/darwin-x86_64-debug-openjdk-src/avian -cp $HOME/.m2/repository/junit/junit/4.11/junit-4.11.jar:/$HOME/.m2/repository/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar:target/scrypt-1.4.0.jar:target/test-classes org.junit.runner.JUnitCore com.lambdaworks.crypto.test.SCryptTest

@joshuawarner32
Copy link
Collaborator

IIRC, when you throw an exception in either vm-internal code or JNI code, it doesn't immediately unwind the stack - rather, it sets an exception flag on the thread, and waits until we return out of native code before unwinding the java stack. The expectation is that the native code will detect the presence of the exception (either by reading the flag, or from return codes) and return as quickly as possible.

In short, I think seeing a throw in native code and then returning null is a normal course of action.

@mikehearn
Copy link
Contributor Author

That doesn't seem to be what happens in lldb, but debuggers are weird :) I can well believe actual control flow is not the same as what appears to happen inside the debugger.

At any rate, I'm not sure why your test passes and yet this code goes wrong. It feels like the same issue! :)

@joshuawarner32
Copy link
Collaborator

Sorry, I think I just confused the issue.

I was referring to how the scrypt code should probably read something like:

    //...
    jclass cls = (*env)->FindClass(env, "com/lambdaworks/crypto/SCrypt");
    if((*env)->ExceptionCheck()) return -1;
    int r = (*env)->RegisterNatives(env, cls, methods, 1);
    //...

That issue aside, my best guess would be that the problem is that in the JNI_OnLoad, we haven't registered any of the application class loaders - and so the only one available is the system one, which only loads from the internal jar.

@dicej, does that sound plausible?

@mikehearn
Copy link
Contributor Author

Ah yes. Scrypt lib does indeed load a library from its own jar. I thought it unpacked it first but maybe not.

@dicej
Copy link
Member

dicej commented Jun 23, 2014

On Mon, 23 Jun 2014, Mike Hearn wrote:

So the issue is quite clearly that it's getting stuck trying to find the
class that's in the process of classloading. To see what I see, do this:

git clone https://github.com/wg/scrypt.git
cd scrypt
make
mvn package

$AVIAN_HOME/build/darwin-x86_64-debug-openjdk-src/avian -cp $HOME/.m2/repository/junit/junit/4.11/junit-4.11.jar:/$HOME/.m2/repository/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar:target/scrypt-1.4.0.jar:target/test-classes org.junit.runner.JUnitCore com.lambdaworks.crypto.test.SCryptTest

Thanks for the test case. The problem is that FindClass uses the class
loader of the calling class, which is normally correct, except that it's
java.lang.ClassLoader in the case of JNI_OnLoad, which means it's the
system class loader. Since com.lambdaworks.crypto.SCrypt does not exist
in the system class loader (only the application class loader, it's not
found).

Looking at openjdk/hotspot/src/share/vm/prims/jni.cpp, I see that HotSpot
considers JNI_OnLoad and JNI_OnUnload special cases and uses an
alternative strategy for determining the calling class loader. Avian will
need to do the same. I'll give it a shot when I have a chance.

@mikehearn
Copy link
Contributor Author

Thanks! I can see there's an easy workaround on the scrypt library side. Let me know if I can help in any way.

@mikehearn
Copy link
Contributor Author

Thanks guys, you really do rock.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants