[CoreCLR] Automatically detach current thread from JNI #10316

simonrozsival · 2025-07-22T08:44:02Z

In #10198 I added code which attaches current thread to JNI (https://github.com/dotnet/android/pull/10198/files#diff-fc0414b3741163879db7993fd1fb42fa76e5305bb54c0e9478e077b3094e7aa7R42-R46). If this code is called from a thread pool thread, the thread won't be detached from JNI when exitting.

This PR adds the recommended steps to automatically detach the thread from JNI when exitting: https://developer.android.com/training/articles/perf-jni#threads

/cc @grendello @jonathanpeppers

jonathanpeppers

Is there a way to add a test for this?

Or did MAUI's tests crash on exit? So, we'd need to somehow run an extra test suite?

jonpryor · 2025-07-22T13:58:54Z

What's odd is that I think we have a test for this? Maybe?

From the abort message in #10314:

7-21 15:25:22.624  1610  1610 F DEBUG   : Abort message: 'art/runtime/thread.cc:1223] Native thread exited without calling DetachCurrentThread: Thread[15,tid=4355,Native,Thread*=0x7f3c09434600,peer=0x12c89580,"Thread-234"]'

which suggests that if we:

Create a new Thread
Do some JNI work on that thread, and
Exit the thread

then the assertion should be triggered, no?

We have tests that do that, e.g.:

android/tests/Mono.Android-Tests/Mono.Android-Tests/Java.Interop/JnienvTest.cs

Lines 329 to 380 in 510fc08

    
           [Test] 
        
           public void ConversionsAndThreadsAndInstanceMappingsOhMy () 
        
           { 
        
           	IntPtr lrefJliArray = JNIEnv.NewObjectArray<int> (new[]{1}); 
        
           	IntPtr grefJliArray = JNIEnv.NewGlobalRef (lrefJliArray); 
        
           	JNIEnv.DeleteLocalRef (lrefJliArray); 
        
           	Java.Lang.Object[] jarray = (Java.Lang.Object[]) 
        
           		JNIEnv.GetArray (grefJliArray, JniHandleOwnership.DoNotTransfer, typeof(Java.Lang.Object)); 
        
           	Exception ignore_t1 = null; 
        
           	Exception ignore_t2 = null; 
        
           	var t1 = new Thread (() => { 
        
           		int[] output_array1 = new int[1]; 
        
           		for (int i = 0; i < 2000; ++i) { 
        
           			Console.WriteLine ("# t1 iter: {0}", i); 
        
           			try { 
        
           				JNIEnv.CopyObjectArray (grefJliArray, output_array1); 
        
           			} catch (Exception e) { 
        
           				ignore_t1 = e; 
        
           				break; 
        
           			} 
        
           		} 
        
           	}); 
        
           	var t2 = new Thread (() => { 
        
           		for (int i = 0; i < 2000; ++i) { 
        
           			Console.WriteLine ("# t2 iter: {0}", i); 
        
           			try { 
        
           				JNIEnv.GetArray<int>(jarray); 
        
           			} catch (Exception e) { 
        
           				ignore_t2 = e; 
        
           				break; 
        
           			} 
        
           		} 
        
           	}); 
        
           	t1.Start (); 
        
           	t2.Start (); 
        
           	t1.Join (); 
        
           	t2.Join (); 
        
           	for (int i = 0; i < jarray.Length; ++i) { 
        
           		jarray [i].Dispose (); 
        
           		jarray [i]  = null; 
        
           	} 
        
           	JNIEnv.DeleteGlobalRef (grefJliArray); 
        
           	Assert.IsNull (ignore_t1, string.Format ("No exception should be thrown [t1]! Got: {0}", ignore_t1)); 
        
           	Assert.IsNull (ignore_t2, string.Format ("No exception should be thrown [t2]! Got: {0}", ignore_t2)); 
        
           }

so why didn't we hit this before?

Are thread pool threads different from "normal" new System.Threading.Thread()?

Do we need to do "different" kinds of JNI calls on the created thread to trigger the assertion? It likely wouldn't be a bad idea to have a test that does:

var t = new Thread(() => {
    var list = new Java.Util.ArrayList();
    list.Add(new Java.Lang.String("a");
    list.Add(new Java.Lang.Integer(42);
});
t.Start();
t.Join();

and do a bit more than straight JNIEnv calls on the created thread.

The assert message, as-is, does not imply to me that we need to worry about app exit. (And that's ignoring the fact that "app exit" is a very nebulous concept on Android in the first place!)

grendello · 2025-08-04T09:37:39Z

@jonpryor I think the situation here is a bit different, at least from looking at the crash stack trace attached to #10314 (comment)

07-21 14:55:47.473  2121  2121 F DEBUG   :     #00 pc 0000000000539ff7  /system/lib64/libart.so (art::DumpCheckpoint::Run(art::Thread*)+439)
07-21 14:55:47.473  2121  2121 F DEBUG   :     #01 pc 000000000053cac6  /system/lib64/libart.so (art::ThreadList::RunCheckpoint(art::Closure*)+278)
07-21 14:55:47.473  2121  2121 F DEBUG   :     #02 pc 000000000053ddf5  /system/lib64/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+357)
07-21 14:55:47.473  2121  2121 F DEBUG   :     #03 pc 00000000004ffea4  /system/lib64/libart.so (art::Runtime::Abort()+660)
07-21 14:55:47.473  2121  2121 F DEBUG   :     #04 pc 0000000000178d71  /system/lib64/libart.so (art::LogMessage::~LogMessage()+2865)
07-21 14:55:47.473  2121  2121 F DEBUG   :     #05 pc 00000000005276ff  /system/lib64/libart.so (art::Thread::ThreadExitCallback(void*)+223)
07-21 14:55:47.473  2121  2121 F DEBUG   :     #06 pc 000000000008590e  /system/lib64/libc.so (pthread_key_clean_all()+142)
07-21 14:55:47.473  2121  2121 F DEBUG   :     #07 pc 000000000008549b  /system/lib64/libc.so (pthread_exit+75)
07-21 14:55:47.473  2121  2121 F DEBUG   :     #08 pc 0000000000084ef6  /system/lib64/libc.so (__pthread_start(void*)+54)
07-21 14:55:47.473  2121  2121 F DEBUG   :     #09 pc 00000000000296eb  /system/lib64/libc.so (__start_thread+11)
07-21 14:55:47.473  2121  2121 F DEBUG   :     #10 pc 000000000001ce55  /system/lib64/libc.so (__bionic_clone+53)

This is 100% native + ART. @simonrozsival might be onto something with this PR (btw, why was it closed?) the thread mentioned in the abort message (PID 4651) is created to run a test:

07-21 14:55:47.368  4496  4651 D monodroid: [precompiled] p/invoke found
07-21 14:55:47.375  4496  4651 D monodroid: clr_external_assembly_probe ("xunit.assert.ni.dll"...)
07-21 14:55:47.375  4496  4651 D monodroid: clr_external_assembly_probe ("xunit.assert.dll"...)
07-21 14:55:47.383  4496  4651 I DOTNET  :      [PASS] Charge_State
07-21 14:55:47.393  4496  4651 I DOTNET  :      [PASS] Charge_Level
07-21 14:55:47.395  4496  4651 I DOTNET  :      [PASS] Charge_Power
07-21 14:55:47.396  4496  4651 I DOTNET  :      [PASS] Unsubscribe_BatteryInfoChanged_Does_Not_Crash
07-21 14:55:47.399  4496  4651 I DOTNET  :      [PASS] App_Is_Not_Lower_Power_mode
07-21 14:55:47.407  4496  4651 W art     : Native thread exiting without having called DetachCurrentThread (maybe it's going to use a pthread_key_create destructor?): Thread[17,tid=4651,Native,Thread*=0x7fc955cf4a00,peer=0x12c5a3a0,"Thread-212"]
07-21 14:55:47.408  4496  4651 F art     : art/runtime/thread.cc:1223] Native thread exited without calling DetachCurrentThread: Thread[17,tid=4651,Native,Thread*=0x7fc955cf4a00,peer=0x12c5a3a0,"Thread-212"]
07-21 14:55:47.408  4496  4646 F DOTNET  : Aborting process.

If the GC bridge kicks in while the test is running and on the same thread that crashes, this would explain the error. Our test assumes that calling a JNI method actually attaches the thread to JNI, which might not be the case.
In the MonoVM runtime we always attach every thread automatically, this is not necessarily the case with CoreCLR (or it might be a bug in CoreCLR - we don't control how/when it attaches threads)

simonrozsival · 2025-08-04T10:20:43Z

btw, why was it closed?

I was not able to finish the work on the PR and fully test it, so I closed it for the time being. We can definitely revisit the PR.

If the GC bridge kicks in while the test is running and on the same thread that crashes, this would explain the error.

I don't know if this could be caused by the GC itself (GC has a background thread and the GC bridge has its own separate background thread, both of which are attached to JNI but both should live thoughout app's lifetime). The changes to the ensure_jnienv method affect other pieces of code, not just the GC bridge.

My hypothesis was that when this code is reached on a threadpool thread then the TP thread is attached to JNI and it is likely that this thread will be exitted at some point. This would cause the app to crash because we don't have the automatic detach machanism implemented. I wasn't able to reproduce the crash locally before I left on vacation, so I abandoned this attempt at that time.

grendello · 2025-08-04T13:06:25Z

@simonrozsival does threadpool actually attach threads to JNI? If I remember correctly, both MonoVM and CoreCLR use the same managed threadpool implementation? If it is indeed true, then the code crashing on CoreCLR but working on MonoVM would actually be a runtime issue. On MonoVM we get called back whenever any managed thread is created/destroyed, so we can attach/detach accordingly. On CoreCLR we don't have this opportunity, so it is on the runtime to do the right thing.

I used GC bridge as an example of what might occur, it's hard to tell from the stack trace that this is really the case - as you say, it can be anything that creates the thread, attaches it, but then fails to detach. If it's a managed thread, then it will exit only after the workload running it it finishes (I hope that this is the case), so there's still the chance to do the right thing there, a chance that maybe is missed for some reason?

TBH, I would love if CoreCLR called us back the same way MonoVM does on any thread creation/distruction, so that we can control things like JNI/JVM thread attachment/detachment.

simonrozsival · 2025-08-04T13:16:53Z

@simonrozsival does threadpool actually attach threads to JNI? If I remember correctly, both MonoVM and CoreCLR use the same managed threadpool implementation?

If at some point the ensure_jnienv method is called and the thread isn't attached to JNI yet, it will be attached to JNI. So, if we register the automatic cleanup at this point, it should work just fine and there is no need for us to attach threads to JNI unnecessarily.

simonrozsival added 3 commits July 22, 2025 09:43

Avoid attaching GC thread

4f67bcc

Register thread cleanup

d472807

Attach threads as deamons

803762c

simonrozsival requested review from grendello and jonathanpeppers as code owners July 22, 2025 08:44

jonathanpeppers reviewed Jul 22, 2025

View reviewed changes

simonrozsival closed this Jul 28, 2025

github-actions bot locked and limited conversation to collaborators Sep 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CoreCLR] Automatically detach current thread from JNI #10316

[CoreCLR] Automatically detach current thread from JNI #10316

Uh oh!

simonrozsival commented Jul 22, 2025

Uh oh!

jonathanpeppers left a comment

Uh oh!

jonpryor commented Jul 22, 2025

Uh oh!

grendello commented Aug 4, 2025

Uh oh!

simonrozsival commented Aug 4, 2025

Uh oh!

grendello commented Aug 4, 2025

Uh oh!

simonrozsival commented Aug 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[CoreCLR] Automatically detach current thread from JNI #10316

[CoreCLR] Automatically detach current thread from JNI #10316

Uh oh!

Conversation

simonrozsival commented Jul 22, 2025

Uh oh!

jonathanpeppers left a comment

Choose a reason for hiding this comment

Uh oh!

jonpryor commented Jul 22, 2025

Uh oh!

grendello commented Aug 4, 2025

Uh oh!

simonrozsival commented Aug 4, 2025

Uh oh!

grendello commented Aug 4, 2025

Uh oh!

simonrozsival commented Aug 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants