-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] ndk-stack failing with UnicodeDecodeError due to NativeTombstoneManager #1993
Comments
Assuming it also worked in r24, I suspect this is a result of the Python 2 -> 3 migration? @enh-google started an internal thread with some of the folks that know this system better. I'll nudge them to update here so I don't get something wrong playing telephone. The short answer right now is that they suspect the bug is actually elsewhere, because those characters shouldn't show up in the log at all. Agreed that ndk-stack shouldn't choke in malformed input either way though 👍
I've actually never heard of this, and the Google results are pretty slim. Where should I be looking? |
yeah, basically from your dump.txt (thanks for including that!) it looks like somewhere we have some C code passing one byte too many (the '\0' terminator) to some Java code that commits Java's utf mistake -- https://en.wikipedia.org/wiki/UTF-8#Modified_UTF-8 -- to turn that into the two bytes you're seeing. so someone will want to hunt that down and fix it in the OS. but, as danalbert says, we should fix the script too, if possible. (python is correct --- this is invalid utf8. but if we can, we should just let it through anyway.)
i don't think they meant what they said. that's just the part of the OS that deals with moving tombstones to dropbox (from wheret hey go to the Play Console). the public API is https://developer.android.com/reference/android/app/ApplicationExitInfo#getTraceInputStream() but -- as evidenced by how hard it was for me to find that even though i knew what i was looking for -- that's not easy to find via web search. (hopefully that's because folks are just happy with the Play Console, but i suspect there are a lot of bad wheel-reinventions out there too.) |
Gotcha. Are you still trying to figure out who the likely culprit is? Can I leave filing that OS bug to you, and I'll go poke at ndk-stack? |
i'm 99% sure it's the person i added to the internal thread. i was assuming they'd just fix it rather than file a bug... but now you've made me look, it's actually jmgao's bug. so i guess i'll be fixing that :-) |
interestingly, looking at /proc/self/attr/current on my desktop and on a random android device that happens to be plugged in, i think there are just bugs here where folks are echo(1)/write(2)ing into this file and accidentally leaving a '\n' or '0' included. the kernel just treats these as arbitrary byte sequences and keeps the cruft, and then other bits of userspace are surprised to see random control characters handed back later. i'll see what our selinux supremo says before touching anything (especially because i only see readers in our codebase, so i don't actually know where my hypothetical writes are coming from). |
Managers hate this one weird trick to avoid having to fix bugs!
I don't think that sounds right, the kernel converts between a u32 internal value and the human readable string when reading and writing the file. My uninformed guess is that there's a null terminator in the cil files or compiled sepolicy? |
@DanAlbert I've found it by searching here: Thank you all for the context and support |
yeah, sorry, i followed up further on the internal thread but forgot about this. i couldn't find where anything in the platform was writing to this... ah, it's those same selinux macros that mean i can never find what i'm looking for in selinux (and it's always procattr.c i'm looking for, and as usual i regret looking). so, yes, they do this for some reason:
i still think this is an selinux bug, and i think they usually don't notice because their getter is written in C, uses strdup(), and thus truncates '\0' anyway. (the '\n' on my desktop is another matter entirely!) inside the kernel i see this in selinux_setprocattr():
so i'm pretty confused at this point... without docs explaining why any of this makes sense, it feels like bug layered on top of bug? (had they not heard of i think the least effort fix for us is to switch debuggerd off /proc//attr/current and over to getpidcon(), and continue to ignore the fact that libselinux needs to be rewritten from scratch for another decade... (you do have to wonder how much cruft will have to be in the eventual [presumably rust at this point] rewrite for bug compatibility!) (and, of course, the only part any users care about is that danalbert changes ndk-stack to just treat logs as a sequence of bytes rather than necessarily valid utf8.) |
Which I'd meant to do yesterday, but stuff happened. It'll be in r27 for sure, but it probably won't be in the canary until next week. |
actually, it's more of a PITA to use libselinux (and it's awful anyway), so i've just gone with the one-liner "drop a trailing '\0'" code: https://android-review.googlesource.com/c/platform/system/core/+/2930438 |
Had to go work on other stuff for a few weeks, but fixed with https://r.android.com/c/2972694. |
libselinux has an off-by-one that causes it to pass the trailing '\0' to the kernel as if it's part of the security context, and the kernel dutifully hands it back, since it's an uninterpreted byte array as far as the kernel is concerned. libselinux accidentally hides this bug by treating it as a C string and calling strdup(), but debuggerd doesn't because it reads the file into a std::string. We could switch to libselinux's getcon()/getpidcon(), but (a) libselinux is awful (see above) and (b) not currently accessible to apexes (and it doesn't seem like a great idea to make it accessible). So just manually drop the last byte from the context we read ourselves, if it happens to be a '\0'. Bug: android/ndk#1993 Test: treehugger Change-Id: I8e7605ac5e618007a8da635cb6f45b0778dc167c
libselinux has an off-by-one that causes it to pass the trailing '\0' to the kernel as if it's part of the security context, and the kernel dutifully hands it back, since it's an uninterpreted byte array as far as the kernel is concerned. libselinux accidentally hides this bug by treating it as a C string and calling strdup(), but debuggerd doesn't because it reads the file into a std::string. We could switch to libselinux's getcon()/getpidcon(), but (a) libselinux is awful (see above) and (b) not currently accessible to apexes (and it doesn't seem like a great idea to make it accessible). So just manually drop the last byte from the context we read ourselves, if it happens to be a '\0'. Bug: android/ndk#1993 Test: treehugger Change-Id: I8e7605ac5e618007a8da635cb6f45b0778dc167c
libselinux has an off-by-one that causes it to pass the trailing '\0' to the kernel as if it's part of the security context, and the kernel dutifully hands it back, since it's an uninterpreted byte array as far as the kernel is concerned. libselinux accidentally hides this bug by treating it as a C string and calling strdup(), but debuggerd doesn't because it reads the file into a std::string. We could switch to libselinux's getcon()/getpidcon(), but (a) libselinux is awful (see above) and (b) not currently accessible to apexes (and it doesn't seem like a great idea to make it accessible). So just manually drop the last byte from the context we read ourselves, if it happens to be a '\0'. Bug: android/ndk#1993 Test: treehugger Change-Id: I8e7605ac5e618007a8da635cb6f45b0778dc167c
libselinux has an off-by-one that causes it to pass the trailing '\0' to the kernel as if it's part of the security context, and the kernel dutifully hands it back, since it's an uninterpreted byte array as far as the kernel is concerned. libselinux accidentally hides this bug by treating it as a C string and calling strdup(), but debuggerd doesn't because it reads the file into a std::string. We could switch to libselinux's getcon()/getpidcon(), but (a) libselinux is awful (see above) and (b) not currently accessible to apexes (and it doesn't seem like a great idea to make it accessible). So just manually drop the last byte from the context we read ourselves, if it happens to be a '\0'. Bug: android/ndk#1993 Test: treehugger Change-Id: I8e7605ac5e618007a8da635cb6f45b0778dc167c
libselinux has an off-by-one that causes it to pass the trailing '\0' to the kernel as if it's part of the security context, and the kernel dutifully hands it back, since it's an uninterpreted byte array as far as the kernel is concerned. libselinux accidentally hides this bug by treating it as a C string and calling strdup(), but debuggerd doesn't because it reads the file into a std::string. We could switch to libselinux's getcon()/getpidcon(), but (a) libselinux is awful (see above) and (b) not currently accessible to apexes (and it doesn't seem like a great idea to make it accessible). So just manually drop the last byte from the context we read ourselves, if it happens to be a '\0'. Bug: android/ndk#1993 Test: treehugger Change-Id: I8e7605ac5e618007a8da635cb6f45b0778dc167c
Description
Hey all,
I'm having problem running
ndk-stack
since upgrading to NDK 25.1 and 26.1Specifically I'm invoking it with either
stdin
by piping adb logcat or passing a-dump
, but the tool fails to work in both cases. The tool works correctly with NDK 23.1.It seems like
ndk-stack
fails to read the input due to non UTF-8 chars in the log.I'm uploading a dump taken from an Android emulator:
dump.txt
Specifically the non UTF-8 chars seems to be the following:
It would be great if
ndk-stack
were to not fail on those characters, specifically also becauseNativeTombstoneManager
is part of the Android SDK (and everything worked fine in NDK 23).Stacktrace on NDK 26.1
Stacktrace on NDK 25.1
Affected versions
r26
Canary version
No response
Host OS
Mac
Host OS version
macOS 14.2.1
Affected ABIs
arm64-v8a
Build system
CMake
Other build system
No response
minSdkVersion
21
Device API level
32
The text was updated successfully, but these errors were encountered: