Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SunMD5 format crashes on ARM #5296

Open
claudioandre-br opened this issue Apr 26, 2023 · 26 comments
Open

SunMD5 format crashes on ARM #5296

claudioandre-br opened this issue Apr 26, 2023 · 26 comments
Assignees

Comments

@claudioandre-br
Copy link
Member

claudioandre-br commented Apr 26, 2023

This is macOS on ARM:

It's more about documenting, but maybe someone can reproduce the issue and find a misuse of 'some' API by the format.

Note: it is a test-full run.

Testing: SunMD5 [MD5 128/128 ASIMD 4x2]... (4xOMP) run_tests.sh: line 62: 13013 Abort trap: 6
     $JTR_BIN -test-full=0 --format=cpu

Exit status: 134

Everything else looks ok:

All 402 formats passed self-tests!
 Ok: -test-full=0 --format=cpu

If fails (to me) when:

  • always if group_sz is 64;
  • only if group_sz is 64;
  • It seems that more than one type of failure happens (?).
@magnumripper
Copy link
Member

Probably something unaligned.

@magnumripper magnumripper self-assigned this Apr 27, 2023
magnumripper added a commit to magnumripper/john that referenced this issue Apr 29, 2023
Variables were declared as char and cast to int, for no apparent reason.

Hopefully closes openwall#5296 but I couldn't reproduce as my gear apparently
happened to end up aligned.
magnumripper added a commit to magnumripper/john that referenced this issue Apr 29, 2023
Variables were declared as char and cast to int, for no apparent reason.

Hopefully closes openwall#5296 but I couldn't reproduce as my gear apparently
happened to end up aligned.
magnumripper added a commit to magnumripper/john that referenced this issue Apr 29, 2023
Variables were declared as char and cast to int, for no apparent reason.

Hopefully closes openwall#5296 but I couldn't reproduce as my gear apparently
happened to end up aligned.
magnumripper added a commit to magnumripper/john that referenced this issue Apr 30, 2023
Variables were declared as char and cast to int, for no apparent reason.

Hopefully closes openwall#5296 but I couldn't reproduce as my gear apparently
happened to end up aligned.
magnumripper added a commit to magnumripper/john that referenced this issue Apr 30, 2023
Variables were declared as char and cast to int, for no apparent reason.

Hopefully closes openwall#5296 but I couldn't reproduce as my gear apparently
happened to end up aligned.
@magnumripper
Copy link
Member

I managed to reproduce now. The trick is to use eg. OMP_NUM_THREADS=3 but it still doesn't happen on Linux, only the M1. My current fixes in #5299 doesn't help.

@magnumripper
Copy link
Member

I don't quite understand this backtrace.

Process 73326 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x4a03000)
    frame #0: 0x00000001021e3328 libcrypto.3.dylib`_armv8_sve_probe
libcrypto.3.dylib`:
->  0x1021e3328 <+0>: eor    z0.d, z0.d, z0.d
    0x1021e332c <+4>: ret    

libcrypto.3.dylib`:
    0x1021e3330 <+0>: xar    z0.d, z0.d, z0.d, #0x20
    0x1021e3334 <+4>: ret    
Target 0: (john) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x4a03000)
  * frame #0: 0x00000001021e3328 libcrypto.3.dylib`_armv8_sve_probe
    frame #1: 0x00000001021e3830 libcrypto.3.dylib`OPENSSL_cpuid_setup + 600
    frame #2: 0x000000019029c1e0 dyld`invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const::$_0::operator()() const + 168
    frame #3: 0x00000001902dde94 dyld`invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 340
    frame #4: 0x00000001902d11a4 dyld`invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 528
    frame #5: 0x000000019027c2d8 dyld`dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const + 296
    frame #6: 0x00000001902d01cc dyld`dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 192
    frame #7: 0x00000001902dd958 dyld`dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 516
    frame #8: 0x0000000190298864 dyld`dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 448
    frame #9: 0x0000000190298c18 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 220
    frame #10: 0x0000000190298bf4 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 184
    frame #11: 0x0000000190298bf4 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 184
    frame #12: 0x000000019029c26c dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const::$_1::operator()() const + 112
    frame #13: 0x0000000190298d98 dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 304
    frame #14: 0x00000001902bc984 dyld`dyld4::APIs::runAllInitializersForMain() + 468
    frame #15: 0x00000001902812d0 dyld`dyld4::prepare(dyld4::APIs&, dyld3::MachOAnalyzer const*) + 3480
    frame #16: 0x000000019027fe18 dyld`start + 1964

Not a single line of it refers to john, does it?

@magnumripper
Copy link
Member

OK, that lldb crash happens with any format - must be some totally unrelated problem with lldb itself (it and the gcc are the ones included in MacOS though).

@magnumripper
Copy link
Member

magnumripper commented Apr 30, 2023

Without lldb working, my only clue is this:

Will run 3 OpenMP threads
Testing: SunMD5 [MD5 128/128 ASIMD 4x2]... (3xOMP) Bus error: 10

I think I'll try installing a real gcc and gdb and see what happens. Edit: Apparently I already had gcc-12 and that's what built john. But Homebrew doesn't supply gdb for arm64. I'll try building with native clang "gcc" and see if anything changes.

@magnumripper
Copy link
Member

magnumripper commented Apr 30, 2023

Unfortunately a build with native clang "gcc" never segfaults, so I can't reproduce. I even tried --format=cpu in case some other format would be interfering.

magnumripper added a commit to magnumripper/john that referenced this issue Apr 30, 2023
Variables were declared as char and cast to int, for no apparent reason.
These changes were made while researching openwall#5296 but didn't help at all.
The aliasing violations are real though, so committing anyway.
magnumripper added a commit to magnumripper/john that referenced this issue May 5, 2023
Variables were declared as char and cast to int, for no apparent reason.
These changes were made while researching openwall#5296 but didn't help at all.
The aliasing violations are real though, so committing anyway.

Also tweak OpenMP pragma; Move a private variable inside the block.
@magnumripper
Copy link
Member

magnumripper commented May 10, 2023

Lots of comments in #5299 but here's a better place.

I no longer think this is about alignment at all. Putting that debug print with correct brackets results in a never segfault nor fail (and never shows misalignment).

And to recap, when compiled with GNU gcc it never crashes and we've also ruled out OpenSSL as --without-openssl doesn't make a difference.

Oh, and also my fixes for the aliasing violations doesn't make a difference - so it's not the compiler acting up on UB.

@solardiz
Copy link
Member

The mentions of __stack_chk_fail in backtraces suggest that we have an out of bounds write on the stack.

@solardiz
Copy link
Member

If we can detect the issue with ASan, then that's what we should be using to debug it - not later crashes in a non-ASan build. Can you show that report from ASan you're referring to?

@magnumripper
Copy link
Member

magnumripper commented May 10, 2023

Note to self: I couldn't use --enable-asan and --without-openssl together. Build failed (at least make debug did) with complaints about AES stuff.

Edit: It's the combo --without-openssl and make debug that fails. Opening an issue for that.

@magnumripper
Copy link
Member

I can't reproduce any problem with ASan enabled. I tried the #5299 branch as well as current bleeding-jumbo, make debug or not, and threads between 1 and 29.

@solardiz
Copy link
Member

ASAN shows the line where there is a buffer overflow (when, for example) zeros are moved to a variable (a private group_sz).

What did you refer to, @claudioandre-br? Can you post that ASan report? Thanks!

@claudioandre-br
Copy link
Member Author

Not many build logs left, I didn't find anything. When I have news I'll post.

magnumripper added a commit that referenced this issue May 16, 2023
Variables were declared as char and cast to int, for no apparent reason.
These changes were made while researching #5296 but didn't help at all.
The aliasing violations are real though, so committing anyway.

Also tweak OpenMP pragma; Move a private variable inside the block.
@magnumripper
Copy link
Member

If you could get a core file after such crash it would probably help

@claudioandre-br claudioandre-br changed the title SunMD5 format crashes on Mac M1 SunMD5 format crashes on ARM Jan 5, 2024
@claudioandre-br
Copy link
Member Author

claudioandre-br commented Jan 5, 2024

Ubuntu 22 LTS (easy to reproduce using a cloud ARM instance and a simple --test --format=sunmd5 command line).

gcc version: 11.4.0

Benchmarking: SunMD5 [MD5 128/128 ASIMD 4x2]... (2xOMP) [New Thread 0x4000016fa080 (LWP 9426)]
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated

Thread 2 "john" received signal SIGABRT, Aborted.
(gdb) bt
#0  __pthread_kill_implementation (threadid=70368768270464, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x00004000006ff254 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x00004000006ba67c in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00004000006a7130 in __GI_abort () at ./stdlib/abort.c:79
#4  0x00004000006f3308 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x4000007d3778 "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:155
#5  0x0000400000775868 in __GI___fortify_fail (msg=msg@entry=0x4000007d3760 "stack smashing detected") at ./debug/fortify_fail.c:26
#6  0x0000400000775834 in __stack_chk_fail () at ./debug/stack_chk_fail.c:24
#7  0x0000aaaaaad5601c in crypt_all._omp_fn.0 () at sunmd5_fmt_plug.c:527
#8  0x000040000063b80c in ?? () from /lib/aarch64-linux-gnu/libgomp.so.1
#9  0x0000005e0000005e in ?? ()
$ run/john --test --format=sunmd5
Will run 2 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 ASIMD 4x2]... (2xOMP) *** stack smashing detected ***: terminated
Aborted (core dumped)

$ coredumpctl list
TIME                          PID  UID  GID SIG     COREFILE EXE                             SIZE
Fri 2024-01-05 20:44:52 UTC 10102 1001 1002 SIGABRT present  /home/circleci/project/run/john 9.3M


@claudioandre-br
Copy link
Member Author

@solardiz
Copy link
Member

solardiz commented Jan 5, 2024

core.john.1001.158250485e9b4c2a88e9ee4de80025f4.10102.1704487492000000.zip

I guess for best debugging, this should go along with a corresponding john binary, but there isn't one in that archive.

@solardiz
Copy link
Member

solardiz commented Jan 5, 2024

Thinking of possible reasons why this issue would only show up on Arm, we could want to try setting SIMD_PARA_MD5 to 2 on an x86-64 SSE* or 128-bit AVX/XOP build with ASan and see what happens. We rarely use that value on x86_64 - per x86-64.h we only do on gcc 4.5.x or when building for XOP, which we might not have tried combining with ASan.

@claudioandre-br
Copy link
Member Author

If you enable ASAN, no errors are seen:

A regular ./configure --enable-asan && make ...

Should we add something?

  • -D_FORTIFY_SOURCE=1
  • fstack-protector
  • fstack-protector-all

gcc -DAC_BUILT -DJOHN_ASIMD -c -g -Og -fsanitize=address -I/usr/local/include -DARCH_LITTLE_ENDIAN=1 -DPREFER_FLOCK -Wall -fno-omit-frame-pointer -Wno-deprecated-declarations -Wformat-extra-args -Wunused-but-set-variable -Wdate-time -D_POSIX_SOURCE -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -fopenmp -I/usr/local/include -DCL_SILENCE_DEPRECATION -funroll-loops sunmd5_fmt_plug.c -o sunmd5_fmt_plug.o


$ while ../run/john --test --format=sunmd5; do :; done
[...]
Will run 2 OpenMP threads
NOTE: This is a debug build, speed will be lower than normal
Benchmarking: SunMD5 [MD5 128/128 ASIMD 4x2]... (2xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	173 c/s real, 87.2 c/s virtual
[...]

@claudioandre-br
Copy link
Member Author

The non-simd build works fine.

If you remove the stack protection (to get another hint).

You get a bus error.

Will run 2 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 ASIMD 4x2]... (2xOMP) Bus error (core dumped)

And gdb is useless.

(gdb) bt
#0  0x0000002300000022 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

@claudioandre-br
Copy link
Member Author

Maybe it is possible to reproduce using X86_64.

(gdb) bt
#0  SIMDmd5body (_data=<optimized out>, out=0x555559309c00, reload_state=reload_state@entry=0x0, SSEi_flags=SSEi_flags@entry=0) at simd-intrinsics.c:408
#1  0x000055555571fd73 in crypt_all._omp_fn.0 () at sunmd5_fmt_plug.c:789
#2  0x00007ffff7969a16 in GOMP_parallel (fn=fn@entry=0x55555571e8c0 <crypt_all._omp_fn.0>, data=data@entry=0x7fffffffd9e0, num_threads=8, num_threads@entry=0, flags=flags@entry=0)
    at ../../../src/libgomp/parallel.c:178
#3  0x000055555571dfe8 in crypt_all (pcount=<optimized out>, salt=<optimized out>) at sunmd5_fmt_plug.c:527
#4  0x000055555573a0fb in is_key_right (format=format@entry=0x5555559005a0 <fmt_sunmd5>, index=index@entry=19, binary=binary@entry=0x555556b66f24, 
    ciphertext=ciphertext@entry=0x5555558690f8 "$md5$rounds=904$Vc3VgyFx44iS8.Yu$Scf90iLWN6O6mT9TA06NK/", plaintext=plaintext@entry=0x55555587315f "test", 
    is_test_fmt_case=is_test_fmt_case@entry=0, dbsalt=0x5555592fd670) at formats.c:592
#5  0x000055555573d2b0 in is_key_right (is_test_fmt_case=0, dbsalt=0x5555592fd670, plaintext=0x55555587315f "test", 
    ciphertext=0x5555558690f8 "$md5$rounds=904$Vc3VgyFx44iS8.Yu$Scf90iLWN6O6mT9TA06NK/", binary=<optimized out>, index=19, format=0x5555559005a0 <fmt_sunmd5>) at formats.c:588
#6  fmt_self_test_body (full_lvl=-1, db=0x555558fdfcc0, salt_copy=<optimized out>, binary_copy=<optimized out>, format=0x5555559005a0 <fmt_sunmd5>) at formats.c:1295
#7  fmt_self_test (format=format@entry=0x5555559005a0 <fmt_sunmd5>, db=db@entry=0x555558fdfcc0) at formats.c:2045
#8  0x000055555572c87d in benchmark_all () at bench.c:876
#9  0x000055555574585a in john_run () at john.c:1672
#10 main (argc=<optimized out>, argv=<optimized out>) at john.c:2092
diff --git a/src/x86-64.h b/src/x86-64.h
index f77c44ac13..39eec60dca 100644
--- a/src/x86-64.h
+++ b/src/x86-64.h
@@ -214,7 +214,7 @@
 #define SIMD_COEF_32 16
 #define SIMD_COEF_64 8
 #elif __AVX2__
-#define SIMD_COEF_32 8
+#define SIMD_COEF_32 4
 #define SIMD_COEF_64 4
 #elif __SSE2__
 #define SIMD_COEF_32 4
@@ -258,7 +258,7 @@
 #elif defined(__GNUC__) && (GCC_VERSION < 40600 || defined(__XOP__)) // 4.6.0
 #define SIMD_PARA_MD5                  2
 #else
-#define SIMD_PARA_MD5                  3
+#define SIMD_PARA_MD5                  2
 #endif
 #endif
 

If that is correct, the format should not use SIMD at all, it is buggy.

@solardiz
Copy link
Member

solardiz commented Jan 6, 2024

Thank you @claudioandre-br for these additional tests.

If you enable ASAN, no errors are seen:

If you remove the stack protection (to get another hint).

I find these comments confusing, as it's not clear what they're comparing against. Surely we did see the crash in CircleCI with ASan enabled, so why wouldn't we in some other(?) test with ASan (and in what way it differs)? Perhaps you do know what compiler flags were used in CircleCI? Perhaps you also know where you ran those other tests (so can mention that)?

 #elif __AVX2__
-#define SIMD_COEF_32 8
+#define SIMD_COEF_32 4

That's wrong, things are not supposed to work with incorrect settings like that. To correctly obtain SIMD_COEF_32 = 4, simply build with --enable-simd=avx.

@claudioandre-br
Copy link
Member Author

claudioandre-br commented Jan 6, 2024

Surely we did see the crash in CircleCI with ASan enabled

I'm not sure if I've done this test previously with ASAN enabled. I removed or edited some previous comments as "everything was" 100% remote and only sure a regular build fails on mac M2 [1].

Now, using SSH (so, I can rebuild the way I want, re-test and so son), I was unable to crash when enabling ASAN.

Even forcing CPPFLAGS=sanitize=address (-O2 is used in these cases).

So, I didn't receive anything from ASAN.

[1]

  • Also, inside Launchpad when deploying the ARM snap package. Less than 10 times, in the last 5 months maybe.
  • And now, using an ARM machine via SSH.

solardiz pushed a commit that referenced this issue Jan 9, 2024
SunMD5 format crashes on ARM (macOS and Linux)
See #5296.

Signed-off-by: Claudio André <claudioandre.br@gmail.com>
@kholia
Copy link
Member

kholia commented Apr 14, 2024

I can't reproduce this problem on an aarch64 box here.

ubuntu@ubuntu:~/john/src$ uname -a
Linux ubuntu 5.10.160-rockchip #34 SMP Thu Mar 21 16:59:26 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

ubuntu@ubuntu:~/john/src$ ../run/john --test --format=sunmd5
Will run 8 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 ASIMD 4x2]... (8xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	1280 c/s real, 207 c/s virtual

ubuntu@ubuntu:~/john/src$ cat /proc/cpuinfo 
processor	: 0
BogoMIPS	: 48.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x2
CPU part	: 0xd05
CPU revision	: 0
ubuntu@ubuntu:~/john/src$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.4 LTS"

ubuntu@ubuntu:~/john/src$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

@solardiz
Copy link
Member

Thank you @kholia. It's great that you're still around!

I don't recall all detail of this issue (was the issue disappearing with ASan for us?), but if you have time you could also try an --enable-asan build.

@claudioandre-br
Copy link
Member Author

The last time I saw it failing was on Apr 7 at https://launchpadlibrarian.net/723587866/buildlog_snap_ubuntu_jammy_arm64_john-the-ripper_BUILDING.txt.gz

Random. A second run managed to succeed.

:: Testing: Stribog-512, raw Streebog [GOST R 34.11-2012 64/64]... (4xOMP) PASS
:: Testing: STRIP, Password Manager [PBKDF2-SHA1 128/128 ASIMD 4x]... (4xOMP) PASS
:: Testing: SunMD5 [MD5 128/128 ASIMD 4x2]... (4xOMP) *** stack smashing detected ***: terminated
:: run_tests.sh: line 93: 22374 Aborted                 (core dumped) "$JTR_BIN" -test-full=0 --format=cpu
::  FAILED: -test-full=0 --format=cpu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants