Description
Environment
Category | Detail |
---|---|
Operating System | Ubuntu 20.04 |
LLVM Version | llvm-project-llvmorg-20.1.6 |
Architecture | x86_64 |
Target Triple | aarch64-linux-gnu |
Sanitizers | ASan (AddressSanitizer), UBSan (UndefinedBehaviorSanitizer) |
Tool Used | llvm-mc-assemble-fuzzer |
Build | Debug (With Assertions Enabled) |
Summary
There is an assertion failure (Assertion 'ReadCount == 1' failed
) in the llvm::MCAsmLexer::peekTok(bool)
function. This crash consistently occurs when parsing specific malformed AArch64 assembly input following another malformed line.
The critical observation is that the first malformed line (el1, x12
) by itself does NOT crash the fuzzer. Instead, it correctly reports syntax errors. However, when this same line is followed by a second specific malformed line (sqdmulh s15, s3, v4.s[0
), the assertion triggers, causing the fuzzer to crash. This strongly suggests a state management issue or subtle side effect from the initial malformed input that corrupts the lexer's internal state, leading to a later assertion when processing the subsequent instruction.
Affected Component(s)
- AArch64 Backend
- MC Assembler Parser (
MCAsmLexer
,AArch64AsmParser
) - Potentially
MCAsmParser
state management across multiple instruction parsing attempts.
Steps to Reproduce
-
Prepare the input files:
-
token-bug-min
(Does NOT crash when run alone):
Create a file namedtoken-bug-min
with the following content:el1, x12
-
token-bug
(Reproduces the crash):
Create a file namedtoken-bug
with the following content:el1, x12 sqdmulh s15, s3, v4.s[0
-
-
Set up environment variables for sanitizers:
export ASAN_OPTIONS="halt_on_error=0:abort_on_error=0:coverage=1" export UBSAN_OPTIONS="halt_on_error=0:abort_on_error=0:print_stacktrace=1"
-
Demonstrate
token-bug-min
behavior (no crash):./llvm-mc-assemble-fuzzer \ --triple=aarch64-linux-gnu \ --mattr=+all \ token-bug-min
The output is correct (no crash):
INFO: Seed: 705475483 INFO: Loaded 1 modules (1910 inline 8-bit counters): 1910 [0x174a178, 0x174a8ee), INFO: Loaded 1 PC tables (1910 PCs): 1910 [0x174a8f0,0x1752050), ./llvm-mc-assemble-fuzzer: Running 1 inputs 1 time(s) each. Running: token-bug-min Unknown buffer:1:4: error: unknown token in expression el1, x12 ^ Unknown buffer:1:4: error: invalid operand el1, x12 ^ Executed token-bug-min in 1 ms *** *** NOTE: fuzzing was not performed, you have only *** executed the target code on a fixed set of inputs. ***
-
Run
llvm-mc-assemble-fuzzer
with the two-linetoken-bug
input (reproduces crash):
./llvm-mc-assemble-fuzzer \
--triple=aarch64-linux-gnu \
--mattr=+all \
token-bug
The output is incorrect (fuzzer crash with assertion):
INFO: Seed: 979337232
INFO: Loaded 1 modules (1910 inline 8-bit counters): 1910 [0x174a178, 0x174a8ee),
INFO: Loaded 1 PC tables (1910 PCs): 1910 [0x174a8f0,0x1752050),
./llvm-mc-assemble-fuzzer: Running 1 inputs 1 time(s) each.
Running: token-bug
Unknown buffer:1:4: error: unknown token in expression
el1, x12
^
Unknown buffer:1:4: error: invalid operand
el1, x12
^
llvm-mc-assemble-fuzzer: /llvm-project-llvmorg-20.1.6/llvm/include/llvm/MC/MCParser/MCAsmLexer.h:117: const llvm::AsmToken llvm::MCAsmLexer::peekTok(bool): Assertion `ReadCount == 1' failed.
==2290937== ERROR: libFuzzer: deadly signal
#0 0x4b4cb0 in __sanitizer_print_stack_trace (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x4b4cb0)
#1 0x460fc8 in fuzzer::PrintStackTrace() (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x460fc8)
#2 0x446113 in fuzzer::Fuzzer::CrashCallback() (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x446113)
#3 0x7fd18552b41f (/lib/x86_64-linux-gnu/libpthread.so.0+0x1441f)
#4 0x7fd18512700a in __libc_signal_restore_set /build/glibc-B3wQXB/glibc-2.31/signal/../sysdeps/unix/sysv/linux/internal-signals.h:86:3
#5 0x7fd18512700a in raise /build/glibc-B3wQXB/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:48:3
#6 0x7fd185106858 in abort /build/glibc-B3wQXB/glibc-2.31/stdlib/abort.c:79:7
#7 0x7fd185106728 in __assert_fail_base /build/glibc-B3wQXB/glibc-2.31/assert/assert.c:94:3
#8 0x7fd185117fd5 in __assert_fail /build/glibc-B3wQXB/glibc-2.31/assert/assert.c:103:3
#9 0x577808 in llvm::MCAsmLexer::peekTok(bool) /llvm-project-llvmorg-20.1.6/llvm/include/llvm/MC/MCParser/MCAsmLexer.h:117:5
#10 0x5032da in (anonymous namespace)::AArch64AsmParser::parseOptionalMulOperand(llvm::SmallVectorImpl<std::unique_ptr<llvm::MCParsedAsmOperand, std::default_delete<llvm::MCParsedAsmOperand> > >&) /llvm-project-llvmorg-20.1.6/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp:4835:25
#11 0x4fd38c in (anonymous namespace)::AArch64AsmParser::parseOperand(llvm::SmallVectorImpl<std::unique_ptr<llvm::MCParsedAsmOperand, std::default_delete<llvm::MCParsedAsmOperand> > >&, bool, bool) /home/zhangqi/project/issta2024_asfuzzer/llvm-project-llvmorg-20.1.6/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp:5003:10
#12 0x4e56f3 in (anonymous namespace)::AArch64AsmParser::parseInstruction(llvm::ParseInstructionInfo&, llvm::StringRef, llvm::SMLoc, llvm::SmallVectorImpl<std::unique_ptr<llvm::MCParsedAsmOperand, std::default_delete<llvm::MCParsedAsmOperand> > >&) /home/zhangqi/project/issta2024_asfuzzer/llvm-project-llvmorg-20.1.6/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp:5321:11
#13 0x570f54 in llvm::MCTargetAsmParser::parseInstruction(llvm::ParseInstructionInfo&, llvm::StringRef, llvm::AsmToken, llvm::SmallVectorImpl<std::unique_ptr<llvm::MCParsedAsmOperand, std::default_delete<llvm::MCParsedAsmOperand> > >&) /home/zhangqi/project/issta2024_asfuzzer/llvm-project-llvmorg-20.1.6/llvm/include/llvm/MC/MCParser/MCTargetAsmParser.h:446:12
#14 0x952abc in (anonymous namespace)::AsmParser::parseAndMatchAndEmitTargetInstruction((anonymous namespace)::ParseStatementInfo&, llvm::StringRef, llvm::AsmToken, llvm::SMLoc) /home/zhangqi/project/issta2024_asfuzzer/llvm-project-llvmorg-20.1.6/llvm/lib/MC/MCParser/AsmParser.cpp:2327:42
#15 0x9439de in (anonymous namespace)::AsmParser::parseStatement((anonymous namespace)::ParseStatementInfo&, llvm::MCAsmParserSemaCallback*) /home/zhangqi/project/issta2024_asfuzzer/llvm-project-llvmorg-20.1.6/llvm/lib/MC/MCParser/AsmParser.cpp:2317:10
#16 0x937303 in (anonymous namespace)::AsmParser::Run(bool, bool) /home/zhangqi/project/issta2024_asfuzzer/llvm-project-llvmorg-20.1.6/llvm/lib/MC/MCParser/AsmParser.cpp:999:19
#17 0x4bb403 in AssembleInput(char const*, llvm::Target const*, llvm::SourceMgr&, llvm::MCContext&, llvm::MCStreamer&, llvm::MCAsmInfo&, llvm::MCSubtargetInfo&, llvm::MCInstrInfo&, llvm::MCTargetOptions&) /home/zhangqi/project/issta2024_asfuzzer/llvm-project-llvmorg-20.1.6/llvm/tools/llvm-mc-assemble-fuzzer/llvm-mc-assemble-fuzzer.cpp:129:18
#18 0x4b82d4 in AssembleOneInput(unsigned char const*, unsigned long) /home/zhangqi/project/issta2024_asfuzzer/llvm-project-llvmorg-20.1.6/llvm/tools/llvm-mc-assemble-fuzzer/llvm-mc-assemble-fuzzer.cpp:234:19
#19 0x4bbc14 in LLVMFuzzerTestOneInput /llvm-project-llvmorg-20.1.6/llvm/tools/llvm-mc-assemble-fuzzer/llvm-mc-assemble-fuzzer.cpp:243:10
#20 0x4477d1 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x4477d1)
#21 0x432f42 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x432f42)
#22 0x4389f6 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x4389f6)
#23 0x4616b2 in main (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x4616b2)
#24 0x7fd185108082 in __libc_start_main /build/glibc-B3wQXB/glibc-2.31/csu/../csu/libc-start.c:308:16
#25 0x40d5ed in _start (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x40d5ed)
NOTE: libFuzzer has rudimentary signal handlers.
Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal
Expected Behavior
The llvm-mc-assemble-fuzzer
should robustly handle multiple erroneous lines. It should report specific syntax errors for both malformed instructions (el1, x12
and sqdmulh s15, s3, v4.s[0
) and then exit gracefully with a non-zero status code, without crashing or triggering any assertions. The internal state of the lexer/parser should be properly reset or managed between processing instructions, preventing corruption that leads to subsequent crashes.
Actual Behavior
The tool crashes with an Assertion 'ReadCount == 1' failed
error in llvm::MCAsmLexer::peekTok(bool)
. This assertion triggers specifically when processing the second line of the input (sqdmulh s15, s3, v4.s[0
), after the first line (el1, x12
) has already been processed and yielded syntax errors.
This behavior strongly indicates that the processing of the initial malformed instruction el1, x12
leaves the MCAsmLexer
or a related parsing component in a corrupted or inconsistent state. When the parser then attempts to process the subsequent instruction sqdmulh s15, s3, v4.s[0
, this corrupted state leads to the assertion being triggered in peekTok
.
Full Crash Output / Stack Trace (from token-bug
input)
➜ bin git:(master) ✗ cat token-bug
el1, x12
sqdmulh s15, s3, v4.s[0
➜ bin git:(master) ✗ export ASAN_OPTIONS="coverage=1"
➜ bin git:(master) ✗ cat token-bug
el1, x12
sqdmulh s15, s3, v4.s[0
➜ bin git:(master) ✗ export ASAN_OPTIONS="coverage=1"
export UBSAN_OPTIONS="print_stacktrace=1"
./llvm-mc-assemble-fuzzer \
--triple=aarch64-linux-gnu \
--mattr=+all \
token-bug
INFO: Seed: 901112419
INFO: Loaded 1 modules (1910 inline 8-bit counters): 1910 [0x174a178, 0x174a8ee),
INFO: Loaded 1 PC tables (1910 PCs): 1910 [0x174a8f0,0x1752050),
./llvm-mc-assemble-fuzzer: Running 1 inputs 1 time(s) each.
Running: token-bug
Unknown buffer:1:4: error: unknown token in expression
el1, x12
^
Unknown buffer:1:4: error: invalid operand
el1, x12
^
llvm-mc-assemble-fuzzer: /llvm-project-llvmorg-20.1.6/llvm/include/llvm/MC/MCParser/MCAsmLexer.h:117: const llvm::AsmToken llvm::MCAsmLexer::peekTok(bool): Assertion `ReadCount == 1' failed.
==2288831== ERROR: libFuzzer: deadly signal
#0 0x4b4cb0 in __sanitizer_print_stack_trace (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x4b4cb0)
#1 0x460fc8 in fuzzer::PrintStackTrace() (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x460fc8)
#2 0x446113 in fuzzer::Fuzzer::CrashCallback() (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x446113)
#3 0x7fd89828e41f (/lib/x86_64-linux-gnu/libpthread.so.0+0x1441f)
#4 0x7fd897e8a00a in __libc_signal_restore_set /build/glibc-B3wQXB/glibc-2.31/signal/../sysdeps/unix/sysv/linux/internal-signals.h:86:3
#5 0x7fd897e8a00a in raise /build/glibc-B3wQXB/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:48:3
#6 0x7fd897e69858 in abort /build/glibc-B3wQXB/glibc-2.31/stdlib/abort.c:79:7
#7 0x7fd897e69728 in __assert_fail_base /build/glibc-B3wQXB/glibc-2.31/assert/assert.c:94:3
#8 0x7fd897e7afd5 in __assert_fail /build/glibc-B3wQXB/glibc-2.31/assert/assert.c:103:3
#9 0x577808 in llvm::MCAsmLexer::peekTok(bool) /llvm-project-llvmorg-20.1.6/llvm/include/llvm/MC/MCParser/MCAsmLexer.h:117:5
#10 0x5032da in (anonymous namespace)::AArch64AsmParser::parseOptionalMulOperand(llvm::SmallVectorImpl<std::unique_ptr<llvm::MCParsedAsmOperand, std::default_delete<llvm::MCParsedAsmOperand> > >&) /llvm-project-llvmorg-20.1.6/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp:4835:25
#11 0x4fd38c in (anonymous namespace)::AArch64AsmParser::parseOperand(llvm::SmallVectorImpl<std::unique_ptr<llvm::MCParsedAsmOperand, std::default_delete<llvm::MCParsedAsmOperand> > >&, bool, bool) /llvm-project-llvmorg-20.1.6/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp:5003:10
#12 0x4e56f3 in (anonymous namespace)::AArch64AsmParser::parseInstruction(llvm::ParseInstructionInfo&, llvm::StringRef, llvm::SMLoc, llvm::SmallVectorImpl<std::unique_ptr<llvm::MCParsedAsmOperand, std::default_delete<llvm::MCParsedAsmOperand> > >&) /llvm-project-llvmorg-20.1.6/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp:5321:11
#13 0x570f54 in llvm::MCTargetAsmParser::parseInstruction(llvm::ParseInstructionInfo&, llvm::StringRef, llvm::AsmToken, llvm::SmallVectorImpl<std::unique_ptr<llvm::MCParsedAsmOperand, std::default_delete<llvm::MCParsedAsmOperand> > >&) /llvm-project-llvmorg-20.1.6/llvm/include/llvm/MC/MCParser/MCTargetAsmParser.h:446:12
#14 0x952abc in (anonymous namespace)::AsmParser::parseAndMatchAndEmitTargetInstruction((anonymous namespace)::ParseStatementInfo&, llvm::StringRef, llvm::AsmToken, llvm::SMLoc) /llvm-project-llvmorg-20.1.6/llvm/lib/MC/MCParser/AsmParser.cpp:2327:42
#15 0x9439de in (anonymous namespace)::AsmParser::parseStatement((anonymous namespace)::ParseStatementInfo&, llvm::MCAsmParserSemaCallback*) /llvm-project-llvmorg-20.1.6/llvm/lib/MC/MCParser/AsmParser.cpp:2317:10
#16 0x937303 in (anonymous namespace)::AsmParser::Run(bool, bool) /llvm-project-llvmorg-20.1.6/llvm/lib/MC/MCParser/AsmParser.cpp:999:19
#17 0x4bb403 in AssembleInput(char const*, llvm::Target const*, llvm::SourceMgr&, llvm::MCContext&, llvm::MCStreamer&, llvm::MCAsmInfo&, llvm::MCSubtargetInfo&, llvm::MCInstrInfo&, llvm::MCTargetOptions&) /llvm-project-llvmorg-20.1.6/llvm/tools/llvm-mc-assemble-fuzzer/llvm-mc-assemble-fuzzer.cpp:129:18
#18 0x4b82d4 in AssembleOneInput(unsigned char const*, unsigned long) /llvm-project-llvmorg-20.1.6/llvm/tools/llvm-mc-assemble-fuzzer/llvm-mc-assemble-fuzzer.cpp:234:19
#19 0x4bbc14 in LLVMFuzzerTestOneInput /llvm-project-llvmorg-20.1.6/llvm/tools/llvm-mc-assemble-fuzzer/llvm-mc-assemble-fuzzer.cpp:243:10
#20 0x4477d1 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x4477d1)
#21 0x432f42 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x432f42)
#22 0x4389f6 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x4389f6)
#23 0x4616b2 in main (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x4616b2)
#24 0x7fd897e6b082 in __libc_start_main /build/glibc-B3wQXB/glibc-2.31/csu/../csu/libc-start.c:308:16
#25 0x40d5ed in _start (/llvm-project-llvmorg-20.1.6/build-fuzzer/bin/llvm-mc-assemble-fuzzer+0x40d5ed)
NOTE: libFuzzer has rudimentary signal handlers.
Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal
Surprisingly, input only the malformed assembly instructions, looks good.
➜ bin git:(master) ✗ cat token-bug-min
el1, x12
➜ bin git:(master) ✗ export ASAN_OPTIONS="coverage=1"
export UBSAN_OPTIONS="print_stacktrace=1"
./llvm-mc-assemble-fuzzer \
--triple=aarch64-linux-gnu \
--mattr=+all \
token-bug-min
INFO: Seed: 1000704437
INFO: Loaded 1 modules (1910 inline 8-bit counters): 1910 [0x174a178, 0x174a8ee),
INFO: Loaded 1 PC tables (1910 PCs): 1910 [0x174a8f0,0x1752050),
./llvm-mc-assemble-fuzzer: Running 1 inputs 1 time(s) each.
Running: token-bug-min
Unknown buffer:1:4: error: unknown token in expression
el1, x12
^
Unknown buffer:1:4: error: invalid operand
el1, x12
^
Executed token-bug-min in 1 ms
***
*** NOTE: fuzzing was not performed, you have only
*** executed the target code on a fixed set of inputs.
***