Skip to content

[ARM] reglists can not be defined as out operands #62455

Open
@Rot127

Description

@Rot127

The following LDM instructions loads from the memory and writes to a list of registers.
0x90,0xe8,0x0e,0x00 - ldm.w r0, {r1, r2, r3} (little endian - thumb)

In the td file the reglist is incorrectly defined as in operand. I tried to fix it but the disassembler now breaks.

3fc: e890 000e    	ldm.w	r2, llvm-objdump: /home/user/repos/llvm/llvm/include/llvm/MC/MCInst.h:70: unsigned int llvm::MCOperand::getReg() const: Assertion `isReg() && "This is not a register operand!"' failed

You can find the patched td files and a binary here: Commit with patched td files and binary (first instruction of _start is ldm instr. form above): Rot127/llvm-capstone@6344dae

There are two reasons for that:

1. The indices to access MI.operands are off.

The indices to access MI->Operands are off because the length of the register list
is only known during runtime and after the register list was decoded, the indices get not updated.

Because the reglist is an out operand it is stored at the beginning of the operands vector.

For example, when ARMInstPrinter::printPredicateOperand() tries to decode the
predicate, it is given a "hardcoded" index of 2.

The reglist is encoded as an immediate in the instruction bytes, but the
disassembler decodes it to multiple register operands.

This index 2 points to the predicate operand as defined in the td files.
But because the writeback (r0) and reglist (r1-r3) take
index 0 and 1, 2, 3 the printing fails.

So because the ARM disassembler and printer cannot handle dynamically changing operand indices
the disassembly fails.

2. RegisterLists are decoded by printing operand i to MI->getNumOperands().

Because reglists were incorrectly always set as in operands, they were always
at the end of the MI->Operands vector.

So the ARMInstPrinter::printRegisterList() method got the index for the
first register in the list, and printed every operand until MI->getNumOperands().

This doesn't work if a out: reglist is at the beginning of the
MI->Operands vector.
Because it would just print all operands, including the in ops.

Solution?

Can anyone think of a better solution for this, than to save the length of the reglist
and applying it as offset whenever an operand is printed?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions