Skip to content

Fix broken disassembly of floating point immediates on big endian hosts #2222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 21, 2023

Conversation

huth
Copy link
Contributor

@huth huth commented Dec 20, 2023

Disassembling single floating points with immediate values currently gives wrong results on big endian hosts (like s390x), e.g.:

 ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
  0  f2 3c 44 22 40 49 0e 56  fadd.s     #0.000000, fp0

While it should be (like on x86):

 ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
  0  f2 3c 44 22 40 49 0e 56  fadd.s     #3.141500, fp0

The problem is that these single float values are supposed to be stored in the 32-bit "simm" field of struct cs_m68k_op (see e.g. the printing of M68K_FPU_SIZE_SINGLE in printAddressingMode() in M68KInstPrinter.c), but currently the immediate is only written to the 64-bit "imm" field of the union in cs_m68k_op. This works on little endian systems, since the least significant bytes overlap in the union there. For example, let's assume that the value 0x01020304 gets written to "imm":

 04 03 02 01 00 00 00 00    uint64_t imm
 xx xx xx xx xx xx xx xx    double dimm;
 xx xx xx xx .. .. .. ..    float simm;

But on big endian hosts, the important bytes do not overlap, so "simm" is always zero there:

 00 00 00 00 01 02 03 04    uint64_t imm
 xx xx xx xx xx xx xx xx    double dimm;
 xx xx xx xx .. .. .. ..    float simm;

To fix the problem, let's always set "simm" explicitly, this works on both, big endian and little endian hosts.

Thanks to Michal Schulz for his initial analysis of the problem (in #1710) and to Travis Finkenauer for providing an easy example to reproduce the issue (in #1931).

Closes: #1710

Disassembling single floating points with immediate values currently
gives wrong results on big endian hosts (like s390x), e.g.:

 ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
  0  f2 3c 44 22 40 49 0e 56  fadd.s     #0.000000, fp0

While it should be (like on x86):

 ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
  0  f2 3c 44 22 40 49 0e 56  fadd.s     capstone-engine#3.141500, fp0

The problem is that these single float values are supposed to be stored
in the 32-bit "simm" field of struct cs_m68k_op (see e.g. the printing
of M68K_FPU_SIZE_SINGLE in printAddressingMode() in M68KInstPrinter.c),
but currently the immediate is only written to the 64-bit "imm" field
of the union in cs_m68k_op. This works on little endian systems, since
the least significant bytes overlap in the union there. For example,
let's assume that the value 0x01020304 gets written to "imm":

 04 03 02 01 00 00 00 00    uint64_t imm
 xx xx xx xx xx xx xx xx    double dimm;
 xx xx xx xx .. .. .. ..    float simm;

But on big endian hosts, the important bytes do not overlap, so "simm"
is always zero there:

 00 00 00 00 01 02 03 04    uint64_t imm
 xx xx xx xx xx xx xx xx    double dimm;
 xx xx xx xx .. .. .. ..    float simm;

To fix the problem, let's always set "simm" explicitly, this works on
both, big endian and little endian hosts.

Thanks to Michal Schulz for his initial analysis of the problem
(in capstone-engine#1710) and to Travis Finkenauer for providing an easy example
to reproduce the issue (in capstone-engine#1931).

Closes: capstone-engine#1710
Copy link
Collaborator

@Rot127 Rot127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Though what is about case 0x5 and the default case?
Aren't they broken as well? If yes, it would be great if you could quickly fix them here as well.

@huth
Copy link
Contributor Author

huth commented Dec 20, 2023

@Rot127 : case 0x05 is fine since it is about double floats that have 64 bit length - so this aliases perfectly well with the "imm" that gets written by get_ea_mode_op(). And the "default" case does not seem to use get_ea_mode_op() at all, i.e. it does not set the "imm" field at all, so I assume that's not affected.

@Rot127
Copy link
Collaborator

Rot127 commented Dec 20, 2023

@kabeor Please take a look.

@kabeor
Copy link
Member

kabeor commented Dec 21, 2023

Looks cool, thanks!

@kabeor kabeor merged commit e3a2b4c into capstone-engine:next Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

M68K: Float disassembly (immediate) broken on BigEndian targets
3 participants