Skip to content

Bit reversing instructions #1382

Open
Open
@MaxGraey

Description

@MaxGraey

Future features mentioned a nice set for instruction set future improvements. For example bswap and bswap16 but missing bireverse/bitrev/rbit operations which I think quite important due to it could be useful for wide range of compression algorithms, Cooley–Tukey's FFT, FHT, dyadic rational numbers and etc.

Clang has builtins for that:
__builtin_bitreverse8
__builtin_bitreverse16
__builtin_bitreverse32
__builtin_bitreverse64

which quite useful due to implement bit reversing is quite challenging and some architectures like ARM64 could be lowered to single instruction RBIT.

So it would be interesting to hear your opinion on whether to add bitrev to post-MVP?

Proposed unary instructions:

i32.bitrev8
i32.bitrev16
i32.bitrev
i64.bitrev

Implementation details for runtimes

For x64 / ARMv7 / RISCV / MIPS / POWER and etc:

  • i32.bitrev8:
    one read from 8-bit lookup table
  • i32.bitrev16:
    two reads from 8-bit lookup table
  • i32.bitrev:
    four reads from 8-bit lookup table
  • i64.bitrev:
    use Knuth's method from Hacker's delight or 8 lookups if it faster on target architecture.

For more modern x64 architecture like Zen3 and Skylake bit reversing could be implemented via PDEP / PEXT instructions.

For ARM64:

  • i32.bitrev8:
    rbit w1, w0
    lsr  w0, w1, #24
  • i32.bitrev16:
    rbit w1, w0
    lsr  w0, w1, #16
  • i32.bitrev:
    rbit w0, w1
  • i64.bitrev:
    rbit x0, x1

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions