Description
Note to x86: x86
is not part of this list, because we can not generate all tables in C.
Refer to capstone-engine/llvm-capstone#13 for details.
Note about changes introduced with auto-sync
:
For a preview what changes will come in v6
, please take a look at the WIP release guide.
This issue tracks the auto-sync
refactoring and implementation effort of architecture modules.
The table below lists the responsible developers for each architecture.
In progress
Arch | CS PR | llvm-capstone PR | Part of (planned) release | Assigned developer(s) | Based on LLVM repo |
---|---|---|---|---|---|
SPARC | #2704 | capstone-engine/llvm-capstone#81 | v6 |
@Rot127 | LLVM-project |
.td edits upstreamed
Most LLVM td
files miss some information about instructions (memory read/writes, operands incorrectly assigned as in/out etc.). Since we rely on this we need to fix it. Those fixes should be upstreamed to LLVM.
-
Alhpa(no longer maintained) - ARM
- AArch64
- MIPS
- PPC
- SPARC
- Xtensa Fix Xtensa reachable assert. llvm-capstone#62
Done
Arch | PR | Part of release | Assigned developer(s) | LLVM repo |
---|---|---|---|---|
Alpha | #2071 | v6 |
@R33v0LT | LLVM-project (release v3.0 ) |
ARC | #2570 | v6 |
@R33v0LT | LLVM-project |
AArch64 | #2026 | v6 |
@Rot127 | LLVM-project |
ARM | #1949 | v6 |
@Rot127 | LLVM-project |
PPC | #2013 | v6 |
@Rot127 | LLVM-project |
TriCore | #1973 | v5 |
@imbillow | TriDis |
HPPA | #2265 | v6 |
@R33v0LT | Not Auto-sync based |
LoongArch | #2349 | v6 |
@jiegec | LLVM-project |
MIPS | #2410 | v6 |
@wargio | LLVM-project |
SystemZ | #2462 | v6 |
@Rot127 | LLVM-project |
Xtensa | #2380 | v6 |
@imbillow | LLVM-project |
BPF | #2568 | v6 |
@Roeegg2 | RFC, Linux kernel docs |
Arch extensions
Adding CPU extensions which are not part of upsteram LLVM is easier now.
Here are they tracked.
Arch | Extension name | issue | previous attempt/notes | Done |
---|---|---|---|---|
PPC | VLE | #2241 | https://lists.llvm.org/pipermail/llvm-dev/2014-July/074613.html | No |
PPC | PS (Paired-Single) | None | https://reviews.llvm.org/D85137 | Yes |
Mips | NanoMips | None | Mediatek LLVM: https://github.com/MediaTek-Labs/llvm-project/tree/mtk-pub/nanomips-llvm16, more context: rizinorg/ideas#5 | Yes |
Mips | EE | None | Not in LLVM, see: #940 (comment) | No |
Effort level of not refactored/implemented archs
Arch | Number of operand groups | Generates | Note | Implementation type | Difficulty level |
---|---|---|---|---|---|
AVR | ~3 | Yes | None | New | Easy |
CSKY | ~7 | Yes | None | New | Medium |
DirectX | ~1 | Yes | Deviates from common design. | New | Medium-Hard |
EVM | ~2 | Not tested | Very small module, llvm repo: https://github.com/etclabscore/evm_llvm | New | Easy |
Hexagon | ~2 | No | Deviates from common design. | New | Hard |
Lanai | ~10 | Yes | None | New | Easy |
M68k | ~28 | Yes | None | Refactor | Medium |
MSP430 | ~6 | Yes | None | New | Easy |
SPIRV | ~9 | No | td files faulty |
New | Medium |
VE | ~8 | Yes | None | New | Medium |
XCore | ~15 | No | td files faulty |
Refactor | Medium |
Note to RISC-V: RISC-V will not be generated via LLVM because the LLVM architecture definitions are not precise enough for our use case. Instead, a SAIL based generator will be used (#2392).
Legend
Number of operand groups
: Operand groups which have a distinctprint
functions. Indicates effort to implement the LLVM <-> CS mapping code (fillcs_detail
and the like).Generates
:inc
files generate with most recent backends.Note
: Worthy to note.Implementation type
: Refactor current implementation or implement new arch module.Difficulty level
: Guessed difficulty of this arch (base on points above and complexity like number of instructions etc.). Though "Easy" still means you have to familiarize yourself how LLVM definitions and the updater work. My guess is it will take at least a week of work.
Getting started
- If you like to refactor an architecture module or implement a new one, please comment here and we add you. Also we can give hints to important information.
- Please add a draft PR once you've done the first commit, so the progress is visible and there is a place for discussion.
- Please refer to the
auto-sync
documentation to learn how to refactor or implement an architecture withauto-sync
TODO for refactored archs
List of missing things which should be done before v6
to get a nice round package.
Capstone
- Missing alias support for SystemZ - SystemZ misses optional alias details #2738
- Missing alias support for LoongArch - SystemZ misses optional alias details #2738
- Update docs with
ASUpdater.py
instructions. - Modernize Capstone testing #1984
- Update all archs to LLVM 18
- Remove tablegen files from suite.
- Add CS
assert
version and add the asserts to the LLVM files again. - Wrap all possible code into
CAPSTONE_DIET
. - Run 0x0 to 0xffffffff as input once on ARM, PPC, AArch64 (with details enabled) to check for segfaults.
-
name2id
docs. Parametermax
should be changed to table size and in the loop bemax - 1
- Consider to have alias details and real details live along. So users do not need to decide for one (how would this play together with
CAPSONE_DIET
). - Possibly [Auto-Sync] Generate general instruction encoding format #2152
- AArch64 missing details tasks #2196
- Expose
PPC
instruction formats on the public interface
LLVM revisions
- ARM td files
- PPC td files (have there been any)
- ARM missing
pop
alias (d64f749) - Segfaults for PPC
- AArch64 missing
FeatureAll
check - More PPC
predicates
- PPC false instruction defintions (which should be alias)
- AArch64 - Open issue about missing memory operand info.
- AArch64 - Open issue about patterns not assigned to their instructions (SVE, SME).
- capstone-engine/llvm-capstone@c3484b1
- Remove tPOP and tPUSH as real isntructions. llvm-capstone#39
- Fix BL definition. BL is no call and does not read SP. llvm-capstone#40
Auto-Sync
- add
refactor
setting toauto-sync
updater. - Add auto-sync unit tests
- Translate template functions as functions, not as macros.
Backends
- Generate decoding/printing macros as functions, if there is only a single version (allows proper debugging, which would be a blessing).
ARM
- Add general alias and alias operand handling.
- Add vector layout information
- Set
post_index
when base regsister is tied. Just to make sure to hit every case. -
Encoding info - Move data type insn mapping to own auto-sync class.
- error C4146: unary minus operator applied to unsigned type, result still unsigned and linker errors on Windows, VS2022 latest version #2193
PPC
-
Encoding info
AArch64
-
Encoding info
Metadata
Metadata
Assignees
Labels
Type
Projects
Status