[AAELF64] AArch64 Veneer Types Recognized by Binary Analysis Tools. #333

ilinpv · 2025-06-11T17:05:26Z

The patch introduces definitions for standard veneers on AArch64 to improve recognition by binary analysis tools.

maksfb · 2025-06-12T04:09:46Z

aaelf64/aaelf64.rst

+  __AArch64Abs[XO]LongThunk_<target>:
+    B <target>


Is LLD the only linker that generates this type of veneer? I don't know if the inclusion of AbsLong in the name was intentional.

The name is a side-effect of LLDs implementation. There is a single veneer that chooses whether it is short or long depending on the distance.

I think there could be a patch to LLD that changes the name when the choice is made, although to date it's not been important enough to implement.

At the moment ld.bfd doesn't implement this shorter form of veneer. I don't know about mold.

This one is observed in LLD. At this point, in addition to LLD, I’ve gathered veneer names produced by ld.bfd, mold ( I guess @rui314 know more about mold thunks ) and go linker

I just implemented a long range extension thunk to mold. I chose this code sequence: https://github.com/rui314/mold/blob/47f6c2839c15cd0e982956c760428617ea35a0e9/src/arch-arm64.cc#L604-L614

maksfb · 2025-06-12T04:21:00Z

aaelf64/aaelf64.rst

+
+  __AArch64AbsLongThunk_<target>:
+    LDR X16, =<target>
+    BR  X16


Is there an expectation that the literal pool follows the instruction sequence?

Yes. In theory a linker could put it in a separate section within 1 MiB away, however that would require some special case code to ensure it stayed in range for not much gain. So in practice every veneer implementation that I've seen has the literal immediately afterwards.

Although not an issue with the above implementation, a veneer with an odd number of instructions may need 4-bytes of padding to ensure 8-byte alignment of the literal pool (for targets that have strict alignment enabled).

It would be feasible to first emit a number of veneers and only then the data.

However it's not obvious why we need both this and the execute only variant. There are an infinite number of possible veneer sequences, but for each range we need only one. We can always emit only execute veneers.

Note given none of the code models currently support a .text size larger than 2GB, so you'd need assembler trickery with absolute symbols to force these >4GB veneers.

I've occasionally seen people with embedded systems (not set up MMU yet) have a boot loader in low memory that jumps to a high address.

I suspect that if we default to using the ADRP thunk (4 GiB) universally, either using it within range and falling back to the longer sequence if out of range, or for lld at least defaulting --pic-veneer to True which will force use of the ADRP thunk.

MaskRay · 2025-06-12T05:12:34Z

Thanks! The description looks good to me.

smithp35 · 2025-06-12T14:39:23Z

aaelf64/aaelf64.rst

+
+  __AArch64AbsLongThunk_<target>:
+    LDR X16, =<target>
+    BR  X16


Yes. In theory a linker could put it in a separate section within 1 MiB away, however that would require some special case code to ensure it stayed in range for not much gain. So in practice every veneer implementation that I've seen has the literal immediately afterwards.

Although not an issue with the above implementation, a veneer with an odd number of instructions may need 4-bytes of padding to ensure 8-byte alignment of the literal pool (for targets that have strict alignment enabled).

smithp35 · 2025-06-12T14:45:25Z

aaelf64/aaelf64.rst

+  __AArch64Abs[XO]LongThunk_<target>:
+    B <target>


The name is a side-effect of LLDs implementation. There is a single veneer that chooses whether it is short or long depending on the distance.

I think there could be a patch to LLD that changes the name when the choice is made, although to date it's not been important enough to implement.

At the moment ld.bfd doesn't implement this shorter form of veneer. I don't know about mold.

smithp35 · 2025-06-12T14:57:30Z

aaelf64/aaelf64.rst

+    MOVK X16, #:abs_g2_nc:<target>, LSL #32
+    MOVK X16, #:abs_g3:<target>,    LSL #48
+    BR   X16
+


There is the ld.bfd veneer which is similar to the __AArch64AbsLongThunk_ except it is also position independent.

Instead of loading the address directly, it loads the offset from the PC

LDR X16, =.Loffset_to_target ADR X17, #0 // X17 = current PC ADD X16, X16, X17 BR X16 .Loffset_to_target: .xword target - (. - 12) // R_AARCH64_PREL64 target + 12

I don't know what the naming convention is for GNU ld. I think they may have a stubs section without individual names for each veneer.

If the GOT is within 4GB, doing ADRP/LDR/BR from the GOT would be a better option since it avoids placing literals in .text. And if not, it makes sense to make this veneer execute-only too and use MOVZ/MOVK for the offset.

One possible limitation for some linkers is that veneers are often added quite late and the GOT may be fixed at this point. I think it should be possible to add to the GOT for ld.lld.

I would expect that for most programs if the GOT were within 4 GiB so would the destination function so the ADRP, ADD could be used.

Seems we have such GOT veneer generated in go linker. However, I'm not sure what relocation is generated for that kind of veneer and can we rely on it in BOLT. Looking into the code it seems they have jump relocation referred to that veneer.

smithp35 · 2025-06-12T15:03:04Z

aaelf64/aaelf64.rst

+    ADD  X16, X16, :lo12:<target>
+    BR   X16
+
+Note that ``<target>`` may be an entry in the PLT.


That's the case for all of the veneers described here. If there is a B or BL to the PLT entry for target and the PLT is > 128 MiB away from the B/BL then there will be a veneer generated.

Although not part of the veneer code. When there is an indirect branch to the PLT entry care is needed to add a BTI to the PLT entry if we're generating a program that is setting AARCH64_FEATURE_1_BTI.

Agreed - and if lazy binding is disabled linkers can shortcut the PLT by directly loading from the GOT.

This is also better than using a literal load in the .text section and needing 2 variants for each veneer...

It appears we need a bit of ABI design on linker veneers...

Yes, at the stage that veneers are generated we'll know if there's a .got.plt entry for the target so we should be able to load from .plt.got when -znow is used. Off the top of my head:

ADRP x16, :got: target@got.plt LDR x16:got_lo12: target@got.plt BR x16

smithp35 · 2025-06-12T15:06:02Z

aaelf64/aaelf64.rst

+
+__AArch64BTIThunk_ BTI Landing Pad Veneers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+


May be able to handle some of this via reference to https://github.com/ARM-software/abi-aa/blob/main/sysvabi64/sysvabi64.rst#971tool-requirements-for-generating-bti-instructions

Wilco1 · 2025-06-12T20:25:08Z

aaelf64/aaelf64.rst

+    MOVK X16, #:abs_g3:<target>,    LSL #48
+    BR   X16
+
+Note that some of the MOVK instructions may be omitted if their corresponding 16-bit segments of the address are zero and do not need to be explicitly set.


You can omit MOVK if the value is 0x0000 if MOVZ is used or 0xffff if MOVN is used. It's unlikely linkers optimize anything but the top 16 bits given that this veneer will only needed if the distance is over 4GB.

Wilco1 · 2025-06-12T20:36:25Z

aaelf64/aaelf64.rst

+    ADD  X16, X16, :lo12:<target>
+    BR   X16
+
+Note that ``<target>`` may be an entry in the PLT.


Agreed - and if lazy binding is disabled linkers can shortcut the PLT by directly loading from the GOT.

This is also better than using a literal load in the .text section and needing 2 variants for each veneer...

It appears we need a bit of ABI design on linker veneers...

Wilco1 · 2025-06-12T20:45:39Z

aaelf64/aaelf64.rst

+
+  __AArch64AbsLongThunk_<target>:
+    LDR X16, =<target>
+    BR  X16


It would be feasible to first emit a number of veneers and only then the data.

However it's not obvious why we need both this and the execute only variant. There are an infinite number of possible veneer sequences, but for each range we need only one. We can always emit only execute veneers.

Note given none of the code models currently support a .text size larger than 2GB, so you'd need assembler trickery with absolute symbols to force these >4GB veneers.

Wilco1 · 2025-06-13T12:58:46Z

aaelf64/aaelf64.rst

+    MOVK X16, #:abs_g2_nc:<target>, LSL #32
+    MOVK X16, #:abs_g3:<target>,    LSL #48
+    BR   X16
+


If the GOT is within 4GB, doing ADRP/LDR/BR from the GOT would be a better option since it avoids placing literals in .text. And if not, it makes sense to make this veneer execute-only too and use MOVZ/MOVK for the offset.

smithp35

There's definitely some room for improvement in thunks. Would need a small amount of implementation work in linkers.

smithp35 · 2025-06-13T13:35:34Z

aaelf64/aaelf64.rst

+    ADD  X16, X16, :lo12:<target>
+    BR   X16
+
+Note that ``<target>`` may be an entry in the PLT.


Yes, at the stage that veneers are generated we'll know if there's a .got.plt entry for the target so we should be able to load from .plt.got when -znow is used. Off the top of my head:

ADRP x16, :got: target@got.plt LDR x16:got_lo12: target@got.plt BR x16

smithp35 · 2025-06-13T13:48:38Z

aaelf64/aaelf64.rst

+    MOVK X16, #:abs_g2_nc:<target>, LSL #32
+    MOVK X16, #:abs_g3:<target>,    LSL #48
+    BR   X16
+


One possible limitation for some linkers is that veneers are often added quite late and the GOT may be fixed at this point. I think it should be possible to add to the GOT for ld.lld.

I would expect that for most programs if the GOT were within 4 GiB so would the destination function so the ADRP, ADD could be used.

smithp35 · 2025-06-13T13:53:11Z

aaelf64/aaelf64.rst

+
+  __AArch64AbsLongThunk_<target>:
+    LDR X16, =<target>
+    BR  X16


I've occasionally seen people with embedded systems (not set up MMU yet) have a boot loader in low memory that jumps to a high address.

I suspect that if we default to using the ADRP thunk (4 GiB) universally, either using it within range and falling back to the longer sequence if out of range, or for lld at least defaulting --pic-veneer to True which will force use of the ADRP thunk.

[AAELF64] AArch64 Veneer Types Recognized by Binary Analysis Tools.

49ddf46

ilinpv mentioned this pull request Jun 11, 2025

Consider an ABI extension to define metadata for binary analysis #297

Open

maksfb reviewed Jun 12, 2025

View reviewed changes

smithp35 reviewed Jun 12, 2025

View reviewed changes

Wilco1 reviewed Jun 13, 2025

View reviewed changes

smithp35 reviewed Jun 13, 2025

View reviewed changes

Merge branch 'ARM-software:main' into main

6361033

smithp35 mentioned this pull request Jul 11, 2025

ELF: Introduce R_AARCH64_PATCHINST relocation type. llvm/llvm-project#133534

Open


		__AArch64BTIThunk_ BTI Landing Pad Veneers
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[AAELF64] AArch64 Veneer Types Recognized by Binary Analysis Tools. #333

Are you sure you want to change the base?

[AAELF64] AArch64 Veneer Types Recognized by Binary Analysis Tools. #333

Uh oh!

Conversation

ilinpv commented Jun 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MaskRay commented Jun 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Wilco1 Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Wilco1 Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smithp35 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Wilco1 Jun 13, 2025 •

edited

Loading

Wilco1 Jun 13, 2025 •

edited

Loading