riscv-elf.adoc

[[riscv-elf]]
= RISC-V ELF Specification
ifeval::["{docname}" == "riscv-elf"]
include::prelude.adoc[]
endif::[]

== Code models

The RISC-V architecture constrains the addressing of positions in the
address space.  There is no single instruction that can refer to an arbitrary
memory position using a literal as its argument.  Rather, instructions exist
that, when combined together, can then be used to refer to a memory position
via its literal.  And, when not, other data structures are used to help the
code to address the memory space.  The coding conventions governing their use
are known as code models.

However, some code models can't access the whole address space. The linker may
raise an error if it cannot adjust the instructions to access the target address
in the current code model.

=== Medium low code model

The medium low code model, or `medlow`, allows the code to address the whole RV32
address space or the lower 2 GiB and highest 2 GiB of the RV64 address space
(`0xFFFFFFFF7FFFF800` ~ `0xFFFFFFFFFFFFFFFF` and `0x0` ~ `0x000000007FFFF7FF`).
By using the `lui` and load / store instructions, when referring to an object, or
`addi`, when calculating an address literal, for example,
a 32-bit address literal can be produced.

The following instructions show how to load a value, store a value, or calculate
an address in the `medlow` code model.

[,asm]
----
    # Load value from a symbol
    lui  a0, %hi(symbol)
    lw   a0, %lo(symbol)(a0)

    # Store value to a symbol
    lui  a0, %hi(symbol)
    sw   a1, %lo(symbol)(a0)

    # Calculate address
    lui  a0, %hi(symbol)
    addi a0, a0, %lo(symbol)
----

NOTE: The ranges on RV64 are not `0x0` ~ `0x000000007FFFFFFF` and
`0xFFFFFFFF80000000` ~ `0xFFFFFFFFFFFFFFFF` due to RISC-V's sign-extension of
immediates; the following code fragments show where the ranges come from:
[,asm]
----
# Largest postive number:
lui a0, 0x7ffff # a0 = 0x7ffff000
addi a0, 0x7ff # a0 = a0 + 2047 = 0x000000007FFFF7FF

# Smallest negative number:
lui a0, 0x80000 # a0 = 0xffffffff80000000
addi a0, a0, -0x800 # a0 = a0 + -2048 = 0xFFFFFFFF7FFFF800
----

=== Medium any code model

The medium any code model, or `medany`, allows the code to address the range
between -2 GiB and +2 GiB from its position.  By using `auipc`
and load / store instructions, when referring to an object, or
`addi`, when calculating an address literal, for example,
a signed 32-bit offset, relative to the value of the `pc` register,
can be produced.

As a special edge-case, undefined weak symbols must still be supported, whose
addresses will be 0 and may be out of range depending on the address at which
the code is linked. Any references to possibly-undefined weak symbols should be
made indirectly through the GOT as is used for position-independent code. Not
doing so is deprecated and a future version of this specification will require
using the GOT, not just advise.

NOTE: This is not yet a requirement as existing toolchains predating this part
of the specification do not adhere to this, and without improvements to linker
relaxation support doing so would regress performance and code size.

The following instructions show how to load a value, store a value, or calculate
an address in the medany code model.

[,asm]
----
         # Load value from a symbol
.Ltmp0:  auipc a0, %pcrel_hi(symbol)
         lw    a0, %pcrel_lo(.Ltmp0)(a0)

         # Store value to a symbol
.Ltmp1:  auipc a0, %pcrel_hi(symbol)
         sw    a1, %pcrel_lo(.Ltmp1)(a0)

         # Calculate address
.Ltmp2:  auipc a0, %pcrel_hi(symbol)
         addi  a0, a0, %pcrel_lo(.Ltmp2)
----

NOTE: Although the generated code is technically position independent, it's not
suitable for ELF shared libraries due to differing symbol interposition rules;
for that, please use the medium position independent code model below.

=== Medium position independent code model

This model is similar to the medium any code model, but uses the
<<Global Offset Table,global offset table>> (GOT) for non-local symbol addresses.

[,asm]
----
         # Load value from a local symbol
.Ltmp0:  auipc a0, %pcrel_hi(symbol)
         lw    a0, %pcrel_lo(.Ltmp0)(a0)

         # Store value to a local symbol
.Ltmp1:  auipc a0, %pcrel_hi(symbol)
         sw    a1, %pcrel_lo(.Ltmp1)(a0)

         # Calculate address of a local symbol
.Ltmp2:  auipc a0, %pcrel_hi(symbol)
         addi  a0, a0, %pcrel_lo(.Ltmp2)

         # Calculate address of non-local symbol
.Ltmp3:  auipc  a0, %got_pcrel_hi(symbol)
         l[w|d] a0, a0, %pcrel_lo(.Ltmp3)
----

== Dynamic Linking

Any functions that use registers in a way that is incompatible with
the calling convention of the ABI in use must be annotated with
`STO_RISCV_VARIANT_CC`, as defined in <<Symbol Table>>.

NOTE: Vector registers have a variable size depending on the hardware
implementation and can be quite large. Saving/restoring all these vector
arguments in a run-time linker's lazy resolver would use a large amount of
stack space and hurt performance. `STO_RISCV_VARIANT_CC` attribute will require
the run-time linker to resolve the symbol directly to prevent saving/restoring
any vector registers.

== {Cpp} Name Mangling

{Cpp} name mangling for RISC-V follows
the _Itanium {Cpp} ABI_ <<itanium-cxx-abi>>;
plus mangling for RISC-V vector data types and vector mask types,
which are defined in the following section.

See the "Type encodings" section in _Itanium {Cpp} ABI_
for more detail on how to mangle types. Note that `__bf16` is mangled in the
same way as `std::bfloat16_t`.

=== Name Mangling for Vector Data Types, Vector Mask Types and Vector Tuple Types.

The vector data types and vector mask types, as defined in the section
<<Vector type sizes and alignments>>, are treated as vendor-extended types in
the _Itanium {Cpp} ABI_ <<itanium-cxx-abi>>. These mangled name for
these types is `"u"<len>"__rvv_"<type-name>`. Specifically,
prefixing the type name with `__rvv_`, which is prefixed by
a decimal string indicating its length, which is prefixed by "u".

For example:

[,c]
----
    void foo(vint8m1_t x);
----

is mangled as
[,c]
----
    _Z3foou15__rvv_vint8m1_t
----
[source,abnf]
----
mangled-name = "u" len "__rvv_" type-name

len = nonzero *DIGIT
nonzero = "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9"

type-name = identifier-nondigit *identifier-char
identifier-nondigit = ALPHA / "_"
identifier-char = identifier-nondigit / "_"
----
== ELF Object Files

The ELF object file format for RISC-V follows the
_Generic System V Application Binary Interface_ <<gabi>>
("gABI"); this specification only describes RISC-V-specific definitions.

=== File Header

The section below lists the defined RISC-V-specific values for several ELF
header fields; any fields not listed in this section have no RISC-V-specific
values.

e_ident::
  EI_CLASS::: Specifies the base ISA, either RV32 or RV64.
  Linking RV32 and RV64 code together is not supported.
+
--
[horizontal]
[[ELFCLASS64]]
ELFCLASS64:::: ELF-64 Object File
[horizontal]
[[ELFCLASS32]]
ELFCLASS32:::: ELF-32 Object File
--
  EI_DATA::: Specifies the endianness; either big-endian or little-endian.
  Linking big-endian and little-endian code together is not supported.
+
--
[horizontal]
ELFDATA2LSB:::: Little-endian Object File
ELFDATA2MSB:::: Big-endian Object File
--

e_machine:: Identifies the machine this ELF file targets.  Always contains
EM_RISCV (243) for RISC-V ELF files.

e_flags:: Describes the format of this ELF file.  These flags are used by the
linker to disallow linking ELF files with incompatible ABIs together,
<<e-flags-layout>> shows the layout of e_flags, and flag details are listed
below.
+
[[e-flags-layout]]
.Layout of e_flags
[cols="1,2,1,1,3,5"]
[width=80%]
|===
| Bit 0 | Bits 1 - 2 | Bit 3 | Bit 4 | Bits 5 - 23 | Bits 24 - 31

| RVC   | Float ABI  | RVE   | TSO   | *Reserved*  | *Non-standard extensions*
|===

+
--
  EF_RISCV_RVC (0x0001)::: This bit is set when the binary targets the C ABI,
  which allows instructions to be aligned to 16-bit boundaries (the base RV32
  and RV64 ISAs only allow 32-bit instruction alignment).  When linking
  objects which specify EF_RISCV_RVC, the linker is permitted to use RVC
  instructions such as C.JAL in the linker relaxation process.

  EF_RISCV_FLOAT_ABI_SOFT (0x0000):::
[[EF_RISCV_FLOAT_ABI_SOFT]]
  EF_RISCV_FLOAT_ABI_SINGLE (0x0002):::
[[EF_RISCV_FLOAT_ABI_SINGLE]]
  EF_RISCV_FLOAT_ABI_DOUBLE (0x0004):::
[[EF_RISCV_FLOAT_ABI_DOUBLE]]
  EF_RISCV_FLOAT_ABI_QUAD (0x0006):::
[[EF_RISCV_FLOAT_ABI_QUAD]] These flags identify the floating point
  ABI in use for this ELF file.  They store the largest floating-point type
  that ends up in registers as part of the ABI (but do not control if code
  generation is allowed to use floating-point internally).  The rule is that
  if you have a floating-point type in a register, then you also have all
  smaller floating-point types in registers.  For example _DOUBLE would
  store "float" and "double" values in F registers, but would not store "long
  double" values in F registers.  If none of the float ABI flags are set, the
  object is taken to use the soft-float ABI.

  EF_RISCV_FLOAT_ABI (0x0006)::: This macro is used as a mask to test for one
  of the above floating-point ABIs, e.g.,
  `(e_flags & EF_RISCV_FLOAT_ABI) == EF_RISCV_FLOAT_ABI_DOUBLE`.

[[EF_RISCV_RVE]]
  EF_RISCV_RVE (0x0008)::: This bit is set when the binary targets the E ABI.

  EF_RISCV_TSO (0x0010)::: This bit is set when the binary requires the RVTSO
  memory consistency model.

Until such a time that the *Reserved* bits (0x00ffffe0) are allocated by future
versions of this specification, they shall not be set by standard software.
Non-standard extensions are free to use bits 24-31 for any purpose. This may
conflict with other non-standard extensions.

NOTE: There is no provision for compatibility between conflicting uses of the
e_flags bits reserved for non-standard extensions, and many standard RISC-V
tools will ignore them. Do not use them unless you control both the toolchain
and the operating system, and the ABI differences are so significant they
cannot be done with a .RISCV.attributes tag nor an ELF note, such as using a
different syscall ABI.

==== Policy for Merge Objects With Different File Headers

This section describe the behavior when the inputs files come with different
file headers.

`e_ident` and `e_machine` should have exact same value otherwise linker should
raise an error.

`e_flags` has different different policy for different fields:

  RVC::: Input file could have different values for the RVC field; the linker
  should set this field into EF_RISCV_RVC if any of the input objects has
  been set.

  Float ABI::: Linker should report errors if object files of different value
  for float ABI field.

  RVE::: Linker should report errors if object files of different value
  for RVE field.

  TSO::: Input files can have different values for the TSO field; the linker
  should set this field if any of the input objects have the TSO field set.

NOTE: The static linker may ignore the compatibility checks if all fields in the
`e_flags` are zero and all sections in the input file are non-executable
sections.

--

=== String Tables

There are no RISC-V specific definitions relating to ELF string tables.

=== Symbol Table

st_other:: The lower 2 bits are used to specify a symbol's visibility. The
remaining 6 bits have no defined meaning in the ELF gABI. We use the highest
bit to mark functions that do not follow the standard calling convention for
the ABI in use.
+
The defined processor-specific `st_other` flags are listed in <<rv-st-other>>.
+
[[rv-st-other]]
.RISC-V-specific `st_other` flags
[cols="3,1"]
[width=60%]
|===
| Name                 | Mask

| STO_RISCV_VARIANT_CC | 0x80
|===
+
See <<Dynamic Linking>> for the meaning of `STO_RISCV_VARIANT_CC`.

`__global_pointer$` must be exported in the dynamic symbol table of dynamically-linked
executables if there are any GP-relative accesses present in the executable.

=== Relocations

RISC-V is a classical RISC architecture that has densely packed non-word
sized instruction immediate values. While the linker can make relocations on
arbitrary memory locations, many of the RISC-V relocations are designed for
use with specific instructions or instruction sequences. RISC-V has several
instruction specific encodings for PC-Relative address loading, jumps,
branches and the RVC compressed instruction set.

The purpose of this section is to describe the RISC-V specific instruction
sequences with their associated relocations in addition to the general purpose
machine word sized relocations that are used for symbol addresses in the
Global Offset Table or DWARF meta data.

<<reloc-table>> provides details of the RISC-V ELF relocations; the meaning of each
column is given below:

Enum:: The number of the relocation, encoded in the r_info field

ELF Reloc Type:: The name of the relocation, omitting the prefix of `R_RISCV_`.

Type:: Whether the relocation is a static or dynamic relocation:
+
- A static relocation relocates a location in a relocatable file, processed by a static linker.
- A dynamic relocation relocates a location in an executable or shared object, processed by a run-time linker.
- `Both`: Some relocation types are used by both static relocations and dynamic relocations.

Field:: Describes the set of bits affected by this relocation; see <<Field Symbols>> for the definitions of the individual types

Calculation:: Formula for how to resolve the relocation value; definitions of the
              symbols can be found in <<Calculation Symbols>>

Description:: Additional information about the relocation

[[reloc-table]]
.Relocation types
[cols=">2,6,3,6,11"]
[width=100%]
|===
| Enum          | ELF Reloc Type   | Type    | Field / Calculation  | Description

.2+| 0       .2+| NONE          .2+| None    |                   .2+|
                                            <|
.2+| 1       .2+| 32            .2+| Both    | _word32_          .2+| 32-bit relocation
                                            <| S + A
.2+| 2       .2+| 64            .2+| Both    | _word64_          .2+| 64-bit relocation
                                            <| S + A
.2+| 3       .2+| RELATIVE      .2+| Dynamic | _wordclass_       .2+| Adjust a link address (A) to its load address (B + A)
                                            <| B + A
.2+| 4       .2+| COPY          .2+| Dynamic |                   .2+| Must be in executable; not allowed in shared library
                                            <|
.2+| 5       .2+| JUMP_SLOT     .2+| Dynamic | _wordclass_       .2+| Indicates the symbol associated with a PLT entry
                                            <| S
.2+| 6       .2+| TLS_DTPMOD32  .2+| Dynamic | _word32_          .2+|
                                            <| TLSMODULE
.2+| 7       .2+| TLS_DTPMOD64  .2+| Dynamic | _word64_          .2+|
                                            <| TLSMODULE
.2+| 8       .2+| TLS_DTPREL32  .2+| Dynamic | _word32_          .2+|
                                            <| S + A - TLS_DTV_OFFSET
.2+| 9       .2+| TLS_DTPREL64  .2+| Dynamic | _word64_          .2+|
                                            <| S + A - TLS_DTV_OFFSET
.2+| 10      .2+| TLS_TPREL32   .2+| Dynamic | _word32_          .2+|
                                            <| S + A + TLSOFFSET
.2+| 11      .2+| TLS_TPREL64   .2+| Dynamic | _word64_          .2+|
                                            <| S + A + TLSOFFSET
.2+| 12      .2+| TLSDESC       .2+| Dynamic | See <<TLS Descriptors>> .2+|
                                            <| TLSDESC(S+A)
.2+| 16      .2+| BRANCH        .2+| Static  | _B-Type_          .2+| 12-bit PC-relative branch offset
                                            <| S + A - P
.2+| 17      .2+| JAL           .2+| Static  | _J-Type_          .2+| 20-bit PC-relative jump offset
                                            <| S + A - P
.2+| 18      .2+| CALL          .2+| Static  | _U+I-Type_        .2+| *Deprecated, please use CALL_PLT instead* 32-bit PC-relative function call, macros `call`, `tail`
                                            <| S + A - P
.2+| 19      .2+| CALL_PLT      .2+| Static  | _U+I-Type_        .2+| 32-bit PC-relative function call, macros `call`, `tail` (PIC)
                                            <| S + A - P
.2+| 20      .2+| GOT_HI20      .2+| Static  | _U-Type_          .2+| High 20 bits of 32-bit PC-relative GOT access, `%got_pcrel_hi(symbol)`
                                            <| G + GOT + A - P
.2+| 21      .2+| TLS_GOT_HI20  .2+| Static  | _U-Type_          .2+| High 20 bits of 32-bit PC-relative TLS IE GOT access, macro `la.tls.ie`
                                            <|
.2+| 22      .2+| TLS_GD_HI20   .2+| Static  | _U-Type_          .2+| High 20 bits of 32-bit PC-relative TLS GD GOT reference, macro `la.tls.gd`
                                            <|
.2+| 23      .2+| PCREL_HI20    .2+| Static  | _U-Type_          .2+| High 20 bits of 32-bit PC-relative reference, `%pcrel_hi(symbol)`
                                            <| S + A - P
.2+| 24      .2+| PCREL_LO12_I  .2+| Static  | _I-type_          .2+| Low 12 bits of a 32-bit PC-relative, `%pcrel_lo(address of %pcrel_hi)`, the addend must be 0
                                            <| S - P
.2+| 25      .2+| PCREL_LO12_S  .2+| Static  | _S-Type_          .2+| Low 12 bits of a 32-bit PC-relative, `%pcrel_lo(address of %pcrel_hi)`, the addend must be 0
                                            <| S - P
.2+| 26      .2+| HI20          .2+| Static  | _U-Type_          .2+| High 20 bits of 32-bit absolute address, `%hi(symbol)`
                                            <| S + A
.2+| 27      .2+| LO12_I        .2+| Static  | _I-Type_          .2+| Low 12 bits of 32-bit absolute address, `%lo(symbol)`
                                            <| S + A
.2+| 28      .2+| LO12_S        .2+| Static  | _S-Type_          .2+| Low 12 bits of 32-bit absolute address, `%lo(symbol)`
                                            <| S + A
.2+| 29      .2+| TPREL_HI20    .2+| Static  | _U-Type_          .2+| High 20 bits of TLS LE thread pointer offset, `%tprel_hi(symbol)`
                                            <|
.2+| 30      .2+| TPREL_LO12_I  .2+| Static  | _I-Type_          .2+| Low 12 bits of TLS LE thread pointer offset, `%tprel_lo(symbol)`
                                            <|
.2+| 31      .2+| TPREL_LO12_S  .2+| Static  | _S-Type_          .2+| Low 12 bits of TLS LE thread pointer offset, `%tprel_lo(symbol)`
                                            <|
.2+| 32      .2+| TPREL_ADD     .2+| Static  |                   .2+| TLS LE thread pointer usage, `%tprel_add(symbol)`
                                            <|
.2+| 33      .2+| ADD8          .2+| Static  | _word8_           .2+| 8-bit label addition
                                            <| V + S + A
.2+| 34      .2+| ADD16         .2+| Static  | _word16_          .2+| 16-bit label addition
                                            <| V + S + A
.2+| 35      .2+| ADD32         .2+| Static  | _word32_          .2+| 32-bit label addition
                                            <| V + S + A
.2+| 36      .2+| ADD64         .2+| Static  | _word64_          .2+| 64-bit label addition
                                            <| V + S + A
.2+| 37      .2+| SUB8          .2+| Static  | _word8_           .2+| 8-bit label subtraction
                                            <| V - S - A
.2+| 38      .2+| SUB16         .2+| Static  | _word16_          .2+| 16-bit label subtraction
                                            <| V - S - A
.2+| 39      .2+| SUB32         .2+| Static  | _word32_          .2+| 32-bit label subtraction
                                            <| V - S - A
.2+| 40      .2+| SUB64         .2+| Static  | _word64_          .2+| 64-bit label subtraction
                                            <| V - S - A
.2+| 41      .2+| GOT32_PCREL   .2+| Static  | _word32_          .2+| 32-bit difference between the GOT entry for a symbol and the current location
                                            <| G + GOT + A - P
.2+| 42      .2+| *Reserved*    .2+| -       |                   .2+| Reserved for future standard use
                                            <|
.2+| 43      .2+| ALIGN         .2+| Static  |                   .2+| Alignment statement. The addend indicates the number of bytes occupied by `nop` instructions at the relocation offset. The alignment boundary is specified by the addend rounded up to the next power of two.
                                            <|
.2+| 44      .2+| RVC_BRANCH    .2+| Static  | _CB-Type_         .2+| 8-bit PC-relative branch offset
                                            <| S + A - P
.2+| 45      .2+| RVC_JUMP      .2+| Static  | _CJ-Type_         .2+| 11-bit PC-relative jump offset
                                            <| S + A - P
.2+| 46-50   .2+| *Reserved*    .2+| -       |                   .2+| Reserved for future standard use
                                            <|
.2+| 51      .2+| RELAX         .2+| Static  |                   .2+| Instruction can be relaxed, paired with a normal relocation at the same address
                                            <|
.2+| 52      .2+| SUB6          .2+| Static  | _word6_           .2+| Local label subtraction
                                            <| V - S - A
.2+| 53      .2+| SET6          .2+| Static  | _word6_           .2+| Local label assignment
                                            <| S + A
.2+| 54      .2+| SET8          .2+| Static  | _word8_           .2+| Local label assignment
                                            <| S + A
.2+| 55      .2+| SET16         .2+| Static  | _word16_          .2+| Local label assignment
                                            <| S + A
.2+| 56      .2+| SET32         .2+| Static  | _word32_          .2+| Local label assignment
                                            <| S + A
.2+| 57      .2+| 32_PCREL      .2+| Static  | _word32_          .2+| 32-bit PC relative
                                            <| S + A - P
.2+| 58      .2+| IRELATIVE     .2+| Dynamic | _wordclass_       .2+| Relocation against a non-preemptible ifunc symbol
                                            <| `ifunc_resolver(B + A)`
.2+| 59      .2+| PLT32         .2+| Static  | _word32_          .2+| 32-bit relative offset to a function or its PLT entry
                                            <| S + A - P
.2+| 60      .2+| SET_ULEB128   .2+| Static  | _ULEB128_         .2+| Must be placed immediately before a SUB_ULEB128 with the same offset. Local label assignment <<uleb128-note,*note>>
                                            <| S + A
.2+| 61      .2+| SUB_ULEB128   .2+| Static  | _ULEB128_         .2+| Must be placed immediately after a SET_ULEB128 with the same offset. Local label subtraction <<uleb128-note,*note>>
                                            <| V - S - A
.2+| 62      .2+| TLSDESC_HI20      .2+| Static  | _U-Type_          .2+| High 20 bits of a 32-bit PC-relative offset into a TLS descriptor entry, `%tlsdesc_hi(symbol)`
                                            <| S + A - P
.2+| 63      .2+| TLSDESC_LOAD_LO12 .2+| Static  | _I-Type_          .2+| Low 12 bits of a 32-bit PC-relative offset into a TLS descriptor entry, `%tlsdesc_load_lo(address of %tlsdesc_hi)`, the addend must be 0
                                            <| S - P
.2+| 64      .2+| TLSDESC_ADD_LO12  .2+| Static  | _I-Type_          .2+| Low 12 bits of a 32-bit PC-relative offset into a TLS descriptor entry, `%tlsdesc_add_lo(address of %tlsdesc_hi)`, the addend must be 0
                                            <| S - P
.2+| 65      .2+| TLSDESC_CALL      .2+| Static  |                   .2+| Annotate call to TLS descriptor resolver function, `%tlsdesc_call(address of %tlsdesc_hi)`, for relaxation purposes only
                                            <|
.2+| 66-191  .2+| *Reserved*                          .2+| -       |                   .2+| Reserved for future standard use
                                            <|
.2+| 192-255 .2+| *Reserved*                          .2+| -       |                   .2+| Reserved for nonstandard ABI extensions
                                            <|
|===

Nonstandard extensions are free to use relocation numbers 192-255 for any
purpose.  These relocations may conflict with other nonstandard extensions.

This section and later ones contain fragments written in assembler. The precise
assembler syntax, including that of the relocations, is described in the
_RISC-V Assembly Programmer's Manual_ <<rv-asm>>.

[[uleb128-note]]
NOTE: The assembler must allocate sufficient space to accommodate the final
value for the `R_RISCV_SET_ULEB128` and `R_RISCV_SUB_ULEB128` relocation pair
and fill the space with a single ULEB128-encoded value.
This is achieved by prepending the redundant `0x80` byte as necessary.
The linker must not alter the length of the ULEB128-encoded value.

==== Calculation Symbols

<<var-reloc-calc>> provides details on the variables used in relocation
calculation:

[[var-reloc-calc]]
.Variables used in relocation calculation
[%autowidth]
|===
| Variable  | Description

| A         | Addend field in the relocation entry associated with the symbol
| B         | Base address of a shared object loaded into memory
| G         | Offset of the symbol into the GOT (Global Offset Table)
| GOT       | Address of the GOT (Global Offset Table)
| P         | Position of the relocation
| S         | Value of the symbol in the symbol table
| V         | Value at the position of the relocation
| GP        | Value of `__global_pointer$` symbol
| TLSMODULE | TLS module index for the object containing the symbol
| TLSOFFSET | TLS static block offset (relative to `tp`) for the object containing the symbol
|===

**Global Pointer**: It is assumed that program startup code will load the value
of the `__global_pointer$` symbol into register `gp` (aka `x3`).

==== Field Symbols

<<var-reloc-field>> provides details on the variables used in relocation fields:

[[var-reloc-field]]
.Variables used in relocation fields
[%autowidth]
|===
| Variable    | Description

| _word6_     | Specifies the 6 least significant bits of a _word8_ field
| _word8_     | Specifies an 8-bit word
| _word16_    | Specifies a 16-bit word
| _word32_    | Specifies a 32-bit word
| _word64_    | Specifies a 64-bit word
| _ULEB128_   | Specifies a variable-length data encoded in ULEB128 format.
| _wordclass_ | Specifies a _word32_ field for ILP32 or a _word64_ field for LP64
| _B-Type_    | Specifies a field as the immediate field in a B-type instruction
| _CB-Type_   | Specifies a field as the immediate field in a CB-type instruction
| _CI-Type_   | Specifies a field as the immediate field in a CI-type instruction
| _CJ-Type_   | Specifies a field as the immediate field in a CJ-type instruction
| _I-Type_    | Specifies a field as the immediate field in an I-type instruction
| _S-Type_    | Specifies a field as the immediate field in an S-type instruction
| _U-Type_    | Specifies a field as the immediate field in an U-type instruction
| _J-Type_    | Specifies a field as the immediate field in a J-type instruction
| _U+I-Type_  | Specifies a field as the immediate fields in a U-type and I-type instruction pair
|===

==== Constants

<<const-reloc-field>> provides details on the constants used in relocation fields:

[[const-reloc-field]]
.Constants used in relocation fields
[cols="3,1"]
[width=30%]
|===
| Name           | Value

| TLS_DTV_OFFSET | 0x800
|===

==== Absolute Addresses

32-bit absolute addresses in position dependent code are loaded with a pair
of instructions which have an associated pair of relocations:
`R_RISCV_HI20` plus `R_RISCV_LO12_I` or `R_RISCV_LO12_S`.

The `R_RISCV_HI20` refers to an `LUI` instruction containing the high
20-bits to be relocated to an absolute symbol address. The `LUI` instruction
is used in conjunction with one or more I-Type instructions (add immediate or
load) with `R_RISCV_LO12_I` relocations or S-Type instructions (store) with
`R_RISCV_LO12_S` relocations.
The addresses for pair of relocations are
calculated like this:

[horizontal]
HI20:: `(symbol_address + 0x800) >> 12`
LO12:: `symbol_address`

The following assembly and relocations show loading an absolute address:

[,asm]
----
    lui  a0, %hi(symbol)     # R_RISCV_HI20 (symbol)
    addi a0, a0, %lo(symbol) # R_RISCV_LO12_I (symbol)
----

A symbol can be loaded in multiple fragments using different addends, where
multiple instructions associated with `R_RISCV_LO12_I`/`R_RISCV_LO12_S` share a
single `R_RISCV_HI20`. The HI20 values for the multiple fragments must be
identical, a condition met when the symbol is sufficiently aligned.

[,asm]
----
    lui a0, 0       # R_RISCV_HI20 (symbol)
    lw a1, 0(a0)    # R_RISCV_LO12_I (symbol)
    lw a2, 0(a0)    # R_RISCV_LO12_I (symbol+4)
    lw a3, 0(a0)    # R_RISCV_LO12_I (symbol+8)
    lw a0, 0(a0)    # R_RISCV_LO12_I (symbol+12)
----

==== Global Offset Table

For position independent code in dynamically linked objects, each shared
object contains a GOT (Global Offset Table), which contains addresses of
global symbols (objects and functions) referred to by the dynamically
linked shared object. The GOT in each shared library is filled in by the
dynamic linker during program loading, or on the first call to extern functions.

To avoid dynamic relocations within the text segment of position independent
code the GOT is used for indirection. Instead of code loading virtual addresses
directly, as can be done in static code, addresses are loaded from the GOT.
This allows runtime binding to external objects and functions at the expense of
a slightly higher runtime overhead for access to extern objects and functions.

==== Procedure Linkage Table

The PLT (Procedure Linkage Table) exists to allow function calls between
dynamically linked shared objects. Each dynamic object has its own
GOT (Global Offset Table) and PLT (Procedure Linkage Table).

The first entry of a shared object PLT is a special entry that calls
`_dl_runtime_resolve` to resolve the GOT offset for the called function.
The `_dl_runtime_resolve` function in the dynamic loader resolves the
GOT offsets lazily on the first call to any function, except when
`LD_BIND_NOW` is set in which case the GOT entries are populated by the
dynamic linker before the executable is started. Lazy resolution of GOT
entries is intended to speed up program loading by deferring symbol
resolution to the first time the function is called. The first entry
in the PLT occupies two 16 byte entries:

[,asm]
----
1:  auipc  t2, %pcrel_hi(.got.plt)
    sub    t1, t1, t3               # shifted .got.plt offset + hdr size + 12
    l[w|d] t3, %pcrel_lo(1b)(t2)    # _dl_runtime_resolve
    addi   t1, t1, -(hdr size + 12) # shifted .got.plt offset
    addi   t0, t2, %pcrel_lo(1b)    # &.got.plt
    srli   t1, t1, log2(16/PTRSIZE) # .got.plt offset
    l[w|d] t0, PTRSIZE(t0)          # link map
    jr     t3
----

Subsequent function entry stubs in the PLT take up 16 bytes and load a
function pointer from the GOT. On the first call to a function, the
entry redirects to the first PLT entry which calls `_dl_runtime_resolve`
and fills in the GOT entry for subsequent calls to the function:

[,asm]
----
1:  auipc   t3, %pcrel_hi(function@.got.plt)
    l[w|d]  t3, %pcrel_lo(1b)(t3)
    jalr    t1, t3
    nop
----

==== Procedure Calls

`R_RISCV_CALL` and `R_RISCV_CALL_PLT` relocations are associated with
pairs of instructions (`AUIPC+JALR`) generated by the `CALL` or `TAIL`
pseudoinstructions.  Originally, these relocations had slightly different
behavior, but that has turned out to be unnecessary, and they are now
interchangeable,  `R_RISCV_CALL` is deprecated, suggest using `R_RISCV_CALL_PLT`
instead.

With linker relaxation enabled, the `AUIPC` instruction in the `AUIPC+JALR` pair has
both a `R_RISCV_CALL` or `R_RISCV_CALL_PLT` relocation and an `R_RISCV_RELAX`
relocation indicating the instruction sequence can be relaxed during linking.

Procedure call linker relaxation allows the `AUIPC+JALR` pair to be relaxed
to the `JAL` instruction when the procedure or PLT entry is within (-1MiB to
+1MiB-2) of the instruction pair.

The pseudoinstruction:

[,asm]
----
    call symbol
    call symbol@plt
----

expands to the following assembly and relocation:

[,asm]
----
    auipc ra, 0           # R_RISCV_CALL (symbol), R_RISCV_RELAX (symbol)
    jalr  ra, ra, 0
----

and when symbol has an `@plt` suffix it expands to:

[,asm]
----
    auipc ra, 0           # R_RISCV_CALL_PLT (symbol), R_RISCV_RELAX (symbol)
    jalr  ra, ra, 0
----

==== PC-Relative Jumps and Branches

Unconditional jump (J-Type) instructions have a `R_RISCV_JAL` relocation
that can represent an even signed 21-bit offset (-1MiB to +1MiB-2).

Branch (SB-Type) instructions have a `R_RISCV_BRANCH` relocation that
can represent an even signed 13-bit offset (-4096 to +4094).

==== PC-Relative Symbol Addresses

32-bit PC-relative relocations for symbol addresses on sequences of
instructions such as the `AUIPC+ADDI` instruction pair expanded from
the `la` pseudoinstruction, in position independent code typically
have an associated pair of relocations: `R_RISCV_PCREL_HI20` plus
`R_RISCV_PCREL_LO12_I` or `R_RISCV_PCREL_LO12_S`.

The `R_RISCV_PCREL_HI20` relocation refers to an `AUIPC` instruction
containing the high 20-bits to be relocated to a symbol relative to the
program counter address of the `AUIPC` instruction. The `AUIPC`
instruction is used in conjunction with one or more I-Type instructions
(add immediate or load) with `R_RISCV_PCREL_LO12_I` relocations or S-Type
instructions (store) with `R_RISCV_PCREL_LO12_S` relocations.

The `R_RISCV_PCREL_LO12_I` or `R_RISCV_PCREL_LO12_S` relocations contain
a label pointing to an instruction in the same section with an
`R_RISCV_PCREL_HI20` relocation entry that points to the target symbol:

* At label: `R_RISCV_PCREL_HI20` relocation entry -> symbol
* `R_RISCV_PCREL_LO12_I` relocation entry -> label

To get the symbol address to perform the calculation to fill the 12-bit
immediate on the add, load or store instruction the linker finds the
`R_RISCV_PCREL_HI20` relocation entry associated with the `AUIPC`
instruction. The addresses for pair of relocations are calculated like this:

[horizontal]
HI20:: `(symbol_address - hi20_reloc_offset + 0x800) >> 12`
LO12:: `symbol_address - hi20_reloc_offset`

The successive instruction has a signed 12-bit immediate so the value of the
preceding high 20-bit relocation may have 1 added to it.

Note the compiler emitted instructions for PC-relative symbol addresses are
not necessarily sequential or in pairs. There is a constraint is that the
instruction with the `R_RISCV_PCREL_LO12_I` or `R_RISCV_PCREL_LO12_S`
relocation label points to a valid HI20 PC-relative relocation pointing to
the symbol.

Here is example assembler showing the relocation types:

[,asm]
----
label:
    auipc t0, %pcrel_hi(symbol)   # R_RISCV_PCREL_HI20 (symbol)
    lui t1, 1
    lw t2, t0, %pcrel_lo(label)   # R_RISCV_PCREL_LO12_I (label)
    add t2, t2, t1
    sw t2, t0, %pcrel_lo(label)   # R_RISCV_PCREL_LO12_S (label)
----

==== Relocation for Alignment

The relocation type `R_RISCV_ALIGN` marks a location that must be aligned to
`N`-bytes, where `N` is the smallest power of two that is greater than the value
of the addend field, e.g. `R_RISCV_ALIGN` with addend value 2 means align to 4
bytes, `R_RISCV_ALIGN` with addend value 4 means align to 8 bytes; this
relocation is only required if the containing section has any `R_RISCV_RELAX`
relocations, `R_RISCV_ALIGN` points to the beginning of the padding bytes,
and the instruction that actually needs to be aligned is located at the point
of `R_RISCV_ALIGN` plus its addend.

To ensure the linker can always satisfy the required alignment solely by
deleting bytes, the compiler or assembler must emit a `R_RISCV_ALIGN` relocation
and then insert `N` - <<IALIGN>> padding bytes before the location where we need to
align, it could be mark by an alignment directive like `.align`, `.p2align` or
`.balign` or emit by compiler directly, the addend value of that relocation
is the number of padding bytes.

The compiler and assembler must ensure padding bytes are valid instructions
without any side-effect like `nop` or `c.nop`, and make sure those instructions
are aligned to IALIGN if possible.

The linker may remove part of the padding bytes at the linking process to meet
the alignment requirement, and must make sure those padding bytes still are
valid instructions and each instruction is aligned to at least IALIGN byte.

Here is example to showing how `R_RISCV_ALIGN` is used:
[,asm]
----

0x0    c.nop           # R_RISCV_ALIGN with addend 2
0x2    add t1, t2, t3  # This instruction must align to 4 byte.

----


NOTE: `R_RISCV_ALIGN` relocation is needed because linker relaxation can shrink
preceding code during the linking process, which may cause an aligned location
to become mis-aligned.

NOTE: IALIGN[[IALIGN]] means the instruction-address alignment constraint. IALIGN is 4
bytes in the base ISA, but some ISA extensions, including the compressed ISA
extension, relax IALIGN to 2 bytes. IALIGN may not take on any value other than
4 or 2. This term is also defined in `The RISC-V Instruction Set Manual` with a
similar meaning, the only difference being it is specified in terms of the number
of bits instead of the number of bytes.

NOTE: Here is pseudocode to decide the alignment of `R_RISCV_ALIGN` relocation:
[,python]
----
# input:
#   addend: addend value of relocation with R_RISCV_ALIGN type.
# output:
#   Alignment of this relocation.

def align(addend):
  ALIGN = 1
  while addend >= ALIGN:
    ALIGN *= 2
  return ALIGN
----

=== Thread Local Storage

RISC-V adopts the ELF Thread Local Storage Model in which ELF objects define
`.tbss` and `.tdata` sections and `PT_TLS` program headers that contain the
TLS "initialization images" for new threads. The `.tbss` and `.tdata` sections
are not referenced directly like regular segments, rather they are copied or
allocated to the thread local storage space of newly created threads.
See _ELF Handling For Thread-Local Storage_ <<tls>>.

In The ELF Thread Local Storage Model, TLS offsets are used instead of pointers.
The ELF TLS sections are initialization images for the thread local variables of
each new thread. A TLS offset defines an offset into the dynamic thread vector
which is pointed to by the TCB (Thread Control Block). RISC-V uses Variant I as
described by the ELF TLS specification, with `tp` containing the address one
past the end of the TCB.

There are various thread local storage models for statically allocated or
dynamically allocated thread local storage. <<tls-model>> lists the
thread local storage models:

[[tls-model]]
.TLS models
[cols="1,2"]
[width=70%]
|===
| Mnemonic | Model

| TLS LE   | Local Exec
| TLS IE   | Initial Exec
| TLS LD   | Local Dynamic
| TLS GD   | Global Dynamic
|===

The program linker in the case of static TLS or the dynamic linker in the case
of dynamic TLS allocate TLS offsets for storage of thread local variables.

NOTE: `Global Dynamic` model is also known as `General Dynamic` model.

==== Local Exec

Local exec is a form of static thread local storage. This model is used
when static linking as the TLS offsets are resolved during program linking.

Variable attribute:: `+__thread int i __attribute__((tls_model("local-exec")));+`

Example assembler load and store of a thread local variable `i` using the
`%tprel_hi`, `%tprel_add` and `%tprel_lo` assembler functions. The emitted
relocations are in comments.

[,asm]
----
    lui  a5,%tprel_hi(i)           # R_RISCV_TPREL_HI20 (symbol)
    add  a5,a5,tp,%tprel_add(i)    # R_RISCV_TPREL_ADD (symbol)
    lw   t0,%tprel_lo(i)(a5)       # R_RISCV_TPREL_LO12_I (symbol)
    addi t0,t0,1
    sw   t0,%tprel_lo(i)(a5)       # R_RISCV_TPREL_LO12_S (symbol)
----

The `%tprel_add` assembler function does not return a value and is used purely
to associate the `R_RISCV_TPREL_ADD` relocation with the `add` instruction.

==== Initial Exec

Initial exec is is a form of static thread local storage that can be used in
shared libraries that use thread local storage. TLS relocations are performed
at load time. `dlopen` calls to libraries that use thread local storage may fail
when using the initial exec thread local storage model as TLS offsets must all
be resolved at load time. This model uses the GOT to resolve TLS offsets.

Variable attribute:: `+__thread int i __attribute__((tls_model("initial-exec")));+`
ELF flags:: DF_STATIC_TLS

Example assembler load and store of a thread local variable `i` using the
`la.tls.ie` pseudoinstruction, with the emitted TLS relocations in comments:

[,asm]
----
    la.tls.ie a5,i
    add  a5,a5,tp
    lw   t0,0(a5)
    addi t0,t0,1
    sw   t0,0(a5)
----

The assembler pseudoinstruction:

[,asm]
----
    la.tls.ie a5,symbol
----

expands to the following assembly instructions and relocations:

[,asm]
----
label:
    auipc a5, 0                   # R_RISCV_TLS_GOT_HI20 (symbol)
    {ld,lw} a5, 0(a5)             # R_RISCV_PCREL_LO12_I (label)
----

==== Global Dynamic

RISC-V local dynamic and global dynamic TLS models generate equivalent object code.
The Global dynamic thread local storage model is used for PIC Shared libraries and
handles the case where more than one library uses thread local variables, and
additionally allows libraries to be loaded and unloaded at runtime using `dlopen`.
In the global dynamic model, application code calls the dynamic linker function
`__tls_get_addr` to locate TLS offsets into the dynamic thread vector at runtime.

Variable attribute:: `+__thread int i __attribute__((tls_model("global-dynamic")));+`

Example assembler load and store of a thread local variable `i` using the
`la.tls.gd` pseudoinstruction, with the emitted TLS relocations in comments:

[,asm]
----
    la.tls.gd a0,i
    call  __tls_get_addr@plt
    mv   a5,a0
    lw   t0,0(a5)
    addi t0,t0,1
    sw   t0,0(a5)
----

The assembler pseudoinstruction:

[,asm]
----
    la.tls.gd a0,symbol
----

expands to the following assembly instructions and relocations:

[,asm]
----
label:
    auipc a0,0                    # R_RISCV_TLS_GD_HI20 (symbol)
    addi  a0,a0,0                 # R_RISCV_PCREL_LO12_I (label)
----

In the Global Dynamic model, the runtime library provides the `__tls_get_addr` function:

[,c]
----
extern void *__tls_get_addr (tls_index *ti);
----

where the type tls_index is defined as:

[,c]
----
typedef struct
{
  unsigned long int ti_module;
  unsigned long int ti_offset;
} tls_index;
----

==== TLS Descriptors

TLS Descriptors (TLSDESC) are an alternative implementation of the Global Dynamic model
that allows the dynamic linker to achieve performance close to that
of Initial Exec when the library was not loaded dynamically with `dlopen`.

The linker reserves a consecutive pair of pointer-sized entry in the GOT for each `TLSDESC`
relocation. At runtime, the dynamic linker fills in the TLS descriptor entry as defined below:

[,c]
----
typedef struct
{
  unsigned long (*entry)(tls_descriptor *);
  unsigned long arg;
} tls_descriptor;
----

Upon accessing the thread local variable, the `entry` function is called with the address
of `tls_descriptor` containing it, returning `<address of thread local variable> - tp`.

The TLS descriptor `entry` is called with a special calling convention, specified as follows:

- `a0` is used to pass the argument and return value.
- `t0` is used as the link register.
- Any other registers are callee-saved. This includes any vector registers when the vector extension is supported.

Example assembler load and store of a thread local variable `i` using the `%tlsdesc_hi`, `%tlsdesc_load_lo`, `%tlsdesc_add_lo` and `%tlsdesc_call`
assembler functions. The emitted relocations are in the comments.

[,asm]
----
label:
	auipc tX, %tlsdesc_hi(symbol)         // R_RISCV_TLSDESC_HI20 (symbol)
	lw    tY, tX, %tlsdesc_load_lo(label) // R_RISCV_TLSDESC_LOAD_LO12 (label)
	addi  a0, tX, %tlsdesc_add_lo(label)  // R_RISCV_TLSDESC_ADD_LO12 (label)
	jalr  t0, tY, %tlsdesc_call(label)    // R_RISCV_TLSDESC_CALL (label)
----

`tX` and `tY` in the example may be replaced with any combination of two general purpose registers.

The `%tlsdesc_call` assembler function does not return a value and is used purely
to associate the `R_RISCV_TLSDESC_CALL` relocation with the `jalr` instruction.

The linker can use the relocations to recognize the sequence and to perform relaxations. To ensure correctness, only the following changes to the sequence are allowed:

- Instructions outside the sequence that do not clobber the registers used within the sequence may be inserted in-between the instructions of the sequence (known as instruction scheduling).
- Instructions in the sequence with no data dependency may be reordered. In the preceding example, the only instructions that can be reordered are `lw` and `addi`.

=== Sections

==== Section Types

The defined processor-specific section types are listed in <<rv-section-type>>.

[[rv-section-type]]
.RISC-V-specific section types
[cols="3,3,1"]
[width=80%]
|===
| Name                  | Value       | Attributes

| SHT_RISCV_ATTRIBUTES  | 0x70000003  | none
|===

==== Special Sections

<<rv-section>> lists the special sections defined by this ABI.

[[rv-section]]
.RISC-V-specific sections
[cols="3,3,3"]
[width=80%]
|===
| Name                       | Type                 | Attributes

| .riscv.attributes          | SHT_RISCV_ATTRIBUTES | none
| .riscv.jvt                 | SHT_PROGBITS         | SHF_ALLOC + SHF_EXECINSTR
|===

+++.riscv.attributes+++ names a section that contains RISC-V ELF attributes.

+++.riscv.jvt+++ is a linker-created section to store table jump
target addresses. The minimum alignment of this section is 64 bytes.

=== Program Header Table

The defined processor-specific segment types are listed in <<rv-seg-type>>.

[[rv-seg-type]]
.RISC-V-specific segment types
[cols="3,2,3"]
[width=80%]
|===
| Name                 | Value       | Meaning

| PT_RISCV_ATTRIBUTES  | 0x70000003  | RISC-V ELF attribute section.
|===

`PT_RISCV_ATTRIBUTES` describes the location of RISC-V ELF attribute section.

WARNING: `PT_RISCV_ATTRIBUTES` is deprecated. The build attributes section does
not contain the `SHF_ALLOC` flag. Dynamic loaders cannot assume that the region
described by `PT_RISCV_ATTRIBUTES` is present.

=== Note Sections

There are no RISC-V specific definitions relating to ELF note sections.

=== Dynamic Section

The defined processor-specific dynamic array tags are listed in <<rv-dyn-tag>>.

[[rv-dyn-tag]]
.RISC-V-specific dynamic array tags
[cols="4,2,1,3,3"]
[width=90%]
|===
| Name                | Value      | d_un  | Executable        | Shared Object

| DT_RISCV_VARIANT_CC | 0x70000001 | d_val | Platform specific | Platform specific
|===

An object must have the dynamic tag `DT_RISCV_VARIANT_CC` if it has one or more
`R_RISCV_JUMP_SLOT` relocations against symbols with the `STO_RISCV_VARIANT_CC`
attribute.

`DT_INIT` and `DT_FINI` are not required to be supported and should be avoided
in favour of `DT_PREINIT_ARRAY`, `DT_INIT_ARRAY` and `DT_FINI_ARRAY`.

=== Hash Table

There are no RISC-V specific definitions relating to ELF hash tables.

=== Attributes

Attributes are used to record information about an object file/binary that a
linker or runtime loader needs to check compatibility.

Attributes are encoded in a vendor-specific section of type SHT_RISCV_ATTRIBUTES
and name .riscv.attributes. The value of an attribute can hold an integer
encoded in the uleb128 format or a null-terminated byte string (NTBS).
The tag number is also encoded as uleb128.

In order to improve the compatibility of the tool, the attribute follows below rules:

- RISC-V attributes have a string value if the tag number is odd and an integer
  value if the tag number is even.

- The tag is mandatory;  If the tool does not recognize this attribute and the tag number
  modulo 128 is less than 64 (`(N % 128) < 64`), errors should be reported.

- The tag is optional; If the tool does not recognize this attribute and the tag number
  modulo 128 is greater than or equal to 64 (`(N % 128) >= 64`), the tag can be ignored.

==== Layout of .riscv.attributes section

The attributes section start with a format-version (uint8 = 'A') followed by
vendor specific sub-section(s). A sub-section starts with sub-section length
(uint32), vendor name (NTBS) and one or more sub-sub-section(s).

A sub-sub-section consists of a tag (uleb128), sub-sub-section length (uint32)
followed by actual attribute tag,value pair(s) as specified above.
Sub-sub-section Tag Tag_file (value 1) specifies that contained attibutes
relate to whole file.

A sub-section with name "riscv\0" is mandatory. Vendor specific sub-sections
are allowed in future. Vendor names starting with "[Aa]non" are reserved for
non-standard ABI extensions.

==== List of attributes

.RISC-V attributes
[cols="4,>2,2,5"]
[width=100%]
|===
| Tag                                 | Value    | Parameter type | Description

| Tag_RISCV_stack_align               |        4 | uleb128        | Indicates the stack alignment requirement in bytes.
| Tag_RISCV_arch                      |        5 | NTBS           | Indicates the target architecture of this object.
| Tag_RISCV_unaligned_access          |        6 | uleb128        | Indicates whether to impose unaligned memory accesses in code generation.
| Tag_RISCV_priv_spec                 |        8 | uleb128        | *Deprecated*, indicates the major version of the privileged specification.
| Tag_RISCV_priv_spec_minor           |       10 | uleb128        | *Deprecated*, indicates the minor version of the privileged specification.
| Tag_RISCV_priv_spec_revision        |       12 | uleb128        | *Deprecated*, indicates the revision version of the privileged specification.
| Tag_RISCV_atomic_abi                |       14 | uleb128        | Indicates which version of the atomics ABI is being used.
| Tag_RISCV_x3_reg_usage              |       16 | uleb128        | Indicates the usage definition of the X3 register.
| Reserved for non-standard attribute | >= 32768 | -              | -
|===

==== Detailed attribute description

===== How does this specification describe public attributes?

Each attribute is described in the following structure:
`<Tag name>, <Value>, <Parameter type 1>=<Parameter name 1>[, <Parameter type 2>=<Parameter name 2>]`

===== Tag_RISCV_stack_align, 4, uleb128=value
Tag_RISCV_stack_align records the N-byte stack alignment for this object. The
default value is 16 for RV32I or RV64I, and 4 for RV32E.

Merge Policy:::
The linker should report erros if link object files with different `Tag_RISCV_stack_align` values.

===== Tag_RISCV_arch, 5, NTBS=subarch
Tag_RISCV_arch contains a string for the target architecture taken from
the option `-march`. Different architectures will be integrated into a superset
when object files are merged.

Tag_RISCV_arch should be recorded in lowercase, and all extensions should be
separated by underline(`_`).

Note that the version information for target architecture must be presented
explicitly in the attribute and abbreviations must be expanded. The version
information, if not given by `-march`, must agree with the default
specified by the tool. For example, the architecture `rv32i` has to be recorded
in the attribute as `rv32i2p1` in which `2p1` stands for the default version of
its based ISA. On the other hand, the architecture `rv32g` has to be presented
as `rv32i2p1_m2p0_a2p1_f2p2_d2p2_zicsr2p0_zifencei2p0` in which the
abbreviation `g` is expanded to the `imafd_zicsr_zifencei` combination with
default versions of the standard extensions.

The toolchain should normalize the architecture string by expanding all
required extensions and placing them in canonical order which is defined in
_The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document_ <<riscv-unpriv>>
. Shorthand extensions should be expanded into the architecture string if all
expanded extensions are included in the architecture string.

NOTE: A shorthand extension is an extension that does not define any actual
instructions, registers or behavior, but requires other extensions, such as the
`zks` cryptography extension.
`zks` extension is shorthand for `zbkb`, `zbkc`, `zbkx`, `zksed` and `zksh`, so
the toolchain should normalize `rv32i_zbkb_zbkc_zbkx_zksed_zksh` to
`rv32i_zbkb_zbkc_zbkx_zks_zksed_zksh`; `g` is an exception and does not follow
this rule.

Merge Policy:::
The linker should merge the different architectures into a superset when
object files are merged, and should report errors if the merge result contains
conflict extensions.
+
This specification does not mandate rules on how to merge ISA strings that
refer to different versions of the same ISA extension.
The suggested merge rules are as follows:
+
* Merge versions into the latest version of all input versions that are
ratified without warning or error.
+
* The linker should emit a warning or error if input versions have different
versions and any extension versions are not ratified.
+
* The linker may report a warning or error if it detects incompatible
versions, even if it's ratified.

NOTE: Example of conflicting merge result: `RV32IF` and `RV32IZfinx` will
be merged into `RV32IFZfinx`, which is an invalid architecture since `F` and
`Zfinx` conflict.

===== Tag_RISCV_unaligned_access, 6, uleb128=value
Tag_RISCV_unaligned_access denotes the code generation policy for this object
file. Its values are defined as follows:

[horizontal]
0:: This object does not perform any unaligned memory accesses.
1:: This object may perform unaligned memory accesses.

--

Merge policy:::
Input file could have different values for the Tag_RISCV_unaligned_access;
the linker should set this field into 1 if any of the input objects has
been set.

--

===== Tag_RISCV_priv_spec, 8, uleb128=version
===== Tag_RISCV_priv_spec_minor, 10, uleb128=version
===== Tag_RISCV_priv_spec_revision, 12, uleb128=version

WARNING: Those three attributes are deprecated since RISC-V using extensions
with version rather than a single privileged specification version scheme for
privileged ISA.

Tag_RISCV_priv_spec contains the major/minor/revision version information of
the privileged specification.

Merge policy:::
The linker should report errors if object files of different privileged
specification versions are merged.

===== Tag_RISCV_atomic_abi, 14, uleb128=version

Tag_RISCV_atomic_abi denotes the atomic ABI used within this object
file. Its values are defined as follows:

[cols="1,2,5"]
[width=100%]
|===
| Value | Symbolic Name | Description

| 0     | UNKNOWN  | This object uses unknown atomic ABI.
| 1     | A6C      | This object uses the A6 classical atomic ABI, which is defined in table A.6 in <<riscv-unpriv-20191213>>.
| 2     | A6S      | This object uses the strengthened A6 ABI, which uses the atomic mapping defined by <<Mappings from C/C++ primitives to RISC-V primitives>> and does not rely on any note 3 annotated mappings.
| 3     | A7       | This object uses the A7 atomic ABI, which uses the atomic mapping defined by <<Mappings from C/C++ primitives to RISC-V primitives>> and may rely on note 3 annotated mappings.
|===

Merge policy:::
The linker should report errors if object files with incompatible atomics ABIs
are merged; the compatibility rules for atomic ABIs can be found in the
compatibility column in the following table.

[cols="6,2,3"]
[width=100%]
|===
| Input Values        | Compatible? | Ouput Value

| UNKNOWN and A6C     | Yes         | A6C
| UNKNOWN and A6S     | Yes         | A6S
| UNKNOWN and A7      | Yes         | A7
| A6C and A6S         | Yes         | A6C
| A6C and A7          | No          | -
| A6S and A7          | Yes         | A7
|===

NOTE: Merging object files with the same ABI will result in the same ABI.

NOTE: Programs that implement atomic operations without relying on the
A-extension are classified as UNKNOWN for now. A new value for those
may be defined in the future.

===== Tag_RISCV_x3_reg_usage, 16, uleb128=value

Tag_RISCV_x3_reg_usage indicates the usage of `x3`/`gp` register. `x3`/`gp` could be used for
global pointer relaxation, as a reserved platform register, or as a temporary register.

[horizontal]
0:: This object uses `x3` as a fixed register with unknown purpose.
1:: This object uses `x3` as the global pointer, for relaxation purposes.
2:: This object uses `x3` as the shadow stack pointer.
3:: This object uses `X3` as a temporary register.
4~1023:: Reserved for future standard defined platform register.
1024~2047:: Reserved for nonstandard defined platform register.

--

Merge policy:::
The linker should issue errors when object files with differing `gp` usage are
combined. However, an exception exists: the value `0` can merge with `1` or `2`
value. After the merge, the resulting value will be the non-zero one.

--

=== Mapping Symbol

The section can have a mixture of code and data or code with different ISAs.
A number of symbols, named mapping symbols, describe the boundaries.

[%autowidth]
|===
| Symbol Name | Meaning
| $d       .2+| Start of a sequence of data.
| $d.<any>
| $x       .2+| Start of a sequence of instructions.
| $x.<any>
| $x<ISA>  .2+| Start of a sequence of instructions with <ISA> extension.
| $x<ISA>.<any>
|===

The mapping symbol should set the type to `STT_NOTYPE`, binding to `STB_LOCAL`,
and the size of symbol to zero.

The mapping symbol for data(`$d`) indicates the start of a sequence of data bytes.

The mapping symbol for instruction(`$x`) indicates the start of a sequence of
instructions.
and it has an optional ISA string, which means the following code regions are
using ISA is different than the ISA recorded in the arch attribute;
the ISA information will used until the next instruction mapping symbol;
an instruction mapping symbol without ISA string means using ISA configuration
from ELF attribute.

Format and rule of the optional ISA string are same as `Tag_RISCV_arch`, must
having explicit version, more detailed rule please refer to <<Attributes>>.

The mapping symbol can be followed by an optional uniquifier, which is prefixed
with a dot (`.`).

NOTE: The use case for mapping symbol for instruction(`$x`) with ISA information
is used with ifunc, e.g. libraries are built with `rv64gc`, but few functions
like memcpy provides two versions, one built with `rv64gc`, and one built with
`rv64gcv`, and select by ifunc mechanism at run-time; however, the arch
attribute is recording for minimal execution environment requirements, so the
ISA information from arch attribute is not enough for the disassembler to
disassemble the `rv64gcv` version correctly.

== Linker Relaxation

At link time, when all the memory objects have been resolved, the code sequence
used to refer to them may be simplified and optimized by the linker by relaxing
some assumptions about the memory layout made at compile time.

Some relocation types, in certain situations, indicate to the linker where this
can happen.  Additionally, some relocation types indicate to the
linker the associated parts of a code sequence that can be thusly simplified,
rather than to instruct the linker how to apply a relocation.

The linker should only perform such relaxations when a R_RISCV_RELAX relocation
is at the same position as a candidate relocation.

As this transformation may delete bytes (and thus invalidate references that
are commonly resolved at compile-time, such as intra-function jumps), code
generators must in general ensure that relocations are always emitted when
relaxation is enabled.

Linkers should adjust relocations that refer to symbols whose addresses have
been updated.

ULEB128 value with relocation must be padding to the same length even if the
data can be encoded with a shorter byte sequence after linker relaxation, The
linker should report errors if the length of ULEB128 byte sequence is more
extended than the current byte sequence.

=== Linker Relaxation Types

The purpose of this section is to describe all types of linker relaxation,
the linker may implement a part of linker relaxation type, and can be skipped
the relaxation type is unsupported.

Each candidate relocation might fit more than one relaxation type, the linker
should only apply one relaxation type.

In the linker relaxation optimization, we introduce a concept called relocation
group; a relocation group consists of 1) relocations associated with the same
target symbol and can be applied with the same relaxation, or 2) relocations
with the linkage relationship (e.g. `R_RISCV_PCREL_LO12_S` linked with
a `R_RISCV_PCREL_HI20`); all relocations in a single group must be present in
the same section, otherwise will split into another relocation group.


Every relocation group must apply the same relaxation type, and the linker
should not apply linker relaxation to only part of the relocation group.

NOTE: Applying relaxation on the part of the relocation group might result in a
wrong execution result; for example, a relocation group consists of
`lui t0, 0 # R_RISCV_HI20 (foo)`, `lw t1, 0(t0) # R_RISCV_LO12_I (foo)`, and we
only apply <<gp-relax,global pointer relaxation>> on first instruction, then
remove that instruction, and didn't apply relaxation on the second instruction,
which made the load instruction reference to an unspecified address.

==== Function Call Relaxation

  Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT.

  Description:: This relaxation type can relax `AUIPC+JALR` into `JAL`.

  Condition:: The offset between the location of relocation and target symbol or
  the PLT stub of the target symbol is within +-1MiB.

  Relaxation::
  - Instruction sequence associated with `R_RISCV_CALL` or `R_RISCV_CALL_PLT`
  can be rewritten to a single JAL instruction with the offset between the
  location of relocation and target symbol.

  Example::
+
--
Relaxation candidate:
[,asm]
----
    auipc ra, 0           # R_RISCV_CALL_PLT (symbol), R_RISCV_RELAX
    jalr  ra, ra, 0
----

Relaxation result:
[,asm]
----
    jal  ra, 0            # R_RISCV_JAL (symbol)
----
--

NOTE: Using address of PLT stubs of the target symbol or address target symbol
directly will resolve by linker according to the visibility of the target
symbol.

[[compress-func-call-relax]]
==== Compressed Function Call Relaxation

  Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT.

  Description:: This relaxation type can relax `AUIPC+JALR` into `C.JAL`
  instruction sequence.

  Condition:: The offset between the location of relocation and target symbol or
  the PLT stub of the target symbol is within +-2KiB and rd operand of second
  instruction in the instruction sequence is `X1`/`RA` and if it is RV32.

  Relaxation::
  - Instruction sequence associated with `R_RISCV_CALL` or `R_RISCV_CALL_PLT`
  can be rewritten to a single `C.JAL` instruction with the offset between the
  location of relocation and target symbol.

  Example::
+
--
Relaxation candidate:
[,asm]
----
    auipc ra, 0           # R_RISCV_CALL_PLT (symbol), R_RISCV_RELAX
    jalr  ra, ra, 0
----

Relaxation result:
[,asm]
----
    c.jal  ra, <offset-between-pc-and-symbol>
----
--

[[compress-tailcall-relax]]
==== Compressed Tail Call Relaxation

  Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT.

  Description:: This relaxation type can relax `AUIPC+JALR` into `C.J`
  instruction sequence.

  Condition:: The offset between the location of relocation and target symbol or
  the PLT stub of the target symbol is within +-2KiB and rd operand of second
  instruction in the instruction sequence is `X0`.

  Relaxation::
  - Instruction sequence associated with `R_RISCV_CALL` or `R_RISCV_CALL_PLT`
  can be rewritten to a single `C.J` instruction with the offset between the
  location of relocation and target symbol.

  Example::
+
--
Relaxation candidate:
[,asm]
----
    auipc ra, 0           # R_RISCV_CALL_PLT (symbol), R_RISCV_RELAX
    jalr  x0, ra, 0
----

Relaxation result:
[,asm]
----
    c.j  ra, <offset-between-pc-and-symbol>
----
--

[[gp-relax]]
==== Global-pointer Relaxation

  Target Relocation:: R_RISCV_HI20, R_RISCV_LO12_I, R_RISCV_LO12_S,
  R_RISCV_PCREL_HI20, R_RISCV_PCREL_LO12_I, R_RISCV_PCREL_LO12_S

  Description:: This relaxation type can relax a sequence of the
  load address of a symbol or load/store with a symbol reference into
  global-pointer-relative instruction.

  Condition:: Global-pointer relaxation requires that Tag_RISCV_x3_reg_usage
  must be 0 or 1, and offset between global-pointer and symbol is within +-2KiB,
  `R_RISCV_PCREL_LO12_I` and `R_RISCV_PCREL_LO12_S` resolved as indirect
  relocation pointer. It will always point to another `R_RISCV_PCREL_HI20`
  relocation, the symbol pointed by `R_RISCV_PCREL_HI20` will be used in
  the offset calculation.

  Relaxation::
  - Instruction associated with `R_RISCV_HI20` or `R_RISCV_PCREL_HI20` can
  be removed.

  - Instruction associated with `R_RISCV_LO12_I`, `R_RISCV_LO12_S`,
  `R_RISCV_PCREL_LO12_I` or `R_RISCV_PCREL_LO12_S` can be replaced with a
  global-pointer-relative access instruction.

  Example::
+
--
Relaxation candidate (`tX` and `tY` can be any combination of two general purpose registers):
[,asm]
----
    lui tX, 0       # R_RISCV_HI20 (symbol), R_RISCV_RELAX
    lw tY, 0(tX)    # R_RISCV_LO12_I (symbol), R_RISCV_RELAX
----
Relaxation result:
[,asm]
----
    lw tY, <gp-offset-for-symbol>(gp)
----

A symbol can be loaded in multiple fragments using different addends, where
multiple instructions associated with `R_RISCV_LO12_I`/`R_RISCV_LO12_S` share a
single `R_RISCV_HI20`. The HI20 values for the multiple fragments must be
identical and all the relaxed global-pointer offsets must be in range.

Relaxation candidate:
[,asm]
----
    lui tX, 0       # R_RISCV_HI20 (symbol), R_RISCV_RELAX
    lw tY, 0(tX)    # R_RISCV_LO12_I (symbol), R_RISCV_RELAX
    lw tZ, 0(tX+4)  # R_RISCV_LO12_I (symbol+4), R_RISCV_RELAX
    lw tW, 0(tX+8)  # R_RISCV_LO12_I (symbol+8), R_RISCV_RELAX
    lw tX, 0(tX+12) # R_RISCV_LO12_I (symbol+12), R_RISCV_RELAX
----
Relaxation result:
[,asm]
----
    lw tY, <gp-offset-for-symbol>(gp)
    lw tZ, <gp-offset-for-symbol+4>(gp)
    lw tW, <gp-offset-for-symbol+8>(gp)
    lw tX, <gp-offset-for-symbol+12>(gp)
----
--

NOTE: The global-pointer refers to the address of the `__global_pointer$`
symbol, which is the content of `gp` register.

NOTE: This relaxation requires the program to initialize the `gp` register with
the address of `__global_pointer$` symbol before accessing any symbol address,
strongly recommended initialize `gp` at the beginning of the program entry
function like `_start`, and code fragments of initialization must disable
linker relaxation to prevent initialization instruction relaxed into a NOP-like
instruction (e.g. `mv gp, gp`).
[[gp-relax-asm]]
[,asm]
----
    # Recommended way to initialize the gp register.
    .option push
    .option norelax
1:  auipc gp, %pcrel_hi(__global_pointer$)
    addi  gp, gp, %pcrel_lo(1b)
    .option pop
----

NOTE: The global pointer is referred to as the global offset table pointer in
many other targets, however, RISC-V uses PC-relative addressing rather than
access GOT via the global pointer register (`gp`), so we use `gp` register to
optimize code size and performance of the symbol accessing.

NOTE: Tag_RISCV_x3_reg_usage is treated as 0 if it is not present.

==== GOT load relaxation

  Target Relocation:: R_RISCV_GOT_HI20, R_RISCV_PCREL_LO12_I

  Description:: This relaxation can relax a GOT indirection into load
  immediate or PC-relative addressing. This relaxation is intended to
  optimize the `lga` assembly pseudo-instruction (and thus `la` for
  PIC objects), which loads a symbol's address from a GOT entry with
  an `auipc` + `l[w|d]` instruction pair.

  Condition::
  - Both `R_RISCV_GOT_HI20` and `R_RISCV_PCREL_LO12_I` are marked with
  `R_RISCV_RELAX`.

  - The symbol pointed to by `R_RISCV_PCREL_LO12_I` is at the location to
  which `R_RISCV_GOT_HI20` refers.

  - If the symbol is absolute, its address is within `0x0` ~ `0x7ff` or
  `0xfffffffffffff800` ~ `0xffffffffffffffff` for RV64 and
  `0xfffff800` ~ `0xffffffff` for RV32.
  Note that an undefined weak symbol satisfies this condition because
  such a symbol is handled as if it were an absolute symbol at address 0.

  - If the symbol is relative, it's bound at link time to be within the
  object. It should not be of the GNU ifunc type. Additionally, the offset
  between the location to which `R_RISCV_GOT_HI20` refers and the target
  symbol should be within a range of +-2GiB.

  Relaxation::
  - The `auipc` instruction associated with `R_RISCV_GOT_HI20` can be
  removed if the symbol is absolute.

  - The instruction or instructions associated with `R_RISCV_PCREL_LO12_I`
  can be rewritten to either `c.li` or `addi` to materialize the symbol's
  address directly in a register.

  - If this relaxation eliminates all references to the symbol's GOT slot,
  the linker may opt not to create a GOT slot for that symbol.

  Example::
+
--
Relaxation candidate:
[,asm]
----
label:
    auipc   tX, 0      # R_RISCV_GOT_HI20 (symbol), R_RISCV_RELAX
    l[w|d]  tY, 0(tX)  # R_RISCV_PCREL_LO12_I (label), R_RISCV_RELAX
----

Relaxation result (absolute symbol whose address can be represented as
a 6-bit signed integer and if the RVC instruction is permitted):

[,asm]
----
    c.li    tY, <symbol-value>
----

Relaxation result (absolute symbol and did not meet the above condition
to use `c.li`):

[,asm]
----
    addi    tY, zero, <symbol-value>
----

Relaxation result (relative symbol):
[,asm]
----
    auipc   tX, <hi>
    addi    tY, tX, <lo>
----
--

==== Zero-page Relaxation

  Target Relocation:: R_RISCV_HI20, R_RISCV_LO12_I, R_RISCV_LO12_S

  Description:: This relaxation type can relax a sequence of the load
  address of a symbol or load/store with a symbol reference into shorter
  instruction sequence if possible.

  Condition:: The symbol address located within `0x0` ~ `0x7ff` or
  `0xfffffffffffff800` ~ `0xffffffffffffffff` for RV64 and
  `0xfffff800` ~ `0xffffffff` for RV32.

  Relaxation::
  - Instruction associated  with `R_RISCV_HI20` can be removed if the symbol
  address satisfies the x0-relative access.

  - Instruction associated with `R_RISCV_LO12_I` or `R_RISCV_LO12_S` can be
  relaxed into x0-relative access.

  Example::
+
--
Relaxation candidate:
[,asm]
----
    lui t0, 0       # R_RISCV_HI20 (symbol), R_RISCV_RELAX
    lw t1, 0(t0)    # R_RISCV_LO12_I (symbol), R_RISCV_RELAX
----
Relaxation result:
[,asm]
----
    lw t1, <address-of-symbol>(x0)
----
--

==== Compressed LUI Relaxation

  Target Relocation:: R_RISCV_HI20, R_RISCV_LO12_I, R_RISCV_LO12_S

  Description:: This relaxation type can relax a sequence of the load
  address of a symbol or load/store with a symbol reference into shorter
  instruction sequence if possible.

  Condition:: The symbol address can be presented by a `C.LUI` plus an `ADDI`
   or load / store instruction.

  Relaxation::
  - Instruction associated with `R_RISCV_HI20` can be replaced with `C.LUI`.

  - Instruction associated with `R_RISCV_LO12_I` or `R_RISCV_LO12_S` should keep
  unchanged.

  Example::
+
--
Relaxation candidate:
[,asm]
----
    lui t0, 0       # R_RISCV_HI20 (symbol), R_RISCV_RELAX
    lw t1, 0(t0)    # R_RISCV_LO12_I (symbol), R_RISCV_RELAX
----
Relaxation result:
[,asm]
----
    c.lui t0, <non-zero>  # RVC_LUI (symbol), R_RISCV_RELAX
    lw t1, 0(t0)          # R_RISCV_LO12_I (symbol), R_RISCV_RELAX
----
--

==== Thread-pointer Relaxation

  Target Relocation:: R_RISCV_TPREL_HI20, R_RISCV_TPREL_ADD,
  R_RISCV_TPREL_LO12_I, R_RISCV_TPREL_LO12_S.

  Description:: This relaxation type can relax a sequence of the load
  address of a symbol or load/store with a thread-local symbol reference into a
  thread-pointer-relative instruction.

  Condition:: Offset between thread-pointer and thread-local symbol is within
   +-2KiB.

  Relaxation::
  - Instruction associated with `R_RISCV_TPREL_HI20` or `R_RISCV_TPREL_ADD` can
  be removed.

  - Instruction associated with `R_RISCV_TPREL_LO12_I` or `R_RISCV_TPREL_LO12_S`
  can be replaced with a thread-pointer-relative access instruction.

  Example::
+
--
Relaxation candidate:
[,asm]
----
    lui t0, 0       # R_RISCV_TPREL_HI20 (symbol), R_RISCV_RELAX
    add t0, t0, tp  # R_RISCV_TPREL_ADD (symbol), R_RISCV_RELAX
    lw t1, 0(t0)    # R_RISCV_TPREL_LO12_I (symbol), R_RISCV_RELAX
----
Relaxation result:
[,asm]
----
    lw t1, <tp-offset-for-symbol>(tp)
----
--

==== TLS Descriptors -> Initial Exec Relaxation

Target Relocation:: R_RISCV_TLSDESC_HI20, R_RISCV_TLSDESC_LOAD_LO12, R_RISCV_TLSDESC_ADD_LO12, R_RISCV_TLSDESC_CALL

Description:: This relaxation can relax a sequence loading the address of a thread-local symbol reference into a GOT load instruction.

Condition::
- Linker output is an executable.

Relaxation::

- Instruction associated with `R_RISCV_TLSDESC_HI20` or `R_RISCV_TLSDESC_LOAD_LO12` can be removed.
- Instruction associated with `R_RISCV_TLSDESC_ADD_LO12` can be replaced with load of the high half of the symbol's GOT address.
- Instruction associated with `R_RISCV_TLSDESC_CALL` can be replaced with load of the low half of the symbol's GOT address.
Example::
+
--
Relaxation candidate (`tX` and `tY` can be any combination of two general purpose registers):

[,asm]
----
label:
	auipc tX, <hi>      // R_RISCV_TLSDESC_HI20 (symbol), R_RISCV_RELAX
	lw    tY, tX, <lo>  // R_RISCV_TLSDESC_LOAD_LO12 (label)
	addi  a0, tX, <lo>  // R_RISCV_TLSDESC_ADD_LO12 (label)
	jalr  t0, tY        // R_RISCV_TLSDESC_CALL (label)
----

Relaxation result:

[,asm]
----
	auipc   a0, <pcrel-got-offset-for-symbol-hi>
	{ld,lw} a0, <pcrel-got-offset-for-symbol-lo>(a0)
----
--

==== TLS Descriptors -> Local Exec Relaxation

Target Relocation:: R_RISCV_TLSDESC_HI20, R_RISCV_TLSDESC_LOAD_LO12, R_RISCV_TLSDESC_ADD_LO12, R_RISCV_TLSDESC_CALL

Description:: This relaxation can relax a sequence loading the address of a thread-local symbol reference into a thread-pointer-relative instruction sequence.

Condition::

- Short form only: Offset between thread-pointer and thread-local symbol is within +-2KiB.
- Linker output is an executable.
- Target symbol is non-preemptible.

Relaxation::

- Instruction associated with `R_RISCV_TLSDESC_HI20` or `R_RISCV_TLSDESC_LOAD_LO12` can be removed.
- Instruction associated with `R_RISCV_TLSDESC_ADD_LO12` can be replaced with the high TP-relative offset of symbol (long form) or be removed (short form).
- Instruction associated with `R_RISCV_TLSDESC_CALL` can be replaced with the low TP-relative offset of symbol.

Example::
+
--
Relaxation candidate (`tX` and `tY` can be any combination of two general purpose registers):

[,asm]
----
label:
	auipc tX, <hi>      // R_RISCV_TLSDESC_HI20 (symbol), R_RISCV_RELAX
	lw    tY, tX, <lo>  // R_RISCV_TLSDESC_LOAD_LO12 (label)
	addi  a0, tX, <lo>  // R_RISCV_TLSDESC_ADD_LO12 (label)
	jalr  t0, tY        // R_RISCV_TLSDESC_CALL (label)
----

Relaxation result (long form):

[,asm]
----
	lui a0, <tp-offset-for-symbol-hi>
	addi a0, a0, <tp-offset-for-symbol-lo>
----

Relaxation result (short form):

[,asm]
----
	addi a0, zero, <tp-offset-for-symbol>
----
--

==== Table Jump Relaxation

  Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT, R_RISCV_JAL.

  Description:: This relaxation type can relax a function call or jump
  instruction into a single table jump instruction with the index of the target
  address in table jump section (<<rv-section>>).
  Before relaxation, the linker scans all relocations and calculates whether
  additional gains can be obtained by using table jump instructions, where
  expected size saving from function-call-related relaxations and the size of jump
  table will be taken into account. If there is no additional gain, then table
  jump relaxation is ignored. Otherwise, this relaxation is switched on.
  <<compress-tailcall-relax, Compressed Tail Call Relaxation>> and
  <<compress-func-call-relax, Compressed Function Call Relaxation>> are
  always prefered during relaxation, since table jump relaxation has no
  extra size saving over these two relaxations and might bring a performance
  overhead.

  Condition:: The `zcmt` extension is required, the linker output is not
  position-independent and the rd operand of a function call or jump instruction
  is `X0` or `RA`.

  Relaxation::
  - Instruction sequence associated with `R_RISCV_CALL` or `R_RISCV_CALL_PLT`
  can be rewritten to a table jump instruction.
  - Instruction associated with `R_RISCV_JAL` can be rewritten to a table
  jump instruction.
  Example::
+
--
Relaxation candidate:
[,asm]
----
    auipc ra, 0           # R_RISCV_CALL (symbol), R_RISCV_RELAX (symbol)
    jalr  ra, ra, 0

    auipc ra, 0           # R_RISCV_CALL_PLT (symbol), R_RISCV_RELAX (symbol)
    jalr  x0, ra, 0

    jal ra, 0             # R_RISCV_JAL (symbol), R_RISCV_RELAX (symbol)

    jal x0, 0             # R_RISCV_JAL (symbol), R_RISCV_RELAX (symbol)
----

Relaxation result:
[,asm]
----
    cm.jalt  <index-for-symbol>

    cm.jt    <index-for-symbol>

    cm.jalt  <index-for-symbol>
----
--

NOTE: The `zcmt` extension cannot be used in position-independent binaries.

NOTE: Jump or call instructions with the rd operand `RA` will be relaxed into
`cm.jalt` and instructions with the rd operand `X0` will be relaxed into
`cm.jt`. The table jump section holds target addresses for these two
instructions separately. More details are available in the _ZC* extension
specification_ <<riscv-zc-extension-group>>.

NOTE: This relaxation requires programs to initialize the `jvt` CSR with the
address of the `__jvt_base$` symbol before executing table jump
instructions. It is recommended to initialize `jvt` CSR immediately after
<<gp-relax-asm,global pointer initialization>>.
[,asm]
----
    # Recommended way to initialize the jvt CSR.
1:  auipc a0, %pcrel_hi(__jvt_base$)
    addi  a0, a0, %pcrel_lo(1b)
    csrw  jvt, a0
----

[bibliography]
== References

* [[[gabi]]] "Generic System V Application Binary Interface"
http://www.sco.com/developers/gabi/latest/contents.html

* [[[itanium-cxx-abi]]] "Itanium {Cpp} ABI"
http://itanium-cxx-abi.github.io/cxx-abi/

* [[[rv-asm]]] "RISC-V Assembly Programmer's Manual"
https://github.com/riscv-non-isa/riscv-asm-manual

* [[[tls]]] "ELF Handling For Thread-Local Storage"
https://www.akkadia.org/drepper/tls.pdf, Ulrich Drepper

* [[[riscv-unpriv]]] "The RISC-V Instruction Set Manual, Volume I: User-Level
ISA, Document", Editors Andrew Waterman and Krste Asanovi´c,
RISC-V International.

* [[[riscv-unpriv-20191213]]] "The RISC-V Instruction Set Manual, Volume I: User-Level
ISA, Document release 20191213", Editors Andrew Waterman and Krste Asanovi´c,
RISC-V International.

* [[[riscv-zc-extension-group]]] "ZC* extension specification"
https://github.com/riscv/riscv-code-size-reduction

* [[[rvv-intrinsic-doc]]] "RISC-V Vector Extension Intrinsic Document"
https://github.com/riscv-non-isa/rvv-intrinsic-doc