Skip to content

Function multi-version proposal #48

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 11, 2024

Conversation

BeMg
Copy link
Contributor

@BeMg BeMg commented Jul 26, 2023

During the Function multi-version dispatch the function, we need a method to retrieve the RISC-V hardware environment to make sure all extension must be available.

The problem is

  • How to implement this function
  • Where to provide this function

From the compiler's view, it will generate the IFUNC resolver when there are more than one implementation with the same symbol name.

Consider following example:

__attribute__((target("default"))) int foo (int index)
{
  return index;
}

__attribute__((target("arch=rv64gc"))) int foo (int index)
{
 return index;
}

void bar() {
  foo(0);
}

The corresponding assembly will look like:

bar() {
(foo.ifunc())(0);
}

.set foo.ifunc, foo.resolver

func_ptr foo.resolver() {
  if (__riscv_ifunc_select("m_a_f_d_c"))
    return ptr foo.m_a_f_d_c;
  return ptr foo.default;
}

int foo.default(int index) {
	...
}

int foo.m_a_f_d_c(int index) {
	...
}

The resolver that the compiler generates query and selects for each candidate function. When fulfilling the requirement, then return the corresponding function ptr for further processing.

In this proposal, the major part of the resolver function is __riscv_ifunc_select. __riscv_ifunc_select must retrieve the hardware information for deciding whether to execute the specific function.

Here we propose that function as the following declaration

bool __riscv_ifunc_select(char *FeatureStr);

Where FeatureString is a string that concatenates all target features belonging to a particular function. The form can be described in the following BNF form.

When hardware fulfills the FeatureStr, then returns true. Otherwise this function returns false.


2023/09/04 Update: The following section take the linux platform as example for __riscv_ifunc_select implementation.

There are two ways to retrieve hardware information.

  • RISC-V Hardware Probing Interface [1]
    • Not fully cover all extensions, and need to sync the all defined symbol from linux kernel source code.
  • /proc/cpuinfo isa string
    • Not every system has the cpuinfo file.

Another problem is where to place the function definition.

The compiler-rt/libgcc is a good place to implement these functions, like other target(x86/aarch64) implementation.

[1] https://docs.kernel.org/riscv/hwprobe.html


2024/07/11 Update

  1. Remain the syntax part, the runtime function move to another RFC.
  2. Remove arch=<full-arch-string> from syntax

@sorear
Copy link

sorear commented Aug 2, 2023

This repository is for specifications of features that are portable between multiple RISC-V toolchains. As such it is inappropriate to specify any behavior exclusively in terms of Linux-only interfaces like hwprobe and cpuinfo.

Providing a string-to-bool or string-to-int (for things like Zicbo* cache block size) lookup interface as a portable frontend to Linux's syscalls, the HWCAP-inspired interfaces on the BSDs, and whatever NT ends up with is a good idea, although it's useful for more than just ifunc; @jrtc27 suggested __riscv_get_extension for essentially this interface.

BeMg added a commit to BeMg/llvm-project that referenced this pull request Sep 6, 2023
Here is proposal riscv-non-isa/riscv-c-api-doc#48.

During the Function multi-version dispatch the function, we need a method to retrieve the RISC-V hardware environment to make sure all extension must be available.

Differential Revision: https://reviews.llvm.org/D155938
@BeMg BeMg force-pushed the ifunc-runtime-resolver branch from 46c4bd7 to 6272992 Compare December 14, 2023 13:06
@BeMg BeMg requested a review from topperc December 26, 2023 08:57
@sorear
Copy link

sorear commented Feb 27, 2024

Can arch= be removed from target_version and target_clones, since nothing else can appear there? Then it becomes just [[gnu::target_clones("+zbb","default")]] or similar.

How are versions and clones prioritized? A big list of every possible exception isn't going to work for us, so it should be something in the source code. Not sure if declaration order would cause problems.

@BeMg
Copy link
Contributor Author

BeMg commented Mar 3, 2024

Hi @sorear, thanks for the comment.

Can arch= be removed from target_version and target_clones, since nothing else can appear there? Then it becomes just [[gnu::target_clones("+zbb","default")]] or similar.

For target_version and target_clones's ATTR-STRING, I tend to reuse the format like target attribute to avoid confusion. Removing mtune and mcpu could reduce the complexity of usage and could be treated as a subset, but using the format like [[gnu::target_clones("+zbb","default")]] is more inconsistent between target attribute and target_clones/target_version.

From the compiler's perspective, mtune, mcpu information will be highly related to the compilation result. If someday, we need to add mtune and mcpu to target_version /target_clones's ATTR-STRING, it will be easier and not break the the existing code.

@kito-cheng any other comments on this topic?

How are versions and clones prioritized? A big list of every possible exception isn't going to work for us, so it should be something in the source code. Not sure if declaration order would cause problems.

Currently, the selection order depend on IFUNC resolver's implementation. We plan to add the one another option inside ATTR-STRING that represents the user's manual priority weight. Like target_clones("default", "arch=rv64gc;prior=5", "arch=rv64g;prior=7").

@BeMg
Copy link
Contributor Author

BeMg commented Mar 21, 2024

I have created a pull request llvm/llvm-project#85786 to LLVM to implement target_clones in the current proposal.

I have also created a draft pull request llvm/llvm-project#85790 to implement __riscv_ifunc_select, which allows target_clones to run in a QEMU environment.

For example

clang -march=rv64g -rtlib=compiler-rt targetclones.c
qemu-riscv64 -cpu rv64,zbb=true,zba=true -B 0x100000 -L /path/to/sysroot a.out

@BeMg
Copy link
Contributor Author

BeMg commented Jul 11, 2024

Rebase to origin/main

@BeMg BeMg force-pushed the ifunc-runtime-resolver branch from a58b156 to 8e75c7b Compare July 11, 2024 09:27
@BeMg
Copy link
Contributor Author

BeMg commented Jul 11, 2024

Update: Remove the arch=<full-arch-string> from syntax

Define the two syntax for function multiversion on RISC-V.

1. target_clones
2. target_version

Both of them are kind of function attribute
@BeMg BeMg force-pushed the ifunc-runtime-resolver branch from 8e75c7b to fabd52a Compare July 11, 2024 09:31
Copy link
Collaborator

@kito-cheng kito-cheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this has discussed for a while and remove arch=<full-arch-string> to prevent some implementation issue is reasonable to me, anyway we gonna merge this and implement that on GCC and LLVM :)

@kito-cheng kito-cheng merged commit 7722b44 into riscv-non-isa:main Jul 11, 2024
@preames
Copy link

preames commented Sep 12, 2024

For anyone following along, there is a follow on proposal to add syntax to assist with selection between multiple candidates, see #85

BeMg added a commit to llvm/llvm-project that referenced this pull request Sep 13, 2024
This patch enable the function multiversion(FMV) and `target_clones`
attribute for RISC-V target.

The proposal of `target_clones` syntax can be found at the
riscv-non-isa/riscv-c-api-doc#48 (which has
landed), as modified by the proposed
riscv-non-isa/riscv-c-api-doc#85 (which adds the
priority syntax).

It supports the `target_clones` function attribute and function
multiversioning feature for RISC-V target. It will generate the ifunc
resolver function for the function that declared with target_clones
attribute.

The resolver function will check the version support by runtime object
`__riscv_feature_bits`.

For example:

```
__attribute__((target_clones("default", "arch=+ver1", "arch=+ver2"))) int bar() {
    return 1;
}
```

the corresponding resolver will be like:

```
bar.resolver() {
    __init_riscv_feature_bits();
    // Check arch=+ver1
    if ((__riscv_feature_bits.features[0] & BITMASK_OF_VERSION1) == BITMASK_OF_VERSION1) {
        return bar.arch=+ver1;
    } else {
        // Check arch=+ver2
        if ((__riscv_feature_bits.features[0] & BITMASK_OF_VERSION2) == BITMASK_OF_VERSION2) {
            return bar.arch=+ver2;
        } else {
            // Default
            return bar.default;
        }
    }
}
```
BeMg added a commit to llvm/llvm-project that referenced this pull request Oct 4, 2024
This patch enable `target_version` attribute for RISC-V target.

The proposal of `target_version` syntax can be found at the
riscv-non-isa/riscv-c-api-doc#48 (which has
landed), as modified by the proposed
riscv-non-isa/riscv-c-api-doc#85 (which adds the
priority syntax).

`target_version` attribute will trigger the function multi-versioning
feature and act like `target_clones` attribute. See
#85786 for the implementation
of `target_clones`.
BeMg added a commit to llvm/llvm-project that referenced this pull request Oct 8, 2024
Fix the buildbot failure caused by heap use-after-free error.

Origin message:

    This patch enable `target_version` attribute for RISC-V target.

    The proposal of `target_version` syntax can be found at the
    riscv-non-isa/riscv-c-api-doc#48 (which has
    landed), as modified by the proposed
riscv-non-isa/riscv-c-api-doc#85 (which adds the
    priority syntax).

`target_version` attribute will trigger the function multi-versioning
    feature and act like `target_clones` attribute. See
#85786 for the implementation
    of `target_clones`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants