Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for Vector Calling Convention #389

Merged
merged 21 commits into from
Jan 8, 2024
Merged

Conversation

lhtin
Copy link
Collaborator

@lhtin lhtin commented Jun 16, 2023

Update 2023-6-29: Updated to a new proposal based on comments, please view the file changes directly.

This proposal is based on the prior work (#171) which is currently implemented in LLVM. The differences between them are:

  1. v1-v15 and v24-v31 is callee save in this proposal for vector functions, but LLVM's one is all caller-save
  2. No ABI for tuple type ABI implemented in LLVM yet
  3. For variadic functions, the current LLVM does inconsistent passing between caller and callee. (https://godbolt.org/z/9qv9n3zYP)

PoCs:


NOTE: Original post, it doesn't matter if you don't read it.

Hi,

Here's a proposal for vector calling convention. This proposal is based on the prior work (#171) which is currently implemented in LLVM. This proposal extends the range of registers that can be used for vector data types and added more descriptions of the allocation.

Update 2023-6-27: Here (riscv-non-isa/rvv-intrinsic-doc#38) is a proposal that is very similar to the current one.

Reasons for the proposal:

  • Reason 1: The vector register size tends to be very large, so all vector registers used to pass arguments can reduce memory usage.

    In particular, for arguments with LMUL equal to 8, it is possible to support passing in 3 such arguments (i.e. v8-v15, v16-v23, v24-v31).

  • Reason 2: In RVV, the mask operand of an instruction with a mask must be placed in the v0 register. Let the first vector mask type argument pass in v0 can reduce register move.

  • Reason 3: Finding the right register segment each time starting from v1 allows more arguments to be passed through the vector registers.

    For example for the function void foo(vint32m8_t a1, vint32m8_t a2, vint32m8_t a3, vint32m1_t a4 vint32m2_t a5, vint32m4_t a6);. You can pass a1 into v8-v15, a2 into v16-v23, a3 into v24-v31, a4 into v1, a5 into v2-v3, a6 into v4-v7. If not, a4, a5, and a6 need to be passed by reference.

  • Reason 4: Start from v1 to find LMUL-aligned vector registers segment for return value can reduce the overlap of different LMUL return value vector registers.

    (Update 2023-6-28: Kito pointed out that this reason is not true. Proposal for Vector Calling Convention #389 (comment))

    For example, for the following code snippet. You can return x in v1, and y in v2-v3, they do not overlap.

    vint32m1_t x = foo1 ();
    vint32m2_t y = foo2 (c, d);
    vint32m2_t z = __riscv_vwadd_wv (x, y, vl);
    

I have a few points that I am not very sure about and would like your help in looking at.

  1. Do we really need all ilp32v, ilp32fv, ilp32dv?

    Update 2023-6-22: After further understanding, there should be no need to add a new abi, but the function that uses the vector calling convention must be annotated with STO_RISCV_VARIANT_CC.

  2. Will using all registers as argument registers to cause the resolver routine in the dynamic link lazy bind cost too much?

    Update:

    Maybe solved by Calling conventions for the lazily bound functions. #190 ? (point by @sorear ). From bellow two patches, it seems that the STO_RISCV_VARIANT_CC symbol is not yet used inside glibc.

Feel free to ask any questions or suggestions.

Best,
Lehua (RiVAI)

@lhtin lhtin changed the title a vector abi proposal A Vector ABI Proposal Jun 16, 2023
@lhtin lhtin changed the title A Vector ABI Proposal A Vector Calling Convention Proposal Jun 16, 2023
@lhtin lhtin changed the title A Vector Calling Convention Proposal Proposal for Vector Calling Convention Jun 17, 2023
Copy link
Contributor

@nick-knight nick-knight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some prior work: #171

riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
@lhtin
Copy link
Collaborator Author

lhtin commented Jun 17, 2023

Some prior work: #171

Thank you for sending the previous discussion about vector calling convention.

From this issue and related issues(#66), I found that if all vector registers are used as argument registers, the resolver routine in the dynamic link lazy bind needs to save all vector registers. This work will be huge because the vector register itself can be very large. This issue needs to be further resolved.

@sorear
Copy link
Collaborator

sorear commented Jun 20, 2023

Suppose XLEN = VLEN = 64 and we are passing an argument of type struct { int a; int b; vint8m1_t c; }. Since the struct has three fields, this proposal causes the struct to be "passed according to the integer calling convention". But since the aggregate is no larger than 2*XLEN, the struct is passed in a pair of integer registers. If the same call occurs with VLEN = 128, the struct will be passed by reference in the proposal. I consider it undesirable for the registers used to pass an argument to depend on the runtime VLEN, so I propose that the integer calling convention be modified to add "aggregates containing a value of vector type" to the list of aggregates always passed by reference in the integer calling convention.

From this issue and related issues(#66), I found that if all vector registers are used as argument registers, the resolver routine in the dynamic link lazy bind needs to save all vector registers. This work will be huge because the vector register itself can be very large. This issue needs to be further resolved.

Wasn't this already resolved in #190?

Do we really need all ilp32v, ilp32fv, ilp32dv

I don't think we should be defining any new ABIs for this; the vector calling convention is exactly the same as the integer calling convention for functions that don't take or return vector values, supporting scalable vector arguments or return values without the V extension is meaningless, and any proposal which prevents mixing vector and non-vector code and requires to recompile the world at a single time is not viable for major distros.

@lhtin
Copy link
Collaborator Author

lhtin commented Jun 21, 2023

@sorear Thank you for your comments.

Suppose XLEN = VLEN = 64 and we are passing an argument of type struct { int a; int b; vint8m1_t c; }. Since the struct has three fields, this proposal causes the struct to be "passed according to the integer calling convention". But since the aggregate is no larger than 2*XLEN, the struct is passed in a pair of integer registers. If the same call occurs with VLEN = 128, the struct will be passed by reference in the proposal. I consider it undesirable for the registers used to pass an argument to depend on the runtime VLEN, so I propose that the integer calling convention be modified to add "aggregates containing a value of vector type" to the list of aggregates always passed by reference in the integer calling convention.

I think the way to pass parameters does not depend on runtime VLEN, this compiler compile is also not aware of this information. Here when VLEN=64 and 128, the inconsistency is because you specify vl64 and vl128 at compile time and use the fixed-vlmax option. If you don't specify it, the compiler can only assume that it is VLEN agnostic and doesn't know if it is less than or equal to 2*XLEN. For safety reasons, it must be passed through memory. I think this is a bit like the difference between struct { long long a; long long b; } on top of ilp32 and lp64, where on ilp32 it pass in memory but use two registers in lp64.

Wasn't this already resolved in #190?

Yes! Thank you for pointing that out.

According to my superficial understanding, this means that if you use vector registers to pass parameters, you need to set the STO_RISCV_VARIANT_CC attribute. So I understand that the resolver function does not use the vector register if it finds this attribute or saves it before using it and restores it afterward. Is it right?

Update: After looking at the relevant patches again, it should be forced to bind directly instead of lazy bind when the function comes with STO_RISCV_VARIANT_CC.

I don't think we should be defining any new ABIs for this; the vector calling convention is exactly the same as the integer calling convention for functions that don't take or return vector values, supporting scalable vector arguments or return values without the V extension is meaningless, and any proposal which prevents mixing vector and non-vector code and requires to recompile the world at a single time is not viable for major distros.

How do you know if a compiled library uses a vector register as an argument without adding a new abi? If one is used and one is not, then linking them together will cause problems, right?

@sorear
Copy link
Collaborator

sorear commented Jun 21, 2023

I think the way to pass parameters does not depend on runtime VLEN, this compiler compile is also not aware of this information. Here when VLEN=64 and 128, the inconsistency is because you specify vl64 and vl128 at compile time and use the fixed-vlmax option.

To be clear, I'm talking about the text of the proposal, not the current gcc behavior. I believe that the proposal as written could be interpreted as saying that the compiler needs to look at vlenb before deciding which registers to pass the argument in.

For safety reasons, it must be passed through memory.

This is the behavior I want - I propose we make it clear in the ABI specification.

So I understand that the resolver function does not use the vector register if it finds this attribute or saves it before using it and restores it afterward. Is it right?

Update: After looking at the relevant patches again, it should be forced to bind directly instead of lazy bind when the function comes with STO_RISCV_VARIANT_CC.

I would argue that both of those are allowed implementations.

How do you know if a compiled library uses a vector register as an argument without adding a new abi? If one is used and one is not, then linking them together will cause problems, right?

The calling convention is per-function. If the caller of a function passes a vector register and the callee expects a vector argument, it works. If the caller is using the integer convention and the callee expects only integers, it works since both sides are using the integer convention. If the caller and the callee have mismatched prototypes, we are already looking at undefined behavior.

@nick-knight
Copy link
Contributor

nick-knight commented Jun 21, 2023

Regarding the thread:

Do we really need all ilp32v, ilp32fv, ilp32dv

I don't think we should be defining any new ABIs for this;

Our psABI currently says:

Vector registers are not used for passing arguments or return values; we intend to define a new calling convention variant to allow that as a future software optimization.

My understanding is that this PR proposes such a variant. I don't really care what us humans call it. I just need to know what to tell my compiler (e.g., via mabi) to ensure my C program leverages the new calling convention variant. And I (and my linker) need to be able to inspect an ELF to deduce which calling convention variant was used.

@lhtin
Copy link
Collaborator Author

lhtin commented Jun 22, 2023

@sorear @nick-knight For the question of whether new ABIs need to be added. After re-understood the existing ABI specification, functions need to be marked as STO_RISCV_VARIANT_CC for cases that are not compatible with the current ABI specification, so that the compiler, inker, and loader can distinguish them and prevent caller and callee from using different specifications.

Therefore, it should not be necessary to add a new abi. However, the compiler needs to add an option to enable the vector calling convention and annotate the function that uses vector registers to pass arguments or return value with STO_RISCV_VARIANT_CC.

@lhtin
Copy link
Collaborator Author

lhtin commented Jun 22, 2023

This is the behavior I want - I propose we make it clear in the ABI specification.

I think this proposal is reasonable. Currently, neither GCC nor LLVM supports putting vector type fields in struct structures. Adding this rule should not introduce a breaking change. I descript it as below:

The size of the vector type is considered to be unknown. So if structs contain a field of
vector type, they always are passed by reference.

NOTE: The vector type mentioned here refers to the type defined https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#type-system[here]

riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
@kito-cheng
Copy link
Collaborator

Reason 4: Start from v1 to find LMUL-aligned vector registers segment for return value can reduce the overlap of different LMUL return value vector registers.

For example, for the following code snippet. You can return x in v1, and y in v2-v3, they do not overlap.

vint32m1_t x = foo1 ();
vint32m2_t y = foo2 (c, d);
vint32m2_t z = __riscv_vwadd_wv (x, y, vl);

This is not true, because all return value register are all call-clobber, so it not really get benefit on this case.

@lhtin
Copy link
Collaborator Author

lhtin commented Jun 28, 2023

Reason 4: Start from v1 to find LMUL-aligned vector registers segment for return value can reduce the overlap of different LMUL return value vector registers.
For example, for the following code snippet. You can return x in v1, and y in v2-v3, they do not overlap.
vint32m1_t x = foo1 ();
vint32m2_t y = foo2 (c, d);
vint32m2_t z = __riscv_vwadd_wv (x, y, vl);

This is not true, because all return value register are all call-clobber, so it not really get benefit on this case.

Indeed, thank you very much for pointing that out. I've added a delete line to Reason 4.

@kito-cheng
Copy link
Collaborator

kito-cheng commented Jun 28, 2023

|===
| Name    | ABI Mnemonic | Meaning                      | Preserved across calls?

|v0       |              | Argument register for mask*  | No
|v1-v7    |              | Callee-saved registers        | Yes
|v8-v23   |              | Argument registers           | No
|v24-v31  |              | Callee-saved registers       | Yes
|===

I would like add callee-save registers rather than all caller-save, this could benefit those kernel loop contain with vector function, 16+1 argument register can hold 2 LMUL=8 value and one vector mask, which is satisfy almost all math function, the only function take 3 argument in std c math is fma, which can be expanded to instruction.

e.g.

void func (){
  vector_type a, b, x, y, z;
  for (i = 0 ~ n)
  {
     a = foo (b); // vectorized foo
     // `a` can be moved into a callee-save register instead of spill to stack.
     x = bar (y); // vectorized bar
     z = a + x;
  }
}

Just emphasize one point is the outer function func still define all vector register as caller-save register, so no callee-save overhead for func, the rule only applied on foo and bar, SVE also define similar rule for their vector calling convention.

@sorear
Copy link
Collaborator

sorear commented Jun 28, 2023

What's the use case for temporary registers? We have integer temporary registers because they're needed for some kinds of stub code, and I think float temporaries exist only to align the float ABI register names with the integer ABI register names, but with LMUL=8 types it's much easier to write a function that needs 24 vector registers than it is for integer registers. Making v1 - v7 non-arguments wouldn't affect most functions but I'm also not sure what this is accomplishing.

Far more hesitant about a block of call-preserved registers. I remember, years and years ago before there even was a vector spec, a decision being made to never add new call-preserved U-mode state, does anyone remember the complete list of reasons why that was? setjmp shouldn't be an issue because setjmp doesn't take any vector arguments, so the compiler won't try to have vectors live across it. .eh_frame could be a problem will also need careful handing; my first impression is that if vectors are not allowed to be live at landing pad entry, then vector-related entries don't need to appear in the unwind info, but this needs a closer analysis.

@kito-cheng
Copy link
Collaborator

@sorear updated the table, I just changed the column of Preserved across calls? but no update with Meaning column.

What's the use case for temporary registers? We have integer temporary registers because they're needed for some kinds of stub code, and I think float temporaries exist only to align the float ABI register names with the integer ABI register names, but with LMUL=8 types it's much easier to write a function that needs 24 vector registers than it is for integer registers. Making v1 - v7 non-arguments wouldn't affect most functions but I'm also not sure what this is accomplishing.

caller can hold value on v1-v7 and v24-v31, which is one LMUL=8, one LMUL=4, one LMUL=2 and one LMUL=1 values, this might not easy to use when LMUL=8, but would be useful to hold multiple value across the function call with LMUL=4~1.

Making v1-v7 and v24-v31 could reduce the potential spill/reload around the vectorized function call, use same example with more comment to demonstrate the idea.

All callee-save:

void func (){
  vector_type a, b, x, y, z;
  for (i = 0 ~ n)
  {
     b = load
     a = foo (b); // vectorized foo
     // spill a, b
     y = load
     x = bar (y); // vectorized bar
     // reload a, b
     z = a + x + b;
  }
}

So a and b will always spill/reload N times where N is the number of iteration.

All callee-save:

void func (){
  // NO callee-save reg store/restore since func is NOT using vector cc
  vector_type a, b, x, y, z;
  for (i = 0 ~ n)
  {
     b = load
     // Put b into vector arg register
     // Copy b to callee-save register
     a = foo (b); // vectorized foo
     // Move a to callee-save register
     y = load
     x = bar (y); // vectorized bar
     z = a + x + b;
  }
}

So there is no spill/reload around within the loop.

You might ask what about foo and bar? they won't have callee-save reg overhead IF the vector reg pressure is less than 17, but what if the reg pressure larger than 17? save/restore callee-save reg at prologue and epilogue, that would be same situation as all callee-save.

So in this design we can gain some performance IF those vectorized function has lower reg pressure, the worst case is same as the baseline design, overall is net gain.

Far more hesitant about a block of call-preserved registers. I remember, years and years ago before there even was a vector spec, a decision being made to never add new call-preserved U-mode state, does anyone remember the complete list of reasons why that was? setjmp shouldn't be an issue because setjmp doesn't take any vector arguments, so the compiler won't try to have vectors live across it.

Adding call-preserved vector registers into integer or HW floating point cc will cause ABI breakage, it will cause same problem as passing argument in vector reg, and this issue has addressed by STO_RISCV_VARIANT_CC :)

cc @aswaterman might have memory on those early things.

.eh_frame could be a problem will also need careful handing; my first impression is that if vectors are not allowed to be live at landing pad entry, then vector-related entries don't need to appear in the unwind info, but this needs a closer analysis.

Fortunately RVV is not forerunner on scalable vector, unwinding stuffs already resolved by ARM folks and we just follow them (patch for unwinding scalable vector gcc-mirror/gcc@89367e7) :P

@sorear
Copy link
Collaborator

sorear commented Jun 28, 2023

So there is no spill/reload around within the loop.

With the changed proposal to have 17 call-clobbered argument registers, 15 call-saved, and 0 temporary (call-clobbered non-argument) registers I no longer have a question about the purpose of the last category :)

I also see that saving 16+12 registers in any vectorized function that calls non-vectorized functions was implemented and found adequate for SVE, so it should be good for us.

(Feedback from people with real-world experience with SVE calling conventions would be useful - I'm going off first principles here. Did 16 saved vector registers turn out to be a useful number or should a different one be picked?)

Fortunately RVV is not forerunner on scalable vector, unwinding stuffs already resolved by ARM folks and we just follow them (patch for unwinding scalable vector gcc-mirror/gcc@89367e7) :P

I was asking more about the case where a vectorized function contains a try/catch block. Since the exception unwinding could pass through any number of frames with saved vector registers, either the unwinder needs to be able to restore the saved vector registers, or the compiler needs to be aware that saved vector registers are lost on the edge between a throwing call and the landing pad.

Incidentally this does NOT work on aarch64, with the following test program, qemu-aarch64 8.0.0, and gcc 12.2.0 or 13.1.0:

#include <arm_sve.h>
volatile svint8_t *space;
typedef svint8_t (*Ft)(svint8_t);
volatile Ft g;
inline Ft hide(Ft f) { asm("" : "=r"(f) : "0"(f)); return f; }
svint8_t F3(svint8_t arg) {
    throw 5;
}
svint8_t F2(svint8_t arg) {
    svint8_t temp = *space;
    svint8_t ret = hide(F3)(arg);
    *space = temp;
    return ret;
}
svint8_t F1(svint8_t arg) {
    try {
        return hide(F2)(arg);
    } catch (int x) {
        return arg;
    }   
}
svint8_t F0(svint8_t arg) {
    hide(F1)(arg);
    return arg;
}
int main() {
    volatile svint8_t sp = svdup_s8(3);
    space = &sp;
    svint8_t rv = hide(F0)(svdup_s8(2));
    return svclastb_n_s8(svdup_b8(1),6,rv);
}

riscv-cc.adoc Outdated Show resolved Hide resolved
@lhtin
Copy link
Collaborator Author

lhtin commented Jun 29, 2023

|===
| Name    | ABI Mnemonic | Meaning                      | Preserved across calls?

|v0       |              | Argument register for mask*  | No
|v1-v7    |              | Callee-saved registers          | Yes
|v8-v23   |              | Argument registers           | No
|v24-v31  |              | Callee-saved registers         | Yes
|===

I would like add callee-save registers rather than all caller-save, this could benefit those kernel loop contain with vector function, 16+1 argument register can hold 2 LMUL=8 value and one vector mask, which is satisfy almost all math function, the only function take 3 argument in std c math is fma, which can be expanded to instruction.

e.g.

void func (){
  vector_type a, b, x, y, z;
  for (i = 0 ~ n)
  {
     a = foo (b); // vectorized foo
     // `a` can be moved into a callee-save register instead of spill to stack.
     x = bar (y); // vectorized bar
     z = a + x;
  }
}

Just emphasize one point is the outer function func still define all vector register as caller-save register, so no callee-save overhead for func, the rule only applied on foo and bar, SVE also define similar rule for their vector calling convention.

This proposal is more sophisticated and reasonable. I think I should update based on this and wait for more people to review it.

Also passing arguments of vector type to the unprototype function should report an error, similar to ARM SVE's ABI. Is this rule to be stated in this document as well?

@kito-cheng
Copy link
Collaborator

@sorear

With the changed proposal to have 17 call-clobbered argument registers, 15 call-saved, and 0 temporary (call-clobbered non-argument) registers I no longer have a question about the purpose of the last category :)

Cool, I'm glad we have a consensus here!

(Feedback from people with real-world experience with SVE calling conventions would be useful - I'm going off first principles here. Did 16 saved vector registers turn out to be a useful number or should a different one be picked?)

Yeah, let me tag few more people once @lhtin updated :)

I was asking more about the case where a vectorized function contains a try/catch block. Since the exception unwinding could pass through any number of frames with saved vector registers, either the unwinder needs to be able to restore the saved vector registers, or the compiler needs to be aware that saved vector registers are lost on the edge between a throwing call and the landing pad.

Incidentally this does NOT work on aarch64, with the following test program, qemu-aarch64 8.0.0, and gcc 12.2.0 or 13.1.0:

Thanks for providing the case, I can reproduce with linaro aarch64 toolchain, we (SiFive folks) resolved several unwinding issue, so I thought should be not issues around their, but seems like not :(

@kito-cheng
Copy link
Collaborator

@lhtin

This proposal is more sophisticated and reasonable. I think I should update based on this and wait for more people to review it.

Thanks!

Also passing arguments of vector type to the unprototype function should report an error, similar to ARM SVE's ABI. Is this rule to be stated in this document as well?

Yeah, forbid that would be easier, so let us follow SVE here :)

riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
riscv-cc.adoc Outdated Show resolved Hide resolved
@kito-cheng
Copy link
Collaborator

@lhtin Could you add few more info on the post?

  • Link of PoC on clang/LLVM (https://reviews.llvm.org/D154576)
  • Link of PoC on GCC (Add once you have updated one)
  • Comparison with existing experimental vector calling convention on LLVM:
    • v1-v15 and v24-v31 is callee save in this proposal, but LLVM's one is all caller-save
    • No ABI for tuple type ABI implemented in LLVM yet.

Copy link
Collaborator

@kito-cheng kito-cheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @lhtin and everyone!

I believe we gather enough feedback and address all major comments, so I think we are ready to land, however it's vacations season for most American and European, therefore I would like to wait few more time, my plan is merge this and #380 at 2024/1/5 IF no further strong comment/objection.

@kito-cheng
Copy link
Collaborator

Thanks everyone, finally we have an official vector calling convention after few years :)

@kito-cheng kito-cheng merged commit d4c38ee into riscv-non-isa:master Jan 8, 2024
4 checks passed
XYenChi pushed a commit to XYenChi/gcc that referenced this pull request Feb 28, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
XYenChi pushed a commit to XYenChi/gcc that referenced this pull request Mar 8, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
Liaoshihua pushed a commit to Liaoshihua/ruyi-gcc that referenced this pull request Mar 12, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
Liaoshihua pushed a commit to Liaoshihua/ruyi-gcc that referenced this pull request Mar 13, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
Liaoshihua pushed a commit to Liaoshihua/ruyi-gcc that referenced this pull request Mar 13, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
Liaoshihua pushed a commit to Liaoshihua/ruyi-gcc that referenced this pull request Mar 13, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
Liaoshihua pushed a commit to Liaoshihua/ruyi-gcc that referenced this pull request Mar 15, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
yulong18 pushed a commit to yulong18/ruyisdk-gcc that referenced this pull request Mar 17, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 19, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 21, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 25, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
XYenChi pushed a commit to XYenChi/gcc that referenced this pull request Mar 25, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 27, 2024
… and returns

I post the vector register calling convention rules from in the proposal[1]
directly here:

v0 is used to pass the first vector mask argument to a function, and to return
vector mask result from a function. v8-v23 are used to pass vector data
arguments, vector tuple arguments and the rest vector mask arguments to a
function, and to return vector data and vector tuple results from a function.

Each vector data type and vector tuple type has an LMUL attribute that
indicates a vector register group. The value of LMUL indicates the number of
vector registers in the vector register group and requires the first vector
register number in the vector register group must be a multiple of it. For
example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be
allocated to this type, but v9-v16 can not because the v9 register number is
not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a
vector mask type, its LMUL is 1.

Each vector tuple type also has an NFIELDS attribute that indicates how many
vector register groups the type contains. Thus a vector tuple type needs to
take up LMUL×NFIELDS registers.

The rules for passing vector arguments are as follows:

1. For the first vector mask argument, use v0 to pass it. The argument has now
been allocated.

2. For vector data arguments or rest vector mask arguments, starting from the
v8 register, if a vector register group between v8-v23 that has not been
allocated can be found and the first register number is a multiple of LMUL,
then allocate this vector register group to the argument and mark these
registers as allocated. Otherwise, pass it by reference. The argument has now
been allocated.

3. For vector tuple arguments, starting from the v8 register, if NFIELDS
consecutive vector register groups between v8-v23 that have not been allocated
can be found and the first register number is a multiple of LMUL, then allocate
these vector register groups to the argument and mark these registers as
allocated. Otherwise, pass it by reference. The argument has now been allocated.

NOTE: It should be stressed that the search for the appropriate vector register
groups starts at v8 each time and does not start at the next register after the
registers are allocated for the previous vector argument. Therefore, it is
possible that the vector register number allocated to a vector argument can be
less than the vector register number allocated to previous vector arguments.
For example, for the function
`void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules
of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b`
and v9 will be allocated to `c`. This approach allows more vector registers to
be allocated to arguments in some cases.

Vector values are returned in the same manner as the first named argument of
the same type would be passed.

[1] riscv-non-isa/riscv-elf-psabi-doc#389

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type.
	* config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto.
	* config/riscv/riscv.cc (struct riscv_arg_info): New fields.
	(riscv_init_cumulative_args): Setup variant_cc field.
	(riscv_vector_type_p): New function for checking vector type.
	(riscv_hard_regno_nregs): Hoist declare.
	(riscv_get_vector_arg): Subroutine of riscv_get_arg_info.
	(riscv_get_arg_info): Support vector cc.
	(riscv_function_arg_advance): Update cum.
	(riscv_pass_by_reference): Handle vector args.
	(riscv_v_abi): New function return vector abi.
	(riscv_return_value_is_vector_type_p): New function for check vector arguments.
	(riscv_arguments_is_vector_type_p): New function for check vector returns.
	(riscv_fntype_abi): Implement TARGET_FNTYPE_ABI.
	(TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI.
	* config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi.
	(MAX_ARGS_IN_VECTOR_REGISTERS): Ditto.
	(MAX_ARGS_IN_MASK_REGISTERS): Ditto.
	(V_ARG_FIRST): Ditto.
	(V_ARG_LAST): Ditto.
	(enum riscv_cc): Define all RISCV_CC variants.
	* config/riscv/riscv.opt: Add --param=riscv-vector-abi.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-2.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-3.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-args-4.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-error-1.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return-run.c: New test.
	* gcc.target/riscv/rvv/base/abi-call-return.c: New test.
Copy link
Collaborator

@jrtc27 jrtc27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note that this was merged despite being 21(!!) individual commits with all manner of fixups, making the history of the repository a mess. Can we please be more diligent in future about ensuring we merge a sensible series of commits?


Vector arguments and return values are disallowed to pass to an unprototyped
function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not have been done like this. STO_RISCV_VARIANT_CC is an ELF-specific thing (a flag in ElfXX_Sym's st_other) detailed in riscv-elf.adoc, but riscv-cc.adoc is a separate document that describes the calling convention independently from the underlying file format (and could be used by Windows or macOS if they choose to adopt RISC-V, which use PE/COFF and Mach-O respectively).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.