-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for Vector Calling Convention #389
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some prior work: #171
Thank you for sending the previous discussion about vector calling convention. From this issue and related issues(#66), I found that if all vector registers are used as argument registers, the resolver routine in the dynamic link lazy bind needs to save all vector registers. This work will be huge because the vector register itself can be very large. This issue needs to be further resolved. |
Suppose XLEN = VLEN = 64 and we are passing an argument of type
Wasn't this already resolved in #190?
I don't think we should be defining any new ABIs for this; the vector calling convention is exactly the same as the integer calling convention for functions that don't take or return vector values, supporting scalable vector arguments or return values without the V extension is meaningless, and any proposal which prevents mixing vector and non-vector code and requires to recompile the world at a single time is not viable for major distros. |
@sorear Thank you for your comments.
I think the way to pass parameters does not depend on runtime VLEN, this compiler compile is also not aware of this information. Here when VLEN=64 and 128, the inconsistency is because you specify vl64 and vl128 at compile time and use the fixed-vlmax option. If you don't specify it, the compiler can only assume that it is VLEN agnostic and doesn't know if it is less than or equal to 2*XLEN. For safety reasons, it must be passed through memory. I think this is a bit like the difference between
Yes! Thank you for pointing that out. According to my superficial understanding, this means that if you use vector registers to pass parameters, you need to set the STO_RISCV_VARIANT_CC attribute. So I understand that the resolver function does not use the vector register if it finds this attribute or saves it before using it and restores it afterward. Is it right? Update: After looking at the relevant patches again, it should be forced to bind directly instead of lazy bind when the function comes with STO_RISCV_VARIANT_CC.
How do you know if a compiled library uses a vector register as an argument without adding a new abi? If one is used and one is not, then linking them together will cause problems, right? |
To be clear, I'm talking about the text of the proposal, not the current gcc behavior. I believe that the proposal as written could be interpreted as saying that the compiler needs to look at
This is the behavior I want - I propose we make it clear in the ABI specification.
I would argue that both of those are allowed implementations.
The calling convention is per-function. If the caller of a function passes a vector register and the callee expects a vector argument, it works. If the caller is using the integer convention and the callee expects only integers, it works since both sides are using the integer convention. If the caller and the callee have mismatched prototypes, we are already looking at undefined behavior. |
Regarding the thread:
Our psABI currently says:
My understanding is that this PR proposes such a variant. I don't really care what us humans call it. I just need to know what to tell my compiler (e.g., via |
@sorear @nick-knight For the question of whether new ABIs need to be added. After re-understood the existing ABI specification, functions need to be marked as STO_RISCV_VARIANT_CC for cases that are not compatible with the current ABI specification, so that the compiler, inker, and loader can distinguish them and prevent caller and callee from using different specifications. Therefore, it should not be necessary to add a new abi. However, the compiler needs to add an option to enable the vector calling convention and annotate the function that uses vector registers to pass arguments or return value with STO_RISCV_VARIANT_CC. |
I think this proposal is reasonable. Currently, neither GCC nor LLVM supports putting vector type fields in struct structures. Adding this rule should not introduce a breaking change. I descript it as below: The size of the vector type is considered to be unknown. So if structs contain a field of
vector type, they always are passed by reference.
NOTE: The vector type mentioned here refers to the type defined https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#type-system[here] |
This is not true, because all return value register are all call-clobber, so it not really get benefit on this case. |
Indeed, thank you very much for pointing that out. I've added a delete line to Reason 4. |
I would like add callee-save registers rather than all caller-save, this could benefit those kernel loop contain with vector function, 16+1 argument register can hold 2 LMUL=8 value and one vector mask, which is satisfy almost all math function, the only function take 3 argument in std c math is e.g.
Just emphasize one point is the outer function |
What's the use case for temporary registers? We have integer temporary registers because they're needed for some kinds of stub code, and I think float temporaries exist only to align the float ABI register names with the integer ABI register names, but with LMUL=8 types it's much easier to write a function that needs 24 vector registers than it is for integer registers. Making v1 - v7 non-arguments wouldn't affect most functions but I'm also not sure what this is accomplishing. Far more hesitant about a block of call-preserved registers. I remember, years and years ago before there even was a vector spec, a decision being made to never add new call-preserved U-mode state, does anyone remember the complete list of reasons why that was? |
@sorear updated the table, I just changed the column of
caller can hold value on v1-v7 and v24-v31, which is one LMUL=8, one LMUL=4, one LMUL=2 and one LMUL=1 values, this might not easy to use when LMUL=8, but would be useful to hold multiple value across the function call with LMUL=4~1. Making v1-v7 and v24-v31 could reduce the potential spill/reload around the vectorized function call, use same example with more comment to demonstrate the idea. All callee-save:
So All callee-save:
So there is no spill/reload around within the loop. You might ask what about So in this design we can gain some performance IF those vectorized function has lower reg pressure, the worst case is same as the baseline design, overall is net gain.
Adding call-preserved vector registers into integer or HW floating point cc will cause ABI breakage, it will cause same problem as passing argument in vector reg, and this issue has addressed by cc @aswaterman might have memory on those early things.
Fortunately RVV is not forerunner on scalable vector, unwinding stuffs already resolved by ARM folks and we just follow them (patch for unwinding scalable vector gcc-mirror/gcc@89367e7) :P |
With the changed proposal to have 17 call-clobbered argument registers, 15 call-saved, and 0 temporary (call-clobbered non-argument) registers I no longer have a question about the purpose of the last category :) I also see that saving 16+12 registers in any vectorized function that calls non-vectorized functions was implemented and found adequate for SVE, so it should be good for us. (Feedback from people with real-world experience with SVE calling conventions would be useful - I'm going off first principles here. Did 16 saved vector registers turn out to be a useful number or should a different one be picked?)
I was asking more about the case where a vectorized function contains a try/catch block. Since the exception unwinding could pass through any number of frames with saved vector registers, either the unwinder needs to be able to restore the saved vector registers, or the compiler needs to be aware that saved vector registers are lost on the edge between a throwing call and the landing pad. Incidentally this does NOT work on aarch64, with the following test program, qemu-aarch64 8.0.0, and gcc 12.2.0 or 13.1.0: #include <arm_sve.h>
volatile svint8_t *space;
typedef svint8_t (*Ft)(svint8_t);
volatile Ft g;
inline Ft hide(Ft f) { asm("" : "=r"(f) : "0"(f)); return f; }
svint8_t F3(svint8_t arg) {
throw 5;
}
svint8_t F2(svint8_t arg) {
svint8_t temp = *space;
svint8_t ret = hide(F3)(arg);
*space = temp;
return ret;
}
svint8_t F1(svint8_t arg) {
try {
return hide(F2)(arg);
} catch (int x) {
return arg;
}
}
svint8_t F0(svint8_t arg) {
hide(F1)(arg);
return arg;
}
int main() {
volatile svint8_t sp = svdup_s8(3);
space = &sp;
svint8_t rv = hide(F0)(svdup_s8(2));
return svclastb_n_s8(svdup_b8(1),6,rv);
} |
This proposal is more sophisticated and reasonable. I think I should update based on this and wait for more people to review it. Also passing arguments of vector type to the unprototype function should report an error, similar to ARM SVE's ABI. Is this rule to be stated in this document as well? |
Cool, I'm glad we have a consensus here!
Yeah, let me tag few more people once @lhtin updated :)
Thanks for providing the case, I can reproduce with linaro aarch64 toolchain, we (SiFive folks) resolved several unwinding issue, so I thought should be not issues around their, but seems like not :( |
Thanks!
Yeah, forbid that would be easier, so let us follow SVE here :) |
@lhtin Could you add few more info on the post?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @lhtin and everyone!
I believe we gather enough feedback and address all major comments, so I think we are ready to land, however it's vacations season for most American and European, therefore I would like to wait few more time, my plan is merge this and #380 at 2024/1/5 IF no further strong comment/objection.
Thanks everyone, finally we have an official vector calling convention after few years :) |
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
… and returns I post the vector register calling convention rules from in the proposal[1] directly here: v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, vector tuple arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. The argument has now been allocated. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. 3. For vector tuple arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference. The argument has now been allocated. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. [1] riscv-non-isa/riscv-elf-psabi-doc#389 gcc/ChangeLog: * config/riscv/riscv-protos.h (builtin_type_p): New function for checking vector type. * config/riscv/riscv-vector-builtins.cc (builtin_type_p): Ditto. * config/riscv/riscv.cc (struct riscv_arg_info): New fields. (riscv_init_cumulative_args): Setup variant_cc field. (riscv_vector_type_p): New function for checking vector type. (riscv_hard_regno_nregs): Hoist declare. (riscv_get_vector_arg): Subroutine of riscv_get_arg_info. (riscv_get_arg_info): Support vector cc. (riscv_function_arg_advance): Update cum. (riscv_pass_by_reference): Handle vector args. (riscv_v_abi): New function return vector abi. (riscv_return_value_is_vector_type_p): New function for check vector arguments. (riscv_arguments_is_vector_type_p): New function for check vector returns. (riscv_fntype_abi): Implement TARGET_FNTYPE_ABI. (TARGET_FNTYPE_ABI): Implement TARGET_FNTYPE_ABI. * config/riscv/riscv.h (GCC_RISCV_H): Define macros for vector abi. (MAX_ARGS_IN_VECTOR_REGISTERS): Ditto. (MAX_ARGS_IN_MASK_REGISTERS): Ditto. (V_ARG_FIRST): Ditto. (V_ARG_LAST): Ditto. (enum riscv_cc): Define all RISCV_CC variants. * config/riscv/riscv.opt: Add --param=riscv-vector-abi. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-call-args-1-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-2.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-3.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-args-4.c: New test. * gcc.target/riscv/rvv/base/abi-call-error-1.c: New test. * gcc.target/riscv/rvv/base/abi-call-return-run.c: New test. * gcc.target/riscv/rvv/base/abi-call-return.c: New test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I note that this was merged despite being 21(!!) individual commits with all manner of fixups, making the history of the repository a mess. Can we please be more diligent in future about ensuring we merge a sensible series of commits?
|
||
Vector arguments and return values are disallowed to pass to an unprototyped | ||
function. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not have been done like this. STO_RISCV_VARIANT_CC is an ELF-specific thing (a flag in ElfXX_Sym's st_other) detailed in riscv-elf.adoc, but riscv-cc.adoc is a separate document that describes the calling convention independently from the underlying file format (and could be used by Windows or macOS if they choose to adopt RISC-V, which use PE/COFF and Mach-O respectively).
Update 2023-6-29: Updated to a new proposal based on comments, please view the file changes directly.
This proposal is based on the prior work (#171) which is currently implemented in LLVM. The differences between them are:
PoCs:
NOTE: Original post, it doesn't matter if you don't read it.
Hi,
Here's a proposal for vector calling convention. This proposal is based on the prior work (#171) which is currently implemented in LLVM. This proposal extends the range of registers that can be used for vector data types and added more descriptions of the allocation.
Update 2023-6-27: Here (riscv-non-isa/rvv-intrinsic-doc#38) is a proposal that is very similar to the current one.
Reasons for the proposal:
Reason 1: The vector register size tends to be very large, so all vector registers used to pass arguments can reduce memory usage.
In particular, for arguments with LMUL equal to 8, it is possible to support passing in 3 such arguments (i.e. v8-v15, v16-v23, v24-v31).
Reason 2: In RVV, the mask operand of an instruction with a mask must be placed in the v0 register. Let the first vector mask type argument pass in v0 can reduce register move.
Reason 3: Finding the right register segment each time starting from v1 allows more arguments to be passed through the vector registers.
For example for the function
void foo(vint32m8_t a1, vint32m8_t a2, vint32m8_t a3, vint32m1_t a4 vint32m2_t a5, vint32m4_t a6);
. You can pass a1 into v8-v15, a2 into v16-v23, a3 into v24-v31, a4 into v1, a5 into v2-v3, a6 into v4-v7. If not, a4, a5, and a6 need to be passed by reference.Reason 4: Start from v1 to find LMUL-aligned vector registers segment for return value can reduce the overlap of different LMUL return value vector registers.(Update 2023-6-28: Kito pointed out that this reason is not true. Proposal for Vector Calling Convention #389 (comment))
For example, for the following code snippet. You can return x in v1, and y in v2-v3, they do not overlap.
I have a few points that I am not very sure about and would like your help in looking at.
Do we really need all ilp32v, ilp32fv, ilp32dv?
Update 2023-6-22: After further understanding, there should be no need to add a new abi, but the function that uses the vector calling convention must be annotated with
STO_RISCV_VARIANT_CC
.Will using all registers as argument registers to cause the resolver routine in the dynamic link lazy bind cost too much?
Update:
Maybe solved by Calling conventions for the lazily bound functions. #190 ? (point by @sorear ). From bellow two patches, it seems that the STO_RISCV_VARIANT_CC symbol is not yet used inside glibc.
Feel free to ask any questions or suggestions.
Best,
Lehua (RiVAI)