RISC-V: Add RV32E multilibs #2

stephanosio · 2021-10-01T16:01:37Z

This commit adds a minimum set of multilibs required to build the RV32E
(embedded) targets:

rv32em, ilp32e
rv32emc, ilp32e

Note that we build a separate multilib variant for the C (compressed)
extension instead of mapping to the uncompressed variant because the
RV32E instruction set is intended to be used in highly resource-
constrained devices such as microcontrollers.

Signed-off-by: Stephanos Ioannidis root@stephanos.io

This commit adds a minimum set of multilibs required to build the RV32E (embedded) targets: * rv32em, ilp32e * rv32emc, ilp32e Note that we build a separate multilib variant for the C (compressed) extension instead of mapping to the uncompressed variant because the RV32E instruction set is intended to be used in highly resource- constrained devices such as microcontrollers. Signed-off-by: Stephanos Ioannidis <root@stephanos.io>

stephanosio · 2021-10-28T08:18:05Z

Verified working

Here during cp_parser_single_declaration for #2, we were calling associate_classtype_constraints for TPL<T> (the primary template type) before maybe_process_partial_specialization could get a chance to notice that we're in fact declaring a distinct constrained partial spec and not redeclaring the primary template. This caused us to emit a bogus error about differing constraints b/t the primary template and #2's constraints. This patch fixes this by moving the call to associate_classtype_constraints after the call to shadow_tag (which calls maybe_process_partial_specialization) and adjusting shadow_tag to use the return value of m_p_p_s. Moreover, if we later try to define a constrained partial specialization that's been declared earlier (as in the third testcase), then maybe_new_partial_specialization correctly notices it's a redeclaration and returns NULL_TREE. But in this case we also need to update TYPE to point to the redeclared partial spec (it'll otherwise continue pointing to the primary template type, eventually leading to a bogus error). PR c++/96363 gcc/cp/ChangeLog: * decl.cc (shadow_tag): Use the return value of maybe_process_partial_specialization. * parser.cc (cp_parser_single_declaration): Call shadow_tag before associate_classtype_constraints. * pt.cc (maybe_new_partial_specialization): Change return type to bool. Take 'type' argument by mutable reference. Set 'type' to point to the correct constrained specialization when appropriate. (maybe_process_partial_specialization): Adjust accordingly. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-partial-spec12.C: New test. * g++.dg/cpp2a/concepts-partial-spec12a.C: New test. * g++.dg/cpp2a/concepts-partial-spec13.C: New test. (cherry picked from commit 97dc78d)

…o_debug_section [PR116614] cat abc.C #define A(n) struct T##n {} t##n; #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9) #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9) #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9) E(1) E(2) E(3) int main () { return 0; } ./xg++ -B ./ -o abc{.o,.C} -flto -flto-partition=1to1 -O2 -g -fdebug-types-section -c ./xgcc -B ./ -o abc{,.o} -flto -flto-partition=1to1 -O2 (not included in testsuite as it takes a while to compile) FAILs with lto-wrapper: fatal error: Too many copied sections: Operation not supported compilation terminated. /usr/bin/ld: error: lto-wrapper failed collect2: error: ld returned 1 exit status The following patch fixes that. Most of the 64K+ section support for reading and writing was already there years ago (and especially reading used quite often already) and a further bug fixed in it in the PR104617 fix. Yet, the fix isn't solely about removing the if (new_i - 1 >= SHN_LORESERVE) { *err = ENOTSUP; return "Too many copied sections"; } 5 lines, the missing part was that the function only handled reading of the .symtab_shndx section but not copying/updating of it. If the result has less than 64K-epsilon sections, that actually wasn't needed, but e.g. with -fdebug-types-section one can exceed that pretty easily (reported to us on WebKitGtk build on ppc64le). Updating the section is slightly more complicated, because it basically needs to be done in lock step with updating the .symtab section, if one doesn't need to use SHN_XINDEX in there, the section should (or should be updated to) contain SHN_UNDEF entry, otherwise needs to have whatever would be overwise stored but couldn't fit. But repeating due to that all the symtab decisions what to discard and how to rewrite it would be ugly. So, the patch instead emits the .symtab_shndx section (or sections) last and prepares the content during the .symtab processing and in a second pass when going just through .symtab_shndx sections just uses the saved content. 2024-09-07 Jakub Jelinek <jakub@redhat.com> PR lto/116614 * simple-object-elf.c (SHN_COMMON): Align comment with neighbouring comments. (SHN_HIRESERVE): Use uppercase hex digits instead of lowercase for consistency. (simple_object_elf_find_sections): Formatting fixes. (simple_object_elf_fetch_attributes): Likewise. (simple_object_elf_attributes_merge): Likewise. (simple_object_elf_start_write): Likewise. (simple_object_elf_write_ehdr): Likewise. (simple_object_elf_write_shdr): Likewise. (simple_object_elf_write_to_file): Likewise. (simple_object_elf_copy_lto_debug_section): Likewise. Don't fail for new_i - 1 >= SHN_LORESERVE, instead arrange in that case to copy over .symtab_shndx sections, though emit those last and compute their section content when processing associated .symtab sections. Handle simple_object_internal_read failure even in the .symtab_shndx reading case. (cherry picked from commit bb8dd09)

PR jit/117275 reports various jit test failures seen on powerpc64le-unknown-linux-gnu due to hitting this assertion in varasm.cc on the 2nd compilation in a process: #2 0x00007ffff63e67d0 in assemble_external_libcall (fun=0x7ffff2a4b1d8) at ../../src/gcc/varasm.cc:2650 2650 gcc_assert (!pending_assemble_externals_processed); (gdb) p pending_assemble_externals_processed $1 = true We're not properly resetting state in varasm.cc after a compile for libgccjit. Fixed thusly. gcc/ChangeLog: PR jit/117275 * toplev.cc (toplev::finalize): Call varasm_cc_finalize. * varasm.cc (varasm_cc_finalize): New. * varasm.h (varasm_cc_finalize): New decl. Signed-off-by: David Malcolm <dmalcolm@redhat.com> (cherry picked from commit 779c039) Signed-off-by: David Malcolm <dmalcolm@redhat.com>

This crash started with my r12-7803 but I believe the problem lies elsewhere. build_vec_init has cleanup_flags whose purpose is -- if I grok this correctly -- to avoid destructing an object multiple times. Let's say we are initializing an array of A. Then we might end up in a scenario similar to initlist-eh1.C: try { call A::A in a loop // #0 try { call a fn using the array } finally { // #1 call A::~A in a loop } } catch { // #2 call A::~A in a loop } cleanup_flags makes us emit a statement like D.3048 = 2; at #0 to disable performing the cleanup at #2, since #1 will take care of the destruction of the array. But if we are not emitting the loop because we can use a constant initializer (and use a single { a, b, ...}), we shouldn't generate the statement resetting the iterator to its initial value. Otherwise we crash in gimplify_var_or_parm_decl because it gets the stray decl D.3048. PR c++/117985 gcc/cp/ChangeLog: * init.cc (build_vec_init): Pop CLEANUP_FLAGS if we're not generating the loop. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/initlist-array23.C: New test. * g++.dg/cpp0x/initlist-array24.C: New test. (cherry picked from commit 40e5636)

We've been miscompiling the following since r0-51314-gd6b4ea8592e338 (I did not go compile something that old, and identified this change via git blame, so might be wrong) === cut here === struct Foo { int x; }; Foo& get (Foo &v) { return v; } void bar () { Foo v; v.x = 1; (true ? get (v) : get (v)).*(&Foo::x) = 2; // v.x still equals 1 here... } === cut here === The problem lies in build_m_component_ref, that computes the address of the COND_EXPR using build_address to build the representation of (true ? get (v) : get (v)).*(&Foo::x); and gets something like &(true ? get (v) : get (v)) // #1 instead of (true ? &get (v) : &get (v)) // #2 and the write does not go where want it to, hence the miscompile. This patch replaces the call to build_address by a call to cp_build_addr_expr, which gives #2, that is properly handled. PR c++/114525 gcc/cp/ChangeLog: * typeck2.cc (build_m_component_ref): Call cp_build_addr_expr instead of build_address. gcc/testsuite/ChangeLog: * g++.dg/expr/cond18.C: New test. (cherry picked from commit 35ce9af)

Whenever C1 and C2 are integer constants, X is of a wrapping type, and cmp is a relational operator, the expression X +- C1 cmp C2 can be simplified in the following cases: (a) If cmp is <= and C2 -+ C1 == +INF(1), we can transform the initial comparison in the following way: X +- C1 <= C2 -INF <= X +- C1 <= C2 (add left hand side which holds for any X, C1) -INF -+ C1 <= X <= C2 -+ C1 (add -+C1 to all 3 expressions) -INF -+ C1 <= X <= +INF (due to (1)) -INF -+ C1 <= X (eliminate the right hand side since it holds for any X) (b) By analogy, if cmp if >= and C2 -+ C1 == -INF(1), use the following sequence of transformations: X +- C1 >= C2 +INF >= X +- C1 >= C2 (add left hand side which holds for any X, C1) +INF -+ C1 >= X >= C2 -+ C1 (add -+C1 to all 3 expressions) +INF -+ C1 >= X >= -INF (due to (1)) +INF -+ C1 >= X (eliminate the right hand side since it holds for any X) (c) The > and < cases are negations of (a) and (b), respectively. This transformation allows to occasionally save add / sub instructions, for instance the expression 3 + (uint32_t)f() < 2 compiles to cmn w0, #4 cset w0, ls instead of add w0, w0, 3 cmp w0, 2 cset w0, ls on aarch64. Testcases that go together with this patch have been split into two separate files, one containing testcases for unsigned variables and the other for wrapping signed ones (and thus compiled with -fwrapv). Additionally, one aarch64 test has been adjusted since the patch has caused the generated code to change from cmn w0, #2 csinc w0, w1, wzr, cc (x < -2) to cmn w0, #3 csinc w0, w1, wzr, cs (x <= -3) This patch has been bootstrapped and regtested on aarch64, x86_64, and i386, and additionally regtested on riscv32. gcc/ChangeLog: PR tree-optimization/116024 * match.pd: New transformation around integer comparison. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr116024-2.c: New test. * gcc.dg/tree-ssa/pr116024-2-fwrapv.c: Ditto. * gcc.target/aarch64/gtu_to_ltu_cmp_1.c: Adjust.

stephanosio added the DNM label Oct 1, 2021

stephanosio requested review from carlescufi and galak October 1, 2021 16:02

stephanosio mentioned this pull request Oct 1, 2021

gcc: Update to add RV32E multilibs zephyrproject-rtos/sdk-ng#398

Merged

galak approved these changes Oct 4, 2021

View reviewed changes

stephanosio force-pushed the rv32e-multilib branch 2 times, most recently from 2f11e2c to aed4ea4 Compare October 28, 2021 08:12

stephanosio removed the DNM label Oct 28, 2021

stephanosio merged commit 15e25dd into zephyr-gcc-10.3.0 Oct 28, 2021

stephanosio deleted the rv32e-multilib branch October 28, 2021 08:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RISC-V: Add RV32E multilibs #2

RISC-V: Add RV32E multilibs #2

Uh oh!

stephanosio commented Oct 1, 2021 •

edited

Loading

Uh oh!

stephanosio commented Oct 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RISC-V: Add RV32E multilibs #2

RISC-V: Add RV32E multilibs #2

Uh oh!

Conversation

stephanosio commented Oct 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stephanosio commented Oct 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

stephanosio commented Oct 1, 2021 •

edited

Loading