With stacktrace attribute #6

ecatmur · 2022-07-17T21:17:41Z

No description provided.

Rather than allocate heap space (next to the exception object?), we use a thread_local with_stacktrace object and move-construct the catch object immediately on entry into the catch block. Builds on recent std::stacktrace support, added in https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588617.html Pass --enable-libstdcxx-backtrace=yes to configure and link libstdc++_libbacktrace.a We pass the caught type through the exception machinery in a special WITH_STACKTRACE_TYPE type wrapper. This is mangled as if it was an attribute (we can't use an actual attribute-modified type, because (a) attributes can't variably modify tagged types, and (b) only platform attributes are allowed to affect type identity). The function to perform the work of collecting the stacktrace at throw point is not compiled into libsupc++ but instead grabbed out of namespace std when doing rtti for the catch block. This is more flexible and in keeping with zero-overhead principle.

per P2370

…tribute

This simple patch implements Richard Biener's suggestion in comment #6 of PR tree-optimization/52171 (from February 2013) that the insn-preds code generated by genpreds can avoid using strncmp when matching constant strings of length one. The effect of this patch is best explained by the diff of insn-preds.cc: < if (!strncmp (str + 1, "g", 1)) --- > if (str[1] == 'g') 3104c3104 < if (!strncmp (str + 1, "m", 1)) --- > if (str[1] == 'm') 3106c3106 < if (!strncmp (str + 1, "c", 1)) --- > if (str[1] == 'c') ... The equivalent optimization is performed by GCC (but perhaps not by the host compiler), but generating simpler/smaller code may encourage further optimizations (such as use of a switch statement). 2022-05-24 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * genpreds.cc (write_lookup_constraint_1): Avoid generating a call to strncmp for strings of length one.

This patch fixes both ICE regressions PR middle-end/105853 and PR target/105856 caused by my recent patch to expand small const structs as immediate constants. That patch updated code generation in three places: two in expr.cc that call store_constructor directly, and the third in calls.cc's load_register_parameters that expands its CONSTRUCTOR via expand_expr, as store_constructor is local/static to expr.cc, and the "public" API, should usually simply forward the constructor to the appropriate store_constructor function. Alas, despite the clean regression testing on multiple targets, the above ICEs show that expand_expr isn't a suitable proxy for store_constructor, and things that (I'd assumed) shouldn't affect how/whether a struct is placed in a register [such as whether the struct is considered packed/ aligned or not] actually interfere with the optimization that is being attempted. The (proposed) solution is to export store_constructor (and it's helper function int_expr_size) from expr.cc, by removing their static qualifier and prototyping both functions in expr.h, so they can be called directly from load_register_parameters in calls.cc. This cures both ICEs, but almost as importantly improves code generation over GCC 12. For PR 105853, GCC 12 generates: compose_nd_na_ipv6_src: movzx eax, WORD PTR eth_addr_zero[rip+2] movzx edx, WORD PTR eth_addr_zero[rip] movzx edi, WORD PTR eth_addr_zero[rip+4] sal rax, 16 or rax, rdx sal rdi, 32 or rdi, rax xor eax, eax jmp packet_set_nd eth_addr_zero: .zero 6 where now (with this fix) GCC 13 generates: compose_nd_na_ipv6_src: xorl %edi, %edi xorl %eax, %eax jmp packet_set_nd Likewise, for PR 105856 on ARM, we'd previously generate: g_329_3: movw r3, #:lower16:.LANCHOR0 movt r3, #:upper16:.LANCHOR0 ldr r0, [r3] b func_19 but with this optimization we now generate: g_329_3: mov r0, #6 b func_19 2022-06-07 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR middle-end/105853 PR target/105856 * calls.cc (load_register_parameters): Call store_constructor and int_expr_size directly instead of expanding via expand_expr. * expr.cc (static void store_constructor): Don't prototype here. (static HOST_WIDE_INT int_expr_size): Likewise. (store_constructor): No longer static. (int_expr_size): Likewise, no longer static. * expr.h (store_constructor): Prototype here. (int_expr_size): Prototype here. gcc/testsuite/ChangeLog PR middle-end/105853 PR target/105856 * gcc.dg/pr105853.c: New test case. * gcc.dg/pr105856.c: New test case.

This patch addresses the issue in comment #6 of PR rtl-optimization/7061 (a four digit PR number) from 2006 where on x86_64 complex number arguments are unconditionally spilled to the stack. For the test cases below: float re(float _Complex a) { return __real__ a; } float im(float _Complex a) { return __imag__ a; } GCC with -O2 currently generates: re: movq %xmm0, -8(%rsp) movss -8(%rsp), %xmm0 ret im: movq %xmm0, -8(%rsp) movss -4(%rsp), %xmm0 ret with this patch we now generate: re: ret im: movq %xmm0, %rax shrq $32, %rax movd %eax, %xmm0 ret [Technically, this shift can be performed on %xmm0 in a single instruction, but the backend needs to be taught to do that, the important bit is that the SCmode argument isn't written to the stack]. The patch itself is to emit_group_store where just before RTL expansion commits to writing to the stack, we check if the store group consists of a single scalar integer register that holds a complex mode value; on x86_64 SCmode arguments are passed in DImode registers. If this is the case, we can use a SUBREG to "view_convert" the integer to the equivalent complex mode. An interesting corner case that showed up during testing is that x86_64 also passes HCmode arguments in DImode registers(!), i.e. using modes of different sizes. This is easily handled/supported by first converting to an integer mode of the correct size, and then generating a complex mode SUBREG of this. This is similar in concept to the patch I proposed here: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html 2020-06-10 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR rtl-optimization/7061 * expr.cc (emit_group_store): For groups that consist of a single scalar integer register that hold a complex mode value, use gen_lowpart to generate a SUBREG to "view_convert" to the complex mode. For modes of different sizes, first convert to an integer mode of the appropriate size. gcc/testsuite/ChangeLog PR rtl-optimization/7061 * gcc.target/i386/pr7061-1.c: New test case. * gcc.target/i386/pr7061-2.c: New test case.

I noticed that for member class templates of a class template we were unnecessarily substituting both the template and its type. Avoiding that duplication speeds compilation of this silly testcase from ~12s to ~9s on my laptop. It's unlikely to make a difference on any real code, but the simplification is also nice. We still need to clear CLASSTYPE_USE_TEMPLATE on the partial instantiation of the template class, but it makes more sense to do that in tsubst_template_decl anyway. #define NC(X) \ template <class U> struct X##1; \ template <class U> struct X##2; \ template <class U> struct X##3; \ template <class U> struct X##4; \ template <class U> struct X##5; \ template <class U> struct X##6; #define NC2(X) NC(X##a) NC(X##b) NC(X##c) NC(X##d) NC(X##e) NC(X##f) #define NC3(X) NC2(X##A) NC2(X##B) NC2(X##C) NC2(X##D) NC2(X##E) template <int I> struct A { NC3(am) }; template <class...Ts> void sink(Ts...); template <int...Is> void g() { sink(A<Is>()...); } template <int I> void f() { g<__integer_pack(I)...>(); } int main() { f<1000>(); } gcc/cp/ChangeLog: * pt.cc (instantiate_class_template): Skip the RECORD_TYPE of a class template. (tsubst_template_decl): Clear CLASSTYPE_USE_TEMPLATE.

ecatmur added 3 commits February 13, 2022 23:00

implement static member function

690810a

per P2370

Merge remote-tracking branch 'ecatmur/master' into with-stacktrace-at…

36969be

…tribute

ecatmur added 3 commits July 27, 2022 19:37

Merge branch 'gcc-mirror:master' into with-stacktrace-attribute

a922b3c

Merge branch 'gcc-mirror:master' into with-stacktrace-attribute

a7e9e74

Merge branch 'gcc-mirror:master' into with-stacktrace-attribute

52f3c7d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

With stacktrace attribute #6

With stacktrace attribute #6

Uh oh!

ecatmur commented Jul 17, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

With stacktrace attribute #6

Are you sure you want to change the base?

With stacktrace attribute #6

Uh oh!

Conversation

ecatmur commented Jul 17, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants