Skip to content

Merge pull request #1 from gcc-mirror/master #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Merge pull request #1 from gcc-mirror/master #3

wants to merge 1 commit into from

Conversation

ghost
Copy link

@ghost ghost commented Jul 3, 2015

wtf: just merge

@ghost ghost closed this Jul 3, 2015
marxin pushed a commit to marxin/gccold that referenced this pull request May 2, 2016
	operand gcc-mirror#2 for COMPONENT_REF.
	* gcc-interface/utils2.c (gnat_save_expr): Likewise.
	(gnat_protect_expr): Likewise.
	(gnat_stabilize_reference_1): Likewise.
	(gnat_rewrite_reference): Do not bother about operand gcc-mirror#3 for ARRAY_REF.
	(get_inner_constant_reference): Likewise.
	(gnat_invariant_expr): Likewise.
	* gcc-interface/trans.c (fold_constant_decl_in_expr): Likewise.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@235701 138bc75d-0d04-0410-961f-82ee72b054a4
muffgaga pushed a commit to electronicvisions/gcc that referenced this pull request Mar 1, 2017
removed obsolete S2PP_REG_TYPE -> back to normal, because STD_REG_TYP…
hubot pushed a commit that referenced this pull request May 2, 2017
	* sem_attr.adb (Attribute_Enum_Rep): Disallow T'Enum_Rep.

2017-05-02  Vasiliy Fofanov  <fofanov@adacore.com>

	* s-os_lib.ads: Minor typo fix.

2017-05-02  Vasiliy Fofanov  <fofanov@adacore.com>

	* gnatls.adb: Merge and refactor code from Prj.Env and remove
	this deprecated dependency.

2017-05-02  Ed Schonberg  <schonberg@adacore.com>

	* exp_util.ads: minor comment addition.

2017-05-02  Eric Botcazou  <ebotcazou@adacore.com>

	* sem_ch3.adb (Build_Derived_Record_Type): Fix a few typos and
	pastos in part #3 of the head comment.

2017-05-02  Ed Schonberg  <schonberg@adacore.com>

	* exp_ch3.adb (Freeze_Type): Do not generate an invariant
	procedure body for a local (sub)type declaration within a
	predicate function. Invariant checks do not apply to these, and
	the expansion of the procedure will happen in the wrong scope,
	leading to misplaced freeze nodes.

2017-05-02  Ed Schonberg  <schonberg@adacore.com>

	* exp_util.adb (Insert_Library_Level_Action): Use proper scope
	to analyze generated actions.  If the main unit is a body,
	the required scope is that of the corresponding unit declaration.

2017-05-02  Arnaud Charlet  <charlet@adacore.com>

	* einfo.adb (Declaration_Node): flip branches of
	an IF statement to avoid repeated negations in its condition;
	no change in semantics, only to improve readability.



git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@247480 138bc75d-0d04-0410-961f-82ee72b054a4
marxin added a commit to marxin/gccold that referenced this pull request May 29, 2017
hubot pushed a commit that referenced this pull request May 29, 2017
	* cp-tree.h (build_min_nt_call_vec): Declare.
	* decl.c (build_offset_ref_call_from_tree): Call it.
	* parser.c (cp_parser_postfix_expression): Likewise.
	* pt.c (tsubst_copy_and_build): Likewise.
	* semantics.c (finish_call_expr): Likewise.
	* tree.c (build_min_nt_loc): Keep unresolved lookups.
	(build_min): Likewise.
	(build_min_non_dep): Likewise.
	(build_min_non_dep_call_vec): Likewise.
	(build_min_nt_call_vec): New.

	PR c++/80891 (#3)
	* g++.dg/lookup/pr80891-3.C: New.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@248571 138bc75d-0d04-0410-961f-82ee72b054a4
hubot pushed a commit that referenced this pull request Apr 19, 2018
When -fcf-protection -mcet is used, I got

FAIL: g++.dg/eh/sighandle.C

(gdb) bt
 #0  _Unwind_RaiseException (exc=exc@entry=0x416ed0)
    at /export/gnu/import/git/sources/gcc/libgcc/unwind.inc:140
 #1  0x00007ffff7d9936b in __cxxabiv1::__cxa_throw (obj=<optimized out>,
    tinfo=0x403dd0 <typeinfo for int@@CXXABI_1.3>, dest=0x0)
    at /export/gnu/import/git/sources/gcc/libstdc++-v3/libsupc++/eh_throw.cc:90
 #2  0x0000000000401255 in sighandler (signo=11, si=0x7fffffffd6f8,
    uc=0x7fffffffd5c0)
    at /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/eh/sighandle.C:9
 #3  <signal handler called> <<<< Signal frame which isn't on shadow stack
 #4  dosegv ()
    at /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/eh/sighandle.C:14
 #5  0x00000000004012e3 in main ()
    at /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/eh/sighandle.C:30
(gdb) p frames
$6 = 5
(gdb)

frame count should be 4, not 5.  This patch skips signal frames when
unwinding shadow stack.

gcc/testsuite/

	PR libgcc/85334
	* g++.dg/torture/pr85334.C: New test.

libgcc/

	PR libgcc/85334
	* unwind-generic.h (_Unwind_Frames_Increment): New.
	* config/i386/shadow-stack-unwind.h (_Unwind_Frames_Increment):
	Likewise.
	* unwind.inc (_Unwind_RaiseException_Phase2): Increment frame
	count with _Unwind_Frames_Increment.
	(_Unwind_ForcedUnwind_Phase2): Likewise.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@259502 138bc75d-0d04-0410-961f-82ee72b054a4
kraj pushed a commit to kraj/gcc that referenced this pull request Oct 31, 2019
…piling for Thumb2

Thumb2 code now uses the Arm implementation of legitimize_address.
That code has a case to handle addresses that are absolute CONST_INT
values, which is a common use case in deeply embedded targets (eg:
void *p = (void*)0x12345678).  Since thumb has very limited negative
offsets from a constant, we want to avoid forming a CSE base that will
then be used with a negative value.

This was reported upstream originally in
https://gcc.gnu.org/ml/gcc-help/2019-10/msg00122.html

For example,

void test1(void) {
  volatile uint32_t * const p = (uint32_t *) 0x43fe1800;

  p[3] = 1;
  p[4] = 2;
  p[1] = 3;
  p[7] = 4;
  p[0] = 6;
}

With the new code, instead of 

        ldr     r3, .L2
        subw    r2, r3, #2035
        movs    r1, #1
        str     r1, [r2]
        subw    r2, r3, #2031
        movs    r1, gcc-mirror#2
        str     r1, [r2]
        subw    r2, r3, #2043
        movs    r1, gcc-mirror#3
        str     r1, [r2]
        subw    r2, r3, #2019
        movs    r1, gcc-mirror#4
        subw    r3, r3, #2047
        str     r1, [r2]
        movs    r2, gcc-mirror#6
        str     r2, [r3]
        bx      lr


We now get

        ldr     r3, .L2
        movs    r2, #1
        str     r2, [r3, #2060]
        movs    r2, gcc-mirror#2
        str     r2, [r3, #2064]
        movs    r2, gcc-mirror#3
        str     r2, [r3, #2052]
        movs    r2, gcc-mirror#4
        str     r2, [r3, #2076]
        movs    r2, gcc-mirror#6
        str     r2, [r3, #2048]
        bx      lr


	* config/arm/arm.c (arm_legitimize_address): Don't form negative
	offsets from a CONST_INT address when TARGET_THUMB2.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@277677 138bc75d-0d04-0410-961f-82ee72b054a4
emsr pushed a commit to emsr/gcc that referenced this pull request Nov 2, 2019
…piling for Thumb2

Thumb2 code now uses the Arm implementation of legitimize_address.
That code has a case to handle addresses that are absolute CONST_INT
values, which is a common use case in deeply embedded targets (eg:
void *p = (void*)0x12345678).  Since thumb has very limited negative
offsets from a constant, we want to avoid forming a CSE base that will
then be used with a negative value.

This was reported upstream originally in
https://gcc.gnu.org/ml/gcc-help/2019-10/msg00122.html

For example,

void test1(void) {
  volatile uint32_t * const p = (uint32_t *) 0x43fe1800;

  p[3] = 1;
  p[4] = 2;
  p[1] = 3;
  p[7] = 4;
  p[0] = 6;
}

With the new code, instead of 

        ldr     r3, .L2
        subw    r2, r3, #2035
        movs    r1, gcc-mirror#1
        str     r1, [r2]
        subw    r2, r3, #2031
        movs    r1, gcc-mirror#2
        str     r1, [r2]
        subw    r2, r3, #2043
        movs    r1, gcc-mirror#3
        str     r1, [r2]
        subw    r2, r3, #2019
        movs    r1, gcc-mirror#4
        subw    r3, r3, #2047
        str     r1, [r2]
        movs    r2, gcc-mirror#6
        str     r2, [r3]
        bx      lr


We now get

        ldr     r3, .L2
        movs    r2, gcc-mirror#1
        str     r2, [r3, #2060]
        movs    r2, gcc-mirror#2
        str     r2, [r3, #2064]
        movs    r2, gcc-mirror#3
        str     r2, [r3, #2052]
        movs    r2, gcc-mirror#4
        str     r2, [r3, #2076]
        movs    r2, gcc-mirror#6
        str     r2, [r3, #2048]
        bx      lr


	* config/arm/arm.c (arm_legitimize_address): Don't form negative
	offsets from a CONST_INT address when TARGET_THUMB2.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@277677 138bc75d-0d04-0410-961f-82ee72b054a4
jwakely pushed a commit to jwakely/gcc that referenced this pull request Jan 14, 2020
…piling for Thumb2

Thumb2 code now uses the Arm implementation of legitimize_address.
That code has a case to handle addresses that are absolute CONST_INT
values, which is a common use case in deeply embedded targets (eg:
void *p = (void*)0x12345678).  Since thumb has very limited negative
offsets from a constant, we want to avoid forming a CSE base that will
then be used with a negative value.

This was reported upstream originally in
https://gcc.gnu.org/ml/gcc-help/2019-10/msg00122.html

For example,

void test1(void) {
  volatile uint32_t * const p = (uint32_t *) 0x43fe1800;

  p[3] = 1;
  p[4] = 2;
  p[1] = 3;
  p[7] = 4;
  p[0] = 6;
}

With the new code, instead of 

        ldr     r3, .L2
        subw    r2, r3, #2035
        movs    r1, #1
        str     r1, [r2]
        subw    r2, r3, #2031
        movs    r1, gcc-mirror#2
        str     r1, [r2]
        subw    r2, r3, #2043
        movs    r1, gcc-mirror#3
        str     r1, [r2]
        subw    r2, r3, #2019
        movs    r1, gcc-mirror#4
        subw    r3, r3, #2047
        str     r1, [r2]
        movs    r2, gcc-mirror#6
        str     r2, [r3]
        bx      lr


We now get

        ldr     r3, .L2
        movs    r2, #1
        str     r2, [r3, #2060]
        movs    r2, gcc-mirror#2
        str     r2, [r3, #2064]
        movs    r2, gcc-mirror#3
        str     r2, [r3, #2052]
        movs    r2, gcc-mirror#4
        str     r2, [r3, #2076]
        movs    r2, gcc-mirror#6
        str     r2, [r3, #2048]
        bx      lr


	* config/arm/arm.c (arm_legitimize_address): Don't form negative
	offsets from a CONST_INT address when TARGET_THUMB2.

From-SVN: r277677
maxim-kuvyrkov pushed a commit to maxim-kuvyrkov/gcc-reparent that referenced this pull request Jan 17, 2020
git-svn-id: https://gcc.gnu.org/svn/gcc/branches/ibm/addr-patch@217386 138bc75d-0d04-0410-961f-82ee72b054a4
maxim-kuvyrkov pushed a commit to maxim-kuvyrkov/gcc-reparent that referenced this pull request Jan 17, 2020
git-svn-id: https://gcc.gnu.org/svn/gcc/branches/ibm/addr-patch@217390 138bc75d-0d04-0410-961f-82ee72b054a4
maxim-kuvyrkov pushed a commit to maxim-kuvyrkov/gcc-reparent that referenced this pull request Jan 17, 2020
kraj pushed a commit to kraj/gcc that referenced this pull request Jun 17, 2020
Made apparent by recent commit dc70315
"openmp: Implement discovery of implicit declare target to clauses":

    +FAIL: libgomp.c/target-39.c (internal compiler error)
    +FAIL: libgomp.c/target-39.c (test for excess errors)
    +UNRESOLVED: libgomp.c/target-39.c compilation failed to produce executable

This is in a '--enable-offload-targets=[...],hsa' build, with '-foffload=hsa'
enabled (by default).

    during GIMPLE pass: hsagen
    source-gcc/libgomp/testsuite/libgomp.c/target-39.c: In function ‘main._omp_fn.0.hsa.0’:
    source-gcc/libgomp/testsuite/libgomp.c/target-39.c:23:11: internal compiler error: Segmentation fault
       23 |   #pragma omp target map(from:err)
          |           ^~~
    [...]

GDB:

    Program received signal SIGSEGV, Segmentation fault.
    fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267
    6267      return (fndecl_built_in_p (node, BUILT_IN_NORMAL)
    (gdb) bt
    #0  fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267
    #1  0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5304
    gcc-mirror#2  0x0000000000b1aca7 in gen_hsa_insns_for_gimple_stmt (stmt=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5770
    gcc-mirror#3  0x0000000000b1bd21 in gen_body_from_gimple () at [...]/source-gcc/gcc/hsa-gen.c:5999
    gcc-mirror#4  0x0000000000b1dbd2 in generate_hsa (kernel=<optimized out>) at [...]/source-gcc/gcc/hsa-gen.c:6596
    gcc-mirror#5  0x0000000000b1de66 in (anonymous namespace)::pass_gen_hsail::execute (this=0x2a2aac0) at [...]/source-gcc/gcc/hsa-gen.c:6680
    gcc-mirror#6  0x0000000000d06f90 in execute_one_pass (pass=pass@entry=0x2a2aac0) at [...]/source-gcc/gcc/passes.c:2502
    [...]
    (gdb) up
    #1  0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at /home/thomas/tmp/source/gcc/build/track-slim-omp/source-gcc/gcc/hsa-gen.c:5304
    5304          if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH))
    (gdb) print function_decl
    $1 = (tree) 0x0
    (gdb) list
    5299      if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
    5300        {
    5301          tree function_decl = gimple_call_fndecl (stmt);
    5302          /* Prefetch pass can create type-mismatching prefetch builtin calls which
    5303             fail the gimple_call_builtin_p test above.  Handle them here.  */
    5304          if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH))
    5305            return;
    5306
    5307          if (function_decl == NULL_TREE)
    5308            {

The problem is present already since 2016-11-23 commit
56b1c60 (r242761) "Merge from HSA branch to
trunk", and the fix obvious enough.

	gcc/
	* hsa-gen.c (gen_hsa_insns_for_call): Move 'function_decl ==
	NULL_TREE' check earlier.
	gcc/testsuite/
	* c-c++-common/gomp/hsa-indirect-call-1.c: New file.
kraj pushed a commit to kraj/gcc that referenced this pull request Jun 17, 2020
Made apparent by recent commit dc70315
"openmp: Implement discovery of implicit declare target to clauses":

    +FAIL: libgomp.c/target-39.c (internal compiler error)
    +FAIL: libgomp.c/target-39.c (test for excess errors)
    +UNRESOLVED: libgomp.c/target-39.c compilation failed to produce executable

This is in a '--enable-offload-targets=[...],hsa' build, with '-foffload=hsa'
enabled (by default).

    during GIMPLE pass: hsagen
    source-gcc/libgomp/testsuite/libgomp.c/target-39.c: In function ‘main._omp_fn.0.hsa.0’:
    source-gcc/libgomp/testsuite/libgomp.c/target-39.c:23:11: internal compiler error: Segmentation fault
       23 |   #pragma omp target map(from:err)
          |           ^~~
    [...]

GDB:

    Program received signal SIGSEGV, Segmentation fault.
    fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267
    6267      return (fndecl_built_in_p (node, BUILT_IN_NORMAL)
    (gdb) bt
    #0  fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267
    #1  0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5304
    gcc-mirror#2  0x0000000000b1aca7 in gen_hsa_insns_for_gimple_stmt (stmt=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5770
    gcc-mirror#3  0x0000000000b1bd21 in gen_body_from_gimple () at [...]/source-gcc/gcc/hsa-gen.c:5999
    gcc-mirror#4  0x0000000000b1dbd2 in generate_hsa (kernel=<optimized out>) at [...]/source-gcc/gcc/hsa-gen.c:6596
    gcc-mirror#5  0x0000000000b1de66 in (anonymous namespace)::pass_gen_hsail::execute (this=0x2a2aac0) at [...]/source-gcc/gcc/hsa-gen.c:6680
    gcc-mirror#6  0x0000000000d06f90 in execute_one_pass (pass=pass@entry=0x2a2aac0) at [...]/source-gcc/gcc/passes.c:2502
    [...]
    (gdb) up
    #1  0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at /home/thomas/tmp/source/gcc/build/track-slim-omp/source-gcc/gcc/hsa-gen.c:5304
    5304          if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH))
    (gdb) print function_decl
    $1 = (tree) 0x0
    (gdb) list
    5299      if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
    5300        {
    5301          tree function_decl = gimple_call_fndecl (stmt);
    5302          /* Prefetch pass can create type-mismatching prefetch builtin calls which
    5303             fail the gimple_call_builtin_p test above.  Handle them here.  */
    5304          if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH))
    5305            return;
    5306
    5307          if (function_decl == NULL_TREE)
    5308            {

The problem is present already since 2016-11-23 commit
56b1c60 (r242761) "Merge from HSA branch to
trunk", and the fix obvious enough.

	gcc/
	* hsa-gen.c (gen_hsa_insns_for_call): Move 'function_decl ==
	NULL_TREE' check earlier.
	gcc/testsuite/
	* c-c++-common/gomp/hsa-indirect-call-1.c: New file.

(cherry picked from commit 973bce0)
kraj pushed a commit to kraj/gcc that referenced this pull request Jun 17, 2020
Made apparent by recent commit dc70315
"openmp: Implement discovery of implicit declare target to clauses":

    +FAIL: libgomp.c/target-39.c (internal compiler error)
    +FAIL: libgomp.c/target-39.c (test for excess errors)
    +UNRESOLVED: libgomp.c/target-39.c compilation failed to produce executable

This is in a '--enable-offload-targets=[...],hsa' build, with '-foffload=hsa'
enabled (by default).

    during GIMPLE pass: hsagen
    source-gcc/libgomp/testsuite/libgomp.c/target-39.c: In function ‘main._omp_fn.0.hsa.0’:
    source-gcc/libgomp/testsuite/libgomp.c/target-39.c:23:11: internal compiler error: Segmentation fault
       23 |   #pragma omp target map(from:err)
          |           ^~~
    [...]

GDB:

    Program received signal SIGSEGV, Segmentation fault.
    fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267
    6267      return (fndecl_built_in_p (node, BUILT_IN_NORMAL)
    (gdb) bt
    #0  fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267
    #1  0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5304
    gcc-mirror#2  0x0000000000b1aca7 in gen_hsa_insns_for_gimple_stmt (stmt=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5770
    gcc-mirror#3  0x0000000000b1bd21 in gen_body_from_gimple () at [...]/source-gcc/gcc/hsa-gen.c:5999
    gcc-mirror#4  0x0000000000b1dbd2 in generate_hsa (kernel=<optimized out>) at [...]/source-gcc/gcc/hsa-gen.c:6596
    gcc-mirror#5  0x0000000000b1de66 in (anonymous namespace)::pass_gen_hsail::execute (this=0x2a2aac0) at [...]/source-gcc/gcc/hsa-gen.c:6680
    gcc-mirror#6  0x0000000000d06f90 in execute_one_pass (pass=pass@entry=0x2a2aac0) at [...]/source-gcc/gcc/passes.c:2502
    [...]
    (gdb) up
    #1  0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at /home/thomas/tmp/source/gcc/build/track-slim-omp/source-gcc/gcc/hsa-gen.c:5304
    5304          if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH))
    (gdb) print function_decl
    $1 = (tree) 0x0
    (gdb) list
    5299      if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
    5300        {
    5301          tree function_decl = gimple_call_fndecl (stmt);
    5302          /* Prefetch pass can create type-mismatching prefetch builtin calls which
    5303             fail the gimple_call_builtin_p test above.  Handle them here.  */
    5304          if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH))
    5305            return;
    5306
    5307          if (function_decl == NULL_TREE)
    5308            {

The problem is present already since 2016-11-23 commit
56b1c60 (r242761) "Merge from HSA branch to
trunk", and the fix obvious enough.

	gcc/
	* hsa-gen.c (gen_hsa_insns_for_call): Move 'function_decl ==
	NULL_TREE' check earlier.
	gcc/testsuite/
	* c-c++-common/gomp/hsa-indirect-call-1.c: New file.

(cherry picked from commit 973bce0)
kraj pushed a commit to kraj/gcc that referenced this pull request Jun 17, 2020
Made apparent by recent commit dc70315
"openmp: Implement discovery of implicit declare target to clauses":

    +FAIL: libgomp.c/target-39.c (internal compiler error)
    +FAIL: libgomp.c/target-39.c (test for excess errors)
    +UNRESOLVED: libgomp.c/target-39.c compilation failed to produce executable

This is in a '--enable-offload-targets=[...],hsa' build, with '-foffload=hsa'
enabled (by default).

    during GIMPLE pass: hsagen
    source-gcc/libgomp/testsuite/libgomp.c/target-39.c: In function ‘main._omp_fn.0.hsa.0’:
    source-gcc/libgomp/testsuite/libgomp.c/target-39.c:23:11: internal compiler error: Segmentation fault
       23 |   #pragma omp target map(from:err)
          |           ^~~
    [...]

GDB:

    Program received signal SIGSEGV, Segmentation fault.
    fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267
    6267      return (fndecl_built_in_p (node, BUILT_IN_NORMAL)
    (gdb) bt
    #0  fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267
    #1  0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5304
    gcc-mirror#2  0x0000000000b1aca7 in gen_hsa_insns_for_gimple_stmt (stmt=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5770
    gcc-mirror#3  0x0000000000b1bd21 in gen_body_from_gimple () at [...]/source-gcc/gcc/hsa-gen.c:5999
    gcc-mirror#4  0x0000000000b1dbd2 in generate_hsa (kernel=<optimized out>) at [...]/source-gcc/gcc/hsa-gen.c:6596
    gcc-mirror#5  0x0000000000b1de66 in (anonymous namespace)::pass_gen_hsail::execute (this=0x2a2aac0) at [...]/source-gcc/gcc/hsa-gen.c:6680
    gcc-mirror#6  0x0000000000d06f90 in execute_one_pass (pass=pass@entry=0x2a2aac0) at [...]/source-gcc/gcc/passes.c:2502
    [...]
    (gdb) up
    #1  0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at /home/thomas/tmp/source/gcc/build/track-slim-omp/source-gcc/gcc/hsa-gen.c:5304
    5304          if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH))
    (gdb) print function_decl
    $1 = (tree) 0x0
    (gdb) list
    5299      if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
    5300        {
    5301          tree function_decl = gimple_call_fndecl (stmt);
    5302          /* Prefetch pass can create type-mismatching prefetch builtin calls which
    5303             fail the gimple_call_builtin_p test above.  Handle them here.  */
    5304          if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH))
    5305            return;
    5306
    5307          if (function_decl == NULL_TREE)
    5308            {

The problem is present already since 2016-11-23 commit
56b1c60 (r242761) "Merge from HSA branch to
trunk", and the fix obvious enough.

	gcc/
	* hsa-gen.c (gen_hsa_insns_for_call): Move 'function_decl ==
	NULL_TREE' check earlier.
	gcc/testsuite/
	* c-c++-common/gomp/hsa-indirect-call-1.c: New file.

(cherry picked from commit 973bce0)
kraj pushed a commit to kraj/gcc that referenced this pull request Aug 17, 2020
Since 21cfe72 there's a new
OMP_LIST_NONTEMPORAL value, but it was missing in
resolve_omp_clauses static array that is defined at the function
beginning:

./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gfortran.dg/gomp/nontemporal-1.f90 -fopenmp -c
../../gcc/fortran/openmp.c:4737:28: runtime error: index 21 out of bounds for type 'char *[21]'
    #0 0xbdb956 in resolve_omp_clauses ../../gcc/fortran/openmp.c:4737
    #1 0xbeb076 in resolve_omp_do ../../gcc/fortran/openmp.c:6139
    gcc-mirror#2 0xbf029a in gfc_resolve_omp_directive(gfc_code*, gfc_namespace*) ../../gcc/fortran/openmp.c:6792
    gcc-mirror#3 0xcb6363 in gfc_resolve_code(gfc_code*, gfc_namespace*) ../../gcc/fortran/resolve.c:12185
    gcc-mirror#4 0xcef8cf in resolve_codes ../../gcc/fortran/resolve.c:17303

gcc/fortran/ChangeLog:

	* openmp.c (resolve_omp_clauses): Add NONTEMPORAL to clause
	names.
kraj pushed a commit to kraj/gcc that referenced this pull request Aug 22, 2020
PR analyzer/94851 reports various false "NULL dereference" diagnostics.
The first case (comment #1) affects GCC 10.2 but no longer affects
trunk; I believe it was fixed by the state rewrite of
r11-2694-g808f4dfeb3a95f50f15e71148e5c1067f90a126d.

The patch adds a regression test for this case.

The other cases (comment gcc-mirror#3 and comment gcc-mirror#4) still affect trunk.
In both cases, the && in a conditional is optimized to bitwise &
  _1 = p_4 != 0B;
  _2 = p_4 != q_6(D);
  _3 = _1 & _2;
and the analyzer fails to fold this for the case where one (or both) of
the conditionals is false, and thus erroneously considers the path where
"p" is non-NULL despite being passed a NULL value.

Fix this by implementing folding for this case.

gcc/analyzer/ChangeLog:
	PR analyzer/94851
	* region-model-manager.cc
	(region_model_manager::maybe_fold_binop): Fold bitwise "& 0" to 0.

gcc/testsuite/ChangeLog:
	PR analyzer/94851
	* gcc.dg/analyzer/pr94851-1.c: New test.
	* gcc.dg/analyzer/pr94851-3.c: New test.
	* gcc.dg/analyzer/pr94851-4.c: New test.
kraj pushed a commit to kraj/gcc that referenced this pull request Sep 7, 2020
Given the following C function:

double *f(double *p, unsigned x)
{
    return p + x;
}

prior to this patch, GCC at -O2 would generate:

f:
        add     x0, x0, x1, uxtw 3
        ret

but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.

This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:

   0:   8b214c00        add     x0, x0, w1, uxtw gcc-mirror#3

This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.

---

gcc/ChangeLog:

	* config/aarch64/aarch64.md
	(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
	agrees with width of extension specifier.
	(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*add_uxt<mode>_shift2): Likewise.
	(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*sub_uxt<mode>_shift2): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
	* gcc.target/aarch64/cmp.c: Likewise.
	* gcc.target/aarch64/subs3.c: Likewise.
	* gcc.target/aarch64/subsp.c: Likewise.
	* gcc.target/aarch64/extend-syntax.c: New test.
kraj pushed a commit to kraj/gcc that referenced this pull request Oct 12, 2020
Prevents the following UBSAN error:

./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/torture/pr49770.C -O2 -c
/home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 2, which is not a valid value for type 'bool'
    #0 0x1fdb4d1 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482
    #1 0x1fcadaa in merge_call_side_effects(modref_summary*, gimple*, modref_summary*, bool) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:511
    gcc-mirror#2 0x1fcbadd in analyze_call /home/marxin/Programming/gcc2/gcc/ipa-modref.c:642
    gcc-mirror#3 0x1fcc061 in analyze_stmt /home/marxin/Programming/gcc2/gcc/ipa-modref.c:732
    gcc-mirror#4 0x1fccf31 in analyze_function /home/marxin/Programming/gcc2/gcc/ipa-modref.c:823
    gcc-mirror#5 0x1fd17e5 in execute /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1441
    gcc-mirror#6 0x25cca6e in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509
    gcc-mirror#7 0x25cd39b in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2597
    gcc-mirror#8 0x25cd450 in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2598
    gcc-mirror#9 0x25cd4ee in execute_pass_list(function*, opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2608
    gcc-mirror#10 0x25c7a5a in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/Programming/gcc2/gcc/passes.c:1726
    gcc-mirror#11 0x25cfa3f in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2941
    gcc-mirror#12 0x173572d in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2642
    gcc-mirror#13 0x17364ee in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777
    gcc-mirror#14 0x17372d9 in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022
    gcc-mirror#15 0x2a1f00a in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485
    gcc-mirror#16 0x2a27dc8 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321
    gcc-mirror#17 0x2a283cc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460
    gcc-mirror#18 0x54f21cd in main /home/marxin/Programming/gcc2/gcc/main.c:39
    gcc-mirror#19 0x7ffff6f0de09 in __libc_start_main ../csu/libc-start.c:314
    gcc-mirror#20 0x9eac09 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1plus+0x9eac09)

gcc/ChangeLog:

	* ipa-modref.c (merge_call_side_effects): Clear modref_parm_map
	fields in the vector.
kraj pushed a commit to kraj/gcc that referenced this pull request Oct 19, 2020
It fixes:

/home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 255, which is not a valid value for type 'bool'
    #0 0x18e5df3 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482
    #1 0x18dc180 in ipa_merge_modref_summary_after_inlining(cgraph_edge*) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1779
    gcc-mirror#2 0x18c1c72 in inline_call(cgraph_edge*, bool, vec<cgraph_edge*, va_heap, vl_ptr>*, int*, bool, bool*) /home/marxin/Programming/gcc2/gcc/ipa-inline-transform.c:492
    gcc-mirror#3 0x4a3589c in inline_small_functions /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2216
    gcc-mirror#4 0x4a3b230 in ipa_inline /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2697
    gcc-mirror#5 0x4a3d902 in execute /home/marxin/Programming/gcc2/gcc/ipa-inline.c:3096
    gcc-mirror#6 0x1edf831 in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509
    gcc-mirror#7 0x1ee26af in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2936
    gcc-mirror#8 0x103f31b in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2700
    gcc-mirror#9 0x103fb40 in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777
    gcc-mirror#10 0x104092b in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022
    gcc-mirror#11 0x235723b in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485
    gcc-mirror#12 0x235fff9 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321
    gcc-mirror#13 0x23605fc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460
    gcc-mirror#14 0x4e2b93b in main /home/marxin/Programming/gcc2/gcc/main.c:39
    gcc-mirror#15 0x7ffff6f0ae09 in __libc_start_main ../csu/libc-start.c:314
    gcc-mirror#16 0x9a0be9 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1+0x9a0be9)

gcc/ChangeLog:

	* ipa-modref.c (compute_parm_map): Clear vector.
kraj pushed a commit to kraj/gcc that referenced this pull request Nov 2, 2020
Enable thumb1_gen_const_int to generate RTL or asm depending on the
context, so that we avoid duplicating code to handle constants in
Thumb-1 with -mpure-code.

Use a template so that the algorithm is effectively shared, and
rely on two classes to handle the actual emission as RTL or asm.

The generated sequence is improved to handle right-shiftable and small
values with less instructions. We now generate:

128:
        movs    r0, r0, #128
264:
        movs    r3, gcc-mirror#33
        lsls    r3, gcc-mirror#3
510:
        movs    r3, #255
        lsls    r3, #1
512:
        movs    r3, #1
        lsls    r3, gcc-mirror#9
764:
        movs    r3, #191
        lsls    r3, gcc-mirror#2
65536:
        movs    r3, #1
        lsls    r3, gcc-mirror#16
0x123456:
        movs    r3, gcc-mirror#18 ;0x12
        lsls    r3, gcc-mirror#8
        adds    r3, gcc-mirror#52 ;0x34
        lsls    r3, gcc-mirror#8
        adds    r3, gcc-mirror#86 ;0x56
0x1123456:
        movs    r3, #137 ;0x89
        lsls    r3, gcc-mirror#8
        adds    r3, gcc-mirror#26 ;0x1a
        lsls    r3, gcc-mirror#8
        adds    r3, gcc-mirror#43 ;0x2b
        lsls    r3, #1
0x1000010:
        movs    r3, gcc-mirror#16
        lsls    r3, gcc-mirror#16
        adds    r3, #1
        lsls    r3, gcc-mirror#4
0x1000011:
        movs    r3, #1
        lsls    r3, gcc-mirror#24
        adds    r3, gcc-mirror#17
-8192:
	movs	r3, #1
	lsls	r3, gcc-mirror#13
	rsbs	r3, #0

The patch adds a testcase which does not fully exercise
thumb1_gen_const_int, as other existing patterns already catch small
constants.  These parts of thumb1_gen_const_int are used by
arm_thumb1_mi_thunk.

2020-11-02  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm.c (thumb1_const_rtl, thumb1_const_print): New
	classes.
	(thumb1_gen_const_int): Rename to ...
	(thumb1_gen_const_int_1): ... New helper function. Add capability
	to emit either RTL or asm, improve generated code.
	(thumb1_gen_const_int_rtl): New function.
	* config/arm/arm-protos.h (thumb1_gen_const_int): Rename to
	thumb1_gen_const_int_rtl.
	* config/arm/thumb1.md: Call thumb1_gen_const_int_rtl instead
	of thumb1_gen_const_int.

	gcc/testsuite/
	* gcc.target/arm/pure-code/no-literal-pool-m0.c: New.
nstester referenced this pull request in nstester/gcc Feb 23, 2021
/home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77:25: runtime error: left shift of 0x0000000000000000fffffffffffffffb by 96 places cannot be represented in type '__int128'
    #0 0x7ffff754edfe in __ubsan::Value::getSIntValue() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77
    #1 0x7ffff7548719 in __ubsan::Value::isNegative() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.h:190
    #2 0x7ffff7542a34 in handleShiftOutOfBoundsImpl /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:338
    #3 0x7ffff75431b7 in __ubsan_handle_shift_out_of_bounds /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:370
    #4 0x40067f in main (/home/marxin/Programming/testcases/a.out+0x40067f)
    gcc-mirror#5 0x7ffff72c8b24 in __libc_start_main (/lib64/libc.so.6+0x27b24)
    gcc-mirror#6 0x4005bd in _start (/home/marxin/Programming/testcases/a.out+0x4005bd)

Differential Revision: https://reviews.llvm.org/D97263

Cherry-pick from 16ede09.
kraj pushed a commit to kraj/gcc that referenced this pull request Apr 27, 2021
The insn has extraneous operand gcc-mirror#3 that is aliased in RTL to operand #0
with a constraint.  The operands specify a single-bit field in memory
that the machine instruction produced boths reads for the purpose of
determining whether to branch or not and either clears or sets according
to the machine operation selected with the `ccss' iterator.  The caller
of the insn is supposed to supply the same rtx for both operands.

This odd arrangement happens to work with old reload, but breaks with
libatomic if LRA is used instead:

.../libatomic/flag.c: In function 'atomic_flag_test_and_set':
.../libatomic/flag.c:36:1: error: unable to generate reloads for:
   36 | }
      | ^
(jump_insn 7 6 19 2 (unspec_volatile [
            (set (pc)
                (if_then_else (eq (zero_extract:SI (mem/v:QI (reg:SI 27) [-1  S1 A8])
                            (const_int 1 [0x1])
                            (const_int 0 [0]))
                        (const_int 1 [0x1]))
                    (label_ref:SI 25)
                    (pc)))
            (set (zero_extract:SI (mem/v:QI (reg:SI 28) [-1  S1 A8])
                    (const_int 1 [0x1])
                    (const_int 0 [0]))
                (const_int 1 [0x1]))
        ] 100) ".../libatomic/flag.c":35:10 669 {jbbssiqi}
     (nil)
 -> 25)
during RTL pass: reload
.../libatomic/flag.c:36:1: internal compiler error: in curr_insn_transform, at lra-constraints.c:4098
0x1112c587 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
	.../gcc/rtl-error.c:108
0x10ee6563 curr_insn_transform
	.../gcc/lra-constraints.c:4098
0x10eeaf87 lra_constraints(bool)
	.../gcc/lra-constraints.c:5133
0x10ec97e3 lra(_IO_FILE*)
	.../gcc/lra.c:2336
0x10e4633f do_reload
	.../gcc/ira.c:5827
0x10e46b27 execute
	.../gcc/ira.c:6013
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

Switch to using `match_dup' as expected then for a machine instruction
that in its encoding only has one actual operand in for the single-bit
field.

	gcc/
	* config/vax/builtins.md (jbb<ccss>i<mode>): Remove operand gcc-mirror#3.
	(sync_lock_test_and_set<mode>): Adjust accordingly.
	(sync_lock_release<mode>): Likewise.
kraj pushed a commit to kraj/gcc that referenced this pull request May 7, 2021
gcc/ada/

	* atree.h (Slots_Ptr): Change pointed-to type to any_slot.
	* fe.h (Get_RT_Exception_Name): Change type of parameter.
	* namet.ads (Name_Entry): Mark non-boolean components as aliased,
	reorder the boolean components and add an explicit Spare component.
	* namet.adb (Name_Enter): Adjust aggregate accordingly.
	(Name_Find): Likewise.
	(Reinitialize): Likewise.
	* namet.h (struct Name_Entry): Adjust accordingly.
	(Names_Ptr): Use correct type.
	(Name_Chars_Ptr): Likewise.
	(Get_Name_String): Fix declaration and adjust to above changes.
	* types.ads (RT_Exception_Code): Add pragma Convention C.
	* types.h (Column_Number_Type): Fix original type.
	(slot): Rename union type to...
	(any_slot): ...this and adjust assertion accordingly.
	(RT_Exception_Code): New enumeration type.
	* uintp.ads (Uint_Entry): Mark components as aliased.
	* uintp.h (Uints_Ptr):  Use correct type.
	(Udigits_Ptr): Likewise.
	* gcc-interface/gigi.h (gigi): Adjust name and type of parameter.
	* gcc-interface/cuintp.c (UI_To_gnu): Adjust references to Uints_Ptr
	and Udigits_Ptr.
	* gcc-interface/trans.c (Slots_Ptr): Adjust pointed-to type.
	(gigi): Adjust type of parameter.
	(build_raise_check): Add cast in call to Get_RT_Exception_Name.
kraj pushed a commit to kraj/gcc that referenced this pull request Jun 14, 2021
The fixed error is:

==21166==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs operator delete) on 0x60300000d900
    #0 0x7367d7 in operator delete(void*, unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:172
    #1 0x3b82e6e in pointer_equiv_analyzer::~pointer_equiv_analyzer() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:161
    gcc-mirror#2 0x3b83387 in hybrid_folder::~hybrid_folder() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:517
    gcc-mirror#3 0x3b83387 in execute_early_vrp /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:686
    gcc-mirror#4 0x1790611 in execute_one_pass(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2567
    gcc-mirror#5 0x1792003 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2656
    gcc-mirror#6 0x1792029 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2657
    gcc-mirror#7 0x179209f in execute_pass_list(function*, opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2667
    gcc-mirror#8 0x178a5f3 in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:1773
    gcc-mirror#9 0x1792fac in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/plugin.h:191
    gcc-mirror#10 0x1792fac in execute_ipa_pass_list(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:3001
    gcc-mirror#11 0xc525fc in ipa_passes /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2154
    gcc-mirror#12 0xc525fc in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2289
    gcc-mirror#13 0xc5a096 in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2269
    gcc-mirror#14 0xc5a096 in symbol_table::finalize_compilation_unit() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2537
    gcc-mirror#15 0x1a7a17c in compile_file /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:482
    gcc-mirror#16 0x69c758 in do_compile /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2210
    gcc-mirror#17 0x69c758 in toplev::main(int, char**) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2349
    gcc-mirror#18 0x6a932a in main /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/main.c:39
    gcc-mirror#19 0x7ffff7820b34 in __libc_start_main ../csu/libc-start.c:332
    gcc-mirror#20 0x6aa5fd in _start (/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/objdir/gcc/cc1+0x6aa5fd)

0x60300000d900 is located 0 bytes inside of 32-byte region [0x60300000d900,0x60300000d920)
allocated by thread T0 here:
    #0 0x735ab7 in operator new[](unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:102
    #1 0x3b82dac in pointer_equiv_analyzer::pointer_equiv_analyzer(gimple_ranger*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:156

gcc/ChangeLog:

	* gimple-ssa-evrp.c (pointer_equiv_analyzer::~pointer_equiv_analyzer): Use delete[].
kraj pushed a commit to kraj/gcc that referenced this pull request Nov 26, 2021
Fixes:

==129444==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000666ca5c at pc 0x000000ef094b bp 0x7fffffff8180 sp 0x7fffffff8178
READ of size 4 at 0x00000666ca5c thread T0
    #0 0xef094a in parse_optimize_options ../../gcc/d/d-attribs.cc:855
    #1 0xef0d36 in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:916
    gcc-mirror#2 0xef107e in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:887
    gcc-mirror#3 0xff85b1 in decl_attributes(tree_node**, tree_node*, int, tree_node*) ../../gcc/attribs.c:829
    gcc-mirror#4 0xef2a91 in apply_user_attributes(Dsymbol*, tree_node*) ../../gcc/d/d-attribs.cc:427
    gcc-mirror#5 0xf7b7f3 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:1346
    gcc-mirror#6 0xf87bc7 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:967
    gcc-mirror#7 0xf87bc7 in DeclVisitor::visit(FuncDeclaration*) ../../gcc/d/decl.cc:808
    gcc-mirror#8 0xf83db5 in DeclVisitor::build_dsymbol(Dsymbol*) ../../gcc/d/decl.cc:146

for the following test-case: gcc/testsuite/gdc.dg/attr_optimize1.d.

gcc/d/ChangeLog:

	* d-attribs.cc (parse_optimize_options): Check index before
	accessing cl_options.
kraj pushed a commit to kraj/gcc that referenced this pull request Nov 26, 2021
Fixes:

==129444==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000666ca5c at pc 0x000000ef094b bp 0x7fffffff8180 sp 0x7fffffff8178
READ of size 4 at 0x00000666ca5c thread T0
    #0 0xef094a in parse_optimize_options ../../gcc/d/d-attribs.cc:855
    #1 0xef0d36 in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:916
    gcc-mirror#2 0xef107e in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:887
    gcc-mirror#3 0xff85b1 in decl_attributes(tree_node**, tree_node*, int, tree_node*) ../../gcc/attribs.c:829
    gcc-mirror#4 0xef2a91 in apply_user_attributes(Dsymbol*, tree_node*) ../../gcc/d/d-attribs.cc:427
    gcc-mirror#5 0xf7b7f3 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:1346
    gcc-mirror#6 0xf87bc7 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:967
    gcc-mirror#7 0xf87bc7 in DeclVisitor::visit(FuncDeclaration*) ../../gcc/d/decl.cc:808
    gcc-mirror#8 0xf83db5 in DeclVisitor::build_dsymbol(Dsymbol*) ../../gcc/d/decl.cc:146

for the following test-case: gcc/testsuite/gdc.dg/attr_optimize1.d.

gcc/d/ChangeLog:

	* d-attribs.cc (parse_optimize_options): Check index before
	accessing cl_options.
xionghul pushed a commit to xionghul/gcc that referenced this pull request Jun 3, 2022
This patch makes us avoid substituting into the TEMPLATE_PARM_CONSTRAINTS
of each template parameter except as necessary for declaration matching,
like we already do for the other constituent constraints of a declaration.

This patch also improves the CA104 implementation of explicit
specialization matching of a constrained function template inside a
class template, by considering the function's combined constraints
instead of just its trailing constraints.  This allows us to correctly
handle the first three explicit specializations in concepts-spec2.C
below, but because we compare the constraints as a whole, it means we
incorrectly accept the fourth explicit specialization which writes gcc-mirror#3's
constraints in a different way.  For complete correctness here,
determine_specialization should use tsubst_each_template_parm_constraints
and template_parameter_heads_equivalent_p.

	PR c++/100374

gcc/cp/ChangeLog:

	* pt.cc (determine_specialization): Compare overall constraints
	not just the trailing constraints.
	(tsubst_each_template_parm_constraints): Define.
	(tsubst_friend_function): Use it.
	(tsubst_friend_class): Use it.
	(tsubst_template_parm): Don't substitute TEMPLATE_PARM_CONSTRAINTS.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/concepts-spec2.C: New test.
	* g++.dg/cpp2a/concepts-template-parm11.C: New test.
kraj pushed a commit to kraj/gcc that referenced this pull request Jul 21, 2022
This patch makes us avoid substituting into the TEMPLATE_PARM_CONSTRAINTS
of each template parameter except as necessary for declaration matching,
like we already do for the other constituent constraints of a declaration.

This patch also improves the CA104 implementation of explicit
specialization matching of a constrained function template inside a
class template, by considering the function's combined constraints
instead of just its trailing constraints.  This allows us to correctly
handle the first three explicit specializations in concepts-spec2.C
below, but because we compare the constraints as a whole, it means we
incorrectly accept the fourth explicit specialization which writes gcc-mirror#3's
constraints in a different way.  For complete correctness here,
determine_specialization should use tsubst_each_template_parm_constraints
and template_parameter_heads_equivalent_p.

	PR c++/100374

gcc/cp/ChangeLog:

	* pt.cc (determine_specialization): Compare overall constraints
	not just the trailing constraints.
	(tsubst_each_template_parm_constraints): Define.
	(tsubst_friend_function): Use it.
	(tsubst_friend_class): Use it.
	(tsubst_template_parm): Don't substitute TEMPLATE_PARM_CONSTRAINTS.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/concepts-spec2.C: New test.
	* g++.dg/cpp2a/concepts-template-parm11.C: New test.

(cherry picked from commit 43c013d)
vathpela pushed a commit to vathpela/gcc that referenced this pull request Mar 28, 2023
This is a regression present on the mainline and 12 branch at -O2, but the
issue is related to vectorization so was present at -O3 in earlier versions.

The vcondu expander that was added for VIS 3 more than a decade ago does not
fully work, because it does not filter out the unsigned condition codes (the
instruction is an UNSPEC that accepts only signed condition codes).

While I was at it, I also added the missing vcond and vcondu expanders for
the new comparison instructions that were added in VIS 4.

gcc/
	PR target/109140
	* config/sparc/sparc.cc (sparc_expand_vcond): Call signed_condition
	on operand gcc-mirror#3 to get the final condition code.  Use std::swap.
	* config/sparc/sparc.md (vcondv8qiv8qi): New VIS 4 expander.
	(fucmp<gcond:code>8<P:mode>_vis): Move around.
	(fpcmpu<gcond:code><GCM:gcm_name><P:mode>_vis): Likewise.
	(vcondu<GCM:mode><GCM:mode>): New VIS 4 expander.

gcc/testsuite/
	* gcc.target/sparc/20230328-1.c: New test.
	* gcc.target/sparc/20230328-2.c: Likewise.
	* gcc.target/sparc/20230328-3.c: Likewise.
	* gcc.target/sparc/20230328-4.c: Likewise.
kraj pushed a commit to kraj/gcc that referenced this pull request Mar 28, 2023
This is a regression present on the mainline and 12 branch at -O2, but the
issue is related to vectorization so was present at -O3 in earlier versions.

The vcondu expander that was added for VIS 3 more than a decade ago does not
fully work, because it does not filter out the unsigned condition codes (the
instruction is an UNSPEC that accepts only signed condition codes).

While I was at it, I also added the missing vcond and vcondu expanders for
the new comparison instructions that were added in VIS 4.

gcc/
	PR target/109140
	* config/sparc/sparc.c (sparc_expand_vcond): Call signed_condition
	on operand gcc-mirror#3 to get the final condition code.  Use std::swap.
	* config/sparc/sparc.md (vcondv8qiv8qi): New VIS 4 expander.
	(fucmp<gcond:code>8<P:mode>_vis): Move around.
	(fpcmpu<gcond:code><GCM:gcm_name><P:mode>_vis): Likewise.
	(vcondu<GCM:mode><GCM:mode>): New VIS 4 expander.

gcc/testsuite/
	* gcc.target/sparc/20230328-1.c: New test.
	* gcc.target/sparc/20230328-2.c: Likewise.
	* gcc.target/sparc/20230328-3.c: Likewise.
	* gcc.target/sparc/20230328-4.c: Likewise.
kraj pushed a commit to kraj/gcc that referenced this pull request Mar 28, 2023
This is a regression present on the mainline and 12 branch at -O2, but the
issue is related to vectorization so was present at -O3 in earlier versions.

The vcondu expander that was added for VIS 3 more than a decade ago does not
fully work, because it does not filter out the unsigned condition codes (the
instruction is an UNSPEC that accepts only signed condition codes).

While I was at it, I also added the missing vcond and vcondu expanders for
the new comparison instructions that were added in VIS 4.

gcc/
	PR target/109140
	* config/sparc/sparc.cc (sparc_expand_vcond): Call signed_condition
	on operand gcc-mirror#3 to get the final condition code.  Use std::swap.
	* config/sparc/sparc.md (vcondv8qiv8qi): New VIS 4 expander.
	(fucmp<gcond:code>8<P:mode>_vis): Move around.
	(fpcmpu<gcond:code><GCM:gcm_name><P:mode>_vis): Likewise.
	(vcondu<GCM:mode><GCM:mode>): New VIS 4 expander.

gcc/testsuite/
	* gcc.target/sparc/20230328-1.c: New test.
	* gcc.target/sparc/20230328-2.c: Likewise.
	* gcc.target/sparc/20230328-3.c: Likewise.
	* gcc.target/sparc/20230328-4.c: Likewise.
kraj pushed a commit to kraj/gcc that referenced this pull request May 2, 2023
I noticed that for member class templates of a class template we were
unnecessarily substituting both the template and its type.  Avoiding that
duplication speeds compilation of this silly testcase from ~12s to ~9s on my
laptop.  It's unlikely to make a difference on any real code, but the
simplification is also nice.

We still need to clear CLASSTYPE_USE_TEMPLATE on the partial instantiation
of the template class, but it makes more sense to do that in
tsubst_template_decl anyway.

  #define NC(X)					\
    template <class U> struct X##1;		\
    template <class U> struct X#gcc-mirror#2;		\
    template <class U> struct X#gcc-mirror#3;		\
    template <class U> struct X#gcc-mirror#4;		\
    template <class U> struct X#gcc-mirror#5;		\
    template <class U> struct X#gcc-mirror#6;
  #define NC2(X) NC(X##a) NC(X##b) NC(X##c) NC(X##d) NC(X##e) NC(X##f)
  #define NC3(X) NC2(X##A) NC2(X##B) NC2(X##C) NC2(X##D) NC2(X##E)
  template <int I> struct A
  {
    NC3(am)
  };
  template <class...Ts> void sink(Ts...);
  template <int...Is> void g()
  {
    sink(A<Is>()...);
  }
  template <int I> void f()
  {
    g<__integer_pack(I)...>();
  }
  int main()
  {
    f<1000>();
  }

gcc/cp/ChangeLog:

	* pt.cc (instantiate_class_template): Skip the RECORD_TYPE
	of a class template.
	(tsubst_template_decl): Clear CLASSTYPE_USE_TEMPLATE.
catap pushed a commit to catap/gcc that referenced this pull request May 3, 2023
Given the following C function:

double *f(double *p, unsigned x)
{
    return p + x;
}

prior to this patch, GCC at -O2 would generate:

f:
        add     x0, x0, x1, uxtw 3
        ret

but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.

This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:

   0:   8b214c00        add     x0, x0, w1, uxtw gcc-mirror#3

This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.

---

gcc/ChangeLog:

	* config/aarch64/aarch64.md
	(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
	agrees with width of extension specifier.
	(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*add_uxt<mode>_shift2): Likewise.
	(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*sub_uxt<mode>_shift2): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
	* gcc.target/aarch64/cmp.c: Likewise.
	* gcc.target/aarch64/subs3.c: Likewise.
	* gcc.target/aarch64/subsp.c: Likewise.
	* gcc.target/aarch64/extend-syntax.c: New test.

(cherry picked from commit d4febc7)
catap pushed a commit to catap/gcc that referenced this pull request May 3, 2023
Given the following C function:

double *f(double *p, unsigned x)
{
    return p + x;
}

prior to this patch, GCC at -O2 would generate:

f:
        add     x0, x0, x1, uxtw 3
        ret

but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.

This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:

   0:   8b214c00        add     x0, x0, w1, uxtw gcc-mirror#3

This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.

---

gcc/ChangeLog:

	* config/aarch64/aarch64.md
	(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
	agrees with width of extension specifier.
	(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*add_uxt<mode>_shift2): Likewise.
	(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*sub_uxt<mode>_shift2): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
	* gcc.target/aarch64/cmp.c: Likewise.
	* gcc.target/aarch64/subs3.c: Likewise.
	* gcc.target/aarch64/subsp.c: Likewise.
	* gcc.target/aarch64/extend-syntax.c: New test.

(cherry picked from commit d4febc7)
catap pushed a commit to catap/gcc that referenced this pull request May 3, 2023
Given the following C function:

double *f(double *p, unsigned x)
{
    return p + x;
}

prior to this patch, GCC at -O2 would generate:

f:
        add     x0, x0, x1, uxtw 3
        ret

but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.

This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:

   0:   8b214c00        add     x0, x0, w1, uxtw gcc-mirror#3

This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.

---

gcc/ChangeLog:

	* config/aarch64/aarch64.md
	(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
	agrees with width of extension specifier.
	(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*add_uxt<mode>_shift2): Likewise.
	(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*sub_uxt<mode>_shift2): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
	* gcc.target/aarch64/cmp.c: Likewise.
	* gcc.target/aarch64/subs3.c: Likewise.
	* gcc.target/aarch64/subsp.c: Likewise.
	* gcc.target/aarch64/extend-syntax.c: New test.

(cherry picked from commit d4febc7)
catap pushed a commit to catap/gcc that referenced this pull request May 3, 2023
Given the following C function:

double *f(double *p, unsigned x)
{
    return p + x;
}

prior to this patch, GCC at -O2 would generate:

f:
        add     x0, x0, x1, uxtw 3
        ret

but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.

This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:

   0:   8b214c00        add     x0, x0, w1, uxtw gcc-mirror#3

This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.

---

gcc/ChangeLog:

	* config/aarch64/aarch64.md
	(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
	agrees with width of extension specifier.
	(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*add_uxt<mode>_shift2): Likewise.
	(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*sub_uxt<mode>_shift2): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
	* gcc.target/aarch64/cmp.c: Likewise.
	* gcc.target/aarch64/subs3.c: Likewise.
	* gcc.target/aarch64/subsp.c: Likewise.
	* gcc.target/aarch64/extend-syntax.c: New test.

(cherry picked from commit d4febc7)
catap pushed a commit to catap/gcc that referenced this pull request May 3, 2023
Given the following C function:

double *f(double *p, unsigned x)
{
    return p + x;
}

prior to this patch, GCC at -O2 would generate:

f:
        add     x0, x0, x1, uxtw 3
        ret

but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.

This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:

   0:   8b214c00        add     x0, x0, w1, uxtw gcc-mirror#3

This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.

---

gcc/ChangeLog:

	* config/aarch64/aarch64.md
	(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
	agrees with width of extension specifier.
	(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*add_uxt<mode>_shift2): Likewise.
	(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*sub_uxt<mode>_shift2): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
	* gcc.target/aarch64/cmp.c: Likewise.
	* gcc.target/aarch64/subs3.c: Likewise.
	* gcc.target/aarch64/subsp.c: Likewise.
	* gcc.target/aarch64/extend-syntax.c: New test.

(cherry picked from commit d4febc7)
catap pushed a commit to catap/gcc that referenced this pull request May 4, 2023
Given the following C function:

double *f(double *p, unsigned x)
{
    return p + x;
}

prior to this patch, GCC at -O2 would generate:

f:
        add     x0, x0, x1, uxtw 3
        ret

but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.

This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:

   0:   8b214c00        add     x0, x0, w1, uxtw gcc-mirror#3

This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.

---

gcc/ChangeLog:

	* config/aarch64/aarch64.md
	(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
	agrees with width of extension specifier.
	(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*add_uxt<mode>_shift2): Likewise.
	(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*sub_uxt<mode>_shift2): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
	* gcc.target/aarch64/cmp.c: Likewise.
	* gcc.target/aarch64/subs3.c: Likewise.
	* gcc.target/aarch64/subsp.c: Likewise.
	* gcc.target/aarch64/extend-syntax.c: New test.

(cherry picked from commit d4febc7)
catap pushed a commit to catap/gcc that referenced this pull request May 4, 2023
Given the following C function:

double *f(double *p, unsigned x)
{
    return p + x;
}

prior to this patch, GCC at -O2 would generate:

f:
        add     x0, x0, x1, uxtw 3
        ret

but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.

This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:

   0:   8b214c00        add     x0, x0, w1, uxtw gcc-mirror#3

This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.

---

gcc/ChangeLog:

	* config/aarch64/aarch64.md
	(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
	agrees with width of extension specifier.
	(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*add_uxt<mode>_shift2): Likewise.
	(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*sub_uxt<mode>_shift2): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
	* gcc.target/aarch64/cmp.c: Likewise.
	* gcc.target/aarch64/subs3.c: Likewise.
	* gcc.target/aarch64/subsp.c: Likewise.
	* gcc.target/aarch64/extend-syntax.c: New test.

(cherry picked from commit d4febc7)
Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
kraj pushed a commit to kraj/gcc that referenced this pull request Aug 8, 2023
Hi, Richard and Richi.

Base on the suggestions from Richard:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html

This patch choose (1) approach that Richard provided, meaning:

RVV implements cond_* optabs as expanders.  RVV therefore supports
both IFN_COND_ADD and IFN_COND_LEN_ADD.  No dummy length arguments
are needed at the gimple level.

Such approach can make codes much cleaner and reasonable.

Consider this following case:
void foo (float * __restrict a, float * __restrict b, int * __restrict cond, int n)
{
  for (int i = 0; i < n; i++)
    if (cond[i])
      a[i] = b[i] + a[i];
}

Output of RISC-V (32-bits) gcc (trunk) (Compiler gcc-mirror#3)
<source>:5:21: missed: couldn't vectorize loop
<source>:5:21: missed: not vectorized: control flow in loop.

ARM SVE:

...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vec_mask_and_55 = loop_mask_49 & mask__27.10_51;
...
vect__9.17_62 = .COND_ADD (vec_mask_and_55, vect__6.13_56, vect__8.16_60, vect__6.13_56);

For RVV, we want IR as follows:

...
_68 = .SELECT_VL (ivtmp_66, POLY_INT_CST [4, 4]);
...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vect__9.17_60 = .COND_LEN_ADD (mask__27.10_51, vect__6.13_55, vect__8.16_59, vect__6.13_55, _68, 0);
...

Both len and mask of COND_LEN_ADD are real not dummy.

This patch has been fully tested in RISC-V port with supporting both COND_* and COND_LEN_*.

And also, Bootstrap and Regression on X86 passed.

OK for trunk?

gcc/ChangeLog:

	* internal-fn.cc (get_len_internal_fn): New function.
	(DEF_INTERNAL_COND_FN): Ditto.
	(DEF_INTERNAL_SIGNED_COND_FN): Ditto.
	* internal-fn.h (get_len_internal_fn): Ditto.
	* tree-vect-stmts.cc (vectorizable_call): Add CALL auto-vectorization.
catap pushed a commit to catap/gcc that referenced this pull request Nov 12, 2023
Given the following C function:

double *f(double *p, unsigned x)
{
    return p + x;
}

prior to this patch, GCC at -O2 would generate:

f:
        add     x0, x0, x1, uxtw 3
        ret

but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.

This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:

   0:   8b214c00        add     x0, x0, w1, uxtw gcc-mirror#3

This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.

---

gcc/ChangeLog:

	* config/aarch64/aarch64.md
	(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
	agrees with width of extension specifier.
	(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*add_uxt<mode>_shift2): Likewise.
	(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*sub_uxt<mode>_shift2): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
	* gcc.target/aarch64/cmp.c: Likewise.
	* gcc.target/aarch64/subs3.c: Likewise.
	* gcc.target/aarch64/subsp.c: Likewise.
	* gcc.target/aarch64/extend-syntax.c: New test.

(cherry picked from commit d4febc7)
Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 19, 2024
I noticed that for member class templates of a class template we were
unnecessarily substituting both the template and its type.  Avoiding that
duplication speeds compilation of this silly testcase from ~12s to ~9s on my
laptop.  It's unlikely to make a difference on any real code, but the
simplification is also nice.

We still need to clear CLASSTYPE_USE_TEMPLATE on the partial instantiation
of the template class, but it makes more sense to do that in
tsubst_template_decl anyway.

  #define NC(X)					\
    template <class U> struct X#gcc-mirror#1;		\
    template <class U> struct X#gcc-mirror#2;		\
    template <class U> struct X#gcc-mirror#3;		\
    template <class U> struct X#gcc-mirror#4;		\
    template <class U> struct X#gcc-mirror#5;		\
    template <class U> struct X#gcc-mirror#6;
  #define NC2(X) NC(X##a) NC(X##b) NC(X##c) NC(X##d) NC(X##e) NC(X##f)
  #define NC3(X) NC2(X##A) NC2(X##B) NC2(X##C) NC2(X##D) NC2(X##E)
  template <int I> struct A
  {
    NC3(am)
  };
  template <class...Ts> void sink(Ts...);
  template <int...Is> void g()
  {
    sink(A<Is>()...);
  }
  template <int I> void f()
  {
    g<__integer_pack(I)...>();
  }
  int main()
  {
    f<1000>();
  }

gcc/cp/ChangeLog:

	* pt.cc (instantiate_class_template): Skip the RECORD_TYPE
	of a class template.
	(tsubst_template_decl): Clear CLASSTYPE_USE_TEMPLATE.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 19, 2024
Hi, Richard and Richi.

Base on the suggestions from Richard:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html

This patch choose (1) approach that Richard provided, meaning:

RVV implements cond_* optabs as expanders.  RVV therefore supports
both IFN_COND_ADD and IFN_COND_LEN_ADD.  No dummy length arguments
are needed at the gimple level.

Such approach can make codes much cleaner and reasonable.

Consider this following case:
void foo (float * __restrict a, float * __restrict b, int * __restrict cond, int n)
{
  for (int i = 0; i < n; i++)
    if (cond[i])
      a[i] = b[i] + a[i];
}

Output of RISC-V (32-bits) gcc (trunk) (Compiler gcc-mirror#3)
<source>:5:21: missed: couldn't vectorize loop
<source>:5:21: missed: not vectorized: control flow in loop.

ARM SVE:

...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vec_mask_and_55 = loop_mask_49 & mask__27.10_51;
...
vect__9.17_62 = .COND_ADD (vec_mask_and_55, vect__6.13_56, vect__8.16_60, vect__6.13_56);

For RVV, we want IR as follows:

...
_68 = .SELECT_VL (ivtmp_66, POLY_INT_CST [4, 4]);
...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vect__9.17_60 = .COND_LEN_ADD (mask__27.10_51, vect__6.13_55, vect__8.16_59, vect__6.13_55, _68, 0);
...

Both len and mask of COND_LEN_ADD are real not dummy.

This patch has been fully tested in RISC-V port with supporting both COND_* and COND_LEN_*.

And also, Bootstrap and Regression on X86 passed.

OK for trunk?

gcc/ChangeLog:

	* internal-fn.cc (get_len_internal_fn): New function.
	(DEF_INTERNAL_COND_FN): Ditto.
	(DEF_INTERNAL_SIGNED_COND_FN): Ditto.
	* internal-fn.h (get_len_internal_fn): Ditto.
	* tree-vect-stmts.cc (vectorizable_call): Add CALL auto-vectorization.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 21, 2024
Hi, Richard and Richi.

Base on the suggestions from Richard:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html

This patch choose (1) approach that Richard provided, meaning:

RVV implements cond_* optabs as expanders.  RVV therefore supports
both IFN_COND_ADD and IFN_COND_LEN_ADD.  No dummy length arguments
are needed at the gimple level.

Such approach can make codes much cleaner and reasonable.

Consider this following case:
void foo (float * __restrict a, float * __restrict b, int * __restrict cond, int n)
{
  for (int i = 0; i < n; i++)
    if (cond[i])
      a[i] = b[i] + a[i];
}

Output of RISC-V (32-bits) gcc (trunk) (Compiler gcc-mirror#3)
<source>:5:21: missed: couldn't vectorize loop
<source>:5:21: missed: not vectorized: control flow in loop.

ARM SVE:

...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vec_mask_and_55 = loop_mask_49 & mask__27.10_51;
...
vect__9.17_62 = .COND_ADD (vec_mask_and_55, vect__6.13_56, vect__8.16_60, vect__6.13_56);

For RVV, we want IR as follows:

...
_68 = .SELECT_VL (ivtmp_66, POLY_INT_CST [4, 4]);
...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vect__9.17_60 = .COND_LEN_ADD (mask__27.10_51, vect__6.13_55, vect__8.16_59, vect__6.13_55, _68, 0);
...

Both len and mask of COND_LEN_ADD are real not dummy.

This patch has been fully tested in RISC-V port with supporting both COND_* and COND_LEN_*.

And also, Bootstrap and Regression on X86 passed.

OK for trunk?

gcc/ChangeLog:

	* internal-fn.cc (get_len_internal_fn): New function.
	(DEF_INTERNAL_COND_FN): Ditto.
	(DEF_INTERNAL_SIGNED_COND_FN): Ditto.
	* internal-fn.h (get_len_internal_fn): Ditto.
	* tree-vect-stmts.cc (vectorizable_call): Add CALL auto-vectorization.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 25, 2024
Hi, Richard and Richi.

Base on the suggestions from Richard:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html

This patch choose (1) approach that Richard provided, meaning:

RVV implements cond_* optabs as expanders.  RVV therefore supports
both IFN_COND_ADD and IFN_COND_LEN_ADD.  No dummy length arguments
are needed at the gimple level.

Such approach can make codes much cleaner and reasonable.

Consider this following case:
void foo (float * __restrict a, float * __restrict b, int * __restrict cond, int n)
{
  for (int i = 0; i < n; i++)
    if (cond[i])
      a[i] = b[i] + a[i];
}

Output of RISC-V (32-bits) gcc (trunk) (Compiler gcc-mirror#3)
<source>:5:21: missed: couldn't vectorize loop
<source>:5:21: missed: not vectorized: control flow in loop.

ARM SVE:

...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vec_mask_and_55 = loop_mask_49 & mask__27.10_51;
...
vect__9.17_62 = .COND_ADD (vec_mask_and_55, vect__6.13_56, vect__8.16_60, vect__6.13_56);

For RVV, we want IR as follows:

...
_68 = .SELECT_VL (ivtmp_66, POLY_INT_CST [4, 4]);
...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vect__9.17_60 = .COND_LEN_ADD (mask__27.10_51, vect__6.13_55, vect__8.16_59, vect__6.13_55, _68, 0);
...

Both len and mask of COND_LEN_ADD are real not dummy.

This patch has been fully tested in RISC-V port with supporting both COND_* and COND_LEN_*.

And also, Bootstrap and Regression on X86 passed.

OK for trunk?

gcc/ChangeLog:

	* internal-fn.cc (get_len_internal_fn): New function.
	(DEF_INTERNAL_COND_FN): Ditto.
	(DEF_INTERNAL_SIGNED_COND_FN): Ditto.
	* internal-fn.h (get_len_internal_fn): Ditto.
	* tree-vect-stmts.cc (vectorizable_call): Add CALL auto-vectorization.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 27, 2024
Hi, Richard and Richi.

Base on the suggestions from Richard:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html

This patch choose (1) approach that Richard provided, meaning:

RVV implements cond_* optabs as expanders.  RVV therefore supports
both IFN_COND_ADD and IFN_COND_LEN_ADD.  No dummy length arguments
are needed at the gimple level.

Such approach can make codes much cleaner and reasonable.

Consider this following case:
void foo (float * __restrict a, float * __restrict b, int * __restrict cond, int n)
{
  for (int i = 0; i < n; i++)
    if (cond[i])
      a[i] = b[i] + a[i];
}

Output of RISC-V (32-bits) gcc (trunk) (Compiler gcc-mirror#3)
<source>:5:21: missed: couldn't vectorize loop
<source>:5:21: missed: not vectorized: control flow in loop.

ARM SVE:

...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vec_mask_and_55 = loop_mask_49 & mask__27.10_51;
...
vect__9.17_62 = .COND_ADD (vec_mask_and_55, vect__6.13_56, vect__8.16_60, vect__6.13_56);

For RVV, we want IR as follows:

...
_68 = .SELECT_VL (ivtmp_66, POLY_INT_CST [4, 4]);
...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vect__9.17_60 = .COND_LEN_ADD (mask__27.10_51, vect__6.13_55, vect__8.16_59, vect__6.13_55, _68, 0);
...

Both len and mask of COND_LEN_ADD are real not dummy.

This patch has been fully tested in RISC-V port with supporting both COND_* and COND_LEN_*.

And also, Bootstrap and Regression on X86 passed.

OK for trunk?

gcc/ChangeLog:

	* internal-fn.cc (get_len_internal_fn): New function.
	(DEF_INTERNAL_COND_FN): Ditto.
	(DEF_INTERNAL_SIGNED_COND_FN): Ditto.
	* internal-fn.h (get_len_internal_fn): Ditto.
	* tree-vect-stmts.cc (vectorizable_call): Add CALL auto-vectorization.
hubot pushed a commit that referenced this pull request Jun 13, 2024
Here during overload resolution we have two strictly viable ambiguous
candidates #1 and #2, and two non-strictly viable candidates #3 and #4
which we hold on to ever since r14-6522.  These latter candidates have
an empty second arg conversion since the first arg conversion was deemed
bad, and this trips up joust when called on #3 and #4 which assumes all
arg conversions are there.

We can fix this by making joust robust to empty arg conversions, but in
this situation we shouldn't need to compare #3 and #4 at all given that
we have a strictly viable candidate.  To that end, this patch makes
tourney shortcut considering non-strictly viable candidates upon
encountering ambiguity between two strictly viable candidates (taking
advantage of the fact that the candidates list is sorted according to
viability via splice_viable).

	PR c++/115239

gcc/cp/ChangeLog:

	* call.cc (tourney): Don't consider a non-strictly viable
	candidate as the champ if there was ambiguity between two
	strictly viable candidates.

gcc/testsuite/ChangeLog:

	* g++.dg/overload/error7.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
hubot pushed a commit that referenced this pull request Jun 17, 2024
Here during overload resolution we have two strictly viable ambiguous
candidates #1 and #2, and two non-strictly viable candidates #3 and #4
which we hold on to ever since r14-6522.  These latter candidates have
an empty second arg conversion since the first arg conversion was deemed
bad, and this trips up joust when called on #3 and #4 which assumes all
arg conversions are there.

We can fix this by making joust robust to empty arg conversions, but in
this situation we shouldn't need to compare #3 and #4 at all given that
we have a strictly viable candidate.  To that end, this patch makes
tourney shortcut considering non-strictly viable candidates upon
encountering ambiguity between two strictly viable candidates (taking
advantage of the fact that the candidates list is sorted according to
viability via splice_viable).

	PR c++/115239

gcc/cp/ChangeLog:

	* call.cc (tourney): Don't consider a non-strictly viable
	candidate as the champ if there was ambiguity between two
	strictly viable candidates.

gcc/testsuite/ChangeLog:

	* g++.dg/overload/error7.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 7fed7e9)
hubot pushed a commit that referenced this pull request Sep 7, 2024
…o_debug_section [PR116614]

cat abc.C
  #define A(n) struct T##n {} t##n;
  #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
  #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
  #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
  #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
  E(1) E(2) E(3)
  int main () { return 0; }
./xg++ -B ./ -o abc{.o,.C} -flto -flto-partition=1to1 -O2 -g -fdebug-types-section -c
./xgcc -B ./ -o abc{,.o} -flto -flto-partition=1to1 -O2
(not included in testsuite as it takes a while to compile) FAILs with
lto-wrapper: fatal error: Too many copied sections: Operation not supported
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

The following patch fixes that.  Most of the 64K+ section support for
reading and writing was already there years ago (and especially reading used
quite often already) and a further bug fixed in it in the PR104617 fix.

Yet, the fix isn't solely about removing the
  if (new_i - 1 >= SHN_LORESERVE)
    {
      *err = ENOTSUP;
      return "Too many copied sections";
    }
5 lines, the missing part was that the function only handled reading of
the .symtab_shndx section but not copying/updating of it.
If the result has less than 64K-epsilon sections, that actually wasn't
needed, but e.g. with -fdebug-types-section one can exceed that pretty
easily (reported to us on WebKitGtk build on ppc64le).
Updating the section is slightly more complicated, because it basically
needs to be done in lock step with updating the .symtab section, if one
doesn't need to use SHN_XINDEX in there, the section should (or should be
updated to) contain SHN_UNDEF entry, otherwise needs to have whatever would
be overwise stored but couldn't fit.  But repeating due to that all the
symtab decisions what to discard and how to rewrite it would be ugly.

So, the patch instead emits the .symtab_shndx section (or sections) last
and prepares the content during the .symtab processing and in a second
pass when going just through .symtab_shndx sections just uses the saved
content.

2024-09-07  Jakub Jelinek  <jakub@redhat.com>

	PR lto/116614
	* simple-object-elf.c (SHN_COMMON): Align comment with neighbouring
	comments.
	(SHN_HIRESERVE): Use uppercase hex digits instead of lowercase for
	consistency.
	(simple_object_elf_find_sections): Formatting fixes.
	(simple_object_elf_fetch_attributes): Likewise.
	(simple_object_elf_attributes_merge): Likewise.
	(simple_object_elf_start_write): Likewise.
	(simple_object_elf_write_ehdr): Likewise.
	(simple_object_elf_write_shdr): Likewise.
	(simple_object_elf_write_to_file): Likewise.
	(simple_object_elf_copy_lto_debug_section): Likewise.  Don't fail for
	new_i - 1 >= SHN_LORESERVE, instead arrange in that case to copy
	over .symtab_shndx sections, though emit those last and compute their
	section content when processing associated .symtab sections.  Handle
	simple_object_internal_read failure even in the .symtab_shndx reading
	case.
mikpe added a commit to mikpe/gcc that referenced this pull request Sep 8, 2024
hubot pushed a commit that referenced this pull request Sep 12, 2024
…o_debug_section [PR116614]

cat abc.C
  #define A(n) struct T##n {} t##n;
  #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
  #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
  #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
  #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
  E(1) E(2) E(3)
  int main () { return 0; }
./xg++ -B ./ -o abc{.o,.C} -flto -flto-partition=1to1 -O2 -g -fdebug-types-section -c
./xgcc -B ./ -o abc{,.o} -flto -flto-partition=1to1 -O2
(not included in testsuite as it takes a while to compile) FAILs with
lto-wrapper: fatal error: Too many copied sections: Operation not supported
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

The following patch fixes that.  Most of the 64K+ section support for
reading and writing was already there years ago (and especially reading used
quite often already) and a further bug fixed in it in the PR104617 fix.

Yet, the fix isn't solely about removing the
  if (new_i - 1 >= SHN_LORESERVE)
    {
      *err = ENOTSUP;
      return "Too many copied sections";
    }
5 lines, the missing part was that the function only handled reading of
the .symtab_shndx section but not copying/updating of it.
If the result has less than 64K-epsilon sections, that actually wasn't
needed, but e.g. with -fdebug-types-section one can exceed that pretty
easily (reported to us on WebKitGtk build on ppc64le).
Updating the section is slightly more complicated, because it basically
needs to be done in lock step with updating the .symtab section, if one
doesn't need to use SHN_XINDEX in there, the section should (or should be
updated to) contain SHN_UNDEF entry, otherwise needs to have whatever would
be overwise stored but couldn't fit.  But repeating due to that all the
symtab decisions what to discard and how to rewrite it would be ugly.

So, the patch instead emits the .symtab_shndx section (or sections) last
and prepares the content during the .symtab processing and in a second
pass when going just through .symtab_shndx sections just uses the saved
content.

2024-09-07  Jakub Jelinek  <jakub@redhat.com>

	PR lto/116614
	* simple-object-elf.c (SHN_COMMON): Align comment with neighbouring
	comments.
	(SHN_HIRESERVE): Use uppercase hex digits instead of lowercase for
	consistency.
	(simple_object_elf_find_sections): Formatting fixes.
	(simple_object_elf_fetch_attributes): Likewise.
	(simple_object_elf_attributes_merge): Likewise.
	(simple_object_elf_start_write): Likewise.
	(simple_object_elf_write_ehdr): Likewise.
	(simple_object_elf_write_shdr): Likewise.
	(simple_object_elf_write_to_file): Likewise.
	(simple_object_elf_copy_lto_debug_section): Likewise.  Don't fail for
	new_i - 1 >= SHN_LORESERVE, instead arrange in that case to copy
	over .symtab_shndx sections, though emit those last and compute their
	section content when processing associated .symtab sections.  Handle
	simple_object_internal_read failure even in the .symtab_shndx reading
	case.

(cherry picked from commit bb8dd09)
hubot pushed a commit that referenced this pull request Sep 13, 2024
…o_debug_section [PR116614]

cat abc.C
  #define A(n) struct T##n {} t##n;
  #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
  #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
  #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
  #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
  E(1) E(2) E(3)
  int main () { return 0; }
./xg++ -B ./ -o abc{.o,.C} -flto -flto-partition=1to1 -O2 -g -fdebug-types-section -c
./xgcc -B ./ -o abc{,.o} -flto -flto-partition=1to1 -O2
(not included in testsuite as it takes a while to compile) FAILs with
lto-wrapper: fatal error: Too many copied sections: Operation not supported
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

The following patch fixes that.  Most of the 64K+ section support for
reading and writing was already there years ago (and especially reading used
quite often already) and a further bug fixed in it in the PR104617 fix.

Yet, the fix isn't solely about removing the
  if (new_i - 1 >= SHN_LORESERVE)
    {
      *err = ENOTSUP;
      return "Too many copied sections";
    }
5 lines, the missing part was that the function only handled reading of
the .symtab_shndx section but not copying/updating of it.
If the result has less than 64K-epsilon sections, that actually wasn't
needed, but e.g. with -fdebug-types-section one can exceed that pretty
easily (reported to us on WebKitGtk build on ppc64le).
Updating the section is slightly more complicated, because it basically
needs to be done in lock step with updating the .symtab section, if one
doesn't need to use SHN_XINDEX in there, the section should (or should be
updated to) contain SHN_UNDEF entry, otherwise needs to have whatever would
be overwise stored but couldn't fit.  But repeating due to that all the
symtab decisions what to discard and how to rewrite it would be ugly.

So, the patch instead emits the .symtab_shndx section (or sections) last
and prepares the content during the .symtab processing and in a second
pass when going just through .symtab_shndx sections just uses the saved
content.

2024-09-07  Jakub Jelinek  <jakub@redhat.com>

	PR lto/116614
	* simple-object-elf.c (SHN_COMMON): Align comment with neighbouring
	comments.
	(SHN_HIRESERVE): Use uppercase hex digits instead of lowercase for
	consistency.
	(simple_object_elf_find_sections): Formatting fixes.
	(simple_object_elf_fetch_attributes): Likewise.
	(simple_object_elf_attributes_merge): Likewise.
	(simple_object_elf_start_write): Likewise.
	(simple_object_elf_write_ehdr): Likewise.
	(simple_object_elf_write_shdr): Likewise.
	(simple_object_elf_write_to_file): Likewise.
	(simple_object_elf_copy_lto_debug_section): Likewise.  Don't fail for
	new_i - 1 >= SHN_LORESERVE, instead arrange in that case to copy
	over .symtab_shndx sections, though emit those last and compute their
	section content when processing associated .symtab sections.  Handle
	simple_object_internal_read failure even in the .symtab_shndx reading
	case.

(cherry picked from commit bb8dd09)
hubot pushed a commit that referenced this pull request Oct 9, 2024
Whenever C1 and C2 are integer constants, X is of a wrapping type, and
cmp is a relational operator, the expression X +- C1 cmp C2 can be
simplified in the following cases:

(a) If cmp is <= and C2 -+ C1 == +INF(1), we can transform the initial
comparison in the following way:
   X +- C1 <= C2
   -INF <= X +- C1 <= C2 (add left hand side which holds for any X, C1)
   -INF -+ C1 <= X <= C2 -+ C1 (add -+C1 to all 3 expressions)
   -INF -+ C1 <= X <= +INF (due to (1))
   -INF -+ C1 <= X (eliminate the right hand side since it holds for any X)

(b) By analogy, if cmp if >= and C2 -+ C1 == -INF(1), use the following
sequence of transformations:

   X +- C1 >= C2
   +INF >= X +- C1 >= C2 (add left hand side which holds for any X, C1)
   +INF -+ C1 >= X >= C2 -+ C1 (add -+C1 to all 3 expressions)
   +INF -+ C1 >= X >= -INF (due to (1))
   +INF -+ C1 >= X (eliminate the right hand side since it holds for any X)

(c) The > and < cases are negations of (a) and (b), respectively.

This transformation allows to occasionally save add / sub instructions,
for instance the expression

3 + (uint32_t)f() < 2

compiles to

cmn     w0, #4
cset    w0, ls

instead of

add     w0, w0, 3
cmp     w0, 2
cset    w0, ls

on aarch64.

Testcases that go together with this patch have been split into two
separate files, one containing testcases for unsigned variables and the
other for wrapping signed ones (and thus compiled with -fwrapv).
Additionally, one aarch64 test has been adjusted since the patch has
caused the generated code to change from

cmn     w0, #2
csinc   w0, w1, wzr, cc   (x < -2)

to

cmn     w0, #3
csinc   w0, w1, wzr, cs   (x <= -3)

This patch has been bootstrapped and regtested on aarch64, x86_64, and
i386, and additionally regtested on riscv32.

gcc/ChangeLog:

	PR tree-optimization/116024
	* match.pd: New transformation around integer comparison.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr116024-2.c: New test.
	* gcc.dg/tree-ssa/pr116024-2-fwrapv.c: Ditto.
	* gcc.target/aarch64/gtu_to_ltu_cmp_1.c: Adjust.
hubot pushed a commit that referenced this pull request Oct 24, 2024
This patch folds svindex with constant arguments into a vector series.
We implemented this in svindex_impl::fold using the function build_vec_series.
For example,
svuint64_t f1 ()
{
  return svindex_u642 (10, 3);
}
compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...}
in the gimple pass lower.
This optimization benefits cases where svindex is used in combination with
other gimple-level optimizations.
For example,
svuint64_t f2 ()
{
    return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5);
}
has previously been compiled to
f2:
        index   z0.d, #10, #3
        mul     z0.d, z0.d, #5
        ret
Now, it is compiled to
f2:
        mov     x0, 50
        index   z0.d, x0, #15
        ret

We added test cases checking
- the application of the transform during gimple for constant arguments,
- the interaction with another gimple-level optimization.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>

gcc/
	* config/aarch64/aarch64-sve-builtins-base.cc
	(svindex_impl::fold): Add constant folding.

gcc/testsuite/
	* gcc.target/aarch64/sve/index_const_fold.c: New test.
hubot pushed a commit that referenced this pull request Mar 31, 2025
Here we instantiate the lambda three times in producing A<0>::f:
1) in tsubst_function_type, substituting the type of A<>::f
2) in tsubst_function_decl, substituting the parameters of A<>::f
3) in regenerate_decl_from_template when instantiating A<>::f

The first one gets thrown away by maybe_rebuild_function_decl_type.  Before
r15-7202, we happily built all of them and mangled the result wrongly as
lambda #3.  After r15-7202, we try to mangle #3 as #1, which breaks because
 #1 is already mangled as #1.

This patch avoids building #3 by suppressing regenerate_decl_from_template
if the template signature includes a lambda, fixing the ICE.

We now mangle the lambda as #2, which is still wrong.  Addressing that
should involve not calling tsubst_function_type from tsubst_function_decl,
and building the type from the parms types in the first place rather than
fixing it up in maybe_rebuild_function_decl_type.

	PR c++/119401

gcc/cp/ChangeLog:

	* pt.cc (regenerate_decl_from_template): Don't regenerate if the
	signature involves a lambda.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/lambda-targ11.C: New test.
Peter0x44 pushed a commit to Peter0x44/gcc that referenced this pull request Apr 6, 2025
Here we instantiate the lambda three times in producing A<0>::f:
1) in tsubst_function_type, substituting the type of A<>::f
2) in tsubst_function_decl, substituting the parameters of A<>::f
3) in regenerate_decl_from_template when instantiating A<>::f

The first one gets thrown away by maybe_rebuild_function_decl_type.  Before
r15-7202, we happily built all of them and mangled the result wrongly as
lambda gcc-mirror#3.  After r15-7202, we try to mangle gcc-mirror#3 as gcc-mirror#1, which breaks because
 gcc-mirror#1 is already mangled as gcc-mirror#1.

This patch avoids building gcc-mirror#3 by suppressing regenerate_decl_from_template
if the template signature includes a lambda, fixing the ICE.

We now mangle the lambda as gcc-mirror#2, which is still wrong.  Addressing that
should involve not calling tsubst_function_type from tsubst_function_decl,
and building the type from the parms types in the first place rather than
fixing it up in maybe_rebuild_function_decl_type.

	PR c++/119401

gcc/cp/ChangeLog:

	* pt.cc (regenerate_decl_from_template): Don't regenerate if the
	signature involves a lambda.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/lambda-targ11.C: New test.
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants