mirrored from git://gcc.gnu.org/git/gcc.git
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Releases/gcc 12 #65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jacopobrusini
wants to merge
2,886
commits into
master
Choose a base branch
from
releases/gcc-12
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Releases/gcc 12 #65
+285,730
−139,638
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is an unofficial mirror that has nothing to do with the GCC project, so submitting pull requests here is a waste of time. Also, I have no idea what this pull request is trying to do but it would never be accepted even if it was submitted to the right place. |
atahanozbayram
approved these changes
Apr 2, 2024
NinaRanns
pushed a commit
to NinaRanns/gcc
that referenced
this pull request
Jan 28, 2025
…on-r15-7214-g0710024b5bd861 Contracts nonattr rebase on r15 7214 g0710024b5bd861
x is not a macro argument. It just happens to work as final.cc passes x for 2nd argument: final.cc: ASM_OUTPUT_SYMBOL_REF (file, x); PR target/118825 * config/i386/i386.h (ASM_OUTPUT_SYMBOL_REF): Replace x with SYM. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit 7317fc0)
Add crtbeginT.o to extra_parts on FreeBSD. This ensures we use GCC's crt objects for static linking. Otherwise it could mix crtbeginT.o from the base system with libgcc's crtend.o, possibly leading to segfaults. libgcc: PR target/118685 * config.host (*-*-freebsd*): Add crtbeginT.o to extra_parts. Signed-off-by: Dimitry Andric <dimitry@andric.com>
During combine we may end up with (set (reg:DI 66 [ _6 ]) (ashift:DI (reg:DI 72 [ x ]) (subreg:QI (and:TI (reg:TI 67 [ _1 ]) (const_wide_int 0x0aaaaaaaaaaaaaabf)) 15))) where the shift count operand does not trivially fit the scheme of address operands. Reject those operands, especially since strip_address_mutations() expects expressions of the form (and ... (const_int ...)) and fails for (and ... (const_wide_int ...)). Thus, be more strict here and accept only CONST_INT operands. Done by replacing immediate_operand() with const_int_operand() which is enough since the former only additionally checks for LEGITIMATE_PIC_OPERAND_P and targetm.legitimate_constant_p which are always true for CONST_INT operands. While on it, fix indentation of the if block. gcc/ChangeLog: PR target/118835 * config/s390/s390.cc (s390_valid_shift_count): Reject shift count operands which do not trivially fit the scheme of address operands. gcc/testsuite/ChangeLog: * gcc.target/s390/pr118835.c: New test. (cherry picked from commit ac9806d)
Floating-point emulation in the D front-end is done via a type named `struct longdouble`, which in GDC is a small interface around the real_value type. Because the D code cannot include gcc/real.h directly, a big enough buffer is used for the data instead. On x86_64, this buffer is actually bigger than real_value itself, so when a new longdouble object is created with longdouble r; real_from_string3 (&r.rv (), buffer, mode); return r; there is uninitialized padding at the end of `r`. This was never a problem when D was implemented in C++ (until GCC 12) as comparing two longdouble objects with `==' would be forwarded to the relevant operator== overload that extracted the underlying real_value. However when the front-end was translated to D, such conditions were instead rewritten into identity comparisons return exp.toReal() is CTFloat.zero The `is` operator gets lowered as a call to `memcmp() == 0', which is where the read of uninitialized memory occurs, as seen by valgrind. ==26778== Conditional jump or move depends on uninitialised value(s) ==26778== at 0x911F41: dmd.dstruct._isZeroInit(dmd.expression.Expression) (dstruct.d:635) ==26778== by 0x9123BE: StructDeclaration::finalizeSize() (dstruct.d:373) ==26778== by 0x86747C: dmd.aggregate.AggregateDeclaration.determineSize(ref const(dmd.location.Loc)) (aggregate.d:226) [...] To avoid accidentally reading uninitialized data, explicitly initialize all `longdouble` variables with an empty constructor on C++ side of the implementation before initializing underlying real_value type it holds. PR d/116961 gcc/d/ChangeLog: * d-codegen.cc (build_float_cst): Change new_value type from real_t to real_value. * d-ctfloat.cc (CTFloat::fabs): Default initialize the return value. (CTFloat::ldexp): Likewise. (CTFloat::parse): Likewise. * d-longdouble.cc (longdouble::add): Likewise. (longdouble::sub): Likewise. (longdouble::mul): Likewise. (longdouble::div): Likewise. (longdouble::mod): Likewise. (longdouble::neg): Likewise. * d-port.cc (Port::isFloat32LiteralOutOfRange): Likewise. (Port::isFloat64LiteralOutOfRange): Likewise. gcc/testsuite/ChangeLog: * gdc.dg/pr116961.d: New test. (cherry picked from commit f7bc17e)
The following testcase is miscompiled due to a bug in optimize_range_tests_to_bit_test. It is trying to optimize check for a in [-34,-34] or [-26,-26] or [-6,-6] or [-4,inf] ranges. Another reassoc optimization folds the the test for the first two ranges into (a + 34U) & ~8U in [0U,0U] range, and extract_bit_test_mask actually has code to virtually undo it and treat that again as test for a being -34 or -26. The problem is that optimize_range_tests_to_bit_test remembers in the type variable TREE_TYPE (ranges[i].exp); from the first range. If extract_bit_test_mask doesn't do that virtual undoing of the BIT_AND_EXPR handling, that is just fine, the returned exp is ranges[i].exp. But if the first range is BIT_AND_EXPR, the type could be different, the BIT_AND_EXPR form has the optional cast to corresponding unsigned type in order to avoid introducing UB. Now, type was used to fill in the max value if ranges[j].high was missing in subsequently tested range, and so in this particular testcase the [-4,inf] range which was signed int and so [-4,INT_MAX] was treated as [-4,UINT_MAX] instead. And we were subtracting values of 2 different types and trying to make sense out of that. The following patch fixes this by using the type of the low bound (which is always non-NULL) for the max value of the high bound instead. 2025-02-24 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/118915 * tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): For highj == NULL_TREE use TYPE_MAX_VALUE (TREE_TYPE (lowj)) rather than TYPE_MAX_VALUE (type). * gcc.c-torture/execute/pr118915.c: New test. (cherry picked from commit 5806279)
The following testcase was emitting false positive warning that the rhs of #pragma omp atomic write was stored but not read, when the atomic actually does read it. The following patch fixes that by calling default_function_array_read_conversion on it, so that it is marked as read as well as converted from lvalue to rvalue. Furthermore, the code had if (code == NOP_EXPR) ... else ... if (code == NOP_EXPR) ... with none of ... parts changing code, so I've merged the two ifs. 2025-02-25 Jakub Jelinek <jakub@redhat.com> PR c/119000 * c-parser.cc (c_parser_omp_atomic): For omp write call default_function_array_read_conversion on the rhs expression. Merge the two adjacent if (code == NOP_EXPR) blocks. * c-c++-common/gomp/pr119000.c: New test. (cherry picked from commit cdffc76)
…fault_args etc. modify it [PR98533] The following testcases ICE during type verification, because TYPE_FIELDS of e.g. S RECORD_TYPE in pr119123.C is different from TYPE_FIELDS of const S. Various decls are added to S's TYPE_FIELDS first, then finish_struct indirectly calls fixup_type_variants to sync the variant copies. But later on cp_parser_class_specifier calls cp_parser_late_parsing_default_args and that apparently adds a lambda type (from default argument) to TYPE_FIELDS of S. Dunno if that is right or not, assuming it is right, the following patch fixes it by updating TYPE_FIELDS of variant types if there were any changes in the various functions cp_parser_class_specifier defers and calls on the outermost enclosing class. There was quite a lot of code repetition already before, so the patch uses a lambda to avoid the repetitions. To my surprise, in some of the contract testcases ( g++.dg/contracts/contracts-friend1.C g++.dg/contracts/contracts-nested-class1.C g++.dg/contracts/contracts-nested-class2.C g++.dg/contracts/contracts-redecl7.C g++.dg/contracts/contracts-redecl8.C ) it is actually setting class_type and pushing TRANSLATION_UNIT_DECL rather than some class types in some cases. Or should the lambda pushing into the containing class be somehow avoided? 2025-03-06 Jakub Jelinek <jakub@redhat.com> PR c++/98533 PR c++/119123 * parser.cc (cp_parser_class_specifier): Update TYPE_FIELDS of variant types in case cp_parser_late_parsing_default_args etc. change TYPE_FIELDS on the main variant. Add switch_to_class lambda and use it to simplify repeated class switching code. * g++.dg/cpp0x/pr98533.C: New test. * g++.dg/cpp0x/pr119123.C: New test. (cherry picked from commit 179e010)
The following testcase takes very long time to compile, because skip_simple_arithmetic decides to first call tree_invariant_p on the second argument (and indirectly recurse there). I think before canonicalization of operands for commutative binary expressions (and for non-commutative ones always) it is pretty common that the first operand is a constant, something which tree_invariant_p handles immediately, so the following patch special cases that; I've added there a tree_invariant_p call too after the checks, while it is not really needed currently, tree_invariant_p has the same checks, I wanted to be prepared in case tree_invariant_p changes. But if you think I should avoid it, I can drop it too. This is just a partial fix, I think one can certainly construct a testcase which will still have horrible compile time complexity (but I've tried and haven't managed to do so), so perhaps we should just limit the recursion depth through skip_simple_arithmetic/tree_invariant_p with some defaulted argument. 2025-03-11 Jakub Jelinek <jakub@redhat.com> PR c/119183 * tree.cc (skip_simple_arithmetic): If first operand of binary expr is TREE_CONSTANT or TREE_READONLY with no side-effects, call tree_invariant_p on that operand first instead of on the second. * gcc.dg/pr119183.c: New test. (cherry picked from commit 20e5aa9)
Given the recent PR119406 I've tried to grep for concatenated string literals without space at the end of one line and at the start of next line, unless it was obviously intentional. Furthermore, I've then looked through gcc.pot looking for 2 adjacent spaces and looking back if that wasn't the case of "something " " with spaces at both sides". Here is the result from that. I think just the c.opt change needs an explanation, the "" in the description is simply eaten up somewhere during the option processing and gcc -v --help before this patch was displaying -Wdeprecated-literal-operator Warn about deprecated space between and suffix in a user-defined literal operator. 2025-03-22 Jakub Jelinek <jakub@redhat.com> gcc/ * gimplify.cc (warn_switch_unreachable_and_auto_init_r): Add missing space in the middle of diagnostics. (cherry picked from commit 20360e4)
… spots [PR119291] The following testcase is miscompiled on x86_64-linux at -O2 by the combiner. We have from earlier combinations (insn 22 21 23 4 (set (reg:SI 104 [ _7 ]) (const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal} (nil)) (insn 23 22 24 4 (set (reg/v:SI 117 [ e ]) (reg/v:SI 116 [ e ])) 96 {*movsi_internal} (expr_list:REG_DEAD (reg/v:SI 116 [ e ]) (nil))) (note 24 23 25 4 NOTE_INSN_DELETED) (insn 25 24 26 4 (parallel [ (set (reg:CCZ 17 flags) (compare:CCZ (neg:SI (reg:SI 104 [ _7 ])) (const_int 0 [0]))) (set (reg/v:SI 116 [ e ]) (neg:SI (reg:SI 104 [ _7 ]))) ]) "pr119291.c":26:13 977 {*negsi_2} (expr_list:REG_DEAD (reg:SI 104 [ _7 ]) (nil))) (note 26 25 27 4 NOTE_INSN_DELETED) (insn 27 26 28 4 (set (reg:DI 128 [ _9 ]) (ne:DI (reg:CCZ 17 flags) (const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1} (expr_list:REG_DEAD (reg:CCZ 17 flags) (nil))) and try_combine is called on i3 25 and i2 22 (second time) and reach the hunk being patched with simplified i3 (insn 25 24 26 4 (parallel [ (set (pc) (pc)) (set (reg/v:SI 116 [ e ]) (const_int 0 [0])) ]) "pr119291.c":28:13 977 {*negsi_2} (expr_list:REG_DEAD (reg:SI 104 [ _7 ]) (nil))) and (insn 22 21 23 4 (set (reg:SI 104 [ _7 ]) (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal} (nil)) Now, the try_combine code there attempts to split two independent sets in newpat by moving one of them to i2. And among other tests it checks !modified_between_p (SET_DEST (set1), i2, i3) which is certainly needed, if there would be say (set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a])) in between i2 and i3, we couldn't do that, as that set would overwrite the value set by set1 we want to move to the i2 position. But in this case pseudo 116 isn't set in between i2 and i3, but used (and additionally there is a REG_DEAD note for it). This is equally bad for the move, because while the i3 insn and later will see the pseudo value that we set, the insn in between which uses the value will see a different value from the one that it should see. As we don't check for that, in the end try_combine succeeds and changes the IL to: (insn 22 21 23 4 (set (reg/v:SI 116 [ e ]) (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal} (nil)) (insn 23 22 24 4 (set (reg/v:SI 117 [ e ]) (reg/v:SI 116 [ e ])) 96 {*movsi_internal} (expr_list:REG_DEAD (reg/v:SI 116 [ e ]) (nil))) (note 24 23 25 4 NOTE_INSN_DELETED) (insn 25 24 26 4 (set (pc) (pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE} (nil)) (note 26 25 27 4 NOTE_INSN_DELETED) (insn 27 26 28 4 (set (reg:DI 128 [ _9 ]) (const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal} (nil)) (note, the i3 got turned into a nop and try_combine also modified insn 27). The following patch replaces the modified_between_p tests with reg_used_between_p, my understanding is that modified_between_p is a subset of reg_used_between_p, so one doesn't need both. Looking at this some more today, I think we should special case set_noop_p because that can be put into i2 (except for the JUMP_P violations), currently both modified_between_p (pc_rtx, i2, i3) and reg_used_between_p (pc_rtx, i2, i3) returns false. I'll post a patch incrementally for that (but that feels like new optimization, so probably not something that should be backported). On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote: > Can we constrain SET_DEST (set1/set0) to a REG_P in combine? Why > does the comment talk about memory? I was worried about making too risky changes this late in stage4 (and especially also for backports). Most of this code is 1992-ish. I think many of the functions are just misnamed, the reg_ in there doesn't match what those functions do (bet they initially supported just REGs and later on support for other kinds of expressions was added, but haven't done git archeology to prove that). What we know for sure is: && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART that is checked earlier in the condition. Then it calls && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)), XVECEXP (newpat, 0, 0)) && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)), XVECEXP (newpat, 0, 1)) While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p which is also misnamed, that function handles just fine all of REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT, STRICT_LOW_PART, PC and even some further cases. So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG of REG, PC (at least the REG and PC cases are triggered on the testcase) and quite possibly also MEM (SUBREG of MEM not, see below). Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that function for constants just returns false, for PC returns true, for REG returns reg_set_between_p, for MEM recurses on the address, for MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code whether the memory could have been modified in between, for all other rtxes recurses on the subrtxes. This part didn't change in my patch. I've only changed those - && !modified_between_p (SET_DEST (set{1,0}), i2, i3) + && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3) where the former has been described above and clearly handles all of REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things. The replacement reg_used_between_p calls reg_overlap_mentioned_p on each instruction in between i2 and i3. So, there is clearly a difference in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p returns unconditionally true even if there are no instructions in between, but reg_used_between_p if there are no non-debug insns in between returns false. Sorry for missing that, guess I should check for that (with the exception of the noop moves which are often (set (pc) (pc)) and handled by the incremental patch). In fact not just that, reg_used_between_p will only return true for PC if it is mentioned anywhere in the insns in between. Anyway, except for that, for REG it calls refers_to_regno_p and so should find any occurrences of any of the REG or parts of it for hard registers, for MEM returns true if it sees any MEMs in insns in between (conservatively), for SUBREGs apparently it relies on it being SUBREG of REG (so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the SUBREG_REG, PC I've already described. Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think already the initial && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)), XVECEXP (newpat, 0, 0)) && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)), XVECEXP (newpat, 0, 1)) calls would have failed --enable-checking=rtl or would have misbehaved, so I think there is no need to check for it further. To your question why I don't use reg_referenced_p, that is because reg_referenced_p is something to call on one insn pattern, while reg_used_between_p is pretty much that on all insns in between two instructions (excluding the boundaries). So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx checks to preserve former behavior, like in the following version. 2025-04-01 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/119291 * combine.cc (try_combine): For splitting of PARALLEL with 2 independent SETs into i2 and i3 sets check reg_used_between_p of the SET_DESTs rather than just modified_between_p. * gcc.c-torture/execute/pr119291.c: New test. (cherry picked from commit 19ba913)
The following testcase ICEs because c_fully_fold isn't performed on the arguments of __sanitizer_ptr_{sub,cmp} builtins and so e.g. C_MAYBE_CONST_EXPR can leak into the gimplifier where it ICEs. 2025-04-02 Jakub Jelinek <jakub@redhat.com> PR c/119582 * c-typeck.cc (pointer_diff, build_binary_op): Call c_fully_fold on __sanitizer_ptr_sub or __sanitizer_ptr_cmp arguments. * gcc.dg/asan/pr119582.c: New test. (cherry picked from commit 29bc904)
I can reproduce a really weird error in our distro i686 trunk gcc (but haven't managed to reproduce it with vanilla trunk yet). echo 'void foo (void) {}' > a.c; gcc -O2 -flto=auto -m32 -march=i686 -ffat-lto-objects -fhardened -o a.o -c a.c; gcc -O2 -flto=auto -m32 -march=i686 -r -o a.lo a.o lto1: fatal error: open failed: No such file or directory compilation terminated. lto-wrapper: fatal error: gcc returned 1 exit status The error is because cat ./a.lo.lto.o-args.0 "" a.o My suspicion is that this "" in there is caused by weird .gnu.lto_.opts section content during gcc -O2 -flto=auto -m32 -march=i686 -ffat-lto-objects -fhardened -S -o a.s -c a.c compilation (and I can reproduce that one with vanilla trunk). The above results in .section .gnu.lto_.opts,"e",@progbits .string "'-fno-openmp' '-fno-openacc' '-fPIC' '' '-m32' '-march=i686' '-O2' '-flto=auto' '-ffat-lto-objects'" There are two weird things, one (IMHO the cause of the "" later on) is the '' part, I think it comes from lto_write_options doing append_to_collect_gcc_options (&temporary_obstack, &first_p, ""); IMHO it shouldn't call append_to_collect_gcc_options at all for that case. The -fhardened option causes global_options.x_flag_cf_protection to be set to CF_FULL and later on the backend option processing sets it to CF_FULL | CF_SET (i.e. 7, a value not handled in lto_write_options). The following patch fixes it by not emitting anything there if flag_cf_protection is one of the unhandled values. Perhaps it could incrementally use switch (global_options.x_flag_cf_protection & ~CF_SET) instead, dunno. And the other problem is that the -fPIC in there is really weird. Our distro compiler or vanilla configured trunk certainly doesn't default to -fPIC and -fhardened uses -fPIE when -fPIC/-fpic/-fno-pie/-fno-pic is not specified, so I was expecting -fPIE in there. The thing is that the -fpie option causes setting of both global_options.x_flag_pi{c,e} to 1, -fPIE both to 2: /* If -fPIE or -fpie is used, turn on PIC. */ if (opts->x_flag_pie) opts->x_flag_pic = opts->x_flag_pie; else if (opts->x_flag_pic == -1) opts->x_flag_pic = 0; if (opts->x_flag_pic && !opts->x_flag_pie) opts->x_flag_shlib = 1; so checking first for flag_pic == 2 and then flag_pic == 1 and only afterwards for flag_pie means we never print -fPIE/-fpie. Or do you want something further (like switch (global_options.x_flag_cf_protection & ~CF_SET) )? 2025-04-04 Jakub Jelinek <jakub@redhat.com> PR lto/119625 * lto-opts.cc (lto_write_options): If neither flag_pic nor flag_pie are set, check first for flag_pie and only later for flag_pic rather than the other way around, use a temporary variable. If flag_cf_protection is not set, don't append anything if flag_cf_protection is none of CF_{NONE,FULL,BRANCH,RETURN} and use a temporary variable. (cherry picked from commit d25728c)
Here is a cherry-pick from glibc [BZ #32411] fix. As mentioned by the reporter in a pull request against gcc-mirror, the THREEp96 constant in e_expl.c is incorrect, it is actually 0x3.p+94f128 rather than 0x3.p+96f128. The algorithm uses that to compute the t2 integer (tval2), by whose delta it adjusts the x+xl pair and then in the result uses the precomputed exp value for that entry. Using 0x3.p+94f128 rather than 0x3.p+96f128 results in tval2 sometimes being one smaller, sometimes one larger than the desired value, thus can mean the x+xl pair after adjustment will be larger in absolute value than it should be. DesWursters created a test program for this https://github.com/DesWurstes/comparefloats and his results were total: 1135000000 not_equal: 4322 earlier_score: 674 later_score: 3648 I've modified this so with https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c3 so that it actually tests pseudo-random _Float128 values with range (-16384.,16384) with strong bias on values larger than 0.0002 in absolute value (so that tval1/tval2 aren't zero most of the time) and that gave total: 10000000000 not_equal: 29861 earlier_score: 4606 later_score: 25255 So, in both cases, in most cases the change doesn't result in any differences, and in those rare cases where does, about 85% have smaller ulp than without the patch. Additionally I've tried https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c4 and in 2 billion iterations it didn't find any case where x+xl after the adjustments without this change would be smaller in absolute value compared to x+xl after the adjustments with this change. 2025-04-09 Jakub Jelinek <jakub@redhat.com> * math/expq.c (C): Fix up THREEp96 constant. (cherry picked from commit e081ced)
With --enable-host-pie -freport-bug almost never prepares preprocessed source and instead emits The bug is not reproducible, so it is likely a hardware or OS problem. message even for bogus which are 100% reproducible. The way -freport-bug works is that it reruns it 3 times, capturing stdout and stderr from each and then tries to compare the outputs in between different runs. The libbacktrace emitted hexadecimal addresses at the start of the lines can differ between runs due to ASLR, either of the PIE executable, or even if not PIE if there is some frame with e.g. libc function (say crash in strlen/memcpy etc.). The following patch fixes it by ignoring such differences at the start of the lines. 2025-04-12 Jakub Jelinek <jakub@redhat.com> PR driver/119727 * gcc.cc (files_equal_p): Rewritten using fopen/fgets/fclose instead of open/fstat/read/close. At the start of lines, ignore lowercase hexadecimal addresses followed by space. (cherry picked from commit 8b2ceb4)
Andi had a useful comment that even with the PR119727 workaround to ignore differences in libbacktrace printed addresses, it is still better to turn off ASLR when easily possible, e.g. in case some address leaks in somewhere in the ICE message elsewhere, or to verify the ICE doesn't depend on a particular library/binary load addresses. The following patch adds a configure check and uses personality syscall to turn off randomization for further -freport-bug subprocesses. 2025-04-14 Jakub Jelinek <jakub@redhat.com> PR driver/119727 * configure.ac (HOST_HAS_PERSONALITY_ADDR_NO_RANDOMIZE): New check. * gcc.cc: Include sys/personality.h if HOST_HAS_PERSONALITY_ADDR_NO_RANDOMIZE is defined. (try_generate_repro): Call personality (personality (0xffffffffU) | ADDR_NO_RANDOMIZE) if HOST_HAS_PERSONALITY_ADDR_NO_RANDOMIZE is defined. * config.in: Regenerate. * configure: Regenerate. (cherry picked from commit 5a32e85)
This is a regression on some targets introduced I believe by r6-2055 which added mode argument to set_src_cost. The problem here is that in the first iteration, mode is always QImode and we get as -Os zero cost set_src_cost (const0_rtx, QImode, false). But then we use the mode variable for iterating over int, partial int and vector int modes, so for the second iteration we call set_src_cost with mode which is at that time (machine_mode) (MAX_MODE_VECTOR_INT + 1). In the x86 case that happens to be V2HFmode and we don't crash (and compute the same 0 cost as we would for QImode). But e.g. in the SPARC case (machine_mode) (MAX_MODE_VECTOR_INT + 1) is MAX_MACHINE_MODE and that does all kinds of weird things especially when doing ubsan bootstrap. Fixed by always using QImode. 2025-04-14 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/119785 * expmed.cc (init_expmed): Always pass QImode rather than mode to set_src_cost passed to set_zero_cost. (cherry picked from commit f96a543)
As mentioned in the PR (and I think in PR101075 too), we can run into deadlock with libat_lock_n calls with larger n. As mentioned in PR66842, we use multiple locks (normally 64 mutexes for each 64 byte cache line in 4KiB page) and currently can lock more than one lock, in particular for n [0, 64] a single lock, for n [65, 128] 2 locks, for n [129, 192] 3 locks etc. There are two problems with this: 1) we can deadlock if there is some wrap-around, because the locks are acquired always in the order from addr_hash (ptr) up to locks[NLOCKS-1].mutex and then if needed from locks[0].mutex onwards; so if e.g. 2 threads perform libat_lock_n with n = 2048+64, in one case at pointer starting at page boundary and in another case at page boundary + 2048 bytes, the first thread can lock the first 32 mutexes, the second thread can lock the last 32 mutexes and then first thread wait for the lock 32 held by second thread and second thread wait for the lock 0 held by the first thread; fixed below by always locking the locks in order of increasing index, if there is a wrap-around, by locking in 2 loops, first locking some locks at the start of the array and second at the end of it 2) the number of locks seems to be determined solely depending on the n value, I think that is wrong, we don't know the structure alignment on the libatomic side, it could very well be 1 byte aligned struct, and so how many cachelines are actually (partly or fully) occupied by the atomic access depends not just on the size, but also on ptr % WATCH_SIZE, e.g. 2 byte structure at address page_boundary+63 should IMHO lock 2 locks because it occupies the first and second cacheline Note, before this patch it locked exactly one lock for n = 0, while with this patch it could lock either no locks at all (if it is at cacheline boundary) or 1 (otherwise). Dunno of libatomic APIs can be called for zero sizes and whether we actually care that much how many mutexes are locked in that case, because one can't actually read/write anything into zero sized memory. If you think it is important, I could add else if (nlocks == 0) nlocks = 1; in both spots. 2025-04-16 Jakub Jelinek <jakub@redhat.com> PR libgcc/101075 PR libgcc/119796 * config/posix/lock.c (libat_lock_n, libat_unlock_n): Start with computing how many locks will be needed and take into account ((uintptr_t)ptr % WATCH_SIZE). If some locks from the end of the locks array and others from the start of it will be needed, first lock the ones from the start followed by ones from the end. (cherry picked from commit 61dfb07)
Here is just a port of the previously posted patch to mingw which clearly has the same problems. 2025-04-16 Jakub Jelinek <jakub@redhat.com> PR libgcc/101075 PR libgcc/119796 * config/mingw/lock.c (libat_lock_n, libat_unlock_n): Start with computing how many locks will be needed and take into account ((uintptr_t)ptr % WATCH_SIZE). If some locks from the end of the locks array and others from the start of it will be needed, first lock the ones from the start followed by ones from the end. (cherry picked from commit 34fe8e9)
…decisions [PR119327] The following testcase FAILs because the always_inline function can't be inlined. The rs6000 backend has similarly to other targets a hook which rejects inlining which would bring in new ISAs which aren't there in the caller. And this hook rejects this because of OPTION_MASK_SAVE_TOC_INDIRECT differences. This flag is set if explicitly requested or by default depending on whether the current function looks hot (or at least not cold): if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0 && flag_shrink_wrap_separate && optimize_function_for_speed_p (cfun)) rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT; The target nodes that are being compared here are actually the default target node (which was created when cfun was NULL) vs. one that was created for the always_inline function when it wasn't NULL, so one doesn't have it, the other does. In any case, this flag feels like a tuning decision rather than hard ISA requirement and I see no problems why we couldn't inline even explicit -msave-toc-indirect function into -mno-save-toc-indirect or vice versa. We already ignore OPTION_MASK_P{8,10}_FUSION which are also more like tuning flags. 2025-04-22 Jakub Jelinek <jakub@redhat.com> PR target/119327 * config/rs6000/rs6000.cc (rs6000_can_inline_p): Ignore also OPTION_MASK_SAVE_TOC_INDIRECT differences. * g++.dg/opt/pr119327.C: New test. (cherry picked from commit 4b62cf5)
We need to drop the kind argument from what is passed to the library, but need to do it not only when one uses the argument name for it (so kind=4 etc.) but also when one passes all the arguments to the intrinsics. The following patch uses what gfc_conv_intrinsic_findloc uses, which looks more efficient and cleaner, we already set automatic vars to point to the kind and back actual arguments, so we can just free/clear expr on the former and set name to "%VAL" on the latter. And similarly clears dim argument for the BT_CHARACTER case when using maxloc2/minloc2, again regardless of whether it was named or not. 2025-05-13 Jakub Jelinek <jakub@redhat.com> Daniil Kochergin <daniil2472s@gmail.com> Tobias Burnus <tburnus@baylibre.com> PR fortran/120191 * trans-intrinsic.cc (strip_kind_from_actual): Remove. (gfc_conv_intrinsic_minmaxloc): Don't call strip_kind_from_actual. Free and clear kind_arg->expr if non-NULL. Set back_arg->name to "%VAL" instead of a loop looking for last argument. Remove actual variable, use array_arg instead. Free and clear dim_arg->expr if non-NULL for BT_CHARACTER cases instead of using a loop. * gfortran.dg/pr120191_1.f90: New test. (cherry picked from commit ec249be)
I've tried to write a testcase for the BT_CHARACTER maxloc/minloc with named or unnamed arguments and indeed the just posted patch fixed the arguments in there in multiple cases to match what the library expects. But the testcase still fails, due to library problems. One dealt with in this patch are _gfortran_s{max,min}loc2_{4,8,16}_s{1,4} functions. Those are trivial wrappers around _gfortrani_{max,min}loc2_{4,8,16}_s{1,4} which should call those functions if the scalar mask is true and just return 0 otherwise. The two bugs I see there is that the back, len arguments are swapped, which means that it always acts as back=.true. and for len will use character length of 1 or 0 instead of the desired one. The _gfortrani_{max,min}loc2_{4,8,16}_s{1,4} functions have prototypes like GFC_INTEGER_4 maxloc2_4_s1 (gfc_array_s1 * const restrict array, GFC_LOGICAL_4 back, gfc_charlen_type len) so back comes before len, ditto for the GFC_INTEGER_4 smaxloc2_4_s1 (gfc_array_s1 * const restrict array, GFC_LOGICAL_4 *mask, GFC_LOGICAL_4 back, gfc_charlen_type len) The other problem is that it was just testing if (mask). In my limited Fortran understanding that means that the optional argument mask was supplied but nothing about its actual value. Other scalar mask generated routines use if (mask == NULL || *mask) as the condition when to call the non-masked function, i.e. when mask is not supplied (then it should act like .true. mask) or when it is supplied and evaluates to .true.). 2025-05-13 Jakub Jelinek <jakub@redhat.com> PR fortran/120191 * m4/maxloc2s.m4: For smaxloc2 call maxloc2 if mask is NULL or *mask. Swap back and len arguments. * m4/minloc2s.m4: Likewise. * generated/maxloc2_4_s1.c: Regenerate. * generated/maxloc2_4_s4.c: Regenerate. * generated/maxloc2_8_s1.c: Regenerate. * generated/maxloc2_8_s4.c: Regenerate. * generated/maxloc2_16_s1.c: Regenerate. * generated/maxloc2_16_s4.c: Regenerate. * generated/minloc2_4_s1.c: Regenerate. * generated/minloc2_4_s4.c: Regenerate. * generated/minloc2_8_s1.c: Regenerate. * generated/minloc2_8_s4.c: Regenerate. * generated/minloc2_16_s1.c: Regenerate. * generated/minloc2_16_s4.c: Regenerate. * gfortran.dg/pr120191_2.f90: New test. (cherry picked from commit 482f219)
There is a bug in _gfortran_s{max,min}loc1_{4,8,16}_s{1,4} which the following testcase shows. The functions return but then crash in the caller. Seems that is because buffer overflows, I believe those functions for if (mask == NULL || *mask) condition being false are supposed to fill in the result array with all zeros (or allocate it and fill it with zeros). My understanding is the result array in that case is integer(kind={4,8,16}) and should have the extents the character input array has. The problem is that it uses * string_len in the extent multiplication: extent[n] = GFC_DESCRIPTOR_EXTENT(array,n) * string_len; and extent[n] = GFC_DESCRIPTOR_EXTENT(array,n + 1) * string_len; which is I guess fine and desirable for the extents of the character array, but not for the extents of the destination array. Yet the code uses that extent array for that purpose (and no other purposes). Here it uses it to set the dimensions for the case where it needs to allocate (as well as size): for (n = 0; n < rank; n++) { if (n == 0) str = 1; else str = GFC_DESCRIPTOR_STRIDE(retarray,n-1) * extent[n-1]; GFC_DIMENSION_SET(retarray->dim[n], 0, extent[n] - 1, str); } Here it uses it for bounds checking of the destination: if (unlikely (compile_options.bounds_check)) { for (n=0; n < rank; n++) { index_type ret_extent; ret_extent = GFC_DESCRIPTOR_EXTENT(retarray,n); if (extent[n] != ret_extent) runtime_error ("Incorrect extent in return value of" " MAXLOC intrinsic in dimension %ld:" " is %ld, should be %ld", (long int) n + 1, (long int) ret_extent, (long int) extent[n]); } } and here to find out how many retarray elements to actually fill in each dimension: while(1) { *dest = 0; count[0]++; dest += dstride[0]; n = 0; while (count[n] == extent[n]) { /* When we get to the end of a dimension, reset it and increment the next dimension. */ count[n] = 0; /* We could precalculate these products, but this is a less frequently used path so probably not worth it. */ dest -= dstride[n] * extent[n]; Seems maxloc1s.m4 and minloc1s.m4 are the only users of ifunction-s.m4, so we can change SCALAR_ARRAY_FUNCTION in there without breaking anything else. 2025-05-13 Jakub Jelinek <jakub@redhat.com> PR fortran/120191 * m4/ifunction-s.m4 (SCALAR_ARRAY_FUNCTION): Don't multiply GFC_DESCRIPTOR_EXTENT(array,) by string_len. * generated/maxloc1_4_s1.c: Regenerate. * generated/maxloc1_4_s4.c: Regenerate. * generated/maxloc1_8_s1.c: Regenerate. * generated/maxloc1_8_s4.c: Regenerate. * generated/maxloc1_16_s1.c: Regenerate. * generated/maxloc1_16_s4.c: Regenerate. * generated/minloc1_4_s1.c: Regenerate. * generated/minloc1_4_s4.c: Regenerate. * generated/minloc1_8_s1.c: Regenerate. * generated/minloc1_8_s4.c: Regenerate. * generated/minloc1_16_s1.c: Regenerate. * generated/minloc1_16_s4.c: Regenerate. * gfortran.dg/pr120191_3.f90: New test. (cherry picked from commit 781cfc4)
As mentioned in the PR, _gfortran_{,m,s}findloc2_s{1,4} iterate too many times in the back case if nothing is found. For !back, the loops are for (i = 1; i <= extent; i++) so i is in the body [1, extent] if nothing is found, but for back it is for (i = extent; i >= 0; i--) so i is in the body [0, extent] and compares one element before the start of the array. Note, findloc1_s{1,4} uses for (n = len; n > 0; n--, src -= delta * len_array) for the back loop and for (n = 1; n <= len; n++, src += delta * len_array) for !back. This patch fixes that. The testcase fails under valgrind without the libgfortran changes and succeeds with those. 2025-05-13 Jakub Jelinek <jakub@redhat.com> PR libfortran/120196 * m4/ifindloc2.m4 (header1, header2): For back use i > 0 rather than i >= 0 as for condition. * generated/findloc2_s1.c: Regenerate. * generated/findloc2_s4.c: Regenerate. * gfortran.dg/pr120196.f90: New test. (cherry picked from commit 748a7bc)
The UB on the following testcase isn't diagnosed by -fsanitize=address, because we see that the array has a single element and optimize the strlen to 0. I think it is fine to assume e.g. for range purposes the lower bound for the strlen as long as we don't try to optimize strlen (str) where we know that it returns [26, 42] to 26 + strlen (str + 26), but for the upper bound we really want to punt on optimizing that for -fsanitize=address to read all the bytes of the string and diagnose if we run to object end etc. 2024-02-06 Jakub Jelinek <jakub@redhat.com> PR sanitizer/110676 * gimple-fold.cc (gimple_fold_builtin_strlen): For -fsanitize=address reset maxlen to sizetype maximum. * gcc.dg/asan/pr110676.c: New test. (cherry picked from commit d3eac7d)
mark_vtable_entries already has /* It's OK for the vtable to refer to deprecated virtual functions. */ warning_sentinel w(warn_deprecated_decl); but that doesn't cover __attribute__((unavailable)). We can use the following override to cover both. PR c++/116606 gcc/cp/ChangeLog: * decl2.cc (mark_vtable_entries): Temporarily override deprecated_state to UNAVAILABLE_DEPRECATED_SUPPRESS. Remove a warning_sentinel. gcc/testsuite/ChangeLog: * g++.dg/ext/attr-unavailable-13.C: New test. (cherry picked from commit d9d34f9)
…for methods with such attributes [PR116636] On the following testcase, we emit false positive warnings/errors about using the deprecated or unavailable methods when creating thunks for them, even when nothing (in the testcase so far) actually used those. The following patch temporarily disables that diagnostics when creating the thunks. 2024-09-12 Jakub Jelinek <jakub@redhat.com> PR c++/116636 * method.cc: Include decl.h. (use_thunk): Temporarily change deprecated_state to UNAVAILABLE_DEPRECATED_SUPPRESS. * g++.dg/warn/deprecated-19.C: New test. (cherry picked from commit 4026d89)
The following testcases are miscompiled on s390x-linux, because the doloop_optimize /* Ensure that the new sequence doesn't clobber a register that is live at the end of the block. */ { bitmap modified = BITMAP_ALLOC (NULL); for (rtx_insn *i = doloop_seq; i != NULL; i = NEXT_INSN (i)) note_stores (i, record_reg_sets, modified); basic_block loop_end = desc->out_edge->src; bool fail = bitmap_intersect_p (df_get_live_out (loop_end), modified); check doesn't work as intended. The problem is that it uses df, but the df analysis was only done using iv_analysis_loop_init (loop); -> df_analyze_loop (loop); which computes df inside on the bbs of the loop. While loop_end bb is inside of the loop, df_get_live_out computed that way includes registers set in the loop and used at the start of the next iteration, but doesn't include registers set in the loop (or before the loop) and used after the loop. The following patch fixes that by doing whole function df_analyze first, changes the loop iteration mode from 0 to LI_ONLY_INNERMOST (on many targets which use can_use_doloop_if_innermost target hook a so are known to only handle innermost loops) or LI_FROM_INNERMOST (I think only bfin actually allows non-innermost loops) and checking not just df_get_live_out (loop_end) (that is needed for something used by the next iteration), but also df_get_live_in (desc->out_edge->dest), i.e. what will be used after the loop. df of such a bb shouldn't be affected by the df_analyze_loop and so should be from df_analyze of the whole function. 2024-12-05 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/113994 PR rtl-optimization/116799 * loop-doloop.cc: Include targhooks.h. (doloop_optimize): Also punt on intersection of modified with df_get_live_in (desc->out_edge->dest). (doloop_optimize_loops): Call df_analyze. Use LI_ONLY_INNERMOST or LI_FROM_INNERMOST instead of 0 as second loops_list argument. * gcc.c-torture/execute/pr116799.c: New test. * g++.dg/torture/pr113994.C: New test. (cherry picked from commit 0eed816)
The following testcase is miscompiled because of RTL represententation of bt{l,q} insn followed by e.g. j{c,nc} being misleading to what it actually does. Let's look e.g. at (define_insn_and_split "*jcc_bt<mode>" [(set (pc) (if_then_else (match_operator 0 "bt_comparison_operator" [(zero_extract:SWI48 (match_operand:SWI48 1 "nonimmediate_operand") (const_int 1) (match_operand:QI 2 "nonmemory_operand")) (const_int 0)]) (label_ref (match_operand 3)) (pc))) (clobber (reg:CC FLAGS_REG))] "(TARGET_USE_BT || optimize_function_for_size_p (cfun)) && (CONST_INT_P (operands[2]) ? (INTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode) && INTVAL (operands[2]) >= (optimize_function_for_size_p (cfun) ? 8 : 32)) : !memory_operand (operands[1], <MODE>mode)) && ix86_pre_reload_split ()" "#" "&& 1" [(set (reg:CCC FLAGS_REG) (compare:CCC (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)) (const_int 0))) (set (pc) (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)]) (label_ref (match_dup 3)) (pc)))] { operands[0] = shallow_copy_rtx (operands[0]); PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); }) The define_insn part in RTL describes exactly what it does, jumps to op3 if bit op2 in op1 is set (for op0 NE) or not set (for op0 EQ). The problem is with what it splits into. put_condition_code %C1 for CCCmode comparisons emits c for EQ and LTU, nc for NE and GEU and ICEs otherwise. CCCmode is used mainly for carry out of add/adc, borrow out of sub/sbb, in those cases e.g. for add we have (set (reg:CCC flags) (compare:CCC (plus:M x y) x)) and use (ltu (reg:CCC flags) (const_int 0)) for carry set and (geu (reg:CCC flags) (const_int 0)) for carry not set. These cases model in RTL what is actually happening, compare in infinite precision x from the result of finite precision addition in M mode and if it is less than unsigned (i.e. overflow happened), carry is set. Another use of CCCmode is in UNSPEC_* patterns, those are used with (eq (reg:CCC flags) (const_int 0)) for carry set and ne for unset, given the UNSPEC no big deal, the middle-end doesn't know what means set or unset. But for the bt{l,q}; j{c,nc} case the above splits it into (set (reg:CCC flags) (compare:CCC (zero_extract) (const_int 0))) for bt and (set (pc) (if_then_else (eq (reg:CCC flags) (const_int 0)) (label_ref) (pc))) for the bit set case (so that the jump expands to jc) and ne for the bit not set case (so that the jump expands to jnc). Similarly for the different splitters for cmov and set{c,nc} etc. The problem is that when the middle-end reads this RTL, it feels the exact opposite to it. If zero_extract is 1, flags is set to comparison of 1 and 0 and that would mean using ne ne in the if_then_else, and vice versa. So, in order to better describe in RTL what is actually happening, one possibility would be to swap the behavior of put_condition_code and use NE + LTU -> c and EQ + GEU -> nc rather than the current EQ + LTU -> c and NE + GEU -> nc; and adjust everything. The following patch uses a more limited approach, instead of representing bt{l,q}; j{c,nc} case as written above it uses (set (reg:CCC flags) (compare:CCC (const_int 0) (zero_extract))) and (set (pc) (if_then_else (ltu (reg:CCC flags) (const_int 0)) (label_ref) (pc))) which uses the existing put_condition_code but describes what the insns actually do in RTL clearly. If zero_extract is 1, then flags are LTU, 0U < 1U, if zero_extract is 0, then flags are GEU, 0U >= 0U. The patch adjusts the *bt<mode> define_insn and all the splitters to it and its comparisons/conditional moves/setXX. 2025-02-10 Jakub Jelinek <jakub@redhat.com> PR target/118623 * config/i386/i386.md (*bt<mode>): Represent bt as compare:CCC of const0_rtx and zero_extract rather than zero_extract and const0_rtx. (*jcc_bt<mode>): Likewise. Use LTU and GEU as flags test instead of EQ and NE. (*jcc_bt<mode>_1): Likewise. (*jcc_bt<mode>_mask): Likewise. (Help combine recognize bt followed by cmov splitter): Likewise. (*bt<mode>_setcqi): Likewise. (*bt<mode>_setncqi): Likewise. (*bt<mode>_setnc<mode>): Likewise. * gcc.c-torture/execute/pr118623.c: New test. (cherry picked from commit 9214201)
…lier implicit instantation [PR113976] Already previously instantiated const variable templates had cp_apply_type_quals_to_decl called when they were instantiated, but if they need runtime initialization, their TREE_READONLY flag has been subsequently cleared. Explicit variable template instantiation calls grokdeclarator which calls cp_apply_type_quals_to_decl on them again, setting TREE_READONLY flag again, but nothing clears it afterwards, so we emit such instantiations into rodata sections and segfault when the dynamic initialization attempts to initialize them. The following patch fixes that by not calling cp_apply_type_quals_to_decl on already instantiated variable declarations. 2024-02-28 Jakub Jelinek <jakub@redhat.com> Patrick Palka <ppalka@redhat.com> PR c++/113976 * decl.cc (grokdeclarator): Don't call cp_apply_type_quals_to_decl on DECL_TEMPLATE_INSTANTIATED VAR_DECLs. * g++.dg/cpp1y/var-templ87.C: New test. (cherry picked from commit 29ac924)
The following testcase ICEs since r15-1579 (addition of late combiner), because *clrmem_short can't be split. The problem is that the define_insn uses (use (match_operand 1 "nonmemory_operand" "n,a,a,a")) (use (match_operand 2 "immediate_operand" "X,R,X,X")) (clobber (match_scratch:P 3 "=X,X,X,&a")) and define_split assumed that if operands[1] is const_int_operand, match_scratch will be always scratch, and it will be reg only if it was the last alternative where operands[1] is a reg. The pattern doesn't guarantee it though, of course RA will not try to uselessly assign a reg there if it is not needed, but during RA on the testcase below we match the last alternative, but then comes late combiner and propagates const_int 3 into operands[1]. And that matches fine, match_scratch matches either scratch or reg and the constraint in that case is X for the first variant, so still just fine. But we won't split that because the splitters only expect scratch. The following patch fixes it by using match_scratch instead of scratch, so that it accepts either. 2025-04-17 Jakub Jelinek <jakub@redhat.com> PR target/119834 * config/s390/s390.md (define_split after *cpymem_short): Use (clobber (match_scratch N)) instead of (clobber (scratch)). Use (match_dup 4) and operands[4] instead of (match_dup 3) and operands[3] in the last of those. (define_split after *clrmem_short): Use (clobber (match_scratch N)) instead of (clobber (scratch)). (define_split after *cmpmem_short): Likewise. * g++.target/s390/pr119834.C: New test. (cherry picked from commit 22fe83d)
This got broken with r13-9727 and fixed with either of r13-9729 or r13-9728. 2025-05-30 Jakub Jelinek <jakub@redhat.com> PR target/120480 * gcc.dg/pr120480.c: New test. (cherry picked from commit c13d5b9)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Support for Apple Silicon!!!