Skip to content

Conversation

@taskset
Copy link

@taskset taskset commented Sep 6, 2018

make gate_tm_init more robust and avoid this bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87162

@jwakely
Copy link
Contributor

jwakely commented Oct 2, 2018

This is an unofficial mirror that nobody from the GCC project is involved with. Sending pull requests here is a waste of time.

Please see https://gcc.gnu.org/contribute.html for how to contribute to GCC, thanks.

@skeetor
Copy link

skeetor commented Feb 17, 2019

So whats the point of this mirror?

@jwakely
Copy link
Contributor

jwakely commented Mar 8, 2019

I don't know. Somebody thought it would be useful to have a copy of GCC mirrored on github.

nstester pushed a commit to nstester/gcc that referenced this pull request May 18, 2023
Hi all,

We noticed that calls to the vadcq and vsbcq intrinsics, both of
which use __builtin_arm_set_fpscr_nzcvqc to set the Carry flag in
the FPSCR, would produce the following code:

```
< r2 is the *carry input >
vmrs	r3, FPSCR_nzcvqc
bic	r3, r3, #536870912
orr	r3, r3, r2, lsl gcc-mirror#29
vmsr	FPSCR_nzcvqc, r3
```

when the MVE ACLE instead gives a different instruction sequence of:
```
< Rt is the *carry input >
VMRS Rs,FPSCR_nzcvqc
BFI Rs,Rt,gcc-mirror#29,#1
VMSR FPSCR_nzcvqc,Rs
```

the bic + orr pair is slower and it's also wrong, because, if the
*carry input is greater than 1, then we risk overwriting the top two
bits of the FPSCR register (the N and Z flags).

This turned out to be a problem in the header file and the solution was
to simply add a `& 1x0u` to the `*carry` input: then the compiler knows
that we only care about the lowest bit and can optimise to a BFI.

Ok for trunk?

Thanks,
Stam Markianos-Wright

gcc/ChangeLog:

	* config/arm/arm_mve.h (__arm_vadcq_s32): Fix arithmetic.
	(__arm_vadcq_u32): Likewise.
	(__arm_vadcq_m_s32): Likewise.
	(__arm_vadcq_m_u32): Likewise.
	(__arm_vsbcq_s32): Likewise.
	(__arm_vsbcq_u32): Likewise.
	(__arm_vsbcq_m_s32): Likewise.
	(__arm_vsbcq_m_u32): Likewise.
	* config/arm/mve.md (get_fpscr_nzcvqc): Make unspec_volatile.

gcc/testsuite/ChangeLog:
	* gcc.target/arm/mve/mve_vadcq_vsbcq_fpscr_overwrite.c: New.
kraj pushed a commit to kraj/gcc that referenced this pull request May 18, 2023
Hi all,

We noticed that calls to the vadcq and vsbcq intrinsics, both of
which use __builtin_arm_set_fpscr_nzcvqc to set the Carry flag in
the FPSCR, would produce the following code:

```
< r2 is the *carry input >
vmrs	r3, FPSCR_nzcvqc
bic	r3, r3, #536870912
orr	r3, r3, r2, lsl gcc-mirror#29
vmsr	FPSCR_nzcvqc, r3
```

when the MVE ACLE instead gives a different instruction sequence of:
```
< Rt is the *carry input >
VMRS Rs,FPSCR_nzcvqc
BFI Rs,Rt,gcc-mirror#29,#1
VMSR FPSCR_nzcvqc,Rs
```

the bic + orr pair is slower and it's also wrong, because, if the
*carry input is greater than 1, then we risk overwriting the top two
bits of the FPSCR register (the N and Z flags).

This turned out to be a problem in the header file and the solution was
to simply add a `& 1x0u` to the `*carry` input: then the compiler knows
that we only care about the lowest bit and can optimise to a BFI.

Ok for trunk?

Thanks,
Stam Markianos-Wright

gcc/ChangeLog:

	* config/arm/arm_mve.h (__arm_vadcq_s32): Fix arithmetic.
	(__arm_vadcq_u32): Likewise.
	(__arm_vadcq_m_s32): Likewise.
	(__arm_vadcq_m_u32): Likewise.
	(__arm_vsbcq_s32): Likewise.
	(__arm_vsbcq_u32): Likewise.
	(__arm_vsbcq_m_s32): Likewise.
	(__arm_vsbcq_m_u32): Likewise.
	* config/arm/mve.md (get_fpscr_nzcvqc): Make unspec_volatile.

gcc/testsuite/ChangeLog:
	* gcc.target/arm/mve/mve_vadcq_vsbcq_fpscr_overwrite.c: New.

(cherry picked from commit f1417d051be094ffbce228e11951f3e12e8fca1c)
kraj pushed a commit to kraj/gcc that referenced this pull request May 18, 2023
Hi all,

We noticed that calls to the vadcq and vsbcq intrinsics, both of
which use __builtin_arm_set_fpscr_nzcvqc to set the Carry flag in
the FPSCR, would produce the following code:

```
< r2 is the *carry input >
vmrs	r3, FPSCR_nzcvqc
bic	r3, r3, #536870912
orr	r3, r3, r2, lsl gcc-mirror#29
vmsr	FPSCR_nzcvqc, r3
```

when the MVE ACLE instead gives a different instruction sequence of:
```
< Rt is the *carry input >
VMRS Rs,FPSCR_nzcvqc
BFI Rs,Rt,gcc-mirror#29,#1
VMSR FPSCR_nzcvqc,Rs
```

the bic + orr pair is slower and it's also wrong, because, if the
*carry input is greater than 1, then we risk overwriting the top two
bits of the FPSCR register (the N and Z flags).

This turned out to be a problem in the header file and the solution was
to simply add a `& 1x0u` to the `*carry` input: then the compiler knows
that we only care about the lowest bit and can optimise to a BFI.

Ok for trunk?

Thanks,
Stam Markianos-Wright

gcc/ChangeLog:

	* config/arm/arm_mve.h (__arm_vadcq_s32): Fix arithmetic.
	(__arm_vadcq_u32): Likewise.
	(__arm_vadcq_m_s32): Likewise.
	(__arm_vadcq_m_u32): Likewise.
	(__arm_vsbcq_s32): Likewise.
	(__arm_vsbcq_u32): Likewise.
	(__arm_vsbcq_m_s32): Likewise.
	(__arm_vsbcq_m_u32): Likewise.
	* config/arm/mve.md (get_fpscr_nzcvqc): Make unspec_volatile.

gcc/testsuite/ChangeLog:
	* gcc.target/arm/mve/mve_vadcq_vsbcq_fpscr_overwrite.c: New.
tschwinge added a commit to tschwinge/gcc that referenced this pull request Sep 4, 2023
For nvptx offloading, it'll FAIL its execution test until nvptx-tools updated
to include commit 1b5946d78ef5dcfb640e9f545a7c791b7f623911
"Merge commit '26095fd01232061de9f79decb3e8222ef7b46191' into HEAD [gcc-mirror#29]",
<SourceryTools/nvptx-tools@1b5946d>.

	libgomp/
	* testsuite/libgomp.c-c++-common/pr100059-1.c: New.

Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
cooljeanius referenced this pull request in cooljeanius/gcc Sep 20, 2024
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
cooljeanius referenced this pull request in cooljeanius/gcc Sep 20, 2024
Fix code scanning alert #29: Incorrect conversion between integer types
cooljeanius referenced this pull request in cooljeanius/gcc Sep 20, 2024
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
cooljeanius referenced this pull request in cooljeanius/gcc Sep 20, 2024
Fix code scanning alert #29: Incorrect conversion between integer types
iains referenced this pull request in NinaRanns/gcc Oct 28, 2024
hubot pushed a commit that referenced this pull request Nov 8, 2024
Update test case for armv8.1-m.main that supports conditional
arithmetic.

armv7-m:
        push    {r4, lr}
        ldr     r4, .L6
        ldr     r4, [r4]
        lsls    r4, r4, #29
        it      mi
        addmi   r2, r2, #1
        bl      bar
        movs    r0, #0
        pop     {r4, pc}

armv8.1-m.main:
        push    {r3, r4, r5, lr}
        ldr     r4, .L5
        ldr     r5, [r4]
        tst     r5, #4
        csinc   r2, r2, r2, eq
        bl      bar
        movs    r0, #0
        pop     {r3, r4, r5, pc}

gcc/testsuite/ChangeLog:

	* gcc.target/arm/epilog-1.c: Use check-function-bodies.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
hubot pushed a commit that referenced this pull request Nov 8, 2024
Update test case for armv8.1-m.main that supports conditional
arithmetic.

armv7-m:
        push    {r4, lr}
        ldr     r4, .L6
        ldr     r4, [r4]
        lsls    r4, r4, #29
        it      mi
        addmi   r2, r2, #1
        bl      bar
        movs    r0, #0
        pop     {r4, pc}

armv8.1-m.main:
        push    {r3, r4, r5, lr}
        ldr     r4, .L5
        ldr     r5, [r4]
        tst     r5, #4
        csinc   r2, r2, r2, eq
        bl      bar
        movs    r0, #0
        pop     {r3, r4, r5, pc}

gcc/testsuite/ChangeLog:

	* gcc.target/arm/epilog-1.c: Use check-function-bodies.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
(cherry picked from commit ec86e87)
hubot pushed a commit that referenced this pull request Oct 15, 2025
The vadcq and vsbcq patterns had two problems:
- the adc / sbc part of the pattern did not mention the use of vfpcc
- the carry calcultation part should use a different unspec code

In addtion, the get_fpscr_nzcvqc and set_fpscr_nzcvqc were
over-cautious by using unspec_volatile when unspec is really what they
need.  Making them unspec enables to remove redundant accesses to
FPSCR_nzcvqc.

With unspec_volatile, we used to generate:
test_2:
	@ args = 0, pretend = 0, frame = 8
	@ frame_needed = 0, uses_anonymous_args = 0
	vmov.i32	q0, #0x1  @ v4si
	push	{lr}
	sub	sp, sp, #12
	vmrs	r3, FPSCR_nzcvqc    ;; [1]
	bic	r3, r3, #536870912
	vmsr	FPSCR_nzcvqc, r3
	vadc.i32	q3, q0, q0
	vmrs	r3, FPSCR_nzcvqc     ;; [2]
	vmrs	r3, FPSCR_nzcvqc
	orr	r3, r3, #536870912
	vmsr	FPSCR_nzcvqc, r3
	vadc.i32	q0, q0, q0
	vmrs	r3, FPSCR_nzcvqc
	ldr	r0, .L8
	ubfx	r3, r3, #29, #1
	str	r3, [sp, #4]
	bl	print_uint32x4_t
	add	sp, sp, #12
	@ sp needed
	pop	{pc}
.L9:
	.align	2
.L8:
	.word	.LC1

with unspec, we generate:
test_2:
	@ args = 0, pretend = 0, frame = 8
	@ frame_needed = 0, uses_anonymous_args = 0
	vmrs	r3, FPSCR_nzcvqc     ;; [1]
	bic	r3, r3, #536870912   ;; [3]
	vmov.i32	q0, #0x1  @ v4si
	vmsr	FPSCR_nzcvqc, r3
	vadc.i32	q3, q0, q0
	vmrs	r3, FPSCR_nzcvqc
	orr	r3, r3, #536870912
	vmsr	FPSCR_nzcvqc, r3
	vadc.i32	q0, q0, q0
	vmrs	r3, FPSCR_nzcvqc
	push	{lr}
	ubfx	r3, r3, #29, #1
	sub	sp, sp, #12
	ldr	r0, .L8
	str	r3, [sp, #4]
	bl	print_uint32x4_t
	add	sp, sp, #12
	@ sp needed
	pop	{pc}
.L9:
	.align	2
.L8:
	.word	.LC1

That is, unspec in get_fpscr_nzcvqc enables to:
- move [1] earlier
- delete redundant [2]

and unspec in set_fpscr_nzcvqc enables to move push {lr} and stack
manipulation later.

gcc/ChangeLog:

	PR target/122189
	* config/arm/iterators.md (VxCIQ_carry, VxCIQ_M_carry, VxCQ_carry)
	(VxCQ_M_carry): New iterators.
	* config/arm/mve.md (get_fpscr_nzcvqc, set_fpscr_nzcvqc): Use
	unspec instead of unspec_volatile.
	(vadciq, vadciq_m, vadcq, vadcq_m): Use vfpcc in operation.  Use a
	different unspec code for carry calcultation.
	* config/arm/unspecs.md (VADCQ_U_carry, VADCQ_M_U_carry)
	(VADCQ_S_carry, VADCQ_M_S_carry, VSBCIQ_U_carry ,VSBCIQ_S_carry
	,VSBCIQ_M_U_carry ,VSBCIQ_M_S_carry ,VSBCQ_U_carry ,VSBCQ_S_carry
	,VSBCQ_M_U_carry ,VSBCQ_M_S_carry ,VADCIQ_U_carry
	,VADCIQ_M_U_carry ,VADCIQ_S_carry ,VADCIQ_M_S_carry): New unspec
	codes.

gcc/testsuite/ChangeLog:

	PR target/122189
	* gcc.target/arm/mve/intrinsics/vadcq-check-carry.c: New test.
	* gcc.target/arm/mve/intrinsics/vadcq_m_s32.c: Adjust instructions
	order.
	* gcc.target/arm/mve/intrinsics/vadcq_m_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c: Likewise.
hubot pushed a commit that referenced this pull request Nov 12, 2025
The vadcq and vsbcq patterns had two problems:
- the adc / sbc part of the pattern did not mention the use of vfpcc
- the carry calcultation part should use a different unspec code

In addtion, the get_fpscr_nzcvqc and set_fpscr_nzcvqc were
over-cautious by using unspec_volatile when unspec is really what they
need.  Making them unspec enables to remove redundant accesses to
FPSCR_nzcvqc.

With unspec_volatile, we used to generate:
test_2:
	@ args = 0, pretend = 0, frame = 8
	@ frame_needed = 0, uses_anonymous_args = 0
	vmov.i32	q0, #0x1  @ v4si
	push	{lr}
	sub	sp, sp, #12
	vmrs	r3, FPSCR_nzcvqc    ;; [1]
	bic	r3, r3, #536870912
	vmsr	FPSCR_nzcvqc, r3
	vadc.i32	q3, q0, q0
	vmrs	r3, FPSCR_nzcvqc     ;; [2]
	vmrs	r3, FPSCR_nzcvqc
	orr	r3, r3, #536870912
	vmsr	FPSCR_nzcvqc, r3
	vadc.i32	q0, q0, q0
	vmrs	r3, FPSCR_nzcvqc
	ldr	r0, .L8
	ubfx	r3, r3, #29, #1
	str	r3, [sp, #4]
	bl	print_uint32x4_t
	add	sp, sp, #12
	@ sp needed
	pop	{pc}
.L9:
	.align	2
.L8:
	.word	.LC1

with unspec, we generate:
test_2:
	@ args = 0, pretend = 0, frame = 8
	@ frame_needed = 0, uses_anonymous_args = 0
	vmrs	r3, FPSCR_nzcvqc     ;; [1]
	bic	r3, r3, #536870912   ;; [3]
	vmov.i32	q0, #0x1  @ v4si
	vmsr	FPSCR_nzcvqc, r3
	vadc.i32	q3, q0, q0
	vmrs	r3, FPSCR_nzcvqc
	orr	r3, r3, #536870912
	vmsr	FPSCR_nzcvqc, r3
	vadc.i32	q0, q0, q0
	vmrs	r3, FPSCR_nzcvqc
	push	{lr}
	ubfx	r3, r3, #29, #1
	sub	sp, sp, #12
	ldr	r0, .L8
	str	r3, [sp, #4]
	bl	print_uint32x4_t
	add	sp, sp, #12
	@ sp needed
	pop	{pc}
.L9:
	.align	2
.L8:
	.word	.LC1

That is, unspec in get_fpscr_nzcvqc enables to:
- move [1] earlier
- delete redundant [2]

and unspec in set_fpscr_nzcvqc enables to move push {lr} and stack
manipulation later.

gcc/ChangeLog:

	PR target/122189
	* config/arm/iterators.md (VxCIQ_carry, VxCIQ_M_carry, VxCQ_carry)
	(VxCQ_M_carry): New iterators.
	* config/arm/mve.md (get_fpscr_nzcvqc, set_fpscr_nzcvqc): Use
	unspec instead of unspec_volatile.
	(vadciq, vadciq_m, vadcq, vadcq_m): Use vfpcc in operation.  Use a
	different unspec code for carry calcultation.
	* config/arm/unspecs.md (VADCQ_U_carry, VADCQ_M_U_carry)
	(VADCQ_S_carry, VADCQ_M_S_carry, VSBCIQ_U_carry ,VSBCIQ_S_carry
	,VSBCIQ_M_U_carry ,VSBCIQ_M_S_carry ,VSBCQ_U_carry ,VSBCQ_S_carry
	,VSBCQ_M_U_carry ,VSBCQ_M_S_carry ,VADCIQ_U_carry
	,VADCIQ_M_U_carry ,VADCIQ_S_carry ,VADCIQ_M_S_carry): New unspec
	codes.

gcc/testsuite/ChangeLog:

	PR target/122189
	* gcc.target/arm/mve/intrinsics/vadcq-check-carry.c: New test.
	* gcc.target/arm/mve/intrinsics/vadcq_m_s32.c: Adjust instructions
	order.
	* gcc.target/arm/mve/intrinsics/vadcq_m_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c: Likewise.

	(cherry picked from commits
	0272058 and
	697ccad)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants