Skip to content

Work in Progress: edit build system build MPI #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 61 commits into from

Conversation

rouson
Copy link

@rouson rouson commented Mar 30, 2018

No description provided.

Alessandro Fanfarillo and others added 30 commits August 2, 2017 13:26
update teams branch from upstream
This eliminates the downloading of the two gfortran prerequisites:
1. MPICH
2. OpenCoarrays
Merge upstream gcc-mirror/gcc into sourceryinstitute/gcc
Merge sourceryinstitute/master into sourceryinstitute/teams
Merge master into download-opencoarrays-mpich
'sourceryinstitute/master' into download-opencoarrays-mpich
'upstream/master' into download-opencoarrays-mpich
This reverts commit 71c41bf, reversing
changes made to 7204ca4.
Revert "Merge pull request #8 from sourceryinstitute/teams"

This reverts commit 71c41bf, reversing
changes made to 7204ca4.
Revert "Merge pull request #9 from sourceryinstitute/download-opencoarrays-mpich"

This reverts commit 458897c, reversing
changes made to 71c41bf.
Change one error message in contrib/download_prerequisites
scrasmussen and others added 24 commits November 11, 2017 18:50
…pAndFormatting

Scrasmussen/teams cleanup and formatting
This reverts commit 96ce792, reversing
changes made to afff591.
Merge upstream GCC gcc-mirror master into sourceryinstitute/gcc master
…n in the source tree if fortran is one of the enabled languages. Default libray is MPICH, should work with other MPI implementations. Untested in this implementation.
* Replace Fortran 2015 -> Fortran 2018
* Update status of teams 
* Fix typos
…y hardcoded). Placeholder locations for its libs and includes are now passed onto the GCC Makefile thanks to modifications in configure.ac, Makefile.def and Makefile.tpl. It is still unclear how these work i.e. they create folders for some libraries but not others, but MPICH is now a host and build target. I need to make sure I'm doing things pseudo-properly by more closely examining what is done for other libraries. After this I have to find out where MPICH keeps its includes and libs before jumping onto the time consuming task of building gcc.
…lt. It should be made a build target rather than a dependency for gfortran. Let gfortran build, then build MPICH, making sure to send the gfortran flag. There is a way to do that, i just don't know how yet.
…h old code but want to keep this version for comparison.
@rouson rouson closed this Mar 30, 2018
@rouson rouson deleted the build-mpich branch March 30, 2018 18:11
kraj pushed a commit to kraj/gcc that referenced this pull request Nov 2, 2020
Enable thumb1_gen_const_int to generate RTL or asm depending on the
context, so that we avoid duplicating code to handle constants in
Thumb-1 with -mpure-code.

Use a template so that the algorithm is effectively shared, and
rely on two classes to handle the actual emission as RTL or asm.

The generated sequence is improved to handle right-shiftable and small
values with less instructions. We now generate:

128:
        movs    r0, r0, #128
264:
        movs    r3, gcc-mirror#33
        lsls    r3, gcc-mirror#3
510:
        movs    r3, #255
        lsls    r3, #1
512:
        movs    r3, #1
        lsls    r3, gcc-mirror#9
764:
        movs    r3, #191
        lsls    r3, gcc-mirror#2
65536:
        movs    r3, #1
        lsls    r3, gcc-mirror#16
0x123456:
        movs    r3, gcc-mirror#18 ;0x12
        lsls    r3, gcc-mirror#8
        adds    r3, gcc-mirror#52 ;0x34
        lsls    r3, gcc-mirror#8
        adds    r3, gcc-mirror#86 ;0x56
0x1123456:
        movs    r3, #137 ;0x89
        lsls    r3, gcc-mirror#8
        adds    r3, gcc-mirror#26 ;0x1a
        lsls    r3, gcc-mirror#8
        adds    r3, gcc-mirror#43 ;0x2b
        lsls    r3, #1
0x1000010:
        movs    r3, gcc-mirror#16
        lsls    r3, gcc-mirror#16
        adds    r3, #1
        lsls    r3, gcc-mirror#4
0x1000011:
        movs    r3, #1
        lsls    r3, gcc-mirror#24
        adds    r3, gcc-mirror#17
-8192:
	movs	r3, #1
	lsls	r3, gcc-mirror#13
	rsbs	r3, #0

The patch adds a testcase which does not fully exercise
thumb1_gen_const_int, as other existing patterns already catch small
constants.  These parts of thumb1_gen_const_int are used by
arm_thumb1_mi_thunk.

2020-11-02  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm.c (thumb1_const_rtl, thumb1_const_print): New
	classes.
	(thumb1_gen_const_int): Rename to ...
	(thumb1_gen_const_int_1): ... New helper function. Add capability
	to emit either RTL or asm, improve generated code.
	(thumb1_gen_const_int_rtl): New function.
	* config/arm/arm-protos.h (thumb1_gen_const_int): Rename to
	thumb1_gen_const_int_rtl.
	* config/arm/thumb1.md: Call thumb1_gen_const_int_rtl instead
	of thumb1_gen_const_int.

	gcc/testsuite/
	* gcc.target/arm/pure-code/no-literal-pool-m0.c: New.
aurxenon pushed a commit to aurxenon/gcc that referenced this pull request Sep 11, 2023
…butes

Implement weak and alias function attributes
hubot pushed a commit that referenced this pull request May 12, 2024
Examining the code generated for the following C snippet on a
raspberry pi:

int popcount_lut8(unsigned *buf, int n)
{
  int cnt=0;
  unsigned int i;
  do {
    i = *buf;
    cnt += lut[i&255];
    cnt += lut[i>>8&255];
    cnt += lut[i>>16&255];
    cnt += lut[i>>24];
    buf++;
  } while(--n);
  return cnt;
}

I was surprised to see following instruction sequence generated by the
compiler:

  mov    r5, r2, lsr #8
  uxtb   r5, r5

This sequence can be performed by a single ARM instruction:

  uxtb   r5, r2, ror #8

The attached patch allows GCC's combine pass to take advantage of ARM's
uxtb with rotate functionality to implement the above zero_extract, and
likewise to use the sxtb with rotate to implement sign_extract.  ARM's
uxtb and sxtb can only be used with rotates of 0, 8, 16 and 24, and of
these only the 8 and 16 are useful [ror #0 is a nop, and extends with
ror #24 can be implemented using regular shifts],  so the approach here
is to add the six missing but useful instructions as 6 different
define_insn in arm.md, rather than try to be clever with new predicates.

Later ARM hardware has advanced bit field instructions, and earlier
ARM cores didn't support extend-with-rotate, so this appears to only
benefit armv6 era CPUs (e.g. the raspberry pi).

Patch posted:
https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01339.html
Approved by Kyrill Tkachov:
https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01881.html

2024-05-12  Roger Sayle  <roger@nextmovesoftware.com>
	    Kyrill Tkachov  <kyrylo.tkachov@foss.arm.com>

	* config/arm/arm.md (*arm_zeroextractsi2_8_8, *arm_signextractsi2_8_8,
	*arm_zeroextractsi2_8_16, *arm_signextractsi2_8_16,
	*arm_zeroextractsi2_16_8, *arm_signextractsi2_16_8): New.

2024-05-12  Roger Sayle  <roger@nextmovesoftware.com>
	    Kyrill Tkachov  <kyrylo.tkachov@foss.arm.com>

	* gcc.target/arm/extend-ror.c: New test.
hubot pushed a commit that referenced this pull request Mar 3, 2025
The code in gcc.target/unsigned-extend-1.c really should not need an
unsigned extension operations when the optimizers are used.  For Arm
and thumb2 that is indeed the case, but for thumb1 code it gets more
complicated as there are too many instructions for combine to look at.
For thumb1 we end up with two redundant zero_extend patterns which are
not removed: the first after the subtract instruction and the second of
the final boolean result.

We can partially fix this (for the second case above) by adding a new
split pattern for LEU and GEU patterns which work because the two
instructions for the [LG]EU pattern plus the redundant extension
instruction are combined into a single insn, which we can then split
using the 3->2 method back into the two insns of the [LG]EU sequence.

Because we're missing the optimization for all thumb1 cases (not just
those architectures with UXTB), I've adjust the testcase to detect all
the idioms that we might use for zero-extending a value, namely:

       UXTB
       AND ...#255 (in thumb1 this would require a register to hold 255)
       LSL ... #24; LSR ... #24

but I've also marked this test as XFAIL for thumb1 because we can't yet
eliminate the first of the two extend instructions.

gcc/
	* config/arm/thumb1.md (split patterns for GEU and LEU): New.

gcc/testsuite:
	* gcc.target/arm/unsigned-extend-1.c: Expand check for any
	insn suggesting a zero-extend.  XFAIL for thumb1 code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants