mirrored from git://gcc.gnu.org/git/gcc.git
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
[HSABE] indexing of program scope functions #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pjaaskel
wants to merge
6
commits into
gcc-mirror:master
Choose a base branch
from
parmance:hsa_function_index
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Previously the binary image index contained only kernels while program scope functions are needed for indirect call implementations: The HSA Runtime API can be used to query their addresses. This patch mainly renames the corresponding structs and APIs that have 'kernel' in their name to 'function' and adds a separate flag to the function_index (ex. kernel_index) for separating host-callable kernels from program scope functions (of which addresses can be queried by the HSA runtime).
|
@pjaaskel please see - #5 (comment) |
Contributor
Author
|
@agauniyal yep, I know, thanks. Just wanted to open a PR to get the Github's review interface available. |
kraj
pushed a commit
to kraj/gcc
that referenced
this pull request
Oct 12, 2020
Prevents the following UBSAN error:
./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/torture/pr49770.C -O2 -c
/home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 2, which is not a valid value for type 'bool'
#0 0x1fdb4d1 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482
#1 0x1fcadaa in merge_call_side_effects(modref_summary*, gimple*, modref_summary*, bool) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:511
gcc-mirror#2 0x1fcbadd in analyze_call /home/marxin/Programming/gcc2/gcc/ipa-modref.c:642
gcc-mirror#3 0x1fcc061 in analyze_stmt /home/marxin/Programming/gcc2/gcc/ipa-modref.c:732
gcc-mirror#4 0x1fccf31 in analyze_function /home/marxin/Programming/gcc2/gcc/ipa-modref.c:823
gcc-mirror#5 0x1fd17e5 in execute /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1441
gcc-mirror#6 0x25cca6e in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509
gcc-mirror#7 0x25cd39b in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2597
gcc-mirror#8 0x25cd450 in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2598
gcc-mirror#9 0x25cd4ee in execute_pass_list(function*, opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2608
gcc-mirror#10 0x25c7a5a in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/Programming/gcc2/gcc/passes.c:1726
gcc-mirror#11 0x25cfa3f in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2941
gcc-mirror#12 0x173572d in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2642
gcc-mirror#13 0x17364ee in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777
gcc-mirror#14 0x17372d9 in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022
gcc-mirror#15 0x2a1f00a in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485
gcc-mirror#16 0x2a27dc8 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321
gcc-mirror#17 0x2a283cc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460
gcc-mirror#18 0x54f21cd in main /home/marxin/Programming/gcc2/gcc/main.c:39
gcc-mirror#19 0x7ffff6f0de09 in __libc_start_main ../csu/libc-start.c:314
gcc-mirror#20 0x9eac09 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1plus+0x9eac09)
gcc/ChangeLog:
* ipa-modref.c (merge_call_side_effects): Clear modref_parm_map
fields in the vector.
kraj
pushed a commit
to kraj/gcc
that referenced
this pull request
Oct 19, 2020
It fixes:
/home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 255, which is not a valid value for type 'bool'
#0 0x18e5df3 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482
#1 0x18dc180 in ipa_merge_modref_summary_after_inlining(cgraph_edge*) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1779
gcc-mirror#2 0x18c1c72 in inline_call(cgraph_edge*, bool, vec<cgraph_edge*, va_heap, vl_ptr>*, int*, bool, bool*) /home/marxin/Programming/gcc2/gcc/ipa-inline-transform.c:492
gcc-mirror#3 0x4a3589c in inline_small_functions /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2216
gcc-mirror#4 0x4a3b230 in ipa_inline /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2697
gcc-mirror#5 0x4a3d902 in execute /home/marxin/Programming/gcc2/gcc/ipa-inline.c:3096
gcc-mirror#6 0x1edf831 in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509
gcc-mirror#7 0x1ee26af in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2936
gcc-mirror#8 0x103f31b in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2700
gcc-mirror#9 0x103fb40 in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777
gcc-mirror#10 0x104092b in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022
gcc-mirror#11 0x235723b in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485
gcc-mirror#12 0x235fff9 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321
gcc-mirror#13 0x23605fc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460
gcc-mirror#14 0x4e2b93b in main /home/marxin/Programming/gcc2/gcc/main.c:39
gcc-mirror#15 0x7ffff6f0ae09 in __libc_start_main ../csu/libc-start.c:314
gcc-mirror#16 0x9a0be9 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1+0x9a0be9)
gcc/ChangeLog:
* ipa-modref.c (compute_parm_map): Clear vector.
nstester
pushed a commit
to nstester/gcc
that referenced
this pull request
Jun 14, 2021
The fixed error is:
==21166==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs operator delete) on 0x60300000d900
#0 0x7367d7 in operator delete(void*, unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:172
#1 0x3b82e6e in pointer_equiv_analyzer::~pointer_equiv_analyzer() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:161
#2 0x3b83387 in hybrid_folder::~hybrid_folder() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:517
#3 0x3b83387 in execute_early_vrp /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:686
#4 0x1790611 in execute_one_pass(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2567
gcc-mirror#5 0x1792003 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2656
gcc-mirror#6 0x1792029 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2657
gcc-mirror#7 0x179209f in execute_pass_list(function*, opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2667
gcc-mirror#8 0x178a5f3 in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:1773
gcc-mirror#9 0x1792fac in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/plugin.h:191
gcc-mirror#10 0x1792fac in execute_ipa_pass_list(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:3001
gcc-mirror#11 0xc525fc in ipa_passes /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2154
gcc-mirror#12 0xc525fc in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2289
gcc-mirror#13 0xc5a096 in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2269
gcc-mirror#14 0xc5a096 in symbol_table::finalize_compilation_unit() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2537
gcc-mirror#15 0x1a7a17c in compile_file /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:482
gcc-mirror#16 0x69c758 in do_compile /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2210
gcc-mirror#17 0x69c758 in toplev::main(int, char**) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2349
gcc-mirror#18 0x6a932a in main /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/main.c:39
gcc-mirror#19 0x7ffff7820b34 in __libc_start_main ../csu/libc-start.c:332
gcc-mirror#20 0x6aa5fd in _start (/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/objdir/gcc/cc1+0x6aa5fd)
0x60300000d900 is located 0 bytes inside of 32-byte region [0x60300000d900,0x60300000d920)
allocated by thread T0 here:
#0 0x735ab7 in operator new[](unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:102
#1 0x3b82dac in pointer_equiv_analyzer::pointer_equiv_analyzer(gimple_ranger*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:156
gcc/ChangeLog:
* gimple-ssa-evrp.c (pointer_equiv_analyzer::~pointer_equiv_analyzer): Use delete[].
kraj
pushed a commit
to kraj/gcc
that referenced
this pull request
Jun 8, 2022
It will crash when compiling but its the code setup i want to get in sooner than later. Addresses gcc-mirror#15
nstester
pushed a commit
to nstester/gcc
that referenced
this pull request
Apr 29, 2023
This patch adds support for xstormy16's swap nibbles instruction (swpn).
For the test case:
short foo(short x) {
return (x&0xff00) | ((x<<4)&0xf0) | ((x>>4)&0x0f);
}
GCC with -O2 currently generates the nine instruction sequence:
foo: mov r7,r2
asr r2,#4
and r2,gcc-mirror#15
mov.w r6,#-256
and r6,r7
or r2,r6
shl r7,#4
and r7,#255
or r2,r7
ret
with this patch, we now generate:
foo: swpn r2
ret
To achieve this using combine's four instruction "combinations" requires
a little wizardry. Firstly, define_insn_and_split are introduced to
treat logical shifts followed by bitwise-AND as macro instructions that
are split after reload. This is sufficient to recognize a QImode
nibble swap, which can be implemented by swpn followed by either a
zero-extension or a sign-extension from QImode to HImode. Then finally,
in the correct context, a QImode swap-nibbles pattern can be combined to
preserve the high-byte of a HImode word, matching the xstormy16's swpn
semantics. The naming of the new code iterators is taken from i386.md.
2023-04-29 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/stormy16/stormy16.md (any_lshift): New code iterator.
(any_or_plus): Likewise.
(any_rotate): Likewise.
(*<any_lshift>_and_internal): New define_insn_and_split to
recognize a logical shift followed by an AND, and split it
again after reload.
(*swpn): New define_insn matching xstormy16's swpn.
(*swpn_zext): New define_insn recognizing swpn followed by
zero_extendqihi2, i.e. with the high byte set to zero.
(*swpn_sext): Likewise, for swpn followed by cbw.
(*swpn_sext_2): Likewise, for an alternate RTL form.
(*swpn_zext_ior): A pre-reload splitter so that an swpn+zext+ior
sequence is split in the correct place to recognize the *swpn_zext
followed by any_or_plus (ior, xor or plus) instruction.
gcc/testsuite/ChangeLog
* gcc.target/xstormy16/swpn-1.c: New QImode test case.
* gcc.target/xstormy16/swpn-2.c: New zero_extend test case.
* gcc.target/xstormy16/swpn-3.c: New sign_extend test case.
* gcc.target/xstormy16/swpn-4.c: New HImode test case.
nstester
pushed a commit
to nstester/gcc
that referenced
this pull request
Apr 29, 2023
This patch contains some minor tweak to xstormy16's machine description
most significantly providing a pattern for HImode rotate left by a single
bit that requires only two instructions.
unsigned short foo(unsigned short x)
{
return (x << 1) | (x >> 15);
}
currently with -O2 generates:
foo: mov r7,r2
shr r7,gcc-mirror#15
shl r2,#1
or r2,r7
ret
with this patch, GCC now generates:
foo: shl r2,#1 | adc r2,#0
ret
Additionally neghi2 is converted to a define_insn (so that the RTL
optimizers see the negation semantics), and HImode rotations by
8-bits can now be recognized and implemented using swpb.
2023-04-29 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/stormy16/stormy16.md (neghi2): Convert from a define_expand
to a define_insn.
(*rotatehi_1): New define_insn for efficient 2 insn sequence.
(*rotatehi_8, *rotaterthi_8): New define_insn to emit a swpb.
gcc/testsuite/ChangeLog
* gcc.target/xstormy16/neghi2.c: New test case.
* gcc.target/xstormy16/rotatehi-1.c: Likewise.
hubot
pushed a commit
that referenced
this pull request
Oct 24, 2024
This patch folds svindex with constant arguments into a vector series.
We implemented this in svindex_impl::fold using the function build_vec_series.
For example,
svuint64_t f1 ()
{
return svindex_u642 (10, 3);
}
compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...}
in the gimple pass lower.
This optimization benefits cases where svindex is used in combination with
other gimple-level optimizations.
For example,
svuint64_t f2 ()
{
return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5);
}
has previously been compiled to
f2:
index z0.d, #10, #3
mul z0.d, z0.d, #5
ret
Now, it is compiled to
f2:
mov x0, 50
index z0.d, x0, #15
ret
We added test cases checking
- the application of the transform during gimple for constant arguments,
- the interaction with another gimple-level optimization.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve-builtins-base.cc
(svindex_impl::fold): Add constant folding.
gcc/testsuite/
* gcc.target/aarch64/sve/index_const_fold.c: New test.
hubot
pushed a commit
that referenced
this pull request
Apr 25, 2025
We now have some script which transforms e.g. <h2 id="15.1">GCC 15.1</h2> line in gcc-15/changes.html to <h2 id="15.1"><a href="#15.1">GCC 15.1</a></h2> This unfortunately breaks the gcc_release script, which looks for GCC 15.1 appearing in gennews after optional blanks from the start of the line in the NEWS file, which is no longer the case, there is [129]GCC 15.1 or something like that with an URL later on 129. https://gcc.gnu.org/gcc-15/changes.html#15.1 The following patch handles this. 2025-04-25 Jakub Jelinek <jakub@redhat.com> * gcc_release: Allow optional \[[0-9]+\] before GCC major.minor in the NEWS file.
hubot
pushed a commit
that referenced
this pull request
Apr 25, 2025
We now have some script which transforms e.g. <h2 id="15.1">GCC 15.1</h2> line in gcc-15/changes.html to <h2 id="15.1"><a href="#15.1">GCC 15.1</a></h2> This unfortunately breaks the gcc_release script, which looks for GCC 15.1 appearing in gennews after optional blanks from the start of the line in the NEWS file, which is no longer the case, there is [129]GCC 15.1 or something like that with an URL later on 129. https://gcc.gnu.org/gcc-15/changes.html#15.1 The following patch handles this. 2025-04-25 Jakub Jelinek <jakub@redhat.com> * gcc_release: Allow optional \[[0-9]+\] before GCC major.minor in the NEWS file. (cherry picked from commit fef3a3c)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously the binary image index contained only kernels
while program scope functions are needed for indirect call
implementations: The HSA Runtime API can be used to query
their addresses.
This patch mainly renames the corresponding structs and APIs that
have 'kernel' in their name to 'function' and adds a separate
flag to the function_index (ex. kernel_index) for separating
host-callable kernels from program scope functions (of which
addresses can be queried by the HSA runtime).