Skip to content

[HSABE] indexing of program scope functions #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

pjaaskel
Copy link
Contributor

Previously the binary image index contained only kernels
while program scope functions are needed for indirect call
implementations: The HSA Runtime API can be used to query
their addresses.

This patch mainly renames the corresponding structs and APIs that
have 'kernel' in their name to 'function' and adds a separate
flag to the function_index (ex. kernel_index) for separating
host-callable kernels from program scope functions (of which
addresses can be queried by the HSA runtime).

pjaaskel and others added 6 commits October 23, 2017 15:18
Previously the binary image index contained only kernels
while program scope functions are needed for indirect call
implementations: The HSA Runtime API can be used to query
their addresses.

This patch mainly renames the corresponding structs and APIs that
have 'kernel' in their name to 'function' and adds a separate
flag to the function_index (ex. kernel_index) for separating
host-callable kernels from program scope functions (of which
addresses can be queried by the HSA runtime).
@agauniyal
Copy link

@pjaaskel please see - #5 (comment)

@pjaaskel
Copy link
Contributor Author

@agauniyal yep, I know, thanks. Just wanted to open a PR to get the Github's review interface available.

kraj pushed a commit to kraj/gcc that referenced this pull request Oct 12, 2020
Prevents the following UBSAN error:

./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/torture/pr49770.C -O2 -c
/home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 2, which is not a valid value for type 'bool'
    #0 0x1fdb4d1 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482
    #1 0x1fcadaa in merge_call_side_effects(modref_summary*, gimple*, modref_summary*, bool) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:511
    gcc-mirror#2 0x1fcbadd in analyze_call /home/marxin/Programming/gcc2/gcc/ipa-modref.c:642
    gcc-mirror#3 0x1fcc061 in analyze_stmt /home/marxin/Programming/gcc2/gcc/ipa-modref.c:732
    gcc-mirror#4 0x1fccf31 in analyze_function /home/marxin/Programming/gcc2/gcc/ipa-modref.c:823
    gcc-mirror#5 0x1fd17e5 in execute /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1441
    gcc-mirror#6 0x25cca6e in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509
    gcc-mirror#7 0x25cd39b in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2597
    gcc-mirror#8 0x25cd450 in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2598
    gcc-mirror#9 0x25cd4ee in execute_pass_list(function*, opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2608
    gcc-mirror#10 0x25c7a5a in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/Programming/gcc2/gcc/passes.c:1726
    gcc-mirror#11 0x25cfa3f in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2941
    gcc-mirror#12 0x173572d in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2642
    gcc-mirror#13 0x17364ee in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777
    gcc-mirror#14 0x17372d9 in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022
    gcc-mirror#15 0x2a1f00a in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485
    gcc-mirror#16 0x2a27dc8 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321
    gcc-mirror#17 0x2a283cc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460
    gcc-mirror#18 0x54f21cd in main /home/marxin/Programming/gcc2/gcc/main.c:39
    gcc-mirror#19 0x7ffff6f0de09 in __libc_start_main ../csu/libc-start.c:314
    gcc-mirror#20 0x9eac09 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1plus+0x9eac09)

gcc/ChangeLog:

	* ipa-modref.c (merge_call_side_effects): Clear modref_parm_map
	fields in the vector.
kraj pushed a commit to kraj/gcc that referenced this pull request Oct 19, 2020
It fixes:

/home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 255, which is not a valid value for type 'bool'
    #0 0x18e5df3 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482
    #1 0x18dc180 in ipa_merge_modref_summary_after_inlining(cgraph_edge*) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1779
    gcc-mirror#2 0x18c1c72 in inline_call(cgraph_edge*, bool, vec<cgraph_edge*, va_heap, vl_ptr>*, int*, bool, bool*) /home/marxin/Programming/gcc2/gcc/ipa-inline-transform.c:492
    gcc-mirror#3 0x4a3589c in inline_small_functions /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2216
    gcc-mirror#4 0x4a3b230 in ipa_inline /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2697
    gcc-mirror#5 0x4a3d902 in execute /home/marxin/Programming/gcc2/gcc/ipa-inline.c:3096
    gcc-mirror#6 0x1edf831 in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509
    gcc-mirror#7 0x1ee26af in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2936
    gcc-mirror#8 0x103f31b in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2700
    gcc-mirror#9 0x103fb40 in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777
    gcc-mirror#10 0x104092b in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022
    gcc-mirror#11 0x235723b in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485
    gcc-mirror#12 0x235fff9 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321
    gcc-mirror#13 0x23605fc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460
    gcc-mirror#14 0x4e2b93b in main /home/marxin/Programming/gcc2/gcc/main.c:39
    gcc-mirror#15 0x7ffff6f0ae09 in __libc_start_main ../csu/libc-start.c:314
    gcc-mirror#16 0x9a0be9 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1+0x9a0be9)

gcc/ChangeLog:

	* ipa-modref.c (compute_parm_map): Clear vector.
nstester pushed a commit to nstester/gcc that referenced this pull request Jun 14, 2021
The fixed error is:

==21166==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs operator delete) on 0x60300000d900
    #0 0x7367d7 in operator delete(void*, unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:172
    #1 0x3b82e6e in pointer_equiv_analyzer::~pointer_equiv_analyzer() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:161
    #2 0x3b83387 in hybrid_folder::~hybrid_folder() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:517
    #3 0x3b83387 in execute_early_vrp /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:686
    #4 0x1790611 in execute_one_pass(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2567
    gcc-mirror#5 0x1792003 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2656
    gcc-mirror#6 0x1792029 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2657
    gcc-mirror#7 0x179209f in execute_pass_list(function*, opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2667
    gcc-mirror#8 0x178a5f3 in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:1773
    gcc-mirror#9 0x1792fac in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/plugin.h:191
    gcc-mirror#10 0x1792fac in execute_ipa_pass_list(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:3001
    gcc-mirror#11 0xc525fc in ipa_passes /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2154
    gcc-mirror#12 0xc525fc in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2289
    gcc-mirror#13 0xc5a096 in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2269
    gcc-mirror#14 0xc5a096 in symbol_table::finalize_compilation_unit() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2537
    gcc-mirror#15 0x1a7a17c in compile_file /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:482
    gcc-mirror#16 0x69c758 in do_compile /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2210
    gcc-mirror#17 0x69c758 in toplev::main(int, char**) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2349
    gcc-mirror#18 0x6a932a in main /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/main.c:39
    gcc-mirror#19 0x7ffff7820b34 in __libc_start_main ../csu/libc-start.c:332
    gcc-mirror#20 0x6aa5fd in _start (/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/objdir/gcc/cc1+0x6aa5fd)

0x60300000d900 is located 0 bytes inside of 32-byte region [0x60300000d900,0x60300000d920)
allocated by thread T0 here:
    #0 0x735ab7 in operator new[](unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:102
    #1 0x3b82dac in pointer_equiv_analyzer::pointer_equiv_analyzer(gimple_ranger*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:156

gcc/ChangeLog:

	* gimple-ssa-evrp.c (pointer_equiv_analyzer::~pointer_equiv_analyzer): Use delete[].
kraj pushed a commit to kraj/gcc that referenced this pull request Jun 8, 2022
It will crash when compiling but its the code setup i want to get in sooner
than later.

Addresses gcc-mirror#15
nstester pushed a commit to nstester/gcc that referenced this pull request Apr 29, 2023
This patch adds support for xstormy16's swap nibbles instruction (swpn).
For the test case:

short foo(short x) {
  return (x&0xff00) | ((x<<4)&0xf0) | ((x>>4)&0x0f);
}

GCC with -O2 currently generates the nine instruction sequence:
foo:    mov r7,r2
        asr r2,#4
        and r2,gcc-mirror#15
        mov.w r6,#-256
        and r6,r7
        or r2,r6
        shl r7,#4
        and r7,#255
        or r2,r7
        ret

with this patch, we now generate:
foo:	swpn r2
	ret

To achieve this using combine's four instruction "combinations" requires
a little wizardry.  Firstly, define_insn_and_split are introduced to
treat logical shifts followed by bitwise-AND as macro instructions that
are split after reload.  This is sufficient to recognize a QImode
nibble swap, which can be implemented by swpn followed by either a
zero-extension or a sign-extension from QImode to HImode.  Then finally,
in the correct context, a QImode swap-nibbles pattern can be combined to
preserve the high-byte of a HImode word, matching the xstormy16's swpn
semantics.  The naming of the new code iterators is taken from i386.md.

2023-04-29  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/stormy16/stormy16.md (any_lshift): New code iterator.
	(any_or_plus): Likewise.
	(any_rotate): Likewise.
	(*<any_lshift>_and_internal): New define_insn_and_split to
	recognize a logical shift followed by an AND, and split it
	again after reload.
	(*swpn): New define_insn matching xstormy16's swpn.
	(*swpn_zext): New define_insn recognizing swpn followed by
	zero_extendqihi2, i.e. with the high byte set to zero.
	(*swpn_sext): Likewise, for swpn followed by cbw.
	(*swpn_sext_2): Likewise, for an alternate RTL form.
	(*swpn_zext_ior): A pre-reload splitter so that an swpn+zext+ior
	sequence is split in the correct place to recognize the *swpn_zext
	followed by any_or_plus (ior, xor or plus) instruction.

gcc/testsuite/ChangeLog
	* gcc.target/xstormy16/swpn-1.c: New QImode test case.
	* gcc.target/xstormy16/swpn-2.c: New zero_extend test case.
	* gcc.target/xstormy16/swpn-3.c: New sign_extend test case.
	* gcc.target/xstormy16/swpn-4.c: New HImode test case.
nstester pushed a commit to nstester/gcc that referenced this pull request Apr 29, 2023
This patch contains some minor tweak to xstormy16's machine description
most significantly providing a pattern for HImode rotate left by a single
bit that requires only two instructions.

unsigned short foo(unsigned short x)
{
  return (x << 1) | (x >> 15);
}

currently with -O2 generates:
foo:    mov r7,r2
        shr r7,gcc-mirror#15
        shl r2,#1
        or r2,r7
        ret

with this patch, GCC now generates:
foo:	shl r2,#1 | adc r2,#0
        ret

Additionally neghi2 is converted to a define_insn (so that the RTL
optimizers see the negation semantics), and HImode rotations by
8-bits can now be recognized and implemented using swpb.

2023-04-29  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/stormy16/stormy16.md (neghi2): Convert from a define_expand
	to a define_insn.
	(*rotatehi_1): New define_insn for efficient 2 insn sequence.
	(*rotatehi_8, *rotaterthi_8): New define_insn to emit a swpb.

gcc/testsuite/ChangeLog
	* gcc.target/xstormy16/neghi2.c: New test case.
	* gcc.target/xstormy16/rotatehi-1.c: Likewise.
NinaRanns referenced this pull request in NinaRanns/gcc Jul 9, 2024
hubot pushed a commit that referenced this pull request Oct 24, 2024
This patch folds svindex with constant arguments into a vector series.
We implemented this in svindex_impl::fold using the function build_vec_series.
For example,
svuint64_t f1 ()
{
  return svindex_u642 (10, 3);
}
compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...}
in the gimple pass lower.
This optimization benefits cases where svindex is used in combination with
other gimple-level optimizations.
For example,
svuint64_t f2 ()
{
    return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5);
}
has previously been compiled to
f2:
        index   z0.d, #10, #3
        mul     z0.d, z0.d, #5
        ret
Now, it is compiled to
f2:
        mov     x0, 50
        index   z0.d, x0, #15
        ret

We added test cases checking
- the application of the transform during gimple for constant arguments,
- the interaction with another gimple-level optimization.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>

gcc/
	* config/aarch64/aarch64-sve-builtins-base.cc
	(svindex_impl::fold): Add constant folding.

gcc/testsuite/
	* gcc.target/aarch64/sve/index_const_fold.c: New test.
hubot pushed a commit that referenced this pull request Apr 25, 2025
We now have some script which transforms e.g.
<h2 id="15.1">GCC 15.1</h2>
line in gcc-15/changes.html to
<h2 id="15.1"><a href="#15.1">GCC 15.1</a></h2>

This unfortunately breaks the gcc_release script, which looks for
GCC 15.1 appearing in gennews after optional blanks from the start of
the line in the NEWS file, which is no longer the case, there is
[129]GCC 15.1
or something like that with an URL later on
 129. https://gcc.gnu.org/gcc-15/changes.html#15.1

The following patch handles this.

2025-04-25  Jakub Jelinek  <jakub@redhat.com>

	* gcc_release: Allow optional \[[0-9]+\] before GCC major.minor
	in the NEWS file.
hubot pushed a commit that referenced this pull request Apr 25, 2025
We now have some script which transforms e.g.
<h2 id="15.1">GCC 15.1</h2>
line in gcc-15/changes.html to
<h2 id="15.1"><a href="#15.1">GCC 15.1</a></h2>

This unfortunately breaks the gcc_release script, which looks for
GCC 15.1 appearing in gennews after optional blanks from the start of
the line in the NEWS file, which is no longer the case, there is
[129]GCC 15.1
or something like that with an URL later on
 129. https://gcc.gnu.org/gcc-15/changes.html#15.1

The following patch handles this.

2025-04-25  Jakub Jelinek  <jakub@redhat.com>

	* gcc_release: Allow optional \[[0-9]+\] before GCC major.minor
	in the NEWS file.

(cherry picked from commit fef3a3c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants