[WIP][SYCL] Add support for union types kernel arguments #2206

smanna12 · 2020-07-29T14:48:35Z

This patch adds support for union types as kernel parameter using existing codes of
"handleScalarType" and renames "handleScalarType" to "handleSimpleType" to
be more generic for scalars, vectors and unions types.

-Add kernel generation, integration header test, and runtime tests.

Signed-off-by: Soumi Manna soumi.manna@intel.com

smanna12 · 2020-07-29T14:51:27Z

Patch is incomplete. Work is in progress.

Fznamznon

Please add handling in SyclKernelIntHeaderCreator class.
I would also expect some sema test on kernel generation, integration header test and runtime test.

Fznamznon · 2020-07-29T15:08:41Z

clang/lib/Sema/SemaSYCL.cpp

+      } else if (FieldTy->isUnionType()) {
+        KF_FOR_EACH(handleUnionType, Field, FieldTy);
+      } else


If code of "handleScalarType" in "IntHeaderCreator" also suits for unions, we could unify it across scalars, unions and vectors, i.e. here you will have something like

} else if (FieldTy->isScalarType() || FieldTy->isVectorType() || FieldTy->isUnionType()) { KF_FOR_EACH(handleSimpleType, Field, FieldTy); }

i.e. handleScalarType can be renamed to something more generic (handleSimpleType in my example) and used for scalars, vectors and unions..

@Fznamznon, It seems like"handleScalarType" suits for unions. So this is renamed to "handleSimpleType".

…led when they're being used This patch fixed the issue that target memory might be deallocated when they're still being used or before they're used. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D84996

Add the optimizations we have in the SelectionDAG version. Known non-negative copies all known bits. Any known one other than the sign bit makes result non-negative. Differential Revision: https://reviews.llvm.org/D85000

Differential revision: https://reviews.llvm.org/D80802

Differential revision: https://reviews.llvm.org/D81213

A recent change broke `ninja check-asan` on Darwin by causing an error during linking of ASan unit tests [1]. Move the addition of `-ObjC` compiler flag outside of the new `if(COMPILER_RT_STANDALONE_BUILD)` block. It doesn't add any global flags (e.g, `${CMAKE_CXX_FLAGS}`) and the decision to add is based solely on source paths (`${source_rpath}`). [1] 8b2fcc4, https://reviews.llvm.org/D84466 Differential Revision: https://reviews.llvm.org/D85057

actually verified.

comparison.

Rather than hardcoding immediate values for 12 different combinations in a nested pair of switches, we can perform the matched logic operation on 3 magic constants to calculate the immediate. Special thanks to this tweet https://twitter.com/rygorous/status/1187034321992871936 for making me realize I could do this.

contains no packs. Fixes a regression from 740a164.

…/store. Refer to LangRef http://llvm.org/docs/LangRef.html#llvm-masked-load-intrinsics 'llvm.masked.load/store.*’ intrinsics are overloaded intrinsic, which allow the load/store data to be a vector of any integer, floating-point or pointer data type. Therefore, allow pointer data type when checking 'isLegalMaskedLoadStore()'. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D85045

These methods abstract away Error handling when trying to read options that can't be parsed by logging the error automatically and returning None. Reviewed By: gribozavr2 Differential Revision: https://reviews.llvm.org/D84812

Change to expand all arguments and return values to i64 to follow ABI. Update regression tests also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D84581

The patterns were incorrect copies from the FPU code, and are unnecessary, since there's no extended load for SPE. Just let LLVM itself do the work by marking it expand. Reviewed By: #powerpc, lkail Differential Revision: https://reviews.llvm.org/D78670

SPE doesn't have a fsel instruction, so don't try to lower to it. This fixes a "Cannot select: tN: f64 = PPCISD::FSEL tX, tY, tZ" error. Reviewed By: #powerpc, lkail Differential Revision: https://reviews.llvm.org/D77773

…VPTESTM. NFCI

querying getSCEV() for incomplete phis leads to wrong cache value in `ExprToIVMap`, because incomplete phis may be simplified to same value before get SCEV expression. Reviewed By: lebedev.ri, mkazantsev Differential Revision: https://reviews.llvm.org/D77560

… some code. Now we try to load and broadcast together for operand 1. Followed by load and broadcast for operand 1. Previously we tried load operand 1, load operand 1, broadcast operand 0, broadcast operand 1. Now we have a single helper that tries load and broadcast for one operand that we can just call twice.

…d files When checking for the style of a decl that isn't in the main file, the check will now search for the configuration that the included files uses to gather the style for its decls. This can be useful to silence warnings in header files that follow a different naming convention without using header-filter to silence all warnings(even from other checks) in the header file. Reviewed By: aaron.ballman, gribozavr2 Differential Revision: https://reviews.llvm.org/D84814

Patch improves performance of verify-machineinstrs pass up to 10x. Differential revision: https://reviews.llvm.org/D84105

…f32/v4i32 Minor precursor fix for D66004, but helps the SSE41 tests as well as they run with -disable-peephole

abs() should be rare enough that using value tracking is not going to be a compile-time cost burden, so use it to reduce a variety of potential patterns. We do this in DAGCombiner too. Differential Revision: https://reviews.llvm.org/D85043

…macro definition" The /Zc:__cplusplus option fixes GTEST_LANG_CXX11 value but not GTEST_HAS_TR1_TUPLE, so we still need to force the latter off. Still pass the option since it is required by https://reviews.llvm.org/D78186 too. Differential Revision: https://reviews.llvm.org/D84023

This patch moves a few spread out smin/smax tests to smin-smax-folds.ll and adds additional test cases that expose further potential for folds.

Currently, instruction level fast math flags are not considered when generating patterns for the machine combiner. This currently leads to some missed opportunities to generate FMAs in combination with `#pragma clang fp contract (fast)`. For example, when building the example below with -O3 for AArch64, no FMADD is generated. If built with -O2 and the DAGCombiner is used instead of the MachineCombiner for FMAs, an FMADD is generated. With this patch, the same code is generated in both cases. float madd_contract(float a, float b, float c) { #pragma clang fp contract (fast) return (a * b) + c; } Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D84930

…OR(X,Y) if the relevant elements are known zero."" [X86][SSE] Shuffle combine blends to OR(X,Y) if the relevant elements are known zero (REAPPLIED) This allows us to remove the (depth violating) code in getFauxShuffleMask where we were combining the OR(SHUFFLE,SHUFFLE) shuffle inputs as well, and not just the OR(). This is a minor step toward being able to shuffle combine from/to SELECT/BLENDV as a faux shuffle. Reapplied with fixed signed/unsigned comparisons.

Added patterns so that both SSAT and USAT instructions are generated with shifts. Added corresponding regression tests. Differential Review: https://reviews.llvm.org/D85120

This option was added a while back, to help improve AA around pointer phi loops. It looks for phi(gep(phi, const), x) loops, checking if x can then prove more precise aliasing info. Differential Revision: https://reviews.llvm.org/D82998

http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/15718/steps/build%20stage%201/logs/stdio: FAILED: /usr/bin/c++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/llvm-readobj -I/home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux/llvm/llvm/tools/llvm-readobj -Iinclude -I/home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux/llvm/llvm/include -march=broadwell -fPIC -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT tools/llvm-readobj/CMakeFiles/llvm-readobj.dir/ELFDumper.cpp.o -MF tools/llvm-readobj/CMakeFiles/llvm-readobj.dir/ELFDumper.cpp.o.d -o tools/llvm-readobj/CMakeFiles/llvm-readobj.dir/ELFDumper.cpp.o -c /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux/llvm/llvm/tools/llvm-readobj/ELFDumper.cpp /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux/llvm/llvm/tools/llvm-readobj/ELFDumper.cpp: In function ‘llvm::Expected<const llvm::object::Elf_Mips_Options<ELFT>*> readMipsOptions(const uint8_t*, llvm::ArrayRef<unsigned char>&, bool&)’: /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux/llvm/llvm/tools/llvm-readobj/ELFDumper.cpp:3374:12: error: parse error in template argument list if (O->size < ExpectedSize) Note: I played with godbolt.org and was able to catch the similar "error in template argument list" error when used gcc 4.9.0 with this code. Fix: try to introduce a variable to store `O->size`, it helped to me in godbolt.

As discussed in D84949, this removes the constraint to cast since it does not cause compile time degradation. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D85188

This is the final bit of work to relax the register allocation requirements when code generating normal LLVM IR, which rarely care about the result of inactive lanes. By using _PRED nodes we can make better use of SVE's reversed instructions. Also removes a redundant parameter from the min/max tests. Differential Revision: https://reviews.llvm.org/D85142

When mapping an optional value, if the value is <none> and followed by comments, there will be a parsing error. This patch helps fix this issue. e.g., When mapping the following YAML, ``` Sections: - Name: blah Type: SHT_foo Flags: [[FLAGS=<none>]] ## some comments. ``` the raw value of `ScalarNode` is "<none> " rather than "<none>". We need to remove the spaces. Differential Revision: https://reviews.llvm.org/D85180

The CFA is calculated as (SP/FP + offset), but when there are SVE objects on the stack the SP offset is partly scalable and should instead be expressed as the DWARF expression: SP + offset + scalable_offset * VG where VG is the Vector Granule register, containing the number of 64bits 'granules' in a scalable vector. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84043

This patch adds a CFI entry for each SVE callee saved register that needs unwind info at an offset from the CFA. The offset is a DWARF expression because the offset is partly scalable. The CFI entries only cover a subset of the SVE callee-saves and only encodes the lower 64-bits, thus implementing the lowest common denominator ABI. Existing unwinders may support VG but only restore the lower 64-bits. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84044

This is a first patch that sweeps over tests to fix indentation (tabs to spaces). It also adds label checks and removes redundant matching of `%{{.*}} = `. The following tests have been fixed: - arithmetic-ops-to-llvm - bitwise-ops-to-llvm - cast-ops-to-llvm - comparison-ops-to-llvm - logical-ops-to-llvm (renamed to match the rest) Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D85181

The bug was not noticed because we didn't have a lot of custom type conversions directly to LLVM dialect. Differential Revision: https://reviews.llvm.org/D85192

With new LLVM dialect type modeling, the dialect types no longer wrap LLVM IR types. Therefore, they need to be translated to and from LLVM IR during export and import. Introduce the relevant functionality for translating types. It is currently exercised by an ad-hoc type translation roundtripping test that will be subsumed by the actual translation test when the type system transition is complete. Depends On D84339 Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D85019

…deling These are intended to smoothen the transition and may be removed in the future in favor of more MLIR-compatible APIs. They intentionally have the same semantics as the existing functions, which must remain stable until the transition is complete. Depends On D85019 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D85020

This should probably be moved up to some common area eventually when there's another user.

A new first-party modeling for LLVM IR types in the LLVM dialect has been developed in parallel to the existing modeling based on wrapping LLVM `Type *` instances. It resolves the long-standing problem of modeling identified structure types, including recursive structures, and enables future removal of LLVMContext and related locking mechanisms from LLVMDialect. This commit only switches the modeling by (a) renaming LLVMTypeNew to LLVMType, (b) removing the old implementaiton of LLVMType, and (c) updating the tests. It is intentionally minimal. Separate commits will remove the infrastructure built for the transition and update API uses where appropriate. Depends On D85020 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D85021

GCC5 seems to dislike generic lambdas calling a method of the class containing the lambda without explicit `this`.

@src

This is based on the existing code for the non-intrinsic idioms in InstCombine. The vector constant constraint is non-obvious: undefs should be ok in the outer call, but they can't propagate safely from the inner call in all cases. Example: https://alive2.llvm.org/ce/z/-2bVbM define <2 x i8> @src(<2 x i8> %x) { %0: %m = umin <2 x i8> %x, { 7, undef } %m2 = umin <2 x i8> { 9, 9 }, %m ret <2 x i8> %m2 } => define <2 x i8> @tgt(<2 x i8> %x) { %0: %m = umin <2 x i8> %x, { 7, undef } ret <2 x i8> %m } Transformation doesn't verify! ERROR: Value mismatch Example: <2 x i8> %x = < undef, undef > Source: <2 x i8> %m = < #x00 (0) [based on undef value], #x00 (0) > <2 x i8> %m2 = < #x00 (0), #x00 (0) > Target: <2 x i8> %m = < #x07 (7), #x10 (16) > Source value: < #x00 (0), #x00 (0) > Target value: < #x07 (7), #x10 (16) >

…nsfer_read into full and partial copies. This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling: ``` %1:3 = scf.if (%inBounds) { scf.yield %view : memref<A...>, index, index } else { %2 = linalg.fill(%extra_alloc, %pad) %3 = subview %view [...][...][...] linalg.copy(%3, %alloc) memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 : memref<A...>, index, index } %res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]} ``` where `extra_alloc` is a top of the function alloca'ed buffer of one vector. This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer. The extra work only occurs on the boundary tiles.

Signed-off-by: Soumi Manna <soumi.manna@intel.com>

smanna12 · 2020-08-04T16:23:48Z

Please do not review. Sorry wrong merge conflicts happen.

smanna12 · 2020-08-04T17:22:18Z

I have created separate PR here: #2255

smanna12 changed the title ~~[WIP]{SYCL] Add support for union types kernel arguments~~ [WIP][SYCL] Add support for union types kernel arguments Jul 29, 2020

smanna12 changed the title ~~[WIP][SYCL] Add support for union types kernel arguments~~ [WIP][SYCL] [Not Review]Add support for union types kernel arguments Jul 29, 2020

smanna12 changed the title ~~[WIP][SYCL] [Not Review]Add support for union types kernel arguments~~ [WIP][SYCL] Add support for union types kernel arguments Jul 29, 2020

Fznamznon reviewed Jul 29, 2020

View reviewed changes

andreil99 and others added 25 commits July 31, 2020 15:39

Updated the -I option description.

731292e

[InstSimplify] add tests for abs intrinsic; NFC

77a0252

[ValueTracking] Improve llvm.abs handling in computeKnownBits.

86dea1f

Add the optimizations we have in the SelectionDAG version. Known non-negative copies all known bits. Any known one other than the sign bit makes result non-negative. Differential Revision: https://reviews.llvm.org/D85000

Upgrade MC to v0.9.

47a4a27

Differential revision: https://reviews.llvm.org/D80802

Support experimental v extension v0.9.

721d93f

Differential revision: https://reviews.llvm.org/D81213

[libc] [obvious] Fix strchr and strrchr tests so that constness is

5954755

actually verified.

[libc] [obvious] In strrchr, remove cast to unsigned char before

1b35c4f

comparison.

Don't crash if we deserialize a pack expansion type whose pattern

234f51a

contains no packs. Fixes a regression from 740a164.

[VE] Change calling convention to follow ABI

605fd4d

Change to expand all arguments and return values to i64 to follow ABI. Update regression tests also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D84581

PowerPC: Don't lower SELECT_CC to PPCISD::FSEL on SPE

7e9153e

SPE doesn't have a fsel instruction, so don't try to lower to it. This fixes a "Cannot select: tN: f64 = PPCISD::FSEL tX, tY, tZ" error. Reviewed By: #powerpc, lkail Differential Revision: https://reviews.llvm.org/D77773

[X86] Use TargetLowering::getRegClassFor to simplify some code in try…

1bd7046

…VPTESTM. NFCI

[MachineVerifier] Refactor calcRegsPassed. NFC

e73f5d8

Patch improves performance of verify-machineinstrs pass up to 10x. Differential revision: https://reviews.llvm.org/D84105

[clang-tidy][NFC] Small refactor

9f21947

[X86][AVX] Extend v2f64 BROADCAST(LOAD) -> BROADCAST_LOAD to v2i64/v4…

1b19015

…f32/v4i32 Minor precursor fix for D66004, but helps the SSE41 tests as well as they run with -disable-peephole

[InstSimplify] add abs test with assume; NFC

1aa52d6

tkrasnukha and others added 27 commits August 4, 2020 12:22

[SCEV] Consolidate some smin/smax folding tests into single test file.

b7856f9

This patch moves a few spread out smin/smax tests to smin-smax-folds.ll and adds additional test cases that expose further potential for folds.

[ARM] Generated SSAT and USAT instructions with shift

20283ff

Added patterns so that both SSAT and USAT instructions are generated with shifts. Added corresponding regression tests. Differential Review: https://reviews.llvm.org/D85120

[BasicAA] Enable -basic-aa-recphi by default

3c7e7d4

This option was added a while back, to help improve AA around pointer phi loops. It looks for phi(gep(phi, const), x) loops, checking if x can then prove more precise aliasing info. Differential Revision: https://reviews.llvm.org/D82998

[lldb] fix typo

daa1c6d

[JumpThreading] Add a test for simplification of cast of any op; NFC

e218da7

[JumpThreading] Remove cast's constraint

e734e82

As discussed in D84949, this removes the constraint to cast since it does not cause compile time degradation. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D85188

[JumpThreading] Update test freeze.ll; NFC

998c0ef

[mlir] Fix adding wrong operand value in promoteMemRefDescriptors.

8979a9c

The bug was not noticed because we didn't have a lot of custom type conversions directly to LLVM dialect. Differential Revision: https://reviews.llvm.org/D85192

[InstSimplify] reduce code for min/max analysis; NFC

20c71e5

This should probably be moved up to some common area eventually when there's another user.

[InstSimplify] add tests for min/max with constants; NFC

011e15b

[mlir] Fix GCC5 compilation problem in MLIR->LLVM type translation

cb9f9df

GCC5 seems to dislike generic lambdas calling a method of the class containing the lambda without explicit `this`.

Fix test failure

586f020

Signed-off-by: Soumi Manna <soumi.manna@intel.com>

Resolve comflicts

7119ff7

smanna12 closed this Aug 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP][SYCL] Add support for union types kernel arguments #2206

[WIP][SYCL] Add support for union types kernel arguments #2206

Uh oh!

smanna12 commented Jul 29, 2020 •

edited

Loading

Uh oh!

smanna12 commented Jul 29, 2020 •

edited

Loading

Uh oh!

Fznamznon left a comment

Uh oh!

Fznamznon Jul 29, 2020

Uh oh!

smanna12 Aug 4, 2020

Uh oh!

smanna12 commented Aug 4, 2020

Uh oh!

smanna12 commented Aug 4, 2020

Uh oh!

Uh oh!

[WIP][SYCL] Add support for union types kernel arguments #2206

[WIP][SYCL] Add support for union types kernel arguments #2206

Uh oh!

Conversation

smanna12 commented Jul 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smanna12 commented Jul 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fznamznon left a comment

Choose a reason for hiding this comment

Uh oh!

Fznamznon Jul 29, 2020

Choose a reason for hiding this comment

Uh oh!

smanna12 Aug 4, 2020

Choose a reason for hiding this comment

Uh oh!

smanna12 commented Aug 4, 2020

Uh oh!

smanna12 commented Aug 4, 2020

Uh oh!

Uh oh!

smanna12 commented Jul 29, 2020 •

edited

Loading

smanna12 commented Jul 29, 2020 •

edited

Loading