Skip to content

LLVM and SPIRV-LLVM-Translator pulldown (WW50) #7640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 727 commits into from
Dec 6, 2022

Conversation

Fznamznon
Copy link
Contributor

jansvoboda11 and others added 30 commits December 1, 2022 20:16
Currently, the algorithm for gathering affecting module maps includes only those defining modules that include some headers. This is not entirely correct, though. Some module maps might be "importing" module maps for `extern` submodules. Such parent module maps are affecting - they do change semantics of the compilation. This patch adds parent module maps into the set of affecting module maps.

Depends on D137197.

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D137198
With this patch, we mark module maps that include an affecting `extern` module map as also affecting. This is a generalization of D137197: now we don't require the importing module map to describe parent of the extern module.

Depends on D137198.

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D137206
This is an attempt to fix a Windows bot failure. In the test introduced in 83973cf, file dependencies were printed out-of-order (after replacing backslashes with slashes). This might've been caused by styles of some paths being different.
…rloop pass"

This reverts commit df9d60a.

The CTRLoops pass is reordered to front of tail duplication pass in D138265.
Tail duplication may modify the loop to a "non-canonical" form
that CTR Loop pass can not recognize. We fixed one issue in D135846.
And we found in some other case, the loop is changed to irreducible form.
It is hard to fix this case in CTR loop pass, instead we reorder the
CTR loop pass before tail duplication pass and just after finalize-isel
pass to avoid any unexpected change to the loop form.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D138265
This patch is the Part-2 (BE LLVM) implementation of HW Exception handling.
Part-1 (FE Clang) was committed in 797ad70.

This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).

Compiler options:
  For clang-cl.exe, the option is -EHa, the same as MSVC.
  For clang.exe, the extra option is -fasync-exceptions,
  plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.

NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.

The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules:
  First, no exception can move in or out of _try region., i.e., no "potential faulty
    instruction can be moved across _try boundary.
  Second, the order of exceptions for instructions 'directly' under a _try must be preserved
    (not applied to those in callees).
  Finally, global states (local/global/heap variables) that can be read outside of _try region
    must be updated in memory (not just in register) before the subsequent exception occurs.

The impact to C++ code:
  Although SEH is a feature for C code, -EHa does have a profound effect on C++
  side. When a C++ function (in the same compilation unit with option -EHa ) is
  called by a SEH C function, a hardware exception occurs in C++ code can also
  be handled properly by an upstream SEH _try-handler or a C++ catch(...).
  As such, when that happens in the middle of an object's life scope, the dtor
  must be invoked the same way as C++ Synchronous Exception during unwinding process.

Design:
A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.

This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.
One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:

A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others.  If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:

Warning: jump bypasses variable with a non-trivial destructor

In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.
Implementation:
Part-1: Clang implementation (already in):
Please see commit 797ad70).

Part-2 : LLVM implementation described below.

For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).

For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.

The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.

Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

Differential Revision: https://reviews.llvm.org/D102817/new/
The function makes liveness tests for the entire live register set for every instruction it passes by.
This becomes very slow on high RP regions such as ASAN enabled code.

Instead only uses of last tracked instruction should be tested and this greatly improves compilation time.

This patch revealed few bugs in SIFormMemoryClauses and PreRARematStage::sinkTriviallyRematInsts which should
be fixed first.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D136267
Update BinaryOp<T>::gen so that const T& is threaded and some
operation knowledge that is not encoded by the type T or the arguments
can be used. For Extremum, it is the order (Greater or Lesser) that is
required, but this will also be required for evaluate::Relational.

Differential Revision: https://reviews.llvm.org/D139124
This helps override llvm libc functions for experiment purposes.

Differential Revision: https://reviews.llvm.org/D138999
This allows for more interesting manipulation of an
inflight diagnostic.
This allows for properly supporting TypeSwitch on reference
types which do not support copying/do not want copying.
Explanations for options of floating point are updated to match
the `RenderFloatingPointOptions` function in
`clang/lib/Driver/ToolChains/Clang.cpp`.

Missing explanations are also added.

Differential Revision: https://reviews.llvm.org/D138117
SELECT TYPE lower and conversion was not handling
`character` type guard. This add support for it.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D139106
ConstantOp uses `%idx<value>` and BoolConstantOp uses true/false, which
is similar to the printing for arith::ConstantOp.

Differential Revision: https://reviews.llvm.org/D139175
Subviews are supposed to be expanded before we hit the lowering
code.
The expansion is done with the pass called
expand-strided-metadata.

Add a test that demonstrate how these passes can be linked up to achieve
the desired lowering.

This patch is NFC in spirit but not in practice because `subview` gets
lowered into `reinterpret_cast(extract_strided_metadata, <some math>)`
which lowers in two memref descriptors (one for `reinterpert_cast` and
one for `extract_strided_metadata`), which creates some noise of the
form: `extractvalue(unrealized_cast(extractvalue[0]))[0]` that is
currently not simplified within MLIR but that is really just noop in
that case.

Differential Revision: https://reviews.llvm.org/D136377
Restricts FMV usage for subtargets without 'f' extension.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D139105
d0k and others added 14 commits December 4, 2022 18:36
  CONFLICT (content): Merge conflict in clang/lib/AST/Expr.cpp
  CONFLICT (content): Merge conflict in clang/lib/CodeGen/CGDebugInfo.cpp
  CONFLICT (content): Merge conflict in clang/lib/Basic/Targets/SPIR.h
Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@a3fc26f
The extension was added in KhronosGroup/SPIRV-Registry#148

Starting from this PR SPV_INTEL_non_constant_addrspace_printf will
be step by step deprecated.

Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@a9391f7
@Fznamznon Fznamznon added the disable-lint Skip linter check step and proceed with build jobs label Dec 5, 2022
@Fznamznon Fznamznon marked this pull request as ready for review December 5, 2022 15:03
@Fznamznon Fznamznon requested review from a team and pvchupin as code owners December 5, 2022 15:03
@Fznamznon
Copy link
Contributor Author

SYCL / Linux / HIP AMDGPU LLVM Test Suite (pull_request_target) fails for other PRs as well.
@intel/llvm-gatekeepers please review and comment "/merge" to merge.

@pvchupin
Copy link
Contributor

pvchupin commented Dec 6, 2022

/merge

@bb-sycl
Copy link
Contributor

bb-sycl commented Dec 6, 2022

Tue 06 Dec 2022 05:06:27 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

@bb-sycl
Copy link
Contributor

bb-sycl commented Dec 6, 2022

Tue 06 Dec 2022 05:11:21 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later.

@bb-sycl bb-sycl merged commit 3cf2bfb into intel:sycl Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disable-lint Skip linter check step and proceed with build jobs
Projects
None yet
Development

Successfully merging this pull request may close these issues.