[SYCL][DOC] Fix warnings after upgrading sphinx

New sphinx/myst emits more bad cross-reference targets. Warnings like: :'myst' cross-reference target not found: 'prog-scope-var-decl' [myst.xref_missing]
mmoadeli · Oct 27, 2023 · edccb9b · edccb9b
1 parent e7c0b89
commit edccb9b
Show file tree

Hide file tree

Showing 17 changed files with 47 additions and 37 deletions.
diff --git a/sycl/doc/GetStartedGuide.md b/sycl/doc/GetStartedGuide.md
@@ -45,7 +45,7 @@ and a wide range of compute accelerators such as GPU and FPGA.
 * `ninja` -
 [Download](https://github.com/ninja-build/ninja/wiki/Pre-built-Ninja-packages)
 * C++ compiler
-  * See LLVM's [host compiler toolchain requirements](../../llvm/docs/GettingStarted.rst#host-c-toolchain-both-compiler-and-standard-library)
+  * See LLVM's [host compiler toolchain requirements](https://github.com/intel/llvm/blob/sycl/llvm/docs/GettingStarted.rst#host-c-toolchain-both-compiler-and-standard-library)
 
 Alternatively, you can use a Docker image that has everything you need for
 building pre-installed:
@@ -543,7 +543,7 @@ AOT compiler for each device type:
 #### CPU
 
 * CPU AOT compiler `opencl-aot` is enabled by default. For more, see
-[opencl-aot documentation](../../opencl/opencl-aot/README.md).
+[opencl-aot documentation](https://github.com/intel/llvm/blob/sycl/opencl/opencl-aot/README.md).
 
 #### Accelerator
 
@@ -709,7 +709,7 @@ ONEAPI_DEVICE_SELECTOR=cuda:* ./simple-sycl-app-cuda.exe
 
 **NOTE**: oneAPI DPC++/SYCL developers can specify SYCL device for execution
 using device selectors (e.g. `sycl::cpu_selector_v`, `sycl::gpu_selector_v`,
-[Intel FPGA selector(s)](extensions/supported/sycl_ext_intel_fpga_device_selector.md))
+[Intel FPGA selector(s)](extensions/supported/sycl_ext_intel_fpga_device_selector.asciidoc))
 as explained in following section
 [Code the program for a specific GPU](#code-the-program-for-a-specific-gpu).
 

diff --git a/sycl/doc/conf.py b/sycl/doc/conf.py
@@ -37,7 +37,7 @@
 ]
 
 # Implicit targets for cross reference
-myst_heading_anchors = 4
+myst_heading_anchors = 5
 
 # The name of the Pygments (syntax highlighting) style to use.
 pygments_style = 'friendly'

diff --git a/sycl/doc/cuda/opencl-subgroup-vs-cuda-crosslane-op.md b/sycl/doc/cuda/opencl-subgroup-vs-cuda-crosslane-op.md
@@ -1,7 +1,7 @@
 # CUDA crosslane vs OpenCL sub-groups
 
 ## Sub-group function mapping
-This document describes the mapping of the SYCL subgroup operations (based on the proposal [SYCL subgroup proposal](../extensions/sub_group_ndrange/sub_group_ndrange.md)) to CUDA (queries responses and PTX instruction mapping)
+This document describes the mapping of the SYCL subgroup operations (based on the proposal SYCL subgroup proposal) to CUDA (queries responses and PTX instruction mapping)
 
 ### Sub-group device Queries
 

diff --git a/sycl/doc/design/Assert.md b/sycl/doc/design/Assert.md
@@ -41,7 +41,7 @@ int main() {
 In this use-case every work-item with even index along 0 dimension will trigger
 assertion failure. Assertion failure should trigger a call to `std::abort()` at
 host as described in
-[extension](../extensions/supported/SYCL_EXT_ONEAPI_ASSERT.asciidoc).
+[extension](../extensions/supported/sycl_ext_oneapi_assert.asciidoc).
 Even though multiple failures of the same or different assertions can happen in
 multiple work-items, implementation is required to deliver at least one
 assertion. The assertion failure message is printed to `stderr` by DPCPP
@@ -81,7 +81,7 @@ practical cases.
 ## How it works?
 
 `assert(expr)` macro ends up in call to `__devicelib_assert_fail`. This function
-is part of [Device library extension](DeviceLibExtensions.rst#cl_intel_devicelib_cassert).
+is part of [Device library extension](https://github.com/intel/llvm/blob/sycl/doc/design/DeviceLibExtensions.rst#cl_intel_devicelib_cassert).
 
 The format of the assert message is unspecified, but it will always include the
 text of the failing expression, the values of the standard macros `__FILE__` and
@@ -168,6 +168,7 @@ image. All of them should have `extern` declaration of program scope variable
 available. Definition of the variable is only available within devicelib in the
 same binary image where fallback `__devicelib_assert_fail` resides.
 
+(prog-scope-var-decl)=
 <a name="prog-scope-var-decl">The variable has the following structure and
 declaration:</a>
 

diff --git a/sycl/doc/design/CommandGraph.md b/sycl/doc/design/CommandGraph.md
@@ -1,7 +1,7 @@
 # Command-Graph Extension
 
 This document describes the implementation design of the
-[SYCL Graph Extension](../extensions/proposed/sycl_ext_oneapi_graph.asciidoc).
+[SYCL Graph Extension](../extensions/experimental/sycl_ext_oneapi_graph.asciidoc).
 
 A related presentation can be found
 [here](https://www.youtube.com/watch?v=aOTAmyr04rM).
@@ -121,14 +121,14 @@ proposal. Memory operations will be supported subsequently by the current
 implementation starting with `memcpy`.
 
 Buffers and accessors are supported in a command-graph. There are
-[spec restrictions](../extensions/proposed/sycl_ext_oneapi_graph.asciidoc#storage-lifetimes)
+[spec restrictions](../extensions/experimental/sycl_ext_oneapi_graph.asciidoc#storage-lifetimes)
 on buffer usage in a graph so that their lifetime semantics are compatible with
 a lazy work execution model. However these changes to storage lifetimes have not
 yet been implemented.
 
 ## Backend Implementation
 
-Implementation of [UR command-buffers](#UR-command-buffer-experimental-feature)
+Implementation of UR command-buffers
 for each of the supported SYCL 2020 backends.
 
 This is currently only Level Zero but more sub-sections will be added here as

diff --git a/sycl/doc/design/CompileTimeProperties.md b/sycl/doc/design/CompileTimeProperties.md
@@ -40,7 +40,7 @@ One use for compile-time properties is with types that are used exclusively
 for declaring global variables.  One such example is the
 [sycl\_ext\_oneapi\_device\_global][2] extension:
 
-[2]: <../extensions/proposed/sycl_ext_oneapi_device_global.asciidoc>
+[2]: <../extensions/experimental/sycl_ext_oneapi_device_global.asciidoc>
 
 ```
 namespace sycl::ext::oneapi {
@@ -271,7 +271,7 @@ proposed in the [sycl\_ext\_oneapi\_kernel\_properties][8] extension.  There
 are two ways the application can specify these properties.  The first is by
 passing a `properties` parameter to the function that submits the kernel:
 
-[8]: <../extensions/proposed/sycl_ext_oneapi_kernel_properties.asciidoc>
+[8]: <../extensions/experimental/sycl_ext_oneapi_kernel_properties.asciidoc>
 
 ```
 namespace sycl {

diff --git a/sycl/doc/design/CompilerAndRuntimeDesign.md b/sycl/doc/design/CompilerAndRuntimeDesign.md
@@ -484,7 +484,7 @@ list coming either from `llvm-spirv` or from the AOT backend.
 Targeting PTX currently only accepts a single input file for processing, so
 `file-table-tform` is used to extract the code file from the file table, which
 is then processed by the
-["PTX target processing" step](#device-code-post-link-step-for-CUDA).
+["PTX target processing" step](#device-code-post-link-step-for-cuda).
 The resulting device binary is inserted back into the file table in place of the
 extracted code file using `file-table-tform`. If `-fno-sycl-rdc` is specified,
 all shown tools are invoked multiple times, once per translation unit rather than
@@ -556,7 +556,7 @@ TBD
 
 ##### Specialization constants lowering
 
-See [corresponding documentation](SpecializationConstants.md)
+See corresponding documentation
 
 #### CUDA support
 
@@ -1011,4 +1011,4 @@ with any other address space (including default).
 
 ## DPC++ Language extensions to SYCL
 
-List of language extensions can be found at [extensions](../extensions)
+List of language extensions can be found at [extensions](https://github.com/intel/llvm/blob/sycl/doc/extensions/)
diff --git a/sycl/doc/design/DeviceAspectTraitDesign.md b/sycl/doc/design/DeviceAspectTraitDesign.md
@@ -125,6 +125,6 @@ This relies on the fact that unspecialized variants of `any_device_has` and
 
 [1]: <https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:device-aspects>
 [2]: <../extensions/proposed/sycl_ext_oneapi_device_if.asciidoc>
-[3]: <../extensions/proposed/sycl_ext_oneapi_device_architecture.asciidoc>
+[3]: <../extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc>
 [4]: <DeviceIf.md>
 [5]: <OptionalDeviceFeatures.md>
diff --git a/sycl/doc/design/DeviceConfigFile.md b/sycl/doc/design/DeviceConfigFile.md
@@ -274,7 +274,7 @@ in more detail.
 
 ### Changes to Build Infrastructure
 We need the information about the targets in multiple tools and compiler
-modules listed in [Requirements](#Requirements).  Thus, we need to make sure
+modules listed in [Requirements](#requirements).  Thus, we need to make sure
 that the generation of the `.inc` file out of the `.td` file is done in time
 for all the consumers. The command we need to run for TableGen is `llvm-tblgen
 -gen-dynamic-tables -I /llvm-root/llvm/include/ input.td -o output.inc`.
@@ -302,7 +302,7 @@ the Device Configuration File (e.g. `sycl-post-link`) so that each of the
 tools can modify the map according to the user extensions described in the 
 `.yaml` file. 
 
-As mentioned in [Requirements](#Requirements), there is an auto-detection
+As mentioned in [Requirements](#requirements), there is an auto-detection
 mechanism for `aot-toolchain` and `aot-toolchain-options` that is able to
 infer these from the target name. In the `.yaml` example shown above the target
 name is `intel_gpu_skl`. From that name, we can infer that `aot-toolchain` is

diff --git a/sycl/doc/design/DeviceGlobal.md b/sycl/doc/design/DeviceGlobal.md
@@ -4,7 +4,7 @@ This document describes the implementation design for the DPC++ extension
 [sycl\_ext\_oneapi\_device\_global][1], which allows applications to declare
 global variables in device code.
 
-[1]: <../extensions/proposed/sycl_ext_oneapi_device_global.asciidoc>
+[1]: <../extensions/experimental/sycl_ext_oneapi_device_global.asciidoc>
 
 
 ## Requirements

diff --git a/sycl/doc/design/DeviceIf.md b/sycl/doc/design/DeviceIf.md
@@ -5,7 +5,7 @@ This document describes the design for the DPC++ implementation of the
 [sycl\_ext\_oneapi\_device\_architecture][2] extensions.
 
 [1]: <../extensions/proposed/sycl_ext_oneapi_device_if.asciidoc>
-[2]: <../extensions/proposed/sycl_ext_oneapi_device_architecture.asciidoc>
+[2]: <../extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc>
 
 
 ## Phased implementation

diff --git a/sycl/doc/design/KernelProgramCache.md b/sycl/doc/design/KernelProgramCache.md
@@ -81,6 +81,7 @@ predefined HW configuration(s). As a general solution it is reasonable to have
 program persistent cache which works between application restarts (e.g. cache
 on disk for device code built for specific HW/SW configuration).
 
+(what-is-program)=
 <a name="what-is-program">1</a>: Here "program" means an internal SYCL runtime
 object corresponding to a device code module or native binary defining a set of
 SYCL kernels and/or device functions.
@@ -112,9 +113,11 @@ The kernels map's key consists of two components:
 - the program the kernel belongs to,
 - kernel name<sup>[3](#what-is-kname)</sup>.
 
+(what-is-ksid)=
 <a name="what-is-ksid">1</a>: Kernel set id is an ordinal number of the device
 binary image the kernel is contained in.
 
+(what-is-bopts)=
 <a name="what-is-bopts">2</a>: The concatenation of build options (both compile
 and link options) set in application or environment variables. There are three
 sources of build options that the cache is aware of:
@@ -131,6 +134,7 @@ values (e.g. IGC has
 which affect JIT process). Changing such configuration will invalidate cache and
 manual cache cleanup should be done.
 
+(what-is-kname)=
 <a name="what-is-kname">3</a>: Kernel name is a kernel ID mangled class' name
 which is provided to methods of `sycl::handler` (e.g. `parallel_for` or
 `single_task`).
@@ -162,9 +166,11 @@ stored on disk (in every <n>.src file located in the cache item directory):
   containing 2 files: <max_n+1>.src for key values and <max_n+1>.bin for
   built image.
 
+(what-is-diid)=
 <a name="what-is-diid">1</a>: Hash out of the device code image used as input
 for the build.
 
+(what-is-did)=
 <a name="what-is-did">2</a>: Hash out of the string which is concatenation of
 values for `info::platform::name`, `info::device::name`,
 `info::device::version`, `info::device::driver_version` parameters to
@@ -321,9 +327,11 @@ condition variable. We employ them to signal waiting threads that the build
 process for this kernel/program is finished (either successfully or with a
 failure).
 
+(remove-pointer)=
 <a name="remove-pointer">1</a>: The use of `std::remove_pointer` was omitted for
 the sake of simplicity here.
 
+(exception-data)=
 <a name="exception-data">2</a>: Actually, we store contents of the exception:
 its message and error code.
 
@@ -387,6 +395,7 @@ in a directory, the directory should be locked until file creation is done.
 Advisory locking <sup>[1](#advisory-lock)</sup> is used to ensure that the
 user/OS tools are able to manage files.
 
+(advisory-lock)=
 <a name="advisory-lock">1.</a> Advisory locks work only when a process
 explicitly acquires and releases locks, and are ignored if a process is not
 aware of locks.

diff --git a/sycl/doc/design/OffloadDesign.md b/sycl/doc/design/OffloadDesign.md
@@ -7,7 +7,7 @@ the DPC++ Compiler.  This leverages the existing community Offloading
 design [OffloadingDesign][1] which covers the Clang driver and code generation
 steps for creating offloading applications.
 
-[1]: <../../../clang/docs/OffloadingDesign.rst>
+[1]: <https://github.com/intel/llvm/blob/clang/docs/OffloadingDesign.rst>
 
 The current offloading model is completely encapsulated within the Clang
 Compiler Driver requiring the driver to perform all of the additional steps

diff --git a/sycl/doc/design/OptionalDeviceFeatures.md b/sycl/doc/design/OptionalDeviceFeatures.md
@@ -266,7 +266,7 @@ non-FPGA users may want to use the `device_global` property
 [`device_image_scope`][5], which requires even non-FPGA users to have precise
 control over the way kernels are bundled into device images.
 
-[5]: <../extensions/proposed/sycl_ext_oneapi_device_global.asciidoc#properties-for-device-global-variables>
+[5]: <../extensions/experimental/sycl_ext_oneapi_device_global.asciidoc#properties-for-device-global-variables>
 
 The new definition of `-fsycl-device-code-split` is as follows:
 
@@ -1091,10 +1091,10 @@ The "name" column in this table lists the possible target names.  Since not all
 targets have a corresponding enumerator in the `architecture` enumeration, the
 second column tells when there is such an enumerator.  The last row in this
 table corresponds to all of the architecture names listed in the
-[sycl\_ext\_intel\_device\_architecture][8] extension whose name starts with
+[sycl\_ext\_oneapi\_device\_architecture][8] extension whose name starts with
 `intel_gpu_`.
 
-[8]: <../extensions/proposed/sycl_ext_intel_device_architecture.asciidoc>
+[8]: <../extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc>
 
 TODO: This table needs to be filled out for the CPU variants supported by the
 `opencl-aot` tool (avx512, avx2, avx, sse4.2) and for the FPGA targets.  We

diff --git a/sycl/doc/design/SYCLNativeCPU.md b/sycl/doc/design/SYCLNativeCPU.md
@@ -31,7 +31,7 @@ In order to execute kernels compiled for `native-cpu`, we provide a PI Plugin. T
 
 # Supported features and current limitations
 
-The SYCL Native CPU flow is still WIP, not optimized and several core SYCL features are currently unsupported. Currently `barrier` and several math builtins are not supported, and attempting to use those will most likely fail with an `undefined reference` error at link time. Examples of supported applications can be found in the [runtime tests](sycl/test/native_cpu).
+The SYCL Native CPU flow is still WIP, not optimized and several core SYCL features are currently unsupported. Currently `barrier` and several math builtins are not supported, and attempting to use those will most likely fail with an `undefined reference` error at link time. Examples of supported applications can be found in the [runtime tests](https://github.com/intel/llvm/blob/sycl/sycl/test/native_cpu).
 
 
 To execute the `e2e` tests on the Native CPU, configure the test suite with:
@@ -93,13 +93,13 @@ entry:
 }
 ```
 
-For the Native CPU target, the device compiler is in charge of materializing the SPIRV builtins (such as `@__spirv_BuiltInGlobalInvocationId`), so that they can be correctly updated by the runtime when executing the kernel. This is performed by the [PrepareSYCLNativeCPU pass](llvm/lib/SYCLLowerIR/PrepareSYCLNativeCPU.cpp).
+For the Native CPU target, the device compiler is in charge of materializing the SPIRV builtins (such as `@__spirv_BuiltInGlobalInvocationId`), so that they can be correctly updated by the runtime when executing the kernel. This is performed by the [PrepareSYCLNativeCPU pass](https://github.com/intel/llvm/blob/sycl/llvm/lib/SYCLLowerIR/PrepareSYCLNativeCPU.cpp).
 The PrepareSYCLNativeCPUPass also emits a `subhandler` function, which receives the kernel arguments from the SYCL runtime (packed in a vector), unpacks them, and forwards only the used ones to the actual kernel. 
 
 
 ## PrepareSYCLNativeCPU Pass
 
-This pass will add a pointer to a `nativecpu_state` struct as kernel argument to all the kernel functions, and it will replace all the uses of SPIRV builtins with the return value of appropriately defined functions, which will read the requested information from the `__nativecpu_state` struct. The `__nativecpu_state` struct and the builtin functions are defined in [native_cpu.hpp](sycl/include/sycl/detail/native_cpu.hpp).
+This pass will add a pointer to a `nativecpu_state` struct as kernel argument to all the kernel functions, and it will replace all the uses of SPIRV builtins with the return value of appropriately defined functions, which will read the requested information from the `__nativecpu_state` struct. The `__nativecpu_state` struct and the builtin functions are defined in [native_cpu.hpp](https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/detail/native_cpu.hpp).
 
 
 The resulting IR is:
@@ -160,7 +160,7 @@ Each entry in the array contains the kernel name as a string, and a pointer to t
 
 ## Kernel lowering and execution
 
-The information produced by the device compiler is then employed to correctly lower the kernel LLVM-IR module to the target ISA (this is performed by the driver when `-fsycl-targets=native_cpu` is set). The object file containing the kernel code is linked with the host object file (and libsycl and any other needed library) and the final executable is ran using the Native CPU PI Plug-in, defined in [pi_native_cpu.cpp](sycl/plugins/native_cpu/pi_native_cpu.cpp).
+The information produced by the device compiler is then employed to correctly lower the kernel LLVM-IR module to the target ISA (this is performed by the driver when `-fsycl-targets=native_cpu` is set). The object file containing the kernel code is linked with the host object file (and libsycl and any other needed library) and the final executable is ran using the Native CPU PI Plug-in, defined in [pi_native_cpu.cpp](https://github.com/intel/llvm/blob/sycl/sycl/plugins/native_cpu/pi_native_cpu.cpp).
 
 ## Ongoing work
 

diff --git a/sycl/doc/design/SharedLibraries.md b/sycl/doc/design/SharedLibraries.md
@@ -351,7 +351,7 @@ of defined symbols. If this assumption is not correct, there can be two cases:
   device image is taken to use duplicated symbol
 - Same symbols have different definitions. In this case ODR violation takes
   place, such situation leads to undefined behaviour. For more details refer
-  to [ODR violations](#ODR-violations) section.
+  to [ODR violations](#odr-violations) section.
   - The situation when two device images of different formats define the same
     symbols with two different definitions is not considered as ODR violation.
     In this case the suitable device image will be picked.