Skip to content

Adds a user manual to dpctl documentation #712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Dec 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/generate-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ jobs:
if: ${{ !github.event.pull_request || github.event.action != 'closed' }}
shell: bash -l {0}
run: |
pip install numpy cython setuptools sphinx sphinx_rtd_theme pydot graphviz
pip install numpy cython setuptools sphinx sphinx_rtd_theme pydot graphviz sphinxcontrib-programoutput
- name: Checkout repo
uses: actions/checkout@v2
with:
Expand Down
42 changes: 34 additions & 8 deletions docs/conf.in
Original file line number Diff line number Diff line change
@@ -1,23 +1,43 @@
# Data Parallel Control (dpctl)
#
# Copyright 2020-2021 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
import os
import sys

from docutils.parsers.rst import directives
from sphinx.ext.autosummary import Autosummary, get_documenter
from sphinx.util.inspect import safe_getattr

import dpctl

sys.path.insert(0, os.path.abspath("."))

import extlinks_gen as urlgen

# -- Project information -----------------------------------------------------

project = "Data-parallel Control (dpctl)"
copyright = "2020, Intel Corp."
copyright = "2020-21, Intel Corp."
author = "Intel Corp."

version = dpctl.__version__.strip(".dirty")
Expand All @@ -31,13 +51,15 @@ release = dpctl.__version__.strip(".dirty")
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
"sphinx.ext.todo",
"sphinx.ext.coverage",
"sphinx.ext.viewcode",
"sphinx.ext.githubpages",
"sphinx.ext.autodoc",
"sphinx.ext.autosummary",
"sphinx.ext.coverage",
"sphinx.ext.extlinks",
"sphinx.ext.githubpages",
"sphinx.ext.napoleon",
"sphinx.ext.todo",
"sphinx.ext.viewcode",
"sphinxcontrib.programoutput",
]

todo_include_todos = True
Expand Down Expand Up @@ -209,3 +231,7 @@ class AutoAutoSummary(Autosummary):

def setup(app):
app.add_directive("autoautosummary", AutoAutoSummary)


# A dictionary of urls
extlinks = urlgen.create_extlinks()
7 changes: 3 additions & 4 deletions docs/docfiles/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,9 @@ Welcome to Data-parallel Control (dpctl)'s documentation!
=========================================================

The data-parallel control (dpctl) library provides C and Python bindings for
`SYCL 2020 <https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html>`_.
The SYCL 2020 features supported by dpctl are limited to those included by
Intel's DPCPP compiler and specifically cover the SYCL runtime classes described
in `Section 4.6 <https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_sycl_runtime_classes>`_
:sycl_spec_2020:`SYCL 2020 <>`. The SYCL 2020 features supported by dpctl are
limited to those included by Intel's DPCPP compiler and specifically cover the
SYCL runtime classes described in :sycl_runtime_classes:`Section 4.6 <>`
of the SYCL 2020 specification. Apart from the bindings for these runtime
classes, dpctl includes bindings for SYCL USM memory allocators and
deallocators. Dpctl's Python API provides classes that implement
Expand Down
16 changes: 16 additions & 0 deletions docs/docfiles/urls.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"dpcpp_envar": "https://github.com/intel/llvm/blob/sycl/sycl/doc/EnvironmentVariables.md",
"numa_domain": "https://en.wikipedia.org/wiki/Non-uniform_memory_access",
"oneapi": "https://www.oneapi.io/",
"oneapi_filter_selection": "https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/FilterSelector/FilterSelector.adoc",
"sycl_aspects": "https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#table.device.aspect",
"sycl_context": "https://sycl.readthedocs.io/en/latest/iface/context.html",
"sycl_device": "https://sycl.readthedocs.io/en/latest/iface/device.html",
"sycl_device_info": "https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_device_information_descriptors",
"sycl_device_selector": "https://sycl.readthedocs.io/en/latest/iface/device-selector.html",
"sycl_event": "https://sycl.readthedocs.io/en/latest/iface/event.html",
"sycl_platform": "https://sycl.readthedocs.io/en/latest/iface/platform.html",
"sycl_queue": "https://sycl.readthedocs.io/en/latest/iface/queue.html",
"sycl_runtime_classes": "https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_sycl_runtime_classes",
"sycl_spec_2020": "https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html"
}
11 changes: 0 additions & 11 deletions docs/docfiles/urls.rst

This file was deleted.

47 changes: 22 additions & 25 deletions docs/docfiles/user_guides/QuickStart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,8 @@
Quick Start Guide
#################


.. contents:: Table of contents
:local:
:backlinks: none
:depth: 3

Installing from oneAPI
----------------------
======================

Dpctl is available as part of the oneAPI Intel Distribution of Python (IDP).
Please follow `oneAPI installation guide`_ to install oneAPI. In this quick
Expand Down Expand Up @@ -50,7 +44,7 @@ On Windows
`GPU driver installation guide`_.

Install Wheel package from Pypi
-------------------------------
===============================

Dpctl can also be istalled from Pypi.

Expand Down Expand Up @@ -79,20 +73,21 @@ On Windows
set PATH=<path_to_your_env>\bin;<path_to_your_env>\Library\bin;%PATH%

Building from source
--------------------
====================

To build dpctl from source, we need dpcpp and GPU drivers (and optionally CPU
OpenCL drivers). It is preferable to use the dpcpp compiler packaged as part of
oneAPI. However, it is possible to use a custom build of dpcpp to build dpctl,
especially if you want to enable CUDA support.

Building using oneAPI dpcpp
~~~~~~~~~~~~~~~~~~~~~~~~~~~
---------------------------

As before, oneAPI and graphics drivers should be installed on the system prior
to proceeding further.

**Activate oneAPI as follows**
Activate oneAPI as follows
~~~~~~~~~~~~~~~~~~~~~~~~~~

On Linux

Expand All @@ -106,7 +101,8 @@ On Windows

call "%ONEAPI_ROOT%\setvars.bat"

**Build and install using conda-build**
Build and install using conda-build
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The conda-recipe included with the sources can be used to build the dpctl
package. The advantage of this approach is that all dependencies are pulled in
Expand Down Expand Up @@ -136,7 +132,9 @@ After building the conda package you may install it by executing:
You could face issues with conda-build version 3.20. Use conda-build
3.18 instead.

**Build and Install with setuptools**

Build and install with setuptools
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To build using Python ``setuptools``, the following packages should be
installed:
Expand Down Expand Up @@ -164,13 +162,13 @@ to build and install
python setup.py develop

Building using custom dpcpp
~~~~~~~~~~~~~~~~~~~~~~~~~~~
---------------------------

It is possible to build dpctl from source using .. _DPC++ toolchain: https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md
instead of the DPC++ compiler that comes with oneAPI. One reason for doing this
may be to enable support for CUDA devices.

Following steps in :ref:`Build and Install with setuptools` use command line
Following steps in `Build and install with setuptools`_ use command line
option :code:`--sycl-compiler-prefix`, for example:

.. code-block:: bash
Expand All @@ -181,7 +179,7 @@ Available options and their descriptions can be retrieved using option
:code:`--help`.

Using dpctl
-----------
===========

Dpctl requires a DPC++ runtime. When dpctl is installed via conda then it uses
the DPC++ runtime from ``dpcpp_cpp_rt`` package that is part of IDP. When using
Expand All @@ -190,10 +188,10 @@ the system. The easiest way to setup a DPC++ runtime will be by activating
oneAPI.

Running examples and tests
--------------------------
==========================

Running the examples
~~~~~~~~~~~~~~~~~~~~
--------------------

After setting up dpctl you can try out the Python examples as follows:

Expand All @@ -213,7 +211,7 @@ located under *examples/cython*. Each example in the folder can be built using
examples.

Running the Python tests
~~~~~~~~~~~~~~~~~~~~~~~~
------------------------

The dpctl Python test suite can be executed as follows:

Expand All @@ -222,14 +220,13 @@ The dpctl Python test suite can be executed as follows:
pytest --pyargs dpctl


Building the C API shared library
---------------------------------
Building the DPCTLSyclInterface library
=======================================

The dpctl C API is a shared library called libDPCTLSyclInterface and is built
together when build the Python package. However, it is possible to only build
the C API as a standalone library. To do so, you will need ``cmake``,
The libDPCTLSyclInterface is a shared library used by the Python package.
To build the library you will need ``DPC++`` toolchain, ``cmake``,
``ninja`` or ``make``, and optionally ``gtest 1.10`` if you wish to run the
C API test suite.
test suite.

For example, on Linux the following script can be used to build the C oneAPI
library.
Expand Down
10 changes: 10 additions & 0 deletions docs/docfiles/user_guides/UserManual.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
.. _user_manual:

###########
User Manual
###########

.. toctree::
:maxdepth: 3

manual/dpctl/intro
75 changes: 75 additions & 0 deletions docs/docfiles/user_guides/manual/dpctl/basic_concepts.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
.. _basic_concepts:

Basic Concepts
==============

The section introduces the basic concepts for XPU management used by dpctl.
As dpctl is based on SYCL the concepts should be familiar to users with prior
experience with SYCL. However, users of dpctl need not have any prior experience
with SYCL and the concepts presented here should be self-sufficient. We do not
go into all the SYCL-level details here and if needed readers should refer to a
more topical SYCL reference such as the :sycl_spec_2020:`SYCL 2020 spec <>`.

* **Heterogeneous computing**
Refers to using multiple devices in a program.

* **Host**
Every program starts by running on a host, and most of the lines of code in
a program, in particular lines of code implementing the Python interpreter
itself, are usually for the host. Hosts are customarily CPUs.

* **Device**
A device is an XPU connected to a host that is programmable with a specific
device driver. Different types of devices can have different architectures
(CPUs, GPUs, FPGA, ASICs, DSP), but are programmable using the same
:oneapi:`oneAPI <>` programming model.

* **Platform**
A device driver installed on the system is termed as a platform. As multiple
devices of the same type can share the same device driver, a platform may
contain multiple devices. Note that the same physical hardware (say, a GPU)
may be reflected as two separate devices if they can be programmed by more
than one platform, *e.g.*, the same GPU hardware can be listed as an
OpenCL GPU device and a Level-Zero GPU device.

* **Context**
A context holds the run-time information needed to operate on a device or a
group of devices from the same platform. Contexts are relatively expensive
to create and should be reused as much as possible.

* **Queue**
A queue is needed to schedule execution of any computation, or data
copying on the device. Queue construction requires specifying a device
and a context targeting that device as well as additional properties,
such as whether profiling information should be collected or whether submitted
tasks are executed in the order in which they were submitted.

* **Event**
An event holds information related to computation/data movement operation
scheduled for execution on a queue, such as its execution status as well
as profiling information if the queue the task was submitted to allowed
for collection of such information. Events can be used to specify task
dependencies as well as to synchronize host and devices.

* **USM**
Unified Shared Memory (USM) refers to pointer based device memory management.
USM allocations are bound to context. In other words, a pointer representing
USM allocation can be unambiguously mapped to the data it represents only
if the associated context is known. USM allocations are accessible by
computational kernels that are executed on a device, provided that the
allocation is bound to the same context that was used to construct the queue
where the kernel was scheduled for execution.

Depending on the capability of the device, USM allocations can be a "device"
allocation, a "shared" allocation, or a "host" allocation. A "device"
allocation is not accessible from host, while "shared" or "host" allocations
are. "Host" allocation refers to an allocation in host memory that is
accessible from a device.

"Shared" allocations are accessible by both host and device. Runtime manages
synchronization of host's and device's view into shared allocations. Initial
placement of the shared allocations is not defined.

* **Backend**
Refers to an implementation of :oneapi:`oneAPI <>` programming model exposed
by the underlying runtime.
Loading