Skip to content
This repository was archived by the owner on Jan 25, 2023. It is now read-only.

Move README.rst and HowTo.rst to numba-dppy #107

Merged
merged 5 commits into from
Nov 17, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
141 changes: 70 additions & 71 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,97 +1,96 @@
Numba with PyDPPL
=================
*****
Numba with patches for numba-dppy
*****

========
1. What?
========
.. image:: https://badges.gitter.im/numba/numba.svg
:target: https://gitter.im/numba/numba?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge
:alt: Gitter

DPPL proof-of-concept backend for NUMBA to support compilation for Intel CPU and
GPU architectures. The present implementation of DPPL is based on OpenCL 2.1,
but is likely to change in the future to rely on Sycl/DPC++ or Intel Level-0
driver API.
.. image:: https://img.shields.io/badge/discuss-on%20discourse-blue
:target: https://numba.discourse.group/
:alt: Discourse

===============
2. Perquisites?
===============
Patches for numba-dppy
######################

- Bash : In the system and not as default Shell
- Tar : To extract files
- Git : To fetch required dependencies listed below
- C/C++ compiler : To build the dependencies
- Cmake : For managing build process of dependencies
- Python3 : Version 3 is required
- Conda or miniconda : Can be found at https://docs.conda.io/en/latest/miniconda.html
- OpenCL 2.1 driver : DPPL currently works for both Intel GPUs and CPUs is a correct OpenCL driver version is found on the system.
Note. To use the GPU users should be added to "video" user group on Linux systems.
See https://github.com/IntelPython/numba-dppy.
If `numba-dppy` package is installed this version of Numba provides
additional features.
Without `numba-dppy` package this version of Numba works like original Numba.

A Just-In-Time Compiler for Numerical Functions in Python
#########################################################

The following requisites will need to be present in the system. Refer to next section for more details.
*******************************************************************************************************
Numba is an open source, NumPy-aware optimizing compiler for Python sponsored
by Anaconda, Inc. It uses the LLVM compiler project to generate machine code
from Python syntax.

- NUMBA v0.51 : The DPPL backend has only been tested for NUMBA v0.51.
The included install script downloads and applies
the DPPy patch to the correct NUMBA version.
Numba can compile a large subset of numerically-focused Python, including many
NumPy functions. Additionally, Numba has support for automatic
parallelization of loops, generation of GPU-accelerated code, and creation of
ufuncs and C callbacks.

- LLVM-SPIRV translator: Used for SPIRV generation from LLVM IR.
For more information about Numba, see the Numba homepage:
http://numba.pydata.org

- LLVMDEV : To support LLVM IR generation.
Supported Platforms
===================

- Others : All existing dependencies for NUMBA, such as llvmlite, also apply to DPPL.
* Operating systems and CPU:

==================
3. How to install?
==================
Install Pre-requisites
**********************
Make sure the following dependencies of NUMBA-PyDPPL are installed
in your conda environemtn:
- Linux: x86 (32-bit), x86_64, ppc64le (POWER8 and 9), ARMv7 (32-bit),
ARMv8 (64-bit)
- Windows: x86, x86_64
- macOS: x86_64

- llvmlite =0.33
- spirv-tools
- llvm-spirv
- llvmdev
- dpCtl =0.3
* (Optional) Accelerators and GPUs:

Make sure the dependencies are installed with consistent version of LLVM 10.
* NVIDIA GPUs (Kepler architecture or later) via CUDA driver on Linux, Windows,
macOS (< 10.14)
* AMD GPUs via ROCm driver on Linux

Install dpCtl backend
*********************
NUMBA-PyDPPL also depend on dpCtl backend. It can be found `here <https://github.com/IntelPython/dpCtl>`_.
Please install dpCtl from package.
Dependencies
============

Install NUMBA-PyDPPL
********************
After all the dependencies are installed please run ``build_for_develop.sh``
to get a local installation of NUMBA-PyDPPL.
* Python versions: 3.6-3.8
* llvmlite 0.34.*
* NumPy >=1.15 (can build with 1.11 for ABI compatibility)

================
4. Running tests
================
Optionally:

To make sure the installation was successful, try running the examples and the
test suite:
* Scipy >=1.0.0 (for ``numpy.linalg`` support)

$PATH_TO_NUMBA-PyDPPL/numba/dppl/examples/

To run the test suite execute the following:
Installing
==========

.. code-block:: bash
The easiest way to install Numba and get updates is by using the Anaconda
Distribution: https://www.anaconda.com/download

python -m numba.runtests numba.dppl.tests
::

===========================
5. How Tos and Known Issues
===========================
$ conda install numba

Refer the HowTo.rst guide for an overview of the programming semantics,
examples, supported functionalities, and known issues.
For more options, see the Installation Guide: http://numba.pydata.org/numba-doc/latest/user/installing.html

* Installing while Intel oneAPI Base Toolkit is activated have shown to throw error
while installation of NUMBA-PyDPPL because of incompatible TBB interface,
one way around that is to temporarily move env variable TBBROOT to something else*
Documentation
=============

===================
6. Reporting issues
===================
http://numba.pydata.org/numba-doc/latest/index.html


Mailing Lists
=============

Join the Numba mailing list numba-users@continuum.io:
https://groups.google.com/a/continuum.io/d/forum/numba-users

Some old archives are at: http://librelist.com/browser/numba/


Continuous Integration
======================

Please use https://github.com/IntelPython/numba/issues to report issues and bugs.
.. image:: https://dev.azure.com/numba/numba/_apis/build/status/numba.numba?branchName=master
:target: https://dev.azure.com/numba/numba/_build/latest?definitionId=1?branchName=master
:alt: Azure Pipelines
42 changes: 21 additions & 21 deletions HowTo.rst → numba-dppy/HowTo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,28 +2,28 @@
Features
========

DPPL is currently implemented using OpenCL 2.1. The features currently available
DPPY is currently implemented using OpenCL 2.1. The features currently available
are listed below with the help of sample code snippets. In this release we have
the implementation of the OAK approach described in MS138 in section 4.3.2. The
new decorator is described below.

To access the features driver module have to be imported from numba.dppl.dppl_driver
To access the features driver module have to be imported from numba_dppy.dppl_driver

New Decorator
=============

The new decorator included in this release is *dppl.kernel*. Currently this decorator
The new decorator included in this release is *numba_dppy.kernel*. Currently this decorator
takes only one option *access_types* which is explained below with the help of an example.
Users can write OpenCL tpye kernels where they can identify the global id of the work item
being executed. The supported methods inside a decorated function are:

- dppl.get_global_id(dimidx)
- dppl.get_local_id(dimidx)
- dppl.get_group_num(dimidx)
- dppl.get_num_groups(dimidx)
- dppl.get_work_dim()
- dppl.get_global_size(dimidx)
- dppl.get_local_size(dimidx)
- numba_dppy.get_global_id(dimidx)
- numba_dppy.get_local_id(dimidx)
- numba_dppy.get_group_num(dimidx)
- numba_dppy.get_num_groups(dimidx)
- numba_dppy.get_work_dim()
- numba_dppy.get_global_size(dimidx)
- numba_dppy.get_local_size(dimidx)

Currently no support is provided for local memory in the device and everything is in the
global memory. Barrier and other memory fences will be provided once support for local
Expand Down Expand Up @@ -61,7 +61,7 @@ Primitive types are passed by value to the kernel, currently supported are int,
Math Kernels
============

This release has support for math kernels. See numba/dppl/tests/dppl/test_math_functions.py
This release has support for math kernels. See numba_dppy/tests/dppl/test_math_functions.py
for more details.


Expand All @@ -72,7 +72,7 @@ Examples
Sum of two 1d arrays
====================

Full example can be found at numba/dppl/examples/sum.py.
Full example can be found at numba_dppy/examples/sum.py.

To write a program that sums two 1d arrays we at first need a OpenCL device environment.
We can get the environment by using *ocldrv.runtime.get_gpu_device()* for getting the
Expand All @@ -82,7 +82,7 @@ where *device_env.copy_array_to_device(data)* will read the ndarray and copy tha
and *ocldrv.DeviceArray(device_env.get_env_ptr(), data)* will create a buffer in the device
that has the same memory size as the ndarray being passed. The OpenCL Kernel in the
folllowing example is *data_parallel_sum*. To get the id of the work item we are currently
executing we need to use the *dppl.get_global_id(0)*, since this example only 1 dimension
executing we need to use the *numba_dppy.get_global_id(0)*, since this example only 1 dimension
we only need to get the id in dimension 0.

While invoking the kernel we need to pass the device environment and the global work size.
Expand All @@ -91,9 +91,9 @@ back to the host and we can use *device_env.copy_array_from_device(ddata)*.

.. code-block:: python

@dppl.kernel
@numba_dppy.kernel
def data_parallel_sum(a, b, c):
i = dppl.get_global_id(0)
i = numba_dppy.get_global_id(0)
c[i] = a[i] + b[i]

global_size = 10
Expand Down Expand Up @@ -126,7 +126,7 @@ ndArray Support

Support for passing ndarray directly to kernels is also supported.

Full example can be found at numba/dppl/examples/sum_ndarray.py
Full example can be found at numba_dppy/examples/sum_ndarray.py

For availing this feature instead of creating device buffers explicitly like the previous
example, users can directly pass the ndarray to the kernel. Internally it will result in
Expand All @@ -148,7 +148,7 @@ Reduction

This example will demonstrate a sum reduction of 1d array.

Full example can be found at numba/dppl/examples/sum_reduction.py.
Full example can be found at numba_dppy/examples/sum_reduction.py.

In this example to sum the 1d array we invoke the Kernel multiple times.
This can be implemented by invoking the kernel once, but that requires
Expand All @@ -161,15 +161,15 @@ ParFor Support

*Parallel For* is supported in this release for upto 3 dimensions.

Full examples can be found in numba/dppl/examples/pa_examples/
Full examples can be found in numba_dppy/examples/pa_examples/


=======
Testing
=======

All examples can be found in numba/dppl/examples/
All examples can be found in numba_dppy/examples/

All tests can be found in numba/dppl/tests/dppl and can be triggered by the following command:
All tests can be found in numba_dppy/tests/dppl and can be triggered by the following command:

``python -m numba.runtests numba.dppl.tests``
``python -m numba.runtests numba_dppy.tests``
54 changes: 49 additions & 5 deletions numba-dppy/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
# numba-dppy

## Numba + dpCtl + dpNP = numba-dppy
## Numba + dpPy + dpCtl + dpNP = numba-dppy

`numba-dppy` extends Numba with new backend for support compilation
for Intel CPU and GPU architectures.

For more information about Numba, see the Numba homepage:
http://numba.pydata.org
http://numba.pydata.org.

Note: `numba-dppy` requires patched version of Numba.
See https://github.com/IntelPython/numba.

For more information about dpCtl, see the dpCtl homepage:
https://intelpython.github.io/dpctl/
Expand All @@ -16,6 +19,47 @@ https://intelpython.github.io/dpnp/

## Dependencies

* numba
* dpCtl
* dpNP (optional)
* numba >=0.51 (IntelPython/numba)
* dpCtl >=0.3.8
* dpNP >=0.3 (optional)
* llvm-spirv (SPIRV generation from LLVM IR)
* llvmdev (LLVM IR generation)
* spirv-tools

## dpPy

dpPy is a proof-of-concept backend for NUMBA to support compilation for
Intel CPU and GPU architectures.
The present implementation of dpPy is based on OpenCL 2.1, but is likely
to change in the future to rely on Sycl/DPC++ or Intel Level-0 driver API.

## Installation

Use setup.py or conda (see conda-recipe).

## Testing

See folder `numba_dppy/tests`.

Run tests:
```bash
python -m numba.runtests numba_dppy.tests
```

## Examples

See folder `numba_dppy/examples`.

Run examples:
```bash
python numba_dppy/examples/sum.py
```

## How Tos

Refer the HowTo.rst guide for an overview of the programming semantics,
examples, supported functionalities, and known issues.

## Reporting issues

Please use https://github.com/IntelPython/numba-dppy/issues to report issues and bugs.