IntelPython · PokhodenkoSA · Nov 17, 2020 · Nov 9, 2020 · Nov 17, 2020 · Nov 17, 2020
diff --git a/README.rst b/README.rst
@@ -1,97 +1,96 @@
-Numba with PyDPPL
-=================
+*****
+Numba with patches for numba-dppy
+*****
 
-========
-1. What?
-========
+.. image:: https://badges.gitter.im/numba/numba.svg
+   :target: https://gitter.im/numba/numba?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge
+   :alt: Gitter
 
-DPPL proof-of-concept backend for NUMBA to support compilation for Intel CPU and
-GPU architectures. The present implementation of DPPL is based on OpenCL 2.1,
-but is likely to change in the future to rely on Sycl/DPC++ or Intel Level-0
-driver API.
+.. image:: https://img.shields.io/badge/discuss-on%20discourse-blue
+   :target: https://numba.discourse.group/
+   :alt: Discourse
 
-===============
-2. Perquisites?
-===============
+Patches for numba-dppy
+######################
 
-- Bash                 : In the system and not as default Shell
-- Tar                  : To extract files
-- Git                  : To fetch required dependencies listed below
-- C/C++ compiler       : To build the dependencies
-- Cmake                : For managing build process of dependencies
-- Python3              : Version 3 is required
-- Conda or miniconda   : Can be found at https://docs.conda.io/en/latest/miniconda.html
-- OpenCL 2.1 driver    : DPPL currently works for both Intel GPUs and CPUs is a correct OpenCL driver version is found on the system.
-Note. To use the GPU users should be added to "video" user group on Linux systems.
+See https://github.com/IntelPython/numba-dppy.
+If `numba-dppy` package is installed this version of Numba provides
+additional features.
+Without `numba-dppy` package this version of Numba works like original Numba.
 
+A Just-In-Time Compiler for Numerical Functions in Python
+#########################################################
 
-The following requisites will need to be present in the system. Refer to next section for more details.
-*******************************************************************************************************
+Numba is an open source, NumPy-aware optimizing compiler for Python sponsored
+by Anaconda, Inc.  It uses the LLVM compiler project to generate machine code
+from Python syntax.
 
-- NUMBA v0.51          : The DPPL backend has only been tested for NUMBA v0.51.
-                         The included install script downloads and applies
-                         the DPPy patch to the correct NUMBA version.
+Numba can compile a large subset of numerically-focused Python, including many
+NumPy functions.  Additionally, Numba has support for automatic
+parallelization of loops, generation of GPU-accelerated code, and creation of
+ufuncs and C callbacks.
 
-- LLVM-SPIRV translator: Used for SPIRV generation from LLVM IR.
+For more information about Numba, see the Numba homepage:
+http://numba.pydata.org
 
-- LLVMDEV              : To support LLVM IR generation.
+Supported Platforms
+===================
 
-- Others               : All existing dependencies for NUMBA, such as llvmlite, also apply to DPPL.
+* Operating systems and CPU:
 
-==================
-3. How to install?
-==================
-Install Pre-requisites
-**********************
-Make sure the following dependencies of NUMBA-PyDPPL are installed
-in your conda environemtn:
+  - Linux: x86 (32-bit), x86_64, ppc64le (POWER8 and 9), ARMv7 (32-bit),
+    ARMv8 (64-bit)
+  - Windows: x86, x86_64
+  - macOS: x86_64
 
-- llvmlite =0.33
-- spirv-tools
-- llvm-spirv
-- llvmdev
-- dpCtl =0.3
+* (Optional) Accelerators and GPUs:
 
-Make sure the dependencies are installed with consistent version of LLVM 10.
+  * NVIDIA GPUs (Kepler architecture or later) via CUDA driver on Linux, Windows,
+    macOS (< 10.14)
+  * AMD GPUs via ROCm driver on Linux
 
-Install dpCtl backend
-*********************
-NUMBA-PyDPPL also depend on dpCtl backend. It can be found `here <https://github.com/IntelPython/dpCtl>`_.
-Please install dpCtl from package.
+Dependencies
+============
 
-Install NUMBA-PyDPPL
-********************
-After all the dependencies are installed please run ``build_for_develop.sh``
-to get a local installation of NUMBA-PyDPPL.
+* Python versions: 3.6-3.8
+* llvmlite 0.34.*
+* NumPy >=1.15 (can build with 1.11 for ABI compatibility)
 
-================
-4. Running tests
-================
+Optionally:
 
-To make sure the installation was successful, try running the examples and the
-test suite:
+* Scipy >=1.0.0 (for ``numpy.linalg`` support)
 
-    $PATH_TO_NUMBA-PyDPPL/numba/dppl/examples/
 
-To run the test suite execute the following:
+Installing
+==========
 
-.. code-block:: bash
+The easiest way to install Numba and get updates is by using the Anaconda
+Distribution: https://www.anaconda.com/download
 
-    python -m numba.runtests numba.dppl.tests
+::
 
-===========================
-5. How Tos and Known Issues
-===========================
+   $ conda install numba
 
-Refer the HowTo.rst guide for an overview of the programming semantics,
-examples, supported functionalities, and known issues.
+For more options, see the Installation Guide: http://numba.pydata.org/numba-doc/latest/user/installing.html
 
-* Installing while Intel oneAPI Base Toolkit is activated have shown to throw error
-while installation of NUMBA-PyDPPL because of incompatible TBB interface,
-one way around that is to temporarily move env variable TBBROOT to something else*
+Documentation
+=============
 
-===================
-6. Reporting issues
-===================
+http://numba.pydata.org/numba-doc/latest/index.html
+
+
+Mailing Lists
+=============
+
+Join the Numba mailing list numba-users@continuum.io:
+https://groups.google.com/a/continuum.io/d/forum/numba-users
+
+Some old archives are at: http://librelist.com/browser/numba/
+
+
+Continuous Integration
+======================
 
-Please use https://github.com/IntelPython/numba/issues to report issues and bugs.
+.. image:: https://dev.azure.com/numba/numba/_apis/build/status/numba.numba?branchName=master
+    :target: https://dev.azure.com/numba/numba/_build/latest?definitionId=1?branchName=master
+    :alt: Azure Pipelines
diff --git a/HowTo.rst → numba-dppy/HowTo.rst b/HowTo.rst → numba-dppy/HowTo.rst
@@ -2,28 +2,28 @@
 Features
 ========
 
-DPPL is currently implemented using OpenCL 2.1. The features currently available
+DPPY is currently implemented using OpenCL 2.1. The features currently available
 are listed below with the help of sample code snippets. In this release we have
 the implementation of the OAK approach described in MS138 in section 4.3.2. The
 new decorator is described below.
 
-To access the features driver module have to be imported from numba.dppl.dppl_driver
+To access the features driver module have to be imported from numba_dppy.dppl_driver
 
 New Decorator
 =============
 
-The new decorator included in this release is *dppl.kernel*. Currently this decorator
+The new decorator included in this release is *numba_dppy.kernel*. Currently this decorator
 takes only one option *access_types* which is explained below with the help of an example.
 Users can write OpenCL tpye kernels where they can identify the global id of the work item
 being executed. The supported methods inside a decorated function are:
 
-- dppl.get_global_id(dimidx)
-- dppl.get_local_id(dimidx)
-- dppl.get_group_num(dimidx)
-- dppl.get_num_groups(dimidx)
-- dppl.get_work_dim()
-- dppl.get_global_size(dimidx)
-- dppl.get_local_size(dimidx)
+- numba_dppy.get_global_id(dimidx)
+- numba_dppy.get_local_id(dimidx)
+- numba_dppy.get_group_num(dimidx)
+- numba_dppy.get_num_groups(dimidx)
+- numba_dppy.get_work_dim()
+- numba_dppy.get_global_size(dimidx)
+- numba_dppy.get_local_size(dimidx)
 
 Currently no support is provided for local memory in the device and everything is in the
 global memory. Barrier and other memory fences will be provided once support for local
@@ -61,7 +61,7 @@ Primitive types are passed by value to the kernel, currently supported are int,
 Math Kernels
 ============
 
-This release has support for math kernels. See numba/dppl/tests/dppl/test_math_functions.py
+This release has support for math kernels. See numba_dppy/tests/dppl/test_math_functions.py
 for more details.
 
 
@@ -72,7 +72,7 @@ Examples
 Sum of two 1d arrays
 ====================
 
-Full example can be found at numba/dppl/examples/sum.py.
+Full example can be found at numba_dppy/examples/sum.py.
 
 To write a program that sums two 1d arrays we at first need a OpenCL device environment.
 We can get the environment by using *ocldrv.runtime.get_gpu_device()* for getting the
@@ -82,7 +82,7 @@ where *device_env.copy_array_to_device(data)* will read the ndarray and copy tha
 and *ocldrv.DeviceArray(device_env.get_env_ptr(), data)* will create a buffer in the device
 that has the same memory size as the ndarray being passed. The OpenCL Kernel in the
 folllowing example is *data_parallel_sum*. To get the id of the work item we are currently
-executing we need to use the  *dppl.get_global_id(0)*, since this example only 1 dimension
+executing we need to use the  *numba_dppy.get_global_id(0)*, since this example only 1 dimension
 we only need to get the id in dimension 0.
 
 While invoking the kernel we need to pass the device environment and the global work size.
@@ -91,9 +91,9 @@ back to the host and we can use *device_env.copy_array_from_device(ddata)*.
 
 .. code-block:: python
 
-    @dppl.kernel
+    @numba_dppy.kernel
     def data_parallel_sum(a, b, c):
-        i = dppl.get_global_id(0)
+        i = numba_dppy.get_global_id(0)
         c[i] = a[i] + b[i]
 
     global_size = 10
@@ -126,7 +126,7 @@ ndArray Support
 
 Support for passing ndarray directly to kernels is also supported.
 
-Full example can be found at numba/dppl/examples/sum_ndarray.py
+Full example can be found at numba_dppy/examples/sum_ndarray.py
 
 For availing this feature instead of creating device buffers explicitly like the previous
 example, users can directly pass the ndarray to the kernel. Internally it will result in
@@ -148,7 +148,7 @@ Reduction
 
 This example will demonstrate a sum reduction of 1d array.
 
-Full example can be found at numba/dppl/examples/sum_reduction.py.
+Full example can be found at numba_dppy/examples/sum_reduction.py.
 
 In this example to sum the 1d array we invoke the Kernel multiple times.
 This can be implemented by invoking the kernel once, but that requires
@@ -161,15 +161,15 @@ ParFor Support
 
 *Parallel For* is supported in this release for upto 3 dimensions.
 
-Full examples can be found in numba/dppl/examples/pa_examples/
+Full examples can be found in numba_dppy/examples/pa_examples/
 
 
 =======
 Testing
 =======
 
-All examples can be found in numba/dppl/examples/
+All examples can be found in numba_dppy/examples/
 
-All tests can be found in numba/dppl/tests/dppl and can be triggered by the following command:
+All tests can be found in numba_dppy/tests/dppl and can be triggered by the following command:
 
-``python -m numba.runtests numba.dppl.tests``
+``python -m numba.runtests numba_dppy.tests``
diff --git a/numba-dppy/README.md b/numba-dppy/README.md
@@ -1,12 +1,15 @@
 # numba-dppy
 
-## Numba + dpCtl + dpNP = numba-dppy
+## Numba + dpPy + dpCtl + dpNP = numba-dppy
 
 `numba-dppy` extends Numba with new backend for support compilation
 for Intel CPU and GPU architectures.
 
 For more information about Numba, see the Numba homepage:
-http://numba.pydata.org
+http://numba.pydata.org.
+
+Note: `numba-dppy` requires patched version of Numba.
+See https://github.com/IntelPython/numba.
 
 For more information about dpCtl, see the dpCtl homepage:
 https://intelpython.github.io/dpctl/
@@ -16,6 +19,47 @@ https://intelpython.github.io/dpnp/
 
 ## Dependencies
 
-* numba
-* dpCtl
-* dpNP (optional)
+* numba >=0.51 (IntelPython/numba)
+* dpCtl >=0.3.8
+* dpNP >=0.3 (optional)
+* llvm-spirv (SPIRV generation from LLVM IR)
+* llvmdev (LLVM IR generation)
+* spirv-tools
+
+## dpPy
+
+dpPy is a proof-of-concept backend for NUMBA to support compilation for
+Intel CPU and GPU architectures.
+The present implementation of dpPy is based on OpenCL 2.1, but is likely
+to change in the future to rely on Sycl/DPC++ or Intel Level-0 driver API.
+
+## Installation
+
+Use setup.py or conda (see conda-recipe).
+
+## Testing
+
+See folder `numba_dppy/tests`.
+
+Run tests:
+```bash
+python -m numba.runtests numba_dppy.tests
+```
+
+## Examples
+
+See folder `numba_dppy/examples`.
+
+Run examples:
+```bash
+python numba_dppy/examples/sum.py
+```
+
+## How Tos
+
+Refer the HowTo.rst guide for an overview of the programming semantics,
+examples, supported functionalities, and known issues.
+
+## Reporting issues
+
+Please use https://github.com/IntelPython/numba-dppy/issues to report issues and bugs.