Skip to content

0.5.1rc1 #236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 22, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/dpCtl.dptensor_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ dpCtl dptensor Python API

.. automodule:: dpctl.dptensor
:members:
:undoc-members:
20 changes: 18 additions & 2 deletions docs/dpCtl.memory_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,27 @@ dpCtl Memory Python API
#######################

.. automodule:: dpctl.memory

Classes
-------

.. autoclass:: dpctl.memory.MemoryUSMDevice
:members:
:inherited-members:
:undoc-members:

....
.. autoclass:: dpctl.memory.MemoryUSMHost
:members:
:inherited-members:
:undoc-members:

.. autoclass:: dpctl.memory.MemoryUSMShared
:members:
:inherited-members:
:undoc-members:

**Comparing dpctl.memory to Rapids Memory Manager (RMM)**
Comparison with Rapids Memory Manager (RMM)
-------------------------------------------

RMM implements DeviceBuffer which is Cython native class wrapping around something similar to ``std::vector<unsigned char, custom_cuda_allocator (calls resource manager)>`` which is called device_buffer.

Expand Down
21 changes: 21 additions & 0 deletions docs/dpCtl.program_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,25 @@ dpCtl Program Python API
########################

.. automodule:: dpctl.program

Classes
-------

.. autoclass:: dpctl.program.SyclKernel
:members:
:undoc-members:

.. autoclass:: dpctl.program.SyclProgram
:members:
:undoc-members:

Exceptions
----------

.. autoexception:: dpctl.program.SyclProgramCompilationError

Functions
---------

.. autofunction:: create_program_from_source
.. autofunction:: create_program_from_spirv
56 changes: 55 additions & 1 deletion docs/dpCtl_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,58 @@ dpCtl Python API
################

.. automodule:: dpctl
:members:

Classes
-------

.. autoclass:: dpctl.SyclContext
:members:
:undoc-members:

.. autoclass:: dpctl.SyclDevice
:members:
:undoc-members:

.. autoclass:: dpctl.SyclEvent
:members:
:undoc-members:

.. autoclass:: dpctl.SyclQueue
:members:
:undoc-members:

Enumerations
------------

.. autoclass:: dpctl.backend_type
:members:

.. autoclass:: dpctl.device_type
:members:

Exceptions
----------

.. autoexception:: dpctl.SyclKernelInvalidRangeError
.. autoexception:: dpctl.SyclKernelSubmitError
.. autoexception:: dpctl.SyclQueueCreationError
.. autoexception:: dpctl.UnsupportedBackendError
.. autoexception:: dpctl.UnsupportedDeviceError

Functions
---------

.. autofunction:: device_context
.. autofunction:: dump
.. autofunction:: get_current_backend
.. autofunction:: get_current_device_type
.. autofunction:: get_current_queue
.. autofunction:: get_include
.. autofunction:: get_num_activated_queues
.. autofunction:: get_num_platforms
.. autofunction:: get_num_queues
.. autofunction:: has_cpu_queues
.. autofunction:: has_gpu_queues
.. autofunction:: has_sycl_platforms
.. autofunction:: is_in_device_context
.. autofunction:: set_default_queue
1 change: 0 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,5 @@ Indices and tables
:maxdepth: 3
:caption: Contents:

self
toc_pyapi
api/dpCtl-CAPI_root
2 changes: 1 addition & 1 deletion docs/toc_pyapi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@ Python API
:maxdepth: 1

dpctl - SYCL runtime wrapper classes and queue manager <dpCtl_api>
dpctl.memory - USM memory manager <dpCtl.memory_api>
dpctl.dptensor - Data-parallel tensor containers <dpCtl.dptensor_api>
dpctl.memory - USM memory manager <dpCtl.memory_api>
dpctl.program - Program manager <dpCtl.program_api>
36 changes: 21 additions & 15 deletions dpctl-capi/source/dpctl_sycl_queue_manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,8 @@ QMgrHelper::getQueue (DPCTLSyclBackendType BETy,
QRef = new queue(gpuQs[DNum]);
break;
}
case DPCTLSyclBackendType::DPCTL_LEVEL_ZERO | DPCTLSyclDeviceType::DPCTL_GPU:
case DPCTLSyclBackendType::DPCTL_LEVEL_ZERO |
DPCTLSyclDeviceType::DPCTL_GPU:
{
auto l0GpuQs = get_level0_gpu_queues();
if (DNum >= l0GpuQs.size()) {
Expand Down Expand Up @@ -316,7 +317,8 @@ QMgrHelper::setAsDefaultQueue (DPCTLSyclBackendType BETy,
activeQ[0] = oclgpu_q[DNum];
break;
}
case DPCTLSyclBackendType::DPCTL_LEVEL_ZERO | DPCTLSyclDeviceType::DPCTL_GPU:
case DPCTLSyclBackendType::DPCTL_LEVEL_ZERO |
DPCTLSyclDeviceType::DPCTL_GPU:
{
auto l0gpu_q = get_level0_gpu_queues();
if (DNum >= l0gpu_q.size()) {
Expand All @@ -342,8 +344,8 @@ QMgrHelper::setAsDefaultQueue (DPCTLSyclBackendType BETy,
/*!
* Allocates a new sycl::queue by copying from the cached {cpu|gpu}_queues
* vector. The pointer returned is now owned by the caller and must be properly
* cleaned up. The helper function DPCTLDeleteSyclQueue() can be used is for that
* purpose.
* cleaned up. The helper function DPCTLDeleteSyclQueue() can be used is for
* that purpose.
*/
__dpctl_give DPCTLSyclQueueRef
QMgrHelper::pushSyclQueue (DPCTLSyclBackendType BETy,
Expand Down Expand Up @@ -383,7 +385,8 @@ QMgrHelper::pushSyclQueue (DPCTLSyclBackendType BETy,
QRef = new queue(activeQ[get_active_queues().size()-1]);
break;
}
case DPCTLSyclBackendType::DPCTL_LEVEL_ZERO | DPCTLSyclDeviceType::DPCTL_GPU:
case DPCTLSyclBackendType::DPCTL_LEVEL_ZERO |
DPCTLSyclDeviceType::DPCTL_GPU:
{
if (DNum >= get_level0_gpu_queues().size()) {
// \todo handle error
Expand Down Expand Up @@ -447,7 +450,7 @@ size_t DPCTLQueueMgr_GetNumActivatedQueues ()
* type combination.
*/
size_t DPCTLQueueMgr_GetNumQueues (DPCTLSyclBackendType BETy,
DPCTLSyclDeviceType DeviceTy)
DPCTLSyclDeviceType DeviceTy)
{
switch (BETy|DeviceTy)
{
Expand All @@ -459,7 +462,8 @@ size_t DPCTLQueueMgr_GetNumQueues (DPCTLSyclBackendType BETy,
{
return QMgrHelper::get_opencl_gpu_queues().size();
}
case DPCTLSyclBackendType::DPCTL_LEVEL_ZERO | DPCTLSyclDeviceType::DPCTL_GPU:
case DPCTLSyclBackendType::DPCTL_LEVEL_ZERO |
DPCTLSyclDeviceType::DPCTL_GPU:
{
return QMgrHelper::get_level0_gpu_queues().size();
}
Expand All @@ -485,8 +489,8 @@ DPCTLSyclQueueRef DPCTLQueueMgr_GetCurrentQueue ()
* and device number. A runtime_error gets thrown if no such device exists.
*/
DPCTLSyclQueueRef DPCTLQueueMgr_GetQueue (DPCTLSyclBackendType BETy,
DPCTLSyclDeviceType DeviceTy,
size_t DNum)
DPCTLSyclDeviceType DeviceTy,
size_t DNum)
{
return QMgrHelper::getQueue(BETy, DeviceTy, DNum);
}
Expand All @@ -506,8 +510,8 @@ bool DPCTLQueueMgr_IsCurrentQueue (__dpctl_keep const DPCTLSyclQueueRef QRef)
*/
__dpctl_give DPCTLSyclQueueRef
DPCTLQueueMgr_SetAsDefaultQueue (DPCTLSyclBackendType BETy,
DPCTLSyclDeviceType DeviceTy,
size_t DNum)
DPCTLSyclDeviceType DeviceTy,
size_t DNum)
{
return QMgrHelper::setAsDefaultQueue(BETy, DeviceTy, DNum);
}
Expand All @@ -517,8 +521,8 @@ DPCTLQueueMgr_SetAsDefaultQueue (DPCTLSyclBackendType BETy,
*/
__dpctl_give DPCTLSyclQueueRef
DPCTLQueueMgr_PushQueue (DPCTLSyclBackendType BETy,
DPCTLSyclDeviceType DeviceTy,
size_t DNum)
DPCTLSyclDeviceType DeviceTy,
size_t DNum)
{
return QMgrHelper::pushSyclQueue(BETy, DeviceTy, DNum);
}
Expand All @@ -536,8 +540,10 @@ void DPCTLQueueMgr_PopQueue ()
* SYCL device.
*/
DPCTLSyclQueueRef
DPCTLQueueMgr_GetQueueFromContextAndDevice (__dpctl_keep DPCTLSyclContextRef CRef,
__dpctl_keep DPCTLSyclDeviceRef DRef)
DPCTLQueueMgr_GetQueueFromContextAndDevice (
__dpctl_keep DPCTLSyclContextRef CRef,
__dpctl_keep DPCTLSyclDeviceRef DRef
)
{
auto dev = unwrap(DRef);
auto ctx = unwrap(CRef);
Expand Down
6 changes: 2 additions & 4 deletions dpctl/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,8 @@

* A SYCL queue manager exposed directly inside the top-level `dpctl`
module.
* A USM memory manager (`dpctl.memory`) that provides Python objects
implementing the Python buffer protocol using USM shared and USM host
allocators. The memory manager also exposes various utility functions
to wrap SYCL's USM allocators, deallocators, `memcpy` functions, *etc.*
* Python wrapper classes for the main SYCL runtime classes mentioned in
Section 4.6 of SYCL provisional 2020 spec (https://bit.ly/3asQx07).
"""
__author__ = "Intel Corp."

Expand Down
Loading