Description
Implementing an EMM Plugin like in CUDA:
An EMM Plugin is implemented by deriving from BaseCUDAMemoryManager. A summary of considerations for the implementation follows:
Numba instantiates one instance of the EMM Plugin class per context. The context that owns an EMM Plugin object is accessible through self.context, if required.
The EMM Plugin is transparent to any code that uses Numba - all its methods are invoked by Numba, and never need to be called by code that uses Numba.
The allocation methods memalloc, memhostalloc, and mempin, should use the underlying library to allocate and/or pin device or host memory, and construct an instance of a memory pointer representing the memory to return back to Numba. These methods are always called when the current CUDA context is the context that owns the EMM Plugin instance.
The initialize method is called by Numba prior to the first use of the EMM Plugin object for a context. This method should do anything required to prepare the underlying library for allocations in the current context. This method may be called multiple times, and must not invalidate previous state when it is called.
The reset method is called when all allocations in the context are to be cleaned up. It may be called even prior to initialize, and an EMM Plugin implementation needs to guard against this.
To support inter-GPU communication, the get_ipc_handle method should provide an IpcHandle for a given MemoryPointer instance. This method is part of the EMM interface (rather than being handled within Numba) because the base address of the allocation is only known by the underlying library. Closing an IPC handle is handled internally within Numba.
It is optional to provide memory info from the get_memory_info method, which provides a count of the total and free memory on the device for the context. It is preferrable to implement the method, but this may not be practical for all allocators. If memory info is not provided, this method should raise a RuntimeError.
The defer_cleanup method should return a context manager that ensures that expensive cleanup operations are avoided whilst it is active. The nuances of this will vary between plugins, so the plugin documentation should include an explanation of how deferring cleanup affects deallocations, and performance in general.
The interface_version property is used to ensure that the plugin version matches the interface provided by the version of Numba. At present, this should always be 1.