This repository has been archived by the owner on Mar 12, 2021. It is now read-only.
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
467: implement CuIterator for batching arrays to the GPU r=jrevels a=jrevels I'm hitting what I believe is likely a common use case for CuArrays. My workload is essentially a map-reduce of the form: ```julia some_reduction(f(batch::Tuple{Vararg{Array}})::Number for batch in batches) ``` Assuming `f` plays nicely with CuArrays, one way to get this running on a GPU is: ```julia cubatches = (map(x -> adapt(CuArray, x), batch) for batch in batches) some_reduction(f(cubatch) for cubatch in cubatches) ``` Unfortunately, this approach is a poor one when `batches` doesn't entirely fit in GPU memory. As the caller, I can assert that the old iterations' batches don't need to be kept around, so ideally, I'd have a mechanism to leverage to simply reuse old iterations' memory instead of allocating more. This PR implements such a mechanism: a `CuIterator` that maintains a memory pool and exploits the assumption that previous iterations' memory can be reused. This is just a rough POC sketch right now just to express what I'm trying to get at...it's probably dumb in ways I can't yet comprehend. AFAICT I'd like to do the following before merging: - [x] instead of the current shape/eltype-matching approach, just allocate to a device buffer (~`CUDAdrv.UnifiedBuffer`? I have no idea what I'm doing 😅 EDIT: oh is this how CUDAdrv exposes UVM or is that separate?~), increasing the buffer size as necessary once larger batches are encountered - [x] ~leverage `CUDAdrv.prefetch` to asynchronously move values to GPU memory so as to not artificially block the iterator's consumer~ EDIT: I'm not sure this is necessary anymore assuming the `copyto!` call we're using is asynchronous? - [x] make sure to free the pool once iteration is finished - [x] docs - [x] tests In a future PR, we could add a feature for `CuIterator` to utilize UVM if the caller is in an environment where that's supported. cc @vchuravy (thanks for discussing this with me earlier!) Co-authored-by: Jarrett Revels <jarrettrevels@gmail.com>
- Loading branch information