Skip to content

Comments

Add caching allocator for pinned (page-locked) memory#618

Merged
soumith merged 2 commits intotorch:masterfrom
colesbury:cached_pinned_memory
Dec 1, 2016
Merged

Add caching allocator for pinned (page-locked) memory#618
soumith merged 2 commits intotorch:masterfrom
colesbury:cached_pinned_memory

Conversation

@colesbury
Copy link
Contributor

Adds a caching allocator for CUDA pinned (page-locked) memory. This avoid synchronization due to cudaFreeHost (or cudaHostUnregister) calls.

To ensure read-after-write and write-after-read consistency, a CUDA event is recorded after every cudaMemcpyAsync between host and device involving pinned memory created by this allocator. Memory allocations are only re-used after they're freed and all associated CUDA events have completed.

Unlike the caching device allocator, allocations are never split. This means that requests for small allocations may be filled by much larger cached buffers. I think this should be OK in practice.

Also, CUDA events are processed in the order in which they're recorded, even though events may occur out-of-order between devices or streams. This does not affect correctness, but means that cached allocations may not be considered "ready" for re-use until a little later. In practice, I don't think this should matter.

To enable the caching pinned memory allocator and caching device allocator, set the environment variable THC_CACHING_ALLOCATOR=1

Adds a CUDA "sleep" kernel which spins for the given number of
iterations. This is useful for testing correct synchronization with
streams.
Adds a caching allocator for CUDA pinned (page-locked) memory. This
avoid synchronization due to cudaFreeHost or cudaHostUnregister at the
expense of potentially higher host memory usage.

Correctness is preserved by recording CUDA events after each
cudaMemcpyAsync involving the pinned memory. The pinned memory
allocations are not reused until all events associated with it have
completed.
@soumith soumith merged commit 0267dae into torch:master Dec 1, 2016
@colesbury colesbury deleted the cached_pinned_memory branch December 2, 2016 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants