Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev remove cuda tex refs v2 #460

Closed
wants to merge 15 commits into from
Closed

Conversation

griwodz
Copy link
Member

@griwodz griwodz commented Aug 2, 2018

Description

CUDA texture references have been removed and replaced with texture objects. Instead of binding texture to memory before a kernel call and unbinding it after, the textures are now created at memory allocation time and destroyed at memory release.

The maintenance of textured CUDA memory is still global, but collected in a single object named "global_data". Untextured memory has not been touched.

Changes affect only depthMap/cuda. Prerequisite for understanding the side effects of slow kernels like refine_compUpdateYKNCCSimMapPatch_kernel.

Features list

Implementation remarks

The changes remove global CUDA texture references, which are a limited resource and require on-demand binding to and unbinding from CUDA memory. An unlimited number of CUDA texture objects can be allocated. Using texture objects has some advantages:

  • slightly faster because bind/unbind is removed
  • texture objects are passed as parameters to CUDA kernels, which makes dependencies explicitly visible throughout the call chain
  • a piece of memory can be allocated with a texture object, keeping it throughout the memory's lifetime

CUDA memory is frequently allocated like local variables and deallocated at function exit. For memory with textures, this wasteful behavior is now avoided. Memory is now drawn from a pool in global_data, and pushed back when is not needed any more. For low end GPUs, a compile flag or explicit release may be desirable to keep the old semantics.

A lot of dead code was removed in the process.

@griwodz
Copy link
Member Author

griwodz commented Aug 2, 2018

PR 420 is still unsolved but it has too many commits to rebase onto develop in acceptable time. This is PR compresses the work done so far. Problems with cards of CC 3.0 and below are still not understood.

@griwodz griwodz added the review label Aug 2, 2018
@griwodz griwodz self-assigned this Aug 2, 2018
@griwodz griwodz force-pushed the dev_removeCudaTexRefs_v2 branch from 3a4e3eb to ef8f724 Compare August 3, 2018 07:05
@griwodz griwodz force-pushed the dev_removeCudaTexRefs_v2 branch from ec1d667 to 386dbba Compare August 9, 2018 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants