Compile CUDA directly to cubin instead of ptx #587

kris-rowe · 2022-05-20T22:14:30Z

Description

Since OCCA handles jitting and hashes based on the device architecture, we can compile CUDA code directly to cubin instead of PTX.

codecov · 2022-05-20T22:25:31Z

Codecov Report

Merging #587 (e9d4f68) into development (70aa7df) will increase coverage by 0.00%.
The diff coverage is n/a.

@@             Coverage Diff              @@
##           development     #587   +/-   ##
============================================
  Coverage        77.31%   77.32%           
============================================
  Files              264      264           
  Lines            19539    19539           
============================================
+ Hits             15107    15108    +1     
+ Misses            4432     4431    -1

Impacted Files	Coverage Δ
src/occa/internal/lang/specialMacros.cpp	`60.26% <0.00%> (+0.66%)`	⬆️

Compile directly to cubin instead of ptx since OCCA handles jitting.

e9d4f68

kris-rowe marked this pull request as ready for review May 20, 2022 22:14

kris-rowe merged commit c8a6b7d into libocca:development May 23, 2022

SFrijters mentioned this pull request Jun 9, 2022

"Value of threads per SM for entry <x> is out of range" ptxas errors since v1.3 #595

Closed

kris-rowe deleted the cuda-jit-to-cubin branch July 12, 2022 14:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compile CUDA directly to cubin instead of ptx #587

Compile CUDA directly to cubin instead of ptx #587

kris-rowe commented May 20, 2022

codecov bot commented May 20, 2022 •

edited

Loading

Compile CUDA directly to cubin instead of ptx #587

Compile CUDA directly to cubin instead of ptx #587

Conversation

kris-rowe commented May 20, 2022

Description

codecov bot commented May 20, 2022 • edited Loading

Codecov Report

codecov bot commented May 20, 2022 •

edited

Loading