Closed
Description
Current algorithm of searching kernels calling __devicelib_assert_fail (see function getKernelNamesUsingAssert(const Module &M)
) is very inefficient. Current implementation goes over all functions in the module from a kernel down to called device functions even if __devicelib_assert_fail not declared.
Bottom-up approach is way more efficient. I.e. start with __devicelib_assert_fail function declaration and travers users gathering kernels.