Skip to content

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Jun 3, 2021

No description provided.

@maleadt
Copy link
Member Author

maleadt commented Jun 4, 2021

Huh, looks like 11.3's compute-sanitizer doesn't get through our tests...

@maleadt maleadt force-pushed the master branch 27 times, most recently from a1af3c1 to 46f6109 Compare July 29, 2021 13:14
@maleadt maleadt force-pushed the tb/sanitize branch 4 times, most recently from f1e96a2 to 2777fca Compare September 8, 2021 06:50
@maleadt maleadt force-pushed the tb/sanitize branch 3 times, most recently from e7c6c9c to 56849d5 Compare September 13, 2021 09:35
@codecov
Copy link

codecov bot commented Sep 13, 2021

Codecov Report

Merging #950 (df08dd5) into master (2c40cb4) will increase coverage by 1.31%.
The diff coverage is 33.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #950      +/-   ##
==========================================
+ Coverage   79.20%   80.51%   +1.31%     
==========================================
  Files         118      118              
  Lines        8002     8017      +15     
==========================================
+ Hits         6338     6455     +117     
+ Misses       1664     1562     -102     
Impacted Files Coverage Δ
lib/cudnn/util.jl 42.85% <0.00%> (+1.94%) ⬆️
src/compiler/execution.jl 83.97% <0.00%> (-0.64%) ⬇️
lib/cudadrv/execution.jl 96.55% <50.00%> (-3.45%) ⬇️
lib/cudadrv/module/jit.jl 68.96% <0.00%> (-1.21%) ⬇️
lib/cudadrv/context.jl 73.10% <0.00%> (-0.40%) ⬇️
src/state.jl 77.90% <0.00%> (-0.07%) ⬇️
lib/curand/random.jl 93.25% <0.00%> (+0.07%) ⬆️
src/pool.jl 75.73% <0.00%> (+0.10%) ⬆️
lib/cudadrv/error.jl 83.72% <0.00%> (+0.38%) ⬆️
lib/utils/cache.jl 88.88% <0.00%> (+0.42%) ⬆️
... and 35 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2c40cb4...df08dd5. Read the comment docs.

@maleadt maleadt force-pushed the tb/sanitize branch 2 times, most recently from aee14d7 to 56b4951 Compare September 13, 2021 12:56
@maleadt maleadt changed the title Remove compute-sanitizer workaround. CI fixes Sep 14, 2021
@maleadt maleadt force-pushed the tb/sanitize branch 3 times, most recently from 91a5e1d to f120607 Compare September 15, 2021 09:05
@maleadt
Copy link
Member Author

maleadt commented Sep 16, 2021

Interestingly, the 1.7 CI failures only occur on older devices, and seem to have been introduced by rc1 (they don't occur on 1.7-beta4). Not limited to changes in this PR though; also occur on CUDA.jl#master. MWE: permutedims(CuArray(rand(Float32, 1, 1, 1)), (2, 1, 3))

@maleadt
Copy link
Member Author

maleadt commented Sep 16, 2021

Turns out JuliaLang/julia#42119 triggers a miscompilation on older GPUs... 😭

@maleadt
Copy link
Member Author

maleadt commented Sep 17, 2021

Another case of divergent control flow due to unreachable instructions causing ptxas to emit bad code, filed with NVIDIA as #3382020. I've reverted the upstream change (yay for contextual dispatch), because we badly need to get CI working again.

For those interested, a C++-based MWE is available at https://gist.github.com/maleadt/dc59ee944952593e8d9967ad2c3543da

@maleadt maleadt merged commit 3497077 into master Sep 17, 2021
@maleadt maleadt deleted the tb/sanitize branch September 17, 2021 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Everything related to continuous integration.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant