Skip to content

Commit 043e483

Browse files
Ellipse0934Ellipse0934
Ellipse0934
authored and
Ellipse0934
committed
Add inbounds
1 parent 635b06c commit 043e483

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

src/accumulate.jl

+4-3
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,9 @@
77
#
88
#
99
# TODOs:
10-
# - multiple elements per thread (performance)
11-
# - custom launch config (performance)
10+
# - move to GPUArrays once syncwarp is available
11+
# - group all allocations and deallocations in recursive
12+
# case
1213

1314
# Scan entire warp using shfl intrinsics, unrolled for warpsize() = 32
1415
@inline function scan_warp(op, val, lane)
@@ -167,7 +168,7 @@ function aggregate_partial_scan!(op::Function, output, aggregates,
167168
j = (blockIdx().z-1) * gridDim().y + blockIdx().y
168169

169170

170-
if j <= length(Rother) && i <= length(Rdim)
171+
@inbounds if j <= length(Rother) && i <= length(Rdim)
171172
I = Rother[j]
172173
Ipre = Rpre[I[1]]
173174
Ipost = Rpost[I[2]]

0 commit comments

Comments
 (0)