Skip to content

Commit

Permalink
regent: Emit a faster atomic instruction for doubles when sm >= 60
Browse files Browse the repository at this point in the history
  • Loading branch information
magnatelee committed Dec 6, 2018
1 parent ce40a0f commit 9dde639
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions language/src/regent/cudahelper.t
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,9 @@ local function generate_atomic(op, typ)
if op == "+" and typ == float then
return terralib.intrinsic("llvm.nvvm.atomic.load.add.f32.p0f32",
{&float,float} -> {float})
elseif op == "+" and typ == double and get_cuda_version() >= 60 then
return terralib.intrinsic("llvm.nvvm.atomic.load.add.f64.p0f64",
{&double,double} -> {double})
end

local cas_type
Expand Down

0 comments on commit 9dde639

Please sign in to comment.