Skip to content

Commit 3fd0eaa

Browse files
authored
[libclc][amdgpu] Implement native_exp2 via AMD builtin (#133696)
This came up during a discussion on #129679, which has been split out as a preparatory commit. An example of the AMDGPU codegen is: define <2 x float> @_Z10native_expDv2_f(<2 x float> %val) { %mul = fmul afn <2 x float> %val, splat (float 0x3FF7154760000000) %0 = extractelement <2 x float> %mul, i64 0 %1 = tail call float @llvm.amdgcn.exp2.f32(float %0) %vecinit.i = insertelement <2 x float> poison, float %1, i64 0 %2 = extractelement <2 x float> %mul, i64 1 %3 = tail call float @llvm.amdgcn.exp2.f32(float %2) %vecinit2.i = insertelement <2 x float> %vecinit.i, float %3, i64 1 ret <2 x float> %vecinit2.i } define <2 x float> @_Z11native_exp2Dv2_f(<2 x float> %x) { %0 = extractelement <2 x float> %x, i64 0 %1 = tail call float @llvm.amdgcn.exp2.f32(float %0) %vecinit = insertelement <2 x float> poison, float %1, i64 0 %2 = extractelement <2 x float> %x, i64 1 %3 = tail call float @llvm.amdgcn.exp2.f32(float %2) %vecinit2 = insertelement <2 x float> %vecinit, float %3, i64 1 ret <2 x float> %vecinit2 }
1 parent ea06f7f commit 3fd0eaa

File tree

2 files changed

+17
-0
lines changed

2 files changed

+17
-0
lines changed

libclc/amdgpu/lib/SOURCES

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
math/native_exp.cl
2+
math/native_exp2.cl
23
math/native_log.cl
34
math/native_log10.cl
45
math/half_exp.cl

libclc/amdgpu/lib/math/native_exp2.cl

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
9+
#include <clc/clc.h>
10+
#include <clc/clcmacro.h>
11+
12+
_CLC_OVERLOAD _CLC_DEF float native_exp2(float val) {
13+
return __builtin_amdgcn_exp2f(val);
14+
}
15+
16+
_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, float, native_exp2, float)

0 commit comments

Comments
 (0)