Skip to content

Commit 5c367b8

Browse files
committed
[libclc] Optimize generic CLC fmin/fmax
The CLC fmin/fmax builtins now use clang's __builtin_elementwise_(min|max) which helps us generate llvm.(min|max)num intrinsics directly. These intrinsics select the non-NAN input over the NAN input, which adheres to the OpenCL specification. Note that the OpenCL specification doesn't require support for sNAN, so returning qNAN over sNAN is acceptable. Note also that the intrinsics don't differentiate between -0.0 and +0.0; this does not appear to be required - going by the OpenCL CTS, at least. These intrinsics maintain the vector types, as opposed to scalarizing, which was previously happening. This commit therefore helps to optimize codegen for those targets.
1 parent 4609b6a commit 5c367b8

File tree

2 files changed

+8
-50
lines changed

2 files changed

+8
-50
lines changed

libclc/clc/lib/generic/math/clc_fmax.cl

Lines changed: 4 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -6,31 +6,10 @@
66
//
77
//===----------------------------------------------------------------------===//
88

9-
#include <clc/clcmacro.h>
109
#include <clc/internal/clc.h>
11-
#include <clc/relational/clc_isnan.h>
1210

13-
_CLC_DEFINE_BINARY_BUILTIN(float, __clc_fmax, __builtin_fmaxf, float, float);
11+
#define FUNCTION __clc_fmax
12+
#define __CLC_FUNCTION(x) __builtin_elementwise_max
13+
#define __CLC_BODY <clc/shared/binary_def.inc>
1414

15-
#ifdef cl_khr_fp64
16-
17-
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
18-
19-
_CLC_DEFINE_BINARY_BUILTIN(double, __clc_fmax, __builtin_fmax, double, double);
20-
21-
#endif
22-
23-
#ifdef cl_khr_fp16
24-
25-
#pragma OPENCL EXTENSION cl_khr_fp16 : enable
26-
27-
_CLC_DEF _CLC_OVERLOAD half __clc_fmax(half x, half y) {
28-
if (__clc_isnan(x))
29-
return y;
30-
if (__clc_isnan(y))
31-
return x;
32-
return (x < y) ? y : x;
33-
}
34-
_CLC_BINARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, half, __clc_fmax, half, half)
35-
36-
#endif
15+
#include <clc/math/gentype.inc>

libclc/clc/lib/generic/math/clc_fmin.cl

Lines changed: 4 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -6,31 +6,10 @@
66
//
77
//===----------------------------------------------------------------------===//
88

9-
#include <clc/clcmacro.h>
109
#include <clc/internal/clc.h>
11-
#include <clc/relational/clc_isnan.h>
1210

13-
_CLC_DEFINE_BINARY_BUILTIN(float, __clc_fmin, __builtin_fminf, float, float);
11+
#define FUNCTION __clc_fmin
12+
#define __CLC_FUNCTION(x) __builtin_elementwise_min
13+
#define __CLC_BODY <clc/shared/binary_def.inc>
1414

15-
#ifdef cl_khr_fp64
16-
17-
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
18-
19-
_CLC_DEFINE_BINARY_BUILTIN(double, __clc_fmin, __builtin_fmin, double, double);
20-
21-
#endif
22-
23-
#ifdef cl_khr_fp16
24-
25-
#pragma OPENCL EXTENSION cl_khr_fp16 : enable
26-
27-
_CLC_DEF _CLC_OVERLOAD half __clc_fmin(half x, half y) {
28-
if (__clc_isnan(x))
29-
return y;
30-
if (__clc_isnan(y))
31-
return x;
32-
return (y < x) ? y : x;
33-
}
34-
_CLC_BINARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, half, __clc_fmin, half, half)
35-
36-
#endif
15+
#include <clc/math/gentype.inc>

0 commit comments

Comments
 (0)