Open
Description
The device libraries include this pattern to check if all workitem IDs are 0.
// RUN: clang -target amdgcn-amd-amdhsa -S -O3 -mcpu=gfx900 -nogpulib < %s
bool
choose_one_workgroup_workitem(void)
{
return (__builtin_amdgcn_workitem_id_x() | __builtin_amdgcn_workitem_id_y() | __builtin_amdgcn_workitem_id_z()) == 0;
}
This is equivalent to checking x == 0 && y == 0 && z == 0. If we codegen this, we see:
v_and_b32_e32 v0, 0x3ff, v31
v_bfe_u32 v1, v31, 20, 10
v_bfe_u32 v2, v31, 10, 10
v_or3_b32 v0, v0, v2, v1
v_cmp_eq_u32_e32 vcc, 0, v0
v_cndmask_b32_e64 v0, 0, 1, vcc
s_setpc_b64 s[30:31]
In the function ABI, the work item IDs are packed into v31. We should be able to just check v31 == 0, so this would shrink to
v_cmp_eq_u32_e32 vcc, 0, v31
v_cndmask_b32_e64 v0, 0, 1, vcc
s_setpc_b64 s[30:31]