-
Notifications
You must be signed in to change notification settings - Fork 13
Unit tests and benchmark for subgroup2 and workgroup2 stuff #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
keptsecret
wants to merge
76
commits into
master
Choose a base branch
from
new_wg_scan_test
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
76 commits
Select commit
Hold shift + click to select a range
8090a2d
initial benchmark example copy
keptsecret 3a2ff14
test subgroup2 funcs correct
keptsecret dd021a0
fix test
keptsecret ca21941
benchmarking shader + pipeline working
keptsecret 0bb41db
begin adding fake frames for nsight profiler
keptsecret 24a93bb
merge master, fix conflicts
keptsecret 17dda8e
re-numbered example to avoid duplicate
keptsecret 3d4e0f2
fake frames for nsight
keptsecret 0192999
use correct shader, spirv line dbinfo for nsight
keptsecret 8c9d55e
support for 1 item per invoc
keptsecret 07d6980
handle when items per invoc =1
keptsecret be756d5
minor fixes
keptsecret 1963b51
changes in Param, Config usage
keptsecret 99cf5d8
coalesced load/store data
keptsecret 1d5e433
Merge branch 'master' into scan_perf_bench
keptsecret a3bb526
fixed some bugs
keptsecret 355c605
disable test by default
keptsecret 6b57674
refactor to load data as vectors, consecutive uints
keptsecret 7da1bec
initial wg scan test
keptsecret 750b3d2
working? test for workgroup2 reduce
keptsecret f11b3df
fixes to test
keptsecret 9f690ee
tests with multiple items per invoc
keptsecret 755f89a
inclusive scan test
keptsecret b8415ad
exclusive scan test, remove comments
keptsecret 474281d
benchmark shader, new common header
keptsecret 7d06332
test smaller workgroup sizes
keptsecret 874557c
expanded scratch proxy funcs
keptsecret 28ea75f
simplify scratch,proxy to just scalar types
keptsecret e8c2831
move all tests into new example
keptsecret 93b4d0b
Merge branch 'master' into new_wg_scan_test
keptsecret 2ba2b82
workgroup scan benchmark, renamed examples
keptsecret d567e71
removed obsolete files
keptsecret 54acf2a
replaced old ex 23 unit test with new tests
keptsecret 030d622
minor fixes
keptsecret ca71a39
minor fixes to workgroup benchmark
keptsecret 6018e9a
more minor fixes
keptsecret 3a9758c
some fixes to using config vars
keptsecret e496e98
fixes to test mem errors
keptsecret 20011f5
config struct changes
keptsecret 4a951b3
more test case coverage
keptsecret a42a742
Merge branch 'master' into new_wg_scan_test
keptsecret 908abd1
refactor name changes
keptsecret 81238ad
minor refactor
keptsecret 749658f
manage workgroup in example
keptsecret 1de31dd
moved benchmark to ex 29
keptsecret e828dc4
fit accessors to concept
keptsecret 086c21e
use bda in unit test
keptsecret f4af3ed
benchmarks use bda
keptsecret a394f22
use data accessor with preload data in reg
keptsecret 44c34a8
use store with data type because it works now
keptsecret 0ccd26f
save reduction returns to storage
keptsecret 2a991a9
combined headers between subgroup, workgroup stuff, restored spirv ca…
keptsecret e4735a4
simplified test,benchmark function template params
keptsecret 13ae89f
revert test to default params
keptsecret a8774db
use preloaded data in benchmark
keptsecret bb3a901
Merge branch 'master' into new_wg_scan_test
keptsecret 2a85f4e
refactor config member name
keptsecret 99f6dfe
fit new accessor concepts
keptsecret 3d89894
fix template accessors
keptsecret 3d63ed7
add accessor index template type
keptsecret 1100876
limit workgroup count
keptsecret f202ef5
utility func to get items per wg
keptsecret 93b7810
added check for vk spec requirement
keptsecret 3a3aaa9
removed maxComputeWorkgroupSubgroups*subgroupsize check
keptsecret 6581ed4
Merge branch 'master' into new_wg_scan_test
keptsecret 90ba926
various minor adjustments to unit tests
keptsecret 19d7fe0
simplified data accessors
keptsecret fdace31
tests for native and emulated subgroup op
keptsecret d6680f2
removed redundant stuff
keptsecret bafad3e
bind swapchain image directly, explicit surface format swapchain
keptsecret 32dc78f
shared data accessor header between test and bench, same shader adjus…
keptsecret 2aef6d3
generate benchmark inputs with xoroshiro
keptsecret 149a237
only have to benchmark plus op
keptsecret 00ed9be
benchmark all reduce/scan in one run (lots of shaders)
keptsecret a5a21fd
minor changes to passing subgroup size and items per wg
keptsecret 1710b69
push constant stores array of output addresses directly because stati…
keptsecret File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
File renamed without changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
#include "common.hlsl" | ||
|
||
using namespace nbl; | ||
using namespace hlsl; | ||
|
||
// https://github.com/microsoft/DirectXShaderCompiler/issues/6144 | ||
uint32_t3 nbl::hlsl::glsl::gl_WorkGroupSize() {return uint32_t3(WORKGROUP_SIZE,1,1);} | ||
|
||
#ifndef ITEMS_PER_INVOCATION | ||
#error "Define ITEMS_PER_INVOCATION!" | ||
#endif | ||
|
||
[[vk::push_constant]] PushConstantData pc; | ||
|
||
struct device_capabilities | ||
{ | ||
#ifdef TEST_NATIVE | ||
NBL_CONSTEXPR_STATIC_INLINE bool shaderSubgroupArithmetic = true; | ||
#else | ||
NBL_CONSTEXPR_STATIC_INLINE bool shaderSubgroupArithmetic = false; | ||
#endif | ||
}; | ||
|
||
#ifndef OPERATION | ||
#error "Define OPERATION!" | ||
#endif | ||
|
||
#ifndef SUBGROUP_SIZE_LOG2 | ||
#error "Define SUBGROUP_SIZE_LOG2!" | ||
#endif |
55 changes: 55 additions & 0 deletions
55
23_Arithmetic2UnitTest/app_resources/testSubgroup.comp.hlsl
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
#pragma shader_stage(compute) | ||
|
||
#define operation_t nbl::hlsl::OPERATION | ||
|
||
#include "nbl/builtin/hlsl/glsl_compat/core.hlsl" | ||
#include "nbl/builtin/hlsl/glsl_compat/subgroup_basic.hlsl" | ||
#include "nbl/builtin/hlsl/subgroup2/arithmetic_portability.hlsl" | ||
|
||
#include "shaderCommon.hlsl" | ||
#include "nbl/builtin/hlsl/workgroup/basic.hlsl" | ||
|
||
typedef vector<uint32_t, ITEMS_PER_INVOCATION> type_t; | ||
|
||
uint32_t globalIndex() | ||
{ | ||
return glsl::gl_WorkGroupID().x*WORKGROUP_SIZE+workgroup::SubgroupContiguousIndex(); | ||
} | ||
|
||
template<class Binop, uint32_t N> | ||
static void subtest(NBL_CONST_REF_ARG(type_t) sourceVal) | ||
{ | ||
using config_t = subgroup2::Configuration<SUBGROUP_SIZE_LOG2>; | ||
using params_t = subgroup2::ArithmeticParams<config_t, typename Binop::base_t, N, device_capabilities>; | ||
|
||
const uint64_t outputBufAddr = pc.pOutputBuf[Binop::BindingIndex]; | ||
|
||
if (glsl::gl_SubgroupSize()!=1u<<SUBGROUP_SIZE_LOG2) | ||
vk::RawBufferStore<uint32_t>(outputBufAddr, glsl::gl_SubgroupSize()); | ||
|
||
operation_t<params_t> func; | ||
type_t val = func(sourceVal); | ||
|
||
vk::RawBufferStore<type_t>(outputBufAddr + sizeof(uint32_t) + sizeof(type_t) * globalIndex(), val, sizeof(uint32_t)); | ||
} | ||
|
||
type_t test() | ||
{ | ||
const uint32_t idx = globalIndex(); | ||
type_t sourceVal = vk::RawBufferLoad<type_t>(pc.pInputBuf + idx * sizeof(type_t)); | ||
|
||
subtest<arithmetic::bit_and<uint32_t>, ITEMS_PER_INVOCATION>(sourceVal); | ||
subtest<arithmetic::bit_xor<uint32_t>, ITEMS_PER_INVOCATION>(sourceVal); | ||
subtest<arithmetic::bit_or<uint32_t>, ITEMS_PER_INVOCATION>(sourceVal); | ||
subtest<arithmetic::plus<uint32_t>, ITEMS_PER_INVOCATION>(sourceVal); | ||
subtest<arithmetic::multiplies<uint32_t>, ITEMS_PER_INVOCATION>(sourceVal); | ||
subtest<arithmetic::minimum<uint32_t>, ITEMS_PER_INVOCATION>(sourceVal); | ||
subtest<arithmetic::maximum<uint32_t>, ITEMS_PER_INVOCATION>(sourceVal); | ||
return sourceVal; | ||
} | ||
|
||
[numthreads(WORKGROUP_SIZE,1,1)] | ||
void main() | ||
{ | ||
test(); | ||
} |
76 changes: 76 additions & 0 deletions
76
23_Arithmetic2UnitTest/app_resources/testWorkgroup.comp.hlsl
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
#pragma shader_stage(compute) | ||
|
||
#include "nbl/builtin/hlsl/glsl_compat/core.hlsl" | ||
#include "nbl/builtin/hlsl/glsl_compat/subgroup_basic.hlsl" | ||
#include "nbl/builtin/hlsl/subgroup2/arithmetic_portability.hlsl" | ||
#include "nbl/builtin/hlsl/workgroup2/arithmetic.hlsl" | ||
|
||
static const uint32_t WORKGROUP_SIZE = 1u << WORKGROUP_SIZE_LOG2; | ||
|
||
#include "shaderCommon.hlsl" | ||
|
||
using config_t = workgroup2::ArithmeticConfiguration<WORKGROUP_SIZE_LOG2, SUBGROUP_SIZE_LOG2, ITEMS_PER_INVOCATION>; | ||
|
||
typedef vector<uint32_t, config_t::ItemsPerInvocation_0> type_t; | ||
|
||
// final (level 1/2) scan needs to fit in one subgroup exactly | ||
groupshared uint32_t scratch[mpl::max_v<int16_t,config_t::SharedScratchElementCount,1>]; | ||
|
||
#include "../../common/include/WorkgroupDataAccessors.hlsl" | ||
|
||
static ScratchProxy arithmeticAccessor; | ||
|
||
template<class Binop, class device_capabilities> | ||
struct operation_t | ||
{ | ||
using binop_base_t = typename Binop::base_t; | ||
using otype_t = typename Binop::type_t; | ||
|
||
// workgroup reduction returns the value of the reduction | ||
// workgroup scans do no return anything, but use the data accessor to do the storing directly | ||
void operator()() | ||
{ | ||
PreloadedDataProxy<config_t,Binop> dataAccessor = PreloadedDataProxy<config_t,Binop>::create(); | ||
dataAccessor.preload(); | ||
#if IS_REDUCTION | ||
otype_t value = | ||
#endif | ||
OPERATION<config_t,binop_base_t,device_capabilities>::template __call<PreloadedDataProxy<config_t,Binop>, ScratchProxy>(dataAccessor,arithmeticAccessor); | ||
// we barrier before because we alias the accessors for Binop | ||
arithmeticAccessor.workgroupExecutionAndMemoryBarrier(); | ||
#if IS_REDUCTION | ||
[unroll] | ||
for (uint32_t i = 0; i < PreloadedDataProxy<config_t,Binop>::PreloadedDataCount; i++) | ||
dataAccessor.preloaded[i] = value; | ||
devshgraphicsprogramming marked this conversation as resolved.
Show resolved
Hide resolved
|
||
#endif | ||
dataAccessor.unload(); | ||
} | ||
}; | ||
|
||
|
||
template<class Binop> | ||
static void subtest() | ||
{ | ||
if (glsl::gl_SubgroupSize()!=1u<<SUBGROUP_SIZE_LOG2) | ||
vk::RawBufferStore<uint32_t>(pc.pOutputBuf[Binop::BindingIndex], glsl::gl_SubgroupSize()); | ||
Comment on lines
+54
to
+55
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. just assert, have you seen our |
||
|
||
operation_t<Binop,device_capabilities> func; | ||
func(); | ||
} | ||
|
||
void test() | ||
{ | ||
subtest<arithmetic::bit_and<uint32_t> >(); | ||
subtest<arithmetic::bit_xor<uint32_t> >(); | ||
subtest<arithmetic::bit_or<uint32_t> >(); | ||
subtest<arithmetic::plus<uint32_t> >(); | ||
subtest<arithmetic::multiplies<uint32_t> >(); | ||
subtest<arithmetic::minimum<uint32_t> >(); | ||
subtest<arithmetic::maximum<uint32_t> >(); | ||
} | ||
|
||
[numthreads(WORKGROUP_SIZE,1,1)] | ||
void main() | ||
{ | ||
test(); | ||
} |
File renamed without changes.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again