Refactor norm_linear into struct functor #425
Open
+509
−372
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes:
Refactor norm_linear kernel into struct functor style.
NormLinearKernelSpec
: Maintain compile-time configuration and constantsProcessAtomFunctor
: Core computation logic for processing one single OUTPUT_ATOM_SIZE output tilesNormLinearHandler
: Top-level control flow and memory management. Later can have more fine-grained functionsNow the style to run norm_linear kernel is as:
The refactored code has exactly the same register usage as the old one (123). Also, no efficiency difference has been observed:
ptxas info : Used 123 registers, used 1 barriers, 392 bytes cmem[0]
An imagination of what we could do in the future:
Related Issues:
Linked Issues:
Issues closed by this PR: