Skip to content

Commit 5afcdf0

Browse files
q10facebook-github-bot
authored andcommitted
Clean up WeightRow in preparation for optimizer state offloading
Summary: X-link: facebookresearch/FBGEMM#1109 - Clean up `WeightRow` implementation in preparation for optimizer state offloading - Add documentation for the class Differential Revision: D73473546
1 parent c54cead commit 5afcdf0

File tree

6 files changed

+203
-145
lines changed

6 files changed

+203
-145
lines changed

fbgemm_gpu/codegen/training/forward/embedding_forward_split_meta_template.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333
using namespace fbgemm_gpu;
3434
using Tensor = at::Tensor;
3535

36-
[[maybe_unused]] static constexpr float kINT8QparamsBytes = 8;
36+
[[maybe_unused]] static constexpr int32_t kINT8QparamsBytes = 8;
3737

3838
////////////////////////////////////////////////////////////////////////////////
3939
// Kernel Definitions

fbgemm_gpu/include/fbgemm_gpu/utils/cuda_prelude.cuh

+1-1
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ static constexpr float kQParamEps = 1e-8f;
8585
will be stored at the end of each row in FP32 formats, appending a total of
8686
8 bytes to each row.
8787
*/
88-
static constexpr float kINT8QparamsBytes = 8;
88+
static constexpr int32_t kINT8QparamsBytes = 8;
8989

9090
template <typename T>
9191
DEVICE_INLINE T shfl_xor(

0 commit comments

Comments
 (0)