Skip to content

Commit

Permalink
[NFC] improve doc: fix typo in mma doc (NVIDIA#1417)
Browse files Browse the repository at this point in the history
  • Loading branch information
ThomsonTan authored Mar 27, 2024
1 parent c4e3e12 commit 8f7d278
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion media/docs/cute/0t_mma_atom.md
Original file line number Diff line number Diff line change
Expand Up @@ -433,7 +433,7 @@ where we see 16 copies of the 64x8 tile.

### A and B Layout Mapping

GMMA atoms that consume A and B sources directly from shared memory are a bit interesting. The GMMA Descriptor is constructed on an entore tile of A and/or B data in shared memory rather than being partitioned by threads. That is, every thread sees the entire tile of data and the tile is not reordered so that the descriptor can be constructed on it. In `ALayout` form, this can be expressed
GMMA atoms that consume A and B sources directly from shared memory are a bit interesting. The GMMA Descriptor is constructed on an entire tile of A and/or B data in shared memory rather than being partitioned by threads. That is, every thread sees the entire tile of data and the tile is not reordered so that the descriptor can be constructed on it. In `ALayout` form, this can be expressed

```cpp
// (T128,V64x8) -> (M64,K16)
Expand Down

0 comments on commit 8f7d278

Please sign in to comment.