Release notes from cutlass

v2.5.0

2021-03-03T19:17:40Z

Create PUBLICATIONS.md (NVIDIA#189)

2020-11-23T12:59:45Z

cutlass 2.4 documentation only update

2020-09-25T18:25:26Z

CUTLASS 2.3.0

2020-06-15T17:47:01Z

Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. (…

2020-04-08T17:54:36Z

update tools/library/CMakeLists to require python 3.6 according to NVIDIA#7…

2019-11-22T17:39:12Z

Need Python 3.6 to use enum.auto() (NVIDIA#70)

2019-07-10T17:54:12Z

Fixed performance defect with indirect access to pointer array for Volta TensorCores TN arrangement.
Updated patch version and changelog.
Updated patch version and changelog.
Added link to changelog in readme.
Fixed markdown link

2019-07-10T17:54:12Z

Fixed performance defect with indirect access to pointer array for Volta TensorCores TN arrangement.
Updated patch version and changelog.
Updated patch version and changelog.
Added link to changelog in readme.
Fixed markdown link

2019-03-20T17:49:17Z

CUTLASS 1.3 Release

Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1.

2018-10-26T21:59:50Z

CUTLASS 1.2