tag:github.com,2008:https://github.com/mengchihe/cutlass/releases Release notes from cutlass 2021-03-03T19:17:40Z tag:github.com,2008:Repository/359345995/v2.5.0 2021-03-03T19:17:40Z v2.5.0 <p>Create PUBLICATIONS.md (<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="819291046" data-permission-text="Title is private" data-url="https://github.com/NVIDIA/cutlass/issues/189" data-hovercard-type="pull_request" data-hovercard-url="/NVIDIA/cutlass/pull/189/hovercard" href="https://github.com/NVIDIA/cutlass/pull/189">NVIDIA#189</a>)</p> hwu36 tag:github.com,2008:Repository/359345995/v2.4.0 2020-11-23T12:59:45Z v2.4.0 <p>cutlass 2.4 documentation only update</p> manishucsd tag:github.com,2008:Repository/359345995/v2.3.0 2020-09-25T18:25:26Z v2.3.0: Merge pull request #135 from NVIDIA/cutlass_2.3_final <p>CUTLASS 2.3.0</p> d-k-b tag:github.com,2008:Repository/359345995/v2.2.0 2020-06-15T17:47:01Z v2.2.0 <p>Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast&lt;&gt;. (…</p> kerrmudgeon tag:github.com,2008:Repository/359345995/v2.1.0 2020-04-08T17:54:36Z v2.1.0 <p>update tools/library/CMakeLists to require python 3.6 according to <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="323756596" data-permission-text="Title is private" data-url="https://github.com/NVIDIA/cutlass/issues/7" data-hovercard-type="pull_request" data-hovercard-url="/NVIDIA/cutlass/pull/7/hovercard" href="https://github.com/NVIDIA/cutlass/pull/7">NVIDIA#7</a>…</p> thakkarV tag:github.com,2008:Repository/359345995/v2.0.0 2019-11-22T17:39:12Z v2.0.0 <p>Need Python 3.6 to use enum.auto() (<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="527326278" data-permission-text="Title is private" data-url="https://github.com/NVIDIA/cutlass/issues/70" data-hovercard-type="pull_request" data-hovercard-url="/NVIDIA/cutlass/pull/70/hovercard" href="https://github.com/NVIDIA/cutlass/pull/70">NVIDIA#70</a>)</p> kerrmudgeon tag:github.com,2008:Repository/359345995/v1.3.3 2019-07-10T17:54:12Z v1.3.3: Performance enhancement for Volta Tensor Cores TN layout (#53) <ul> <li> <p>Fixed performance defect with indirect access to pointer array for Volta TensorCores TN arrangement.</p> </li> <li> <p>Updated patch version and changelog.</p> </li> <li> <p>Updated patch version and changelog.</p> </li> <li> <p>Added link to changelog in readme.</p> </li> <li> <p>Fixed markdown link</p> </li> </ul> kerrmudgeon tag:github.com,2008:Repository/359345995/v1.3.2 2019-07-10T17:54:12Z v1.3.2: Performance enhancement for Volta Tensor Cores TN layout (#53) <ul> <li> <p>Fixed performance defect with indirect access to pointer array for Volta TensorCores TN arrangement.</p> </li> <li> <p>Updated patch version and changelog.</p> </li> <li> <p>Updated patch version and changelog.</p> </li> <li> <p>Added link to changelog in readme.</p> </li> <li> <p>Fixed markdown link</p> </li> </ul> kerrmudgeon tag:github.com,2008:Repository/359345995/v1.3.0 2019-03-20T17:49:17Z v1.3.0: Cutlass 1.3 Release (#42) <p>CUTLASS 1.3 Release</p> <ul> <li>Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1.</li> </ul> kerrmudgeon tag:github.com,2008:Repository/359345995/v1.2.0 2018-10-26T21:59:50Z v1.2.0: Merge pull request #33 from NVIDIA/cutlass_1.2 <p>CUTLASS 1.2</p> kerrmudgeon