tag:github.com,2008:https://github.com/mengchihe/cutlass/releasesRelease notes from cutlass2021-03-03T19:17:40Ztag:github.com,2008:Repository/359345995/v2.5.02021-03-03T19:17:40Zv2.5.0<p>Create PUBLICATIONS.md (<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="819291046" data-permission-text="Title is private" data-url="https://github.com/NVIDIA/cutlass/issues/189" data-hovercard-type="pull_request" data-hovercard-url="/NVIDIA/cutlass/pull/189/hovercard" href="https://github.com/NVIDIA/cutlass/pull/189">NVIDIA#189</a>)</p>hwu36tag:github.com,2008:Repository/359345995/v2.4.02020-11-23T12:59:45Zv2.4.0<p>cutlass 2.4 documentation only update</p>manishucsdtag:github.com,2008:Repository/359345995/v2.3.02020-09-25T18:25:26Zv2.3.0: Merge pull request #135 from NVIDIA/cutlass_2.3_final<p>CUTLASS 2.3.0</p>d-k-btag:github.com,2008:Repository/359345995/v2.2.02020-06-15T17:47:01Zv2.2.0<p>Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. (…</p>kerrmudgeontag:github.com,2008:Repository/359345995/v2.1.02020-04-08T17:54:36Zv2.1.0<p>update tools/library/CMakeLists to require python 3.6 according to <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="323756596" data-permission-text="Title is private" data-url="https://github.com/NVIDIA/cutlass/issues/7" data-hovercard-type="pull_request" data-hovercard-url="/NVIDIA/cutlass/pull/7/hovercard" href="https://github.com/NVIDIA/cutlass/pull/7">NVIDIA#7</a>…</p>thakkarVtag:github.com,2008:Repository/359345995/v2.0.02019-11-22T17:39:12Zv2.0.0<p>Need Python 3.6 to use enum.auto() (<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="527326278" data-permission-text="Title is private" data-url="https://github.com/NVIDIA/cutlass/issues/70" data-hovercard-type="pull_request" data-hovercard-url="/NVIDIA/cutlass/pull/70/hovercard" href="https://github.com/NVIDIA/cutlass/pull/70">NVIDIA#70</a>)</p>kerrmudgeontag:github.com,2008:Repository/359345995/v1.3.32019-07-10T17:54:12Zv1.3.3: Performance enhancement for Volta Tensor Cores TN layout (#53)<ul>
<li>
<p>Fixed performance defect with indirect access to pointer array for Volta TensorCores TN arrangement.</p>
</li>
<li>
<p>Updated patch version and changelog.</p>
</li>
<li>
<p>Updated patch version and changelog.</p>
</li>
<li>
<p>Added link to changelog in readme.</p>
</li>
<li>
<p>Fixed markdown link</p>
</li>
</ul>kerrmudgeontag:github.com,2008:Repository/359345995/v1.3.22019-07-10T17:54:12Zv1.3.2: Performance enhancement for Volta Tensor Cores TN layout (#53)<ul>
<li>
<p>Fixed performance defect with indirect access to pointer array for Volta TensorCores TN arrangement.</p>
</li>
<li>
<p>Updated patch version and changelog.</p>
</li>
<li>
<p>Updated patch version and changelog.</p>
</li>
<li>
<p>Added link to changelog in readme.</p>
</li>
<li>
<p>Fixed markdown link</p>
</li>
</ul>kerrmudgeontag:github.com,2008:Repository/359345995/v1.3.02019-03-20T17:49:17Zv1.3.0: Cutlass 1.3 Release (#42)<p>CUTLASS 1.3 Release</p>
<ul>
<li>Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1.</li>
</ul>kerrmudgeontag:github.com,2008:Repository/359345995/v1.2.02018-10-26T21:59:50Zv1.2.0: Merge pull request #33 from NVIDIA/cutlass_1.2<p>CUTLASS 1.2</p>kerrmudgeon