tag:github.com,2008:https://github.com/cli99/llm-analysis/releases Release notes from llm-analysis 2023-11-13T04:44:32Z tag:github.com,2008:Repository/635392069/v0.2.2 2023-11-13T04:45:52Z Bug fixes No content. cli99 tag:github.com,2008:Repository/635392069/v0.2.1 2023-11-02T17:37:34Z v0.2.1 No content. cli99 tag:github.com,2008:Repository/635392069/v0.2.0 2023-10-31T07:55:44Z Bug fixes and MoE training analysis support <p>This release fixes a few bugs when calculating memory usage (e.g. activation, optimizer states), and adds support to analysis MoE training.</p> cli99 tag:github.com,2008:Repository/635392069/v0.1.1 2023-08-18T06:30:37Z Bug fixes and Llama 2 inference support <p>This release:</p> <ul> <li>adds group query attention (GQA) support</li> <li>changes the activation memory calculation in inference to assume maximum tensor buffer</li> <li>fixes the kv cache size calculation</li> <li>adds a gpu cost analysis in the inference</li> <li>adds llama2 inference case study</li> </ul> cli99 tag:github.com,2008:Repository/635392069/push 2023-08-18T06:25:55Z push <p>update version</p> cli99 tag:github.com,2008:Repository/635392069/v0.1.0 2023-05-02T17:19:27Z v0.1.0 No content. cli99