You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+16-21Lines changed: 16 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,7 @@
25
25
</div>
26
26
27
27
## Latest News
28
+
*[2024/05][Large AI Models Inference Speed Doubled, Colossal-Inference Open Source Release](https://hpc-ai.com/blog/colossal-inference)
28
29
*[2024/04][Open-Sora Unveils Major Upgrade: Embracing Open Source with Single-Shot 16-Second Video Generation and 720p Resolution](https://hpc-ai.com/blog/open-soras-comprehensive-upgrade-unveiled-embracing-16-second-video-generation-and-720p-resolution-in-open-source)
29
30
*[2024/04][Most cost-effective solutions for inference, fine-tuning and pretraining, tailored to LLaMA3 series](https://hpc-ai.com/blog/most-cost-effective-solutions-for-inference-fine-tuning-and-pretraining-tailored-to-llama3-series)
30
31
*[2024/03][314 Billion Parameter Grok-1 Inference Accelerated by 3.8x, Efficient and Easy-to-Use PyTorch+HuggingFace version is Here](https://hpc-ai.com/blog/314-billion-parameter-grok-1-inference-accelerated-by-3.8x-efficient-and-easy-to-use-pytorchhuggingface-version-is-here)
@@ -75,11 +76,9 @@
75
76
<li>
76
77
<ahref="#Inference">Inference</a>
77
78
<ul>
79
+
<li><a href="#Colossal-Inference">Colossal-Inference: Large AI Models Inference Speed Doubled</a></li>
78
80
<li><a href="#Grok-1">Grok-1: 314B model of PyTorch + HuggingFace Inference</a></li>
79
81
<li><a href="#SwiftInfer">SwiftInfer:Breaks the Length Limit of LLM for Multi-Round Conversations with 46% Acceleration</a></li>
80
-
<li><a href="#GPT-3-Inference">GPT-3</a></li>
81
-
<li><a href="#OPT-Serving">OPT-175B Online Serving for Text Generation</a></li>
-[SwiftInfer](https://github.com/hpcaitech/SwiftInfer): Inference performance improved by 46%, open source solution breaks the length limit of LLM for multi-round conversations
-[BLOOM](https://github.com/hpcaitech/EnergonAI/tree/main/examples/bloom): Reduce hardware deployment costs of 176-billion-parameter BLOOM by more than 10 times.
415
-
416
411
<palign="right">(<ahref="#top">back to top</a>)</p>
0 commit comments