-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Description
UPDATE (08/09/2023):
We have done major performance overhaul in the past few months, and now I'm happy to share the latest results:
- SOTA performance on CUDA: https://github.com/mlc-ai/llm-perf-bench
- SOTA performance on ROCm: https://blog.mlc.ai/2023/08/09/Making-AMD-GPUs-competitive-for-LLM-inference
============================================================
Hi everyone,
We are looking to gather data points on running MLC-LLM on different hardwares and platforms. Our goal is to create a comprehensive reference for new users. Please share your own experiences in this thread! Thank you for your help!
NOTE: for benchmarking, we highly recommended a device of at least 6GB memory, because the model itself takes 2.9G already. For this reason, it is known that the iOS app will crash on a 4GB iPhone.
AMD GPUs
Hardware/GPU | OS | Tokens/sec | Source | Notes |
---|---|---|---|---|
RX 6600XT (8G) | N/A | 28.3 | GitHub | |
RX 6750XT | openSUSE TumbleWeed | 8.9 - 154.3 | GitHub | |
RX 6700XT | Windows 11 | 33.7 | GitHub | |
APU 5800H | Windows 11 | 8.5 | GitHub | |
Raden RX 470 (4G) | AlmaLinux 9.1 | 9.4 | GitHub | |
Raden Pro 5300M | macOS Venture | 12.6 | @junrushao | Intel MBP 16" (late 2019) |
AMD GPU on Steam Deck | Steam Deck's Linux | TBD | ||
RX6800 16G VRAM | macOS Ventura | 22.5 | GitHub | Intel MBP 13'' (2020) |
Radeon RX 6600 (8GB) | Ubuntu 22.04 | 7.0 | ||
RX 7900 xtx |
Macbook
Hardware/GPU | OS | Tokens/sec | Source | Notes |
---|---|---|---|---|
2020 MacBook Pro M1 (8G) | macOS | 11.4 | GitHub | |
2021 MacBook Pro M1Pro (16G) | macOS Ventura | 17.1 | GitHub | |
M1 Max Mac Studio (64G) | N/A | 18.6 | GitHub | |
2021 MacBook Pro M1 Max (32G) | macOS Monterey | 21.0 | GitHub | |
MacBook Pro M2 (16G) | macOS Ventura | 22.5 | GitHub | |
2021 MacBook M1Pro (32G) | macOS Ventura | 19.3 | GitHub |
Intel GPUs
Hardware/GPU | OS | Tokens/sec | Source | Notes |
---|---|---|---|---|
Arc A770 | N/A | 3.1 - 118.6 | GitHub | perf issues in decoding needs investigation |
UHD Graphics (Comet Lake-U GT2) 1G | Windows 10 | 2.2 | GitHub | |
UHD Graphics 630 | macOS Ventura | 2.3 | @junrushao | Integrated GPU. Intel MBP 16" (late 2019) |
Iris Plus Graphics 1536 MB | macOS Ventura | 2.6 | GitHub | Integrated GPU on MBP |
Iris Plus Graphics 645 1536 MB | macOS Ventura | 2.9 | GitHub | Integrated GPU on MBP |
NVIDIA GPUs
Hardware/GPU | OS | Tokens/sec | Source | Notes |
---|---|---|---|---|
GTX 1650 ti (4GB) | Fedora | 15.6 | GitHub | |
GTX 1060 (6GB) | Windows 10 | 16.7 | GitHub | |
RTX 3080 | Windows 11 | 26.0 | GitHub | |
RTX 3060 | Debian bookworm | 21.3 | GitHub | |
RTX 2080Ti | Windows 10 | 24.5 | GitHub | |
RTX 3090 | N/A | 25.7 | GitHub | |
GTX 1660ti | N/A | 23.9 | GitHub | |
RTX 3070 | N/A | 23.3 | GitHub |
iOS
Hardware/GPU | OS | Tokens/sec | Source | Notes |
---|---|---|---|---|
iPhone 14 Pro | iOS 16.4.1 | 7.2 | @junrushao | |
iPad Pro 11' with M1 | iPadOS 16.1 | 10.6 | GitHub | |
iPad Pro 11' A12Z | N/A | 4.1 | GitHub | |
iPad Pro 11' with M2 (4-th gen) | iPadOS 16.5 | 14.1 | GitHub |
Android
Hardware/GPU | OS | Tokens/sec | Link | Notes |
---|---|---|---|---|
dennis-linux, GabrieleRisso, QQxiaoming, namchuai, innocentius and 5 more
Metadata
Metadata
Assignees
Labels
No labels