[Survey] Supported Hardwares and Speed

UPDATE (08/09/2023):

We have done major performance overhaul in the past few months, and now I'm happy to share the latest results:
- SOTA performance on CUDA: https://github.com/mlc-ai/llm-perf-bench
- SOTA performance on ROCm: https://blog.mlc.ai/2023/08/09/Making-AMD-GPUs-competitive-for-LLM-inference


============================================================


Hi everyone,

We are looking to gather data points on running MLC-LLM on different hardwares and platforms. Our goal is to create a comprehensive reference for new users. Please share your own experiences in this thread! Thank you for your help!

**NOTE**: for benchmarking, we highly recommended a device of at least 6GB memory, because the model itself takes 2.9G already. For this reason, it is known that the iOS app will crash on a 4GB iPhone.

## AMD GPUs

| Hardware/GPU  | OS         | Tokens/sec |  Source             | Notes |
|---------------|------------|------------|---------------------|------|
| RX 6600XT (8G) | N/A |   28.3 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1537158689)  |  |
| RX 6750XT | openSUSE TumbleWeed |   8.9 - 154.3 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1537170150)  |  |
| RX 6700XT | Windows 11 |   33.7 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1537524744)  |  |
| APU 5800H | Windows 11 |   8.5 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1532764716)  |  |
| Raden RX 470 (4G) | AlmaLinux 9.1 |   9.4 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1531689416)  |  |
| Raden Pro 5300M | macOS Venture |   12.6 |  @junrushao  | Intel MBP 16" (late 2019) |
| AMD GPU on Steam Deck |  Steam Deck's Linux | TBD | [Reddit](https://www.reddit.com/r/LocalLLaMA/comments/132igcy/project_mlc_llm_universal_llm_deployment_with_gpu/jia8ux6)                    |      |
| RX6800 16G VRAM |  macOS Ventura | 22.5 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1529753829) | Intel MBP 13'' (2020) |
| Radeon RX 6600 (8GB) | Ubuntu 22.04 | 7.0 |  [Reddit](https://www.reddit.com/r/LocalLLaMA/comments/132igcy/project_mlc_llm_universal_llm_deployment_with_gpu/jih091c/)                   |      |
| RX 7900 xtx              |            |            |  [Reddit](https://www.reddit.com/r/LocalLLaMA/comments/132igcy/project_mlc_llm_universal_llm_deployment_with_gpu/jia691u)                   |      |

## Macbook

| Hardware/GPU  | OS         | Tokens/sec |  Source             | Notes |
|---------------|------------|------------|---------------------|------|
| 2020 MacBook Pro M1 (8G)              |  macOS |  11.4           | [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1529148903)                    |      |
| 2021 MacBook Pro M1Pro (16G) |  macOS Ventura |  17.1 | [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1529434801)                    |      |
| M1 Max Mac Studio (64G) |  N/A |  18.6 | [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1529714864)                    |      |
| 2021 MacBook Pro M1 Max (32G) |  macOS Monterey |  21.0 | [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1530436512)                    |      |
| MacBook Pro M2 (16G) |  macOS Ventura |  22.5 | [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1530991414)                    |      |
| 2021 MacBook M1Pro (32G) |  macOS Ventura |  19.3 | [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1532387161)                    |      |

## Intel GPUs

| Hardware/GPU  | OS         | Tokens/sec |  Source             | Notes |
|---------------|------------|------------|---------------------|------|
| Arc A770 | N/A | 3.1 - 118.6           |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1533637131)  | perf issues in decoding needs investigation |
| UHD Graphics (Comet Lake-U GT2) 1G | Windows 10 | 2.2           |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1534332939)  | |
| UHD Graphics 630 | macOS Ventura | 2.3           |  @junrushao  | Integrated GPU. Intel MBP 16" (late 2019)  |
| Iris Plus Graphics 1536 MB | macOS Ventura | 2.6           |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1532166342)  | Integrated GPU on MBP  |
| Iris Plus Graphics 645 1536 MB | macOS Ventura | 2.9           |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1533951156)  | Integrated GPU on MBP  |

## NVIDIA GPUs

| Hardware/GPU  | OS         | Tokens/sec |  Source             | Notes |
|---------------|------------|------------|---------------------|------|
|  GTX 1650 ti (4GB) | Fedora  |  15.6  | [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1533244953)                    |       |
|  GTX 1060 (6GB) | Windows 10  |  16.7  | [GitHub](https://github.com/mlc-ai/mlc-llm/issues/13#issue-1689858446)                    |       |
|  RTX 3080 | Windows 11 | 26.0 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1529434801) |      |
|  RTX 3060 | Debian bookworm | 21.3 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1529572646) |      |
|  RTX 2080Ti | Windows 10 | 24.5 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1530146568) |   |
|  RTX 3090 | N/A | 25.7 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1530378843) |   |
|  GTX 1660ti | N/A | 23.9 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1530965727) |   |
|  RTX 3070 | N/A | 23.3 |  [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1537292571) |   |

## iOS

| Hardware/GPU  | OS         | Tokens/sec |  Source             | Notes |
|---------------|------------|------------|---------------------|------|
| iPhone 14 Pro | iOS 16.4.1 | 7.2        | @junrushao |      |
| iPad Pro 11' with M1 | iPadOS 16.1  | 10.6           | [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1529377124)                   |      |
| iPad Pro 11' A12Z | N/A  | 4.1           | [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1530307743)                   |      |
| iPad Pro 11' with M2 (4-th gen) | iPadOS 16.5  | 14.1           | [GitHub](https://github.com/mlc-ai/mlc-llm/issues/15#issuecomment-1532470561)                   |      |

## Android

| Hardware/GPU  | OS         | Tokens/sec |  Link             | Notes |
|---------------|------------|------------|---------------------|------|
|               |            |            |                     |      |
|               |            |            |                     |      |
|               |            |            |                     |      |


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Survey] Supported Hardwares and Speed #15

AMD GPUs

Macbook

Intel GPUs

NVIDIA GPUs

iOS

Android

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Hardware/GPU	OS	Tokens/sec	Source	Notes
RX 6600XT (8G)	N/A	28.3	GitHub
RX 6750XT	openSUSE TumbleWeed	8.9 - 154.3	GitHub
RX 6700XT	Windows 11	33.7	GitHub
APU 5800H	Windows 11	8.5	GitHub
Raden RX 470 (4G)	AlmaLinux 9.1	9.4	GitHub
Raden Pro 5300M	macOS Venture	12.6	@junrushao	Intel MBP 16" (late 2019)
AMD GPU on Steam Deck	Steam Deck's Linux	TBD	Reddit
RX6800 16G VRAM	macOS Ventura	22.5	GitHub	Intel MBP 13'' (2020)
Radeon RX 6600 (8GB)	Ubuntu 22.04	7.0	Reddit
RX 7900 xtx			Reddit

Hardware/GPU	OS	Tokens/sec	Source
2020 MacBook Pro M1 (8G)	macOS	11.4	GitHub
2021 MacBook Pro M1Pro (16G)	macOS Ventura	17.1	GitHub
M1 Max Mac Studio (64G)	N/A	18.6	GitHub
2021 MacBook Pro M1 Max (32G)	macOS Monterey	21.0	GitHub
MacBook Pro M2 (16G)	macOS Ventura	22.5	GitHub
2021 MacBook M1Pro (32G)	macOS Ventura	19.3	GitHub

Hardware/GPU	OS	Tokens/sec	Source	Notes
Arc A770	N/A	3.1 - 118.6	GitHub	perf issues in decoding needs investigation
UHD Graphics (Comet Lake-U GT2) 1G	Windows 10	2.2	GitHub
UHD Graphics 630	macOS Ventura	2.3	@junrushao	Integrated GPU. Intel MBP 16" (late 2019)
Iris Plus Graphics 1536 MB	macOS Ventura	2.6	GitHub	Integrated GPU on MBP
Iris Plus Graphics 645 1536 MB	macOS Ventura	2.9	GitHub	Integrated GPU on MBP

Hardware/GPU	OS	Tokens/sec	Source
GTX 1650 ti (4GB)	Fedora	15.6	GitHub
GTX 1060 (6GB)	Windows 10	16.7	GitHub
RTX 3080	Windows 11	26.0	GitHub
RTX 3060	Debian bookworm	21.3	GitHub
RTX 2080Ti	Windows 10	24.5	GitHub
RTX 3090	N/A	25.7	GitHub
GTX 1660ti	N/A	23.9	GitHub
RTX 3070	N/A	23.3	GitHub

Hardware/GPU	OS	Tokens/sec	Source
iPhone 14 Pro	iOS 16.4.1	7.2	@junrushao
iPad Pro 11' with M1	iPadOS 16.1	10.6	GitHub
iPad Pro 11' A12Z	N/A	4.1	GitHub
iPad Pro 11' with M2 (4-th gen)	iPadOS 16.5	14.1	GitHub

[Survey] Supported Hardwares and Speed #15

Description

AMD GPUs

Macbook

Intel GPUs

NVIDIA GPUs

iOS

Android

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions