Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.
This project focuses on adapting LMDeploy for use with NVIDIA Jetson series edge computing cards, facilitating the implementation of InternLM series LLMs for Offline Embodied Intelligence (OEI).
Demo:[Bilibili]
- [2024/2/26] This project has been included in the LMDeploy community.
- [2024/2/25] Updated support for LMDeploy-v0.2.4.
- ✅:Verified and runnable
- ❌:Verified but not runnable
- ⭕️:Pending verification
Models | InternLM-7B | InternLM-20B | InternLM2-1.8B | InternLM2-7B | InternLM2-20B |
---|---|---|---|---|---|
Orin AGX(32G) Jetpack 5.1 |
✅ Mem:??/?? 14.68 token/s |
✅ Mem:??/?? 5.82 token/s |
✅ Mem:??/?? 56.57 token/s |
✅ Mem:??/?? 14.56 token/s |
✅ Mem:??/?? 6.16 token/s |
Orin NX(16G) Jetpack 5.1 |
✅ Mem:8.6G/16G 7.39 token/s |
✅ Mem:14.7G/16G 3.08 token/s |
✅ Mem:5.6G/16G 22.96 token/s |
✅ Mem:9.2G/16G 7.48 token/s |
✅ Mem:14.8G/16G 3.19 token/s |
Xavier NX(8G) Jetpack 5.1 |
❌ | ❌ | ✅ Mem:4.35G/8G 28.36 token/s |
❌ | ❌ |
If you have more Jetson series boards, feel free to run benchmarks and submit the results via Pull Requests
(PR) to become one of the community contributors!
- Testing on more Jetson boards such as Nano and AGX.
- ……
S1.Quantize on server by W4A16
S2.Install Miniconda on Jetson
S3.Install CMake-3.29.0 on Jetson
S4.Install RapidJson on Jetson
S5.Install Pytorch-2.1.0 on Jetson
S6.Port LMDeploy-0.2.4 to Jetson
S7.Run InternLM offline on Jetson
If this project is helpful to your work, please cite it using the following format:
@misc{hongjun2024lmdeployjetson,
title={LMDeploy-Jetson:Opening a new era of Offline Embodied Intelligence},
author={LMDeploy-Jetson Community},
url={https://github.com/BestAnHongjun/LMDeploy-Jetson},
year={2024}
}