On-Device LLM Demo

This repository contains a code example inspired by the LinkedIn post: On-Device LLMs: The Future of Private, Personalized AI. It demonstrates the use of on-device LLMs with Google's MediaPipe.

Setup

To run this project, you need to serve it using a static server. Below are several options based on your environment:

Using Python

For Python 3:

python3 -m http.server 8000

For Python 2:

python -m SimpleHTTPServer 8000

Using Node.js

Install the http-server package globally:

npm install -g http-server

Run the server:

http-server

Using PHP

php -S localhost:8000

Using Ruby

ruby -run -e httpd . -p 8000

Using Docker

Run the following command to serve the project via Docker:

docker run --name static-server -v $(pwd):/usr/share/nginx/html:ro -p 8080:80 nginx

Supported Models

You can find the list of supported models here. These models are compatible with the MediaPipe Task Gen AI API.

Pre-Converted Models

To ensure compatibility with MediaPipe, use pre-converted models. Below are some examples available for download on Kaggle:

Gemma2 2B (LiteRT 2b-it-gpu-int8)
Gemma1.1 2B (LiteRT 2b-it-gpu-int4)
Gemma1.1 7B (LiteRT 7b-it-gpu-int8)

Ensure you select the model variation optimized for your hardware, whether GPU or CPU, based on your system's capabilities.

See the complete list of pre-converted models here.

Prompts

All example prompts can be found in the prompts directory.
To achieve the best results, you must structure your input using Gemma's prompt formatting, which includes tokens like <start_of_turn> and <end_of_turn>. Proper formatting ensures optimal inference performance.

See It in Action

Watch this YouTube video for a hands-on demonstration of the project in action.

Notes

Make sure to serve the project from the root directory to correctly load assets and dependencies.
Ensure you select the model variation optimized for your hardware, whether GPU or CPU, based on your system's capabilities.
Only use models that have been pre-converted for compatibility with MediaPipe and your hardware

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
prompts		prompts
.gitignore		.gitignore
README.md		README.md
index.html		index.html
index.js		index.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

On-Device LLM Demo

Table of Contents

Setup

Using Python

Using Node.js

Using PHP

Using Ruby

Using Docker

Supported Models

Pre-Converted Models

Prompts

See It in Action

Notes

About

Uh oh!

Releases

Packages

Languages

theterminalguy/on-device-llm-demo

Folders and files

Latest commit

History

Repository files navigation

On-Device LLM Demo

Table of Contents

Setup

Using Python

Using Node.js

Using PHP

Using Ruby

Using Docker

Supported Models

Pre-Converted Models

Prompts

See It in Action

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages