Skip to content

theterminalguy/on-device-llm-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

On-Device LLM Demo

This repository contains a code example inspired by the LinkedIn post: On-Device LLMs: The Future of Private, Personalized AI. It demonstrates the use of on-device LLMs with Google's MediaPipe.


Table of Contents

  1. Introduction
  2. Setup
  3. Supported Models
  4. Pre-Converted Models
  5. Prompts
  6. See It in Action
  7. Notes

Setup

To run this project, you need to serve it using a static server. Below are several options based on your environment:

Using Python

For Python 3:

python3 -m http.server 8000

For Python 2:

python -m SimpleHTTPServer 8000

Using Node.js

Install the http-server package globally:

npm install -g http-server

Run the server:

http-server

Using PHP

php -S localhost:8000

Using Ruby

ruby -run -e httpd . -p 8000

Using Docker

Run the following command to serve the project via Docker:

docker run --name static-server -v $(pwd):/usr/share/nginx/html:ro -p 8080:80 nginx

Supported Models

You can find the list of supported models here. These models are compatible with the MediaPipe Task Gen AI API.


Pre-Converted Models

To ensure compatibility with MediaPipe, use pre-converted models. Below are some examples available for download on Kaggle:

Ensure you select the model variation optimized for your hardware, whether GPU or CPU, based on your system's capabilities.

See the complete list of pre-converted models here.


Prompts

All example prompts can be found in the prompts directory.
To achieve the best results, you must structure your input using Gemma's prompt formatting, which includes tokens like <start_of_turn> and <end_of_turn>. Proper formatting ensures optimal inference performance.


See It in Action

Watch this YouTube video for a hands-on demonstration of the project in action.

Notes

  • Make sure to serve the project from the root directory to correctly load assets and dependencies.
  • Ensure you select the model variation optimized for your hardware, whether GPU or CPU, based on your system's capabilities.
  • Only use models that have been pre-converted for compatibility with MediaPipe and your hardware

About

On-Device LLM demo with Google's MediaPipe

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published