-
Updated
Apr 12, 2026 - TypeScript
on-device-inference
Here are 28 public repositories matching this topic...
TinyML & Edge AI: On-device inference, model quantization, embedded ML, ultra-low-power AI for microcontrollers and IoT devices.
-
Updated
Nov 10, 2025 - Python
Auditable offline edge intelligence for low-cost edge devices, with benchmark evidence and public board proof on ESP32-C3.
-
Updated
Mar 23, 2026 - Python
-
Updated
Apr 11, 2026 - Rust
Flutter starter example app to get started with NobodyWho, a library designed to run LLMs locally and efficiently on any device.
-
Updated
Apr 9, 2026 - Dart
Custom llama.cpp fork with character intelligence engine: control vectors, attention bias, head rescaling, attention temperature, fast weight memory
-
Updated
Apr 4, 2026 - C++
iOS + Android app that runs local LLMs on-device + routstr cloud LLMs for anonymous inference
-
Updated
Sep 18, 2025 - TypeScript
Production Android AI with ExecuTorch 1.0 - Deploy PyTorch models to mobile with NPU acceleration and 50KB footprint
-
Updated
Nov 14, 2025 - Python
Mobile AI: iOS CoreML, Android TFLite, on-device inference, ONNX, TensorRT, and ML deployment for smartphones.
-
Updated
Nov 10, 2025 - Python
High-performance Android SDK for on-device LLM inference (GGUF). Privacy-focused, offline-first, and powered by llama.cpp with a clean Kotlin Coroutines API.
-
Updated
Mar 27, 2026 - Kotlin
The Private Agent OS — search files, run AI agents, connect to 10,000+ tools via the complete protocol stack (MCP, AG-UI, A2UI, A2A). Zero cloud. Zero telemetry. On-device inference.
-
Updated
Apr 11, 2026 - Rust
Neural acoustic echo cancellation for Apple platforms using CoreML — Swift package with 128/256/512-unit DTLN-aec models
-
Updated
Mar 9, 2026 - Swift
React Native SDK for local LLM inference and on-device AI on iOS and Android.
-
Updated
Mar 14, 2026 - TypeScript
Ad generation via offline LLMs with on-device inference, optionally managed by a self-hosted CMS.
-
Updated
Feb 28, 2026
Real-time SAM2 segmentation on edge devices - 40x faster C++ inference with ONNX Runtime for iOS/Android deployment
-
Updated
Nov 14, 2025 - C++
Run small LLMs directly in your web browser, no cloud computing needed.
-
Updated
Mar 24, 2026 - TypeScript
Swift wrapper for Apple's BNNS graph API — run compiled CoreML models (.mlmodelc) on CPU with zero-copy buffer management
-
Updated
Mar 9, 2026 - Swift
Open source Node.js runtime for local LLM inference, on-device AI, and private model execution.
-
Updated
Mar 24, 2026 - TypeScript
Privacy-first age check that keeps face processing on the device but still gives the server something it can verify.
-
Updated
Apr 10, 2026 - JavaScript
Deep technical writing on edge AI, on-device inference, llama.cpp, GGML, and mobile AI engineering
-
Updated
Mar 14, 2026 - Svelte
Improve this page
Add a description, image, and links to the on-device-inference topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the on-device-inference topic, visit your repo's landing page and select "manage topics."