Run local LLMs like Gemma, Qwen, and LLaMA on Android for offline, private, real-time chat and question answering with LiteRT and ONNX Runtime.
-
Updated
Apr 26, 2026 - Kotlin
Run local LLMs like Gemma, Qwen, and LLaMA on Android for offline, private, real-time chat and question answering with LiteRT and ONNX Runtime.
run AI without internet, on the web & desktop
A Cloud-to-Edge MLOps pipeline for offline industrial diagnostics. Fine-tunes Phi-3-mini (3.8B) on Cloud GPUs via QLoRA, quantizes to INT4, and deploys as a CPU-optimized ONNX microservice for industrial standard sensor logs.
Edge-AI powered Data-to-Text system that analyzes global fishery trends (Capture, Aquaculture, Stocks) and generates automated status reports offline using LSTM & TensorFlow Lite.
A comprehensive toolkit for streamlining and simplifying the offline inference process for LLMs across various models and libraries.
Мультимодальная офлайновая система детекции контрафакта (текст+изображение+таблица).
GPT-OSS B20 Local Execution. Lightweight local environment for running it with Python 3.12 and CUDA acceleration. - Run GPT-OSS B20 entirely offline - Optimize text generation with GPU - Enable fast, secure inference on consumer hardware.
Inclusive hand gesture recognition system for assistive human–computer interaction, based on classical machine learning and MediaPipe Hands.
Offline CrowdAware system for Raspberry Pi 4B and Heltec LoRa V3 using Raspberry Pi Camera Module 3 and MLX90640 Thermal Camera.
Real-time semantic audio codec achieving 300bps bandwidth via gen AI reconstruction.
Add a description, image, and links to the offline-inference topic page so that developers can more easily learn about it.
To associate your repository with the offline-inference topic, visit your repo's landing page and select "manage topics."