Stars
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
SGLang is a fast serving framework for large language models and vision language models.
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).
LiteRT is the new name for TensorFlow Lite (TFLite). While the name is new, it's still the same trusted, high-performance runtime for on-device AI, now with an expanded vision.
Official inference repo for FLUX.1 models
Development repository for the Triton language and compiler
The easiest way to use Agentic RAG in any enterprise
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Supporting PyTorch models with the Google AI Edge TFLite runtime.
[EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models
On-device AI across mobile, embedded and edge for PyTorch
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
A vector search SQLite extension that runs anywhere!
An open-source framework for machine learning and other computations on decentralized data.
A framework for Privacy Preserving Machine Learning
Federated Learning Simulator (FLSim) is a flexible, standalone core library that simulates FL settings with a minimal, easy-to-use API. FLSim is domain-agnostic and accommodates many use cases such…
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
Meaningful control of data in distributed systems.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.