Lists (2)
Sort Name ascending (A-Z)
Starred repositories
Optimized primitives for collective multi-GPU communication
Universal LLM Deployment Engine with ML Compilation
Tile primitives for speedy kernels
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
A list of papers about distributed consensus.
Making Long-Context LLM Inference 10x Faster and 10x Cheaper
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
An open-source RAG-based tool for chatting with your documents.
Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS
Real time transcription with OpenAI Whisper.
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
Whisper realtime streaming for long speech-to-text transcription and translation
OpenAI Assistants API quickstart with Next.js.
real time face swap and one-click video deepfake with only a single image
Minimap2onGPU / mm2-gb
Forked from lh3/minimap2A versatile pairwise aligner for genomic and spliced nucleotide sequences
A large scale non-linear optimization library
A massively parallel, high-level programming language
An optimization-based multi-sensor state estimator
📐 Jekyll theme for building a personal site, blog, project documentation, or portfolio.
Context aware, pluggable and customizable data protection and de-identification SDK for text and images
[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering
Library for faster pinned CPU <-> GPU transfer in Pytorch