- All languages
- Assembly
- C
- C#
- C++
- CMake
- CSS
- Cuda
- Dockerfile
- EJS
- Emacs Lisp
- Go
- HTML
- Java
- JavaScript
- Jupyter Notebook
- LLVM
- Lua
- MATLAB
- MDX
- MLIR
- Markdown
- Mojo
- Objective-C
- OpenEdge ABL
- OpenQASM
- PHP
- PostScript
- PureBasic
- Python
- R
- Rich Text Format
- Roff
- Rust
- Scala
- Shell
- Starlark
- Swift
- SystemVerilog
- TeX
- TypeScript
- VBA
- Verilog
- Vim Script
Starred repositories
Official implementation of the paper "Watermark Anything with Localized Messages"
A modern, C++-native, test framework for unit-tests, TDD and BDD - using C++14, C++17 and later (C++11 support is in v2.x branch, and C++03 on the Catch1.x branch)
A fast JSON serializing & deserializing library, accelerated by SIMD.
A blazingly fast JSON serializing & deserializing library
For optimization algorithm research and development.
每个人都能看懂的大模型知识分享,LLMs秋招大模型面试前必看,让你和面试官侃侃而谈
A safetensors extension to efficiently store sparse quantized tensors on disk
✨✨Latest Advances on Multimodal Large Language Models
Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
PipeFusion / PipeFusion
Forked from xdit-project/xDiTA Suite for Parallel Inference of Diffusion Transformers (DiTs) on multi-GPU Clusters
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Collection of kernels written in Triton language
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
This project aims to automatically translate and summarize Huggingface's daily papers into Korean using ChatGPT.
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…
Official implementation of Diffusion Policy Policy Optimization, arxiv 2024
Efficient Triton Kernels for LLM Training
A throughput-oriented high-performance serving framework for LLMs