wm901115nwpu

wm901115nwpu

22 followers · 590 following

Achievements

Starred repositories

facebookresearch / watermark-anything

Official implementation of the paper "Watermark Anything with Localized Messages"

Jupyter Notebook 85 1 Updated Nov 12, 2024

catchorg / Catch2

A modern, C++-native, test framework for unit-tests, TDD and BDD - using C++14, C++17 and later (C++11 support is in v2.x branch, and C++03 on the Catch1.x branch)

C++ 18,685 3,051 Updated Nov 12, 2024

bytedance / sonic-cpp

A fast JSON serializing & deserializing library, accelerated by SIMD.

C++ 855 102 Updated Nov 7, 2024

bytedance / sonic

A blazingly fast JSON serializing & deserializing library

Assembly 6,932 339 Updated Nov 12, 2024

facebookresearch / optimizers

For optimization algorithm research and development.

Python 421 30 Updated Nov 13, 2024

luhengshiwo / LLMForEverybody

每个人都能看懂的大模型知识分享，LLMs秋招大模型面试前必看，让你和面试官侃侃而谈

Jupyter Notebook 190 6 Updated Nov 13, 2024

neuralmagic / compressed-tensors

A safetensors extension to efficiently store sparse quantized tensors on disk

Python 46 1 Updated Nov 12, 2024

microsoft / chunk-attention

Python 42 7 Updated May 11, 2024

ROCm / rocMLIR

C++ 128 40 Updated Nov 12, 2024

Cambricon / torch_mlu

Python 14 1 Updated Nov 8, 2024

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

12,607 806 Updated Nov 10, 2024

apuaaChen / EVT_AE

Artifacts of EVT ASPLOS'24

Python 16 2 Updated Mar 6, 2024

vosen / ZLUDA

CUDA on non-NVIDIA GPUs

Rust 9,688 631 Updated Nov 12, 2024

mirage-project / mirage

Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA

C++ 617 35 Updated Nov 5, 2024

HandsOnLLM / Hands-On-Large-Language-Models

Official code repo for the O'Reilly Book - "Hands-On Large Language Models"

Jupyter Notebook 2,177 407 Updated Oct 18, 2024

PipeFusion / PipeFusion

Forked from xdit-project/xDiT

A Suite for Parallel Inference of Diffusion Transformers (DiTs) on multi-GPU Clusters

Python 32 Updated Jul 23, 2024

xdit-project / xDiT

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 684 54 Updated Nov 11, 2024

numba / numba

NumPy aware dynamic Python compiler using LLVM

Python 9,968 1,127 Updated Nov 7, 2024

zinccat / Awesome-Triton-Kernels

Collection of kernels written in Triton language

64 Updated Oct 28, 2024

fpgaminer / GPTQ-triton

GPTQ inference Triton kernel

Jupyter Notebook 283 23 Updated May 18, 2023

ELS-RD / kernl

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Jupyter Notebook 1,534 95 Updated Feb 16, 2024

AlibabaPAI / FLASHNN

Python 79 7 Updated Sep 9, 2024

l-yohai / daily_papers_ko

This project aims to automatically translate and summarize Huggingface's daily papers into Korean using ChatGPT.

Python 48 6 Updated Nov 12, 2024

gfvvz / triton-learning-materials

Triton Compiler related materials.

28 4 Updated Oct 27, 2024

BobMcDear / attorch

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

Python 479 22 Updated Oct 25, 2024

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 24,384 2,758 Updated Oct 2, 2024

chatchat-space / Langchain-Chatchat

Langchain-Chatchat（原Langchain-ChatGLM）基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…

TypeScript 31,997 5,569 Updated Nov 9, 2024

irom-princeton / dppo

Official implementation of Diffusion Policy Policy Optimization, arxiv 2024

Python 220 25 Updated Nov 12, 2024

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 3,414 198 Updated Nov 13, 2024

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Cuda 631 25 Updated Sep 21, 2024

wm901115nwpu

Starred repositories

vim

Tensorflow