Stars
A framework for few-shot evaluation of language models.
Stretching GPU performance for GEMMs and tensor contractions.
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
A validation and profiling tool for AI infrastructure
Dissecting NVIDIA GPU Architecture
IREE plugin repository for the AMD AIE accelerator
A cheatsheet of modern C++ language and library features.
Graph Neural Network Library for PyTorch
Library for specialized dense and sparse matrix operations, and deep learning primitives.
An MLIR-based toolchain for AMD AI Engine-enabled devices.
METIS - Serial Graph Partitioning and Fill-reducing Matrix Ordering
A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support
A JIT assembler for x86/x64 architectures supporting MMX, SSE (1-4), AVX (1-2, 512), FPU, APX, and AVX10.2
Python package built to ease deep learning on graph, on top of existing DL frameworks.
Resources on the GraphBLAS standard for graph algorithms in the language of linear algebra
The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear algebra primitives specifically targeting graph analytics.
ParMETIS - Parallel Graph Partitioning and Fill-reducing Matrix Ordering
PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity
Machine learning compiler based on MLIR for Sophgo TPU.
🌟 Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)
PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations
A list of awesome compiler projects and papers for tensor computation and deep learning.