Skip to content
View charlifu's full-sized avatar
  • AMD

Block or report charlifu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A framework for few-shot evaluation of language models.

Python 7,798 2,096 Updated Feb 14, 2025

Stretching GPU performance for GEMMs and tensor contractions.

Python 233 156 Updated Feb 14, 2025

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫

Python 19,833 5,909 Updated Feb 12, 2025

A validation and profiling tool for AI infrastructure

Python 290 62 Updated Feb 15, 2025

Dissecting NVIDIA GPU Architecture

Cuda 88 27 Updated Jul 11, 2022

ROB size testing utility

C++ 142 13 Updated Dec 19, 2021

IREE plugin repository for the AMD AIE accelerator

MLIR 78 31 Updated Feb 15, 2025

A cheatsheet of modern C++ language and library features.

20,089 2,130 Updated Oct 15, 2024

Graph Neural Network Library for PyTorch

Python 21,881 3,759 Updated Feb 15, 2025

Library for specialized dense and sparse matrix operations, and deep learning primitives.

C 861 188 Updated Feb 12, 2025

An MLIR-based toolchain for AMD AI Engine-enabled devices.

MLIR 334 102 Updated Feb 15, 2025

METIS - Serial Graph Partitioning and Fill-reducing Matrix Ordering

C 770 150 Updated Oct 27, 2023

A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support

Python 15,405 528 Updated Feb 15, 2025

A JIT assembler for x86/x64 architectures supporting MMX, SSE (1-4), AVX (1-2, 512), FPU, APX, and AVX10.2

C++ 2,080 277 Updated Feb 13, 2025

A list of awesome GNN systems.

Python 300 27 Updated Feb 15, 2025

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Python 13,699 3,032 Updated Feb 11, 2025

Resources on the GraphBLAS standard for graph algorithms in the language of linear algebra

191 11 Updated Oct 22, 2024

The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear algebra primitives specifically targeting graph analytics.

C++ 70 24 Updated Dec 9, 2024

ParMETIS - Parallel Graph Partitioning and Fill-reducing Matrix Ordering

C 127 46 Updated Dec 8, 2023

PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity

Cuda 102 27 Updated Feb 3, 2025

Machine learning compiler based on MLIR for Sophgo TPU.

C++ 661 166 Updated Feb 8, 2025

🌟 Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)

TypeScript 22,119 4,097 Updated Feb 14, 2025

PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations

Python 1,036 150 Updated Jan 10, 2025

Stack trace visualizer

Perl 17,789 1,995 Updated Oct 20, 2024
C++ 2 Updated Jun 11, 2022

A microbenchmark support library

C++ 9,250 1,652 Updated Feb 13, 2025

Low-latency machine code generation

C++ 4,043 512 Updated Feb 12, 2025

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,483 307 Updated Oct 19, 2024

Conversions to MLIR EmitC

C++ 126 23 Updated Dec 12, 2024
Next