Skip to content
View IamXuLiang's full-sized avatar

Block or report IamXuLiang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models

Python 156 12 Updated Jan 1, 2025
Python 137 12 Updated Jul 22, 2024

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 12,656 845 Updated Mar 11, 2025

✔(已完结)最全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】

Jupyter Notebook 8,028 1,013 Updated Dec 26, 2024

倪海厦外门弟子,为往圣继绝学。传承中医,主要以倪海厦倪师人纪天纪为主,专攻经方,后期还有李可、胡希恕等国医老前辈授课内容。内含倪师人纪系列(非视频语音转文字,皆是倪师自编教材课本讲义) 自学顺序:针灸,黄帝内经,神农本草经,伤寒论,金匮要略五部教学课本PDF(请配合B站视频学习,搜索倪海夏即可) 自学笔记正在手打中...

HTML 48 23 Updated Mar 10, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 4,917 488 Updated Mar 11, 2025

Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloaded from https://developer.nvidia.com/nvcomp.

C++ 574 80 Updated Sep 11, 2024

FlashMLA: Efficient MLA decoding kernels

C++ 11,268 790 Updated Mar 1, 2025

Run generative AI models in sophgo BM1684X

Python 183 31 Updated Mar 12, 2025

DeepSeek | 中文官网、DeepSeek网页版、API 调用和本地部署教程 | 最全使用指南~【2025年3月更新】轻松使用 DeepSeek 网页版,快速稳定、不卡顿,支持 DeepSeek R1、V3 以及 ChatGPT 4o、o1、o3 多种功能。 本指南提供全面的 DeepSeek 使用说明,包含DeepSeek 官网平替、DeepSeek网页版、API使用、DeepSee…

101 18 Updated Mar 11, 2025

CUDA/Metal accelerated language model inference

C 523 23 Updated Mar 9, 2025

LLM inference in C/C++

C++ 76,328 11,046 Updated Mar 12, 2025

Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O

C++ 270 27 Updated Jan 15, 2025

CPU inference for the DeepSeek family of large language models in pure C++

C++ 272 26 Updated Feb 11, 2025

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C 7,093 1,957 Updated Mar 10, 2025

LLM training in simple, raw C/CUDA

Cuda 26,001 2,980 Updated Oct 2, 2024

Fully open reproduction of DeepSeek-R1

Python 22,647 2,034 Updated Mar 11, 2025

LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale

Python 92 12 Updated Feb 24, 2025

Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.

228 14 Updated Mar 3, 2025

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

332 7 Updated Mar 6, 2025

STP Toolbox for Python

Python 3 1 Updated Jun 7, 2023

This is the accompanying code for our arxiv pre-print: "BoolNet: Minimizing the Energy Consumption of Binary Neural Networks"

Python 10 5 Updated Sep 17, 2021

Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket

Python 66 5 Updated Feb 12, 2023

A 2D Unity simulation in which cars learn to navigate themselves through different courses. The cars are steered by a feedforward neural network. The weights of the network are trained using a modi…

ASP 1,500 363 Updated Sep 7, 2022

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,257 331 Updated May 16, 2023

计算机类常用电子书整理,并且附带下载链接,包括Java,Python,Linux,Go,C,C++,数据结构与算法,人工智能,计算机基础,面试,设计模式,数据库,前端等书籍

4,413 961 Updated Jul 9, 2022

一款定制化的Hexo博客主题

EJS 501 330 Updated Jan 7, 2023

Several simple examples for popular neural network toolkits calling custom CUDA operators.

Python 1,412 196 Updated Apr 29, 2021
Next