Stars
[NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
✔(已完结)最全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】
倪海厦外门弟子,为往圣继绝学。传承中医,主要以倪海厦倪师人纪天纪为主,专攻经方,后期还有李可、胡希恕等国医老前辈授课内容。内含倪师人纪系列(非视频语音转文字,皆是倪师自编教材课本讲义) 自学顺序:针灸,黄帝内经,神农本草经,伤寒论,金匮要略五部教学课本PDF(请配合B站视频学习,搜索倪海夏即可) 自学笔记正在手打中...
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloaded from https://developer.nvidia.com/nvcomp.
DeepSeek | 中文官网、DeepSeek网页版、API 调用和本地部署教程 | 最全使用指南~【2025年3月更新】轻松使用 DeepSeek 网页版,快速稳定、不卡顿,支持 DeepSeek R1、V3 以及 ChatGPT 4o、o1、o3 多种功能。 本指南提供全面的 DeepSeek 使用说明,包含DeepSeek 官网平替、DeepSeek网页版、API使用、DeepSee…
Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O
CPU inference for the DeepSeek family of large language models in pure C++
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Fully open reproduction of DeepSeek-R1
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
This is the accompanying code for our arxiv pre-print: "BoolNet: Minimizing the Energy Consumption of Binary Neural Networks"
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket
A 2D Unity simulation in which cars learn to navigate themselves through different courses. The cars are steered by a feedforward neural network. The weights of the network are trained using a modi…
LightSeq: A High Performance Library for Sequence Processing and Generation
计算机类常用电子书整理,并且附带下载链接,包括Java,Python,Linux,Go,C,C++,数据结构与算法,人工智能,计算机基础,面试,设计模式,数据库,前端等书籍
Several simple examples for popular neural network toolkits calling custom CUDA operators.