Stars
SGLang is a fast serving framework for large language models and vision language models.
VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues
Introduction to Machine Learning Systems
sandeep06011991 / tvm
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
PyTorch Library for Fast and Easy Distributed Graph Learning
sandeep06011991 / dgl_groot
Forked from dmlc/dglPython package built to ease deep learning on graph, on top of existing DL frameworks.
A collection of AWESOME things about Graph-Related LLMs.
Graph Neural Network-based Surrogate Models for Finite Element Analysis
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
RelBench: Relational Deep Learning Benchmark
A concise but complete full-attention transformer with a set of promising experimental features from various papers
A minimum demo for PyTorch c10d extension APIs
An Industrial Graph Neural Network Framework
A high-throughput and memory-efficient inference and serving engine for LLMs
The project uses a Xilinx Artix-7 FPGA on a Digilent Basys 3 board to design a clock whose seconds, minutes, & hours are displayed on a Quad 7-segment display & can also be displayed on a vga displ…
How to write firmware for AVRs and other embedde design practices.
The re-implementation of <End-to-End Lane Marker Detection via Row-wise Classification>
Making large AI models cheaper, faster and more accessible
SoCC'20 and TPDS'21: Scaling GNN Training on Large Graphs via Computation-aware Caching and Partitioning.
A list of awesome compiler projects and papers for tensor computation and deep learning.
Things to learn for new students in the Lab for AI chips and systems of BJTU .