Skip to content
View ZJY0516's full-sized avatar

Highlights

  • Pro

Block or report ZJY0516

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Reverse Engineering: Decompiling Binary Code with Large Language Models

Python 5,087 341 Updated Oct 28, 2024

CPU inference for the DeepSeek family of large language models in pure C++

C++ 251 23 Updated Feb 11, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 2,113 219 Updated Feb 20, 2025

Curated collection of papers in MoE model inference

69 4 Updated Feb 19, 2025

JAX bindings for the flash-attention3 kernels

C++ 11 1 Updated Aug 6, 2024

Kernel-Bypass LibOS Architecture

Rust 1,092 125 Updated Feb 21, 2025

Fast and memory-efficient exact attention

Python 15,601 1,479 Updated Feb 19, 2025

Custom Linux scheduler for concurrency fuzzing written in Java with hello-ebpf

Java 22 1 Updated Feb 13, 2025

FlagGems is an operator library for large language models implemented in Triton Language.

Python 421 65 Updated Feb 21, 2025

Fused SwiGLU Triton kernels

Python 4 Updated Jan 25, 2024

My learning notes/codes for ML SYS.

Python 868 42 Updated Feb 21, 2025

Perceptual video quality assessment based on multi-method fusion.

Python 4,799 767 Updated Feb 12, 2025

FFMPEG Assembly Language Lessons

1,370 39 Updated Jan 27, 2025

如何成为一名自洽的程序员

Shell 1,933 90 Updated Feb 20, 2025

Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA

C++ 746 44 Updated Feb 21, 2025

🎓Automatically Update MLSys Papers Daily using Github Actions (Update Every 12th hours)

Python 2 Updated Feb 21, 2025

Low-bit LLM inference on CPU with lookup table

C++ 687 53 Updated Jan 9, 2025

Implementation of Alphafold 3 from Google Deepmind in Pytorch

Python 1,365 169 Updated Jan 22, 2025

Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing

Jupyter Notebook 29 2 Updated Jan 8, 2025

GitHub page for "Large Language Model-Brained GUI Agents: A Survey"

CSS 114 6 Updated Feb 2, 2025
Python 13 2 Updated Jan 13, 2025

Include binary files in C/C++

C 1,021 95 Updated Jul 12, 2024

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 40,301 5,389 Updated Feb 20, 2025

Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O

C++ 247 25 Updated Jan 15, 2025

Port of OpenAI's Whisper model in C/C++

C++ 37,916 3,925 Updated Feb 19, 2025

AlphaFold 3 inference pipeline.

Python 6,080 747 Updated Feb 14, 2025

Building blocks for foundation models.

449 19 Updated Jan 3, 2024

A visualized debugging framework to aid in understanding the Linux kernel.

C 104 7 Updated Feb 20, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,603 159 Updated Feb 20, 2025

上海交通大学 Beamer 模版 | Beamer template for Shanghai Jiao Tong University

TeX 619 62 Updated Jan 19, 2025
Next