Skip to content
View BDHU's full-sized avatar

Organizations

@utcs-scea @Rust-sys

Block or report BDHU

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

a (nearly) no-CSS, fast, minimalist Hugo theme ported from riggraz/no-style-please.

HTML 315 113 Updated Jan 20, 2025

Open Agentic Schema Framework

Elixir 138 14 Updated Apr 17, 2025

The slightly more awesome standard unix password manager for teams

Go 6,132 508 Updated Apr 22, 2025

Read-only demo server for larger datasets

5 1 Updated Mar 5, 2024

📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism etc.

Python 3,871 275 Updated Apr 18, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 2,711 285 Updated Apr 22, 2025

Build computation graphs from python functions

Python 2 Updated May 20, 2023

A computation graph micro-framework providing seamless lazy and concurrent evaluation

Python 18 Updated Feb 14, 2020

Documentation for Google's Gen AI site - including the Gemini API and Gemma

Jupyter Notebook 1,972 692 Updated Apr 22, 2025

Dynamic Memory Management for Serving LLMs without PagedAttention

C 351 27 Updated Apr 18, 2025

O1 Replication Journey

1,985 66 Updated Jan 14, 2025

CUDA on non-NVIDIA GPUs

Rust 11,212 716 Updated Apr 22, 2025

NumPy & SciPy for GPU

Python 10,135 903 Updated Apr 20, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,300 1,374 Updated Apr 23, 2025

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 342 48 Updated Apr 23, 2025

AICI: Prompts as (Wasm) Programs

Rust 2,019 83 Updated Jan 22, 2025

Read-only mirror of https://git.zx2c4.com/cgit/about . Pull requests and issues on GitHub cannot be accepted and will be automatically closed. The proper way to submit changes is via the mailing li…

C 178 25 Updated Mar 15, 2025
Cuda 1 Updated Dec 28, 2023

Inference Llama 2 in one file of pure C

C 18,305 2,241 Updated Aug 6, 2024

TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.

Python 934 305 Updated Apr 22, 2025

Altis-SYCL: a SYCL-based implementation of the Altis GPGPU benchmark suite for CPUs, GPUs, and FPGAs.

C++ 1 1 Updated Dec 22, 2023

A High contrast, text oriented, performant and Javascript-free theme for Hugo.

HTML 196 96 Updated May 10, 2024

A simple, responsive writing (and reading) theme for Hugo.

CSS 2 Updated Aug 18, 2023

PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations

Python 1,056 154 Updated Apr 10, 2025

system paper reading notes

243 13 Updated Mar 3, 2022

📖 A curated list of resources dedicated to Machine Learning for Systems research

11 Updated Jun 29, 2020

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

Python 24,110 2,015 Updated Sep 26, 2024

This repository contains demos I made with the Transformers library by HuggingFace.

Jupyter Notebook 10,767 1,595 Updated Apr 21, 2025
Next