Skip to content
View AllenJWZhu's full-sized avatar
  • Meta
  • San Francisco, CA
  • 11:01 (UTC -04:00)

Highlights

  • Pro

Block or report AllenJWZhu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AllenJWZhu/README.md

致瞬息万变之物 · 及亘古不变之物

ML System Protection Association, save your GPU, save lives.


About Me

I’m Allen Zhu – I build high-performance machine-learning systems

  • 🔬 Now — SWE Intern @ Meta (Facebook) • AI Infra
  • ⚙️ Before — Research Intern @ SenseTime • AI Lab & SWE Intern @ iFLYTEK • Speech AI
  • ✍️ Writing — large language model, machine learning systems, high performance computing (Medium / 知乎)
  • 🧐 Seeking — full-time ML Systems / Infra roles=
  • 🎯 Side quests — photography, strategy games, pokemon enjoyer

🗂️ Timeline

gantt
    dateFormat  YYYY-MM
    title       Work Experience
    section Experience
    Meta – ML Infra Intern (C++/CUDA)     :a1, 2025-05, 3m
    SenseTime – Research Intern (LLM Quantization) :a2, 2024-12, 2m
    iFLYTEK – Software Engineer Intern (Speech AI) :a3, 2024-08, 3m
Loading

🧰 Tech Stack

Languages ML Systems / HPC Cloud & DevOps

☕ Reach Out

Pinned Loading

  1. LlamaInfer LlamaInfer Public

    LLM Inference Engine: High-performance CUDA-accelerated framework for large language model inference A cutting-edge, open-source implementation of a large language model (LLM) inference engine, opt…

    C++ 7 1

  2. sglang sglang Public

    Forked from sgl-project/sglang

    SGLang is a fast serving framework for large language models and vision language models.

    Python

  3. verl verl Public

    Forked from volcengine/verl

    verl: Volcano Engine Reinforcement Learning for LLMs

    Python

  4. BERT_TensorRT_Inference_Optimization BERT_TensorRT_Inference_Optimization Public

    Inference optimization of the BERT model using TensorRT, NVIDIA's high-performance deep learning inference platform. TensorRT is designed to maximize the efficiency of deep learning models during i…

    5

  5. Distributed-PoW-based-Fault-Tolerant-Blockchain-System Distributed-PoW-based-Fault-Tolerant-Blockchain-System Public

    Proof-of-Work Blockchain Consensus System: A high-performance, modular implementation of a decentralized blockchain network This project is a robust and scalable implementation of a Proof-of-Work (…

    Go 1

  6. CMU_Course_Notes CMU_Course_Notes Public

    This is the combined collection of the course notes for some of the computer science classes at CMU released online.

    63 16