linglingfancodeopensource

Follow

🎯

Focusing

Lingling Fan linglingfancodeopensource

🎯

Focusing

Follow

LF started a new repo only for open-source exploration. This repo doesn't represent her affiliations with company and school.

0 followers · 19 following

Menlo Park

linglingfancodeopensource/README.md

👋 Hi, I’m Lingling.
👀 I’m interested in LLM inference/serving and photonic compiler
🌱 I’m currently focused on model quantization and parallelism.
📫 How to reach me: linglingfan@stanford.edu
😄 Pronouns: She/Her
⚡ Fun fact: I am a pianist.

Pinned Loading

SageAttention SageAttention Public

Forked from thu-ml/SageAttention

Quantized Attention achieves speedup of 2-3x and 3-5x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

Cuda