In an age of GPT, I'm going to handwrite the best links I've used to learn LLMs.
Welcome.
PS: This is for people trying to go deeper. If you want something kind of basic, look elsewhere.
Start by going through the Table of contents. See what you've already read and what you haven't. Then, start with Easy links in each section. Each area has multiple types of subtopics each of which will go more in depth. In the event there are no articles, feel free to email for additions or raise a PR.
- 🟩 Model Architecture
- 🟩 Agentic LLMs
- 🟩 Methodology
- 🟩 Datasets
- 🟩 Pipeline
- 🟩 FineTuning
- 🟩 Quantization
- 🟩 RL in LLM
- 🟩 Coding
- 🟩 Deployment
- 🟩 Engineering
- 🟩 Benchmarks
- 🟩 Modifications
- 🟩 Misc Algorithms
- 🟩 Explainability
- 🟩 MultiModal Transformers
- 🟩 Adversarial methods
- 🟩 Misc
- 🟩 Add to the guide:
This section talks about the key aspects of LLM architecture.
📝 Try to cover basics of Transformers, then understand the GPT architecture before diving deeper into other concepts
-Rotary Positional Encoding Explained
- Jay Alamar - Illustrated GPT2
- Reproducing GPT-2 (124M) in llm.c in 90 minutes for $20
- Umar Jamil: Llama Explained
- Umar Jamil: Llama 2 from Scratch
-Agentic LLMs Deep Dive This section talks about various aspects of the Agentic LLMs
This section tries to cover various methodologies used in LLMs.
- How continuous batching enables 23x throughput in LLM inference while reducing p50 latency
- LLM Inference Optimizations — Continuous Batching and Selective Batching, Orca
- [vLLM] LLM Inference Optimizations: Chunked Prefill and Decode-Maximal Batching
- LLM Inference Series: 2. The two-phase process behind LLMs’ responses
- LLM Inference Series: 4. KV caching, a deeper look
-An Introduction to Model Merging for LLMs
-Merging tokens to accelerate LLM inference with SLERP
Add links you find useful through pull requests.
