Skip to content

Exorust/LLM-Deep-Dive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 

Repository files navigation

Robot Image

The LLM Deep Dive

🐦 Follow me on Twitter • 📧 Contact on Email


In an age of GPT, I'm going to handwrite the best links I've used to learn LLMs.

Welcome.

PS: This is for people trying to go deeper. If you want something kind of basic, look elsewhere.

◻️How to use this guide?

Start by going through the Table of contents. See what you've already read and what you haven't. Then, start with Easy links in each section. Each area has multiple types of subtopics each of which will go more in depth. In the event there are no articles, feel free to email for additions or raise a PR.

◻️Table of contents

🟩 Model Architecture

This section talks about the key aspects of LLM architecture.

📝 Try to cover basics of Transformers, then understand the GPT architecture before diving deeper into other concepts

◻️Transformer Architecture

Tokenization
Positional Encoding
Rotational Positional Encoding
Rotary Positional Encoding

-Rotary Positional Encoding Explained

◻️GPT Architecture

◻️Attention

◻️Loss

Cross-Entropy Loss

🟩 Agentic LLMs

-Agentic LLMs Deep Dive This section talks about various aspects of the Agentic LLMs


🟩 Methodology

This section tries to cover various methodologies used in LLMs.

◻️Distillation


🟩 Datasets


🟩 Pipeline

◻️Training

◻️Inference

RAG

◻️Prompting


🟩 FineTuning

◻️Quantized FineTuning

◻️LoRA

◻️DPO

◻️ORPO

◻️RLHF


🟩 Quantization

◻️Post Training Quantization

Static/Dynamic Quantization
GPTQ
GGUF
LLM.int8()

◻️Quantization Aware Training → 1BIT LLM


🟩 RL in LLM


🟩 Coding

◻️Torch Fundamentals


🟩 Deployment


🟩 Engineering

◻️Flash Attention 2

◻️KV Cache

◻️Batched Inference

◻️Python Advanced

Decorators
Context Managers

◻️Triton Kernels

◻️CuDA

◻️JAX / XLA JIT compilers

◻️Model Exporting (vLLM, Llama.cpp, QLoRA)

◻️ML Debugging


🟩 Benchmarks


🟩 Modifications

◻️Model Merging

-An Introduction to Model Merging for LLMsMedium

Linear Mapping
SLERP

-Merging tokens to accelerate LLM inference with SLERPMedium

TIES
DARE

◻️MoE


🟩 Misc Algorithms

◻️Chained Matrix Unit

◻️Gradient Checkpointing

◻️Chunked Cross Entropy

◻️BPE


🟩 Explainability

◻️Sparse Autoencoders

◻️Task Vectors

◻️Counterfactuals


🟩 MultiModal Transformers

◻️Audio

Whisper Models
Diarization

🟩 Adversarial methods


🟩 Misc


🟩 Add to the guide:

Add links you find useful through pull requests.

About

Deep dive to learn LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •