A repository for my solutions to problems on Deep-ML, a site for LeetCode-style questions for machine learning and data science. For each problem, I decided to use either numpy or pure Python, depending on the type signature of the method, i.e. if the method takes in 2 np.arrays, then I use numpy, else Python.
Note
Collections have duplicate questions.
- AlexNet
- Implement ReLU Activation Function
- Simple Convolutional 2D Layer
- Overlapping Max Pooling
- PCA Colour Augmentation
- Dropout Layer
- Attention is All You Need
- Implement Self-Attention Mechanism
- Implement Multi-Head Attention
- Implement Masked Self-Attention
- Implement Layer Normalization for Sequence Data
- Positional Encoding Calculator
- Deep Learning
- Linear Algebra
- Probability and Statistics
- Optimization Techniques
- Fundamentals of Neural Networks
- Softmax Activation Function Implementation
- Implementation of Log Softmax Function
- Sigmoid Activation Function
- Implement ReLU Activation Function
- Leaky ReLU Activation Function
- Implement the PReLU Activation Function
- Single Neuron
- Implementing a Simple RNN
- Implement a Long Short-Term Memory (LSTM) Network
- Simple Convolutional 2D Layer
- GPT-2 Text Generation
- Backpropagation
- Single Neuron with Backpropagation
- Implementing Basic Autograd Operations
- Implement a Simple RNN with Backpropagation Through Time (BPTT)
- LLM
- Implement Self-Attention Mechanism
- The Pattern Weaver's Code
- Positional Encoding Calculator
- Implement Multi-Head Attention
- GPT-2 Text Generation
- DeepSeek R1
- Implement the GRPO Objective Function
- Group Relative Advantage for GRPO
- KL Divergence Estimator for GRPO
- Pass@k and Majority Voting Evaluation Metrics
- Knowledge Distillation Loss
- DenseNet
- Single Neuron with Backpropagation
- Simple Convolutional 2D Layer
- Implement ReLU Activation Function
- Implement a Simple Residual Block with Shortcut Connection
- Implement Global Average Pooling
- Implement Batch Normalization for BCHW Input
- Implement a Batch Dense Block with 2D Convolutions
- GPT 243
- Autograd Engine (Value Class & Backpropagation)
- Lab: Autograd
- Tokenization & Embeddings
- Character-Level Tokenizer (stoi/itos/BOS)
- Learned Positional Embeddings
- Lab: Tokenization
- Build a Tokenizer for Language Modelling
- Core Building Blocks (Linear, Softmax, RMSNorm)
- Lab: Build a Neural Network from Scratch
- MNIST: Build Neural NEtwork from Scratch (
numpyOnly)
- MNIST: Build Neural NEtwork from Scratch (
- Multi-Head Attention & KV Cache
- Implement Self-Attention Mechanism
- Implement Masked Self-Attention
- Implement Multi-Head Attention
- KV Cache for Efficient Autogregressive Attention
- Lab: Attention
- Design Your Own Attention Mechanism
- Transformer Block (Residuals, MLP, Activations)
- Implement a Simple Residual Block with Shortcut Connection
- Implement Position-wise Feed-Forwards Block with Residual and Dropout
- Implement the Square ReLU Activation Function
- Lab: Activation Function
- Loss Functions & Cross-Entropy
- Compute Multi-class Cross-Entropy Loss
- Implementation of Log Softmax Function
- Adam Optimizer & Learning Rate Schedule
- Implement Adam Optimization Algorithm
- Linear Learning Rate Decay
- Lab: Optimizer
- Design Your Own Optimizer (
numpy)
- Design Your Own Optimizer (
- Training Loop (Putting It All Together)
- Calculate Number of Parameters in Neural Network
- Lab: Full Training Loop
- Build a Digit Classifier from Scratch
- Inference & Text Generation
- Temperature Sampling
- LLM Evaluation Methods
- Multiple Choice Benchmarks
- MMLU Letter-Matching Evaluation
- MMLU Log-Probability Scoring
- Verifier-Based Evaluation
- Boxed Answer Extraction for Math Benchmarks
- Math Answer Verification with Equivalence Checking
- Code Execution Verifier for Programming Benchmarks
- Preference Leaderboards
- Elo Rating System for Model Comparison
- Bradley-Terry Model for Pairwise Rankings
- LLM-as-a-Judge
- Rubric-Based LLM Judge Evaluation
- Pairwise Preference Judge for LLM Comparison
- Other Measures Mentioned in the Post
- BLEU Score for Text Generation
- Calculate PReplexity for Language Models
- Compute Multi-class Cross-Entropy Loss
- Multiple Choice Benchmarks
- Linear Algebra
- Vector Spaces
- Matrix Operations
- Reshape Matrix
- Scalar Multiplication of a Matrix
- Implement Compressed Row Sparse Matrix (CSR) Format Conversion
- Implement Orthogonal Projection of a Vector onto a Line
- Implement Compressed Column Sparse Matrix Format (CSC)
- Transformation Matrix from Basis B to C
- Matrix Transformation
- Calculate 2x2 Matrix Inverse
- Matrix times Matrix
- Implement Reduced Row Echelon Form (RREF) Function
- Eigenvalues and Eigenvectors
- Matrix Factorization and Decomposition
- Machine Learning
- Linear Algebra
- Probability and Statistics
- Optimization
- Model Evaluation
- Generate a Confusion Matrix for Binary Classification
- Calculate Accuracy Score
- Implement Precision Metric
- Implement Recall Metric in Binary Classification
- Implement F-Score Calculation for Binary Classification
- Calculate R-squared for Regression Analysis
- Calculate Mean Absolute Error (MAE)
- Calculate Root Mean Square Error (RMSE)
- Implement K-Fold Cross-Validation
- Calculate Performance Metrics for a Classification Model
- Implementation of Log Softmax Function
- Implement ReLU Activation Function
- Classification & Regression Techniques
- Linear Regression Using Normal Equation
- Linear Regression Using Gradient Descent
- Binary Classification with Logistic Regression
- Calculate Jaccard Index for Binary Classification
- Pegasos Kernel SVM Implementation
- Implement AdaBoost Fit Method
- Softmax Activation Function Implementation
- Unsupervised Learning
- Deep Learning
- Metadata Normalization (MDN)
- Mathematical Prerequisites
- Linear Regression Using Normal Equation
- Implement Orthogonal Projection of a Vector onto a Line
- Normalization Baselines (What MDN Improves Upon)
- Implement Batch Normalization for BCHW Input
- Implement Group Normalization
- Core MDN Concepts
- Implement Code MDN Residualization
- Distance Correlation for Measuring Metadata Dependence
- Advanced MDN (Handling Confounding)
- MDN with Label Collinearity Control
- Lab
- Feature Deconfounder for Biased Image Data
- Mathematical Prerequisites
- ResNet
- Single Neuron with Backpropagation
- Simple Convolutional 2D Layer
- Implement ReLU Activation Function
- Implement a Simple Residual Block with Shortcut Connection
- Implement Global Average Pooling
- Implement Batch Normalization for BCHW Input
- Sparsely Gated MoE
- Softmax Activation Function Implementation
- Single Neuron
- Calculate Computational Efficiency of MoE
- Implement Noisy Top-K Gating Function
- Implement a Sparse Mixture of Experts Layer
- Data Science I Interview Prep
- Core Machine Learning Concepts
- Data Processing
- Deep Learning
- Model Evaluation & Metrics
- Essense of Linear Algebra
- Vectors
- Linear Combinations
- Linear Transformations
- Matrix Multiplication
- Determinant
- Inverse Matrices
- Cross Product
- Cramer's Rule
- Change of Basis
- Eigenvector and Eigenvalues
- Micrograd Builder
- Optimizers
- MNIST: Pytorch DataLoader
- MNIST: Design-Your-Own tiny Pytorch Model
- MNIST: Design Your Own Pytorch Optimizer
- MNIST: Classification Loss (with Gradient)
- MNIST: Adversarial Example Generation
- MNIST: Build Neural Network from Scratch (
numpyOnly) - MNIST: Fix Very Deep Network Training
- Design Your Own Optimizer (
numpy) - Design Your Own Activation Function
- Design Your Own Attention Mechanism
- Data Preprocessing: Handling Missing Values
- PyTorch: Implement Your Own Gradient Descent Training Step
- PyTorch: Build a Complete Training Loop
- Numpy: Design Your Own Dimensionality Reduction
- Dimensionality Reduction with Sklearn
- Feature Deconfounder for Biased Image Data
- Train a Linear Regression Model
- Build a Tokenizer for Language Modeling
- Build a Digit Classifier from Scratch
- Fix Overfitting with Regularization (
numpy) - Fix Overfitting with Regularization (Sklearn)
- Train a Binary Classifier
- Design Your Own Normalization Layer
- Design Your Own MoE Router
- Calculus for Machine Learning
- Derivatives and Gradients
- Multivariate Calculus
- Neural Network Derivatives
- Backpropagation
- Gradient Descent
- Optimization
- Calculus Lab
- Pytorch: Calculus Lab 1
- Pytorch: Calculus Lab 2
- Linear Algebra for Machine Learning
- Vector Operations
- Vector Norms and Independence
- Matrix Basics
- Matrix Multiplication
- Matrix Properties I
- Matrix Properties II
- Solving Linear Systems
- Orthogonality and Projections
- Matrix Decompositions I
- Matrix Decompositions II
- Covariance and Correlation
- Linear Algebra Lab I
- Linear Algebra Lab II
- Probability and Statistics for Machine Learning
- Descriptive Statistics
- Probability Fundamentals
- Bayes' Theorem
- Common Distributions I
- Common Distributions II
- Law of Large Numbers and CLT
- Information Theory
- KL Divergence
- Maximum Likelihood and MAP
- Statistical Inference
- Bayesian Methods
- Probabalistic Models