Author : Sarang Pramode
Each week’s row is expanded into two rows:
- Part 1 (Monday) – Deepen foundational concepts and prerequisites.
- Part 2 (Wednesday) – Build on those foundations with more advanced practices.
- Monday (Part 1): Deepen foundational understanding of the topic with essential math, keywords, and conceptual clarity.
- Between Monday and Wednesday: 2–3 hours of Self-Work to practice coding, read recommended papers/chapters, and clarify any gaps.
- Wednesday (Part 2): Dive into advanced usage and application in PyTorch, building on the Monday foundation.
- Between Wednesday and the Weekend: Another 2–3 hours of Self-Work to reinforce new material, experiment, or prepare for the next topic.
- Sunday: Rest or do light review.
| WEEK | SESSION (Part 1 / Part 2) | KEY TOPICS & DESCRIPTION | STUDY RESOURCES | ACTION ITEMS | DATE |
|---|---|---|---|---|---|
| 1 | Part 1 (Monday) | Foundations of Linear/Logistic Regression & Gradient Descent Detailed Focus: - Why regression matters (predictive modeling basics). - Concepts: cost/loss functions (MSE, cross-entropy), partial derivatives, learning rates. - Keywords/Phrases: “Gradient Descent,” “Loss Landscape,” “Overfitting,” “Regularization.” |
- Deep Learning (Goodfellow et al.), Ch. 5 (basic ML/optimization). - Andrew Ng’s ML Coursera (Weeks 1–2) for classical ML approach. - Khan Academy / 3Blue1Brown videos on gradient descent (for math intuition). |
- Math Prep: Practice manual gradient calculation on simple functions. - Simple Code: Attempt a minimal NumPy-based linear regression to see how gradient descent is coded. |
Mon, 5 Jan |
| 1 | Part 2 (Wed) | PyTorch Basics & Autograd Detailed Focus: - Translating the “manual gradient calculation” to PyTorch’s autograd system.- Tensors, computational graphs, and environment setup (conda/venv). - Keywords/Phrases: “Computational Graph,” “Tensor Operations,” “Backpropagation in PyTorch.” |
- Deep Learning with PyTorch, Ch. 1–2. - PyTorch Tutorials on Tensors & Autograd. - Official PyTorch docs for environment setup. |
- Write a minimal PyTorch script using torch.tensor for a simple polynomial function, see how .backward() works.- Confirm environment is correctly installed (PyTorch + CUDA if available). |
Wed, 7 Jan |
| 1 | Self-Work | - Reinforce fundamental ML concepts (Week 1 Part 1). - Implement a simple PyTorch-based Linear Regression using MSE. - Explore differences between manual gradients vs. autograd. |
- Code snippet references from Deep Learning with PyTorch, Ch. 2 examples. - Andrew Ng’s ML Coursera “Practice Labs.” |
- Document any confusion or errors and clarify next session. - Build a mental map of the entire data → model → optimization pipeline. |
Thu–Sat |
| 2 | Part 1 (Monday) | Fully Connected Networks (MLPs) – Prerequisites Detailed Focus: - Recap linear algebra for matrix multiplications in MLPs. - Activation functions (ReLU, Sigmoid, Tanh). - Keywords/Phrases: “Neural Activation,” “Vanishing Gradient,” “Parameter Initialization.” |
- Deep Learning with PyTorch, Ch. 3 (introduction to feedforward nets). - MIT 6.S191 Lecture 1 for fundamentals of deep networks. - Hands-On ML (Géron), Ch. 10 (basics of neural nets). |
- Brush up on matrix multiplication, dimension alignment for MLP layers. - Sketch out how an MLP transforms inputs step by step. |
Mon, 12 Jan |
| 2 | Part 2 (Wed) | Implementing MLPs & Training Loops Detailed Focus: - Using torch.nn and torch.optim for parameter updates.- Differences between SGD and Adam (Adam paper). - Keywords/Phrases: “Training Loop,” “Epoch,” “Optimizer Step.” |
- PyTorch CIFAR-10 tutorial (basic classifier structure). - Deep Learning, Ch. 8.1–8.2 (neural net optimization). - Example code from PyTorch docs on building an MLP with nn.Module. |
- Compare training an MLP with Adam vs. pure SGD on a small dataset (e.g., MNIST or Iris). - Log results in a spreadsheet or experiment tracker. |
Wed, 14 Jan |
| 2 | Self-Work | - Reinforce MLP knowledge with custom dataset classification (e.g., Iris). - Experiment with different optimizers, learning rates, mini-batch sizes. |
- Andrew Ng’s ML labs (deep learning section) for more advanced assignment ideas. - (Optional) Hands-On ML, Ch. 10 for alternative MLP viewpoints. |
- Track training/validation metrics and any divergences. - Summarize findings in a short note. |
Thu–Sat |
| 3 | Part 1 (Monday) | CNNs – Convolution Math & Intuition Detailed Focus: - Understanding convolution operation (filters, kernels). - Key CNN concepts: stride, padding, channels. - Keywords/Phrases: “Local Receptive Field,” “Feature Map,” “Pooling,” “Parameter Sharing.” |
- Stanford CS231n (Lecture on Convolutions). - Deep Learning, Ch. 9 (basics of CNN). - 3Blue1Brown “Convolutions for Image Processing” video (optional for visual math). |
- Manually compute a small 2D convolution on paper to grasp the math. - Understand how a 3x3 filter slides across an image. |
Mon, 19 Jan |
| 3 | Part 2 (Wed) | CNN Implementation in PyTorch Detailed Focus: - Building a CNN for CIFAR-10 with torchvision.transforms (data augmentation).- Adding multiple conv layers, ReLU, pooling. - Keywords/Phrases: “Feature Extractor,” “Data Augmentation,” “Generalization.” |
- PyTorch CIFAR-10 tutorial (step-by-step code). - Deep Learning with PyTorch, Ch. 4 (CNN basics). |
- Build a CNN with 2–3 convolutional layers, measure performance. - Incorporate data augmentation strategies (random crop, flip, etc.). |
Wed, 21 Jan |
| 3 | Self-Work | - Experiment with BatchNorm and Dropout to improve your CNN. - Compare “with BN” vs. “without BN” in terms of training stability. |
- BatchNorm paper (2015) to understand the theory. - PyTorch docs for nn.BatchNorm2d usage. |
- Track model accuracy & training speed differences. - Visualize training/validation curves if possible. |
Thu–Sat |
| 4 | Part 1 (Monday) | Advanced Regularization Concepts Detailed Focus: - The math behind BatchNorm (mean & variance in mini-batches). - Why Dropout helps reduce overfitting. - Keywords/Phrases: “Internal Covariate Shift,” “Normalization,” “Weight Decay.” |
- Hands-On ML (Géron), Ch. 11 (training deep networks). - Deep Learning, Ch. 7.5 (regularization & capacity). |
- Revisit your CNN code, identify where BN and Dropout could be placed. - Understand the typical parameterization for BN (momentum, eps, etc.). |
Mon, 26 Jan |
| 4 | Part 2 (Wed) | Optimization Techniques & LR Scheduling Detailed Focus: - Using torch.optim.lr_scheduler for StepLR, ReduceLROnPlateau, etc.- Monitoring validation loss for early stopping. - Keywords/Phrases: “Learning Rate Decay,” “Early Stopping,” “Overfit Check.” |
- PyTorch Docs: LR scheduling examples. - Deep Learning, Ch. 7.8 (advanced optimization strategies). |
- Implement a scheduler in your CNN training loop. - Evaluate improvements in final accuracy or convergence speed. |
Wed, 28 Jan |
| 4 | Self-Work | - Finalize your best CNN on CIFAR-10 with BN, Dropout, and LR scheduling. - Track experiment results systematically. |
- Keep an experiment log (spreadsheet or MLflow). - Review Deep Learning with PyTorch, Ch. 6 for more tips on regularization. |
- Aim for a reproducible script with well-documented hyperparameters. - Summarize key results (accuracy, training time) in a short write-up. |
Thu–Sat |
| 5 | Part 1 (Monday) | Transfer Learning Fundamentals Detailed Focus: - Concept of domain adaptation, why pretrained models help. - Feature extraction vs. fine-tuning entire network. - Keywords/Phrases: “Domain Shift,” “Feature Extraction Layer,” “Overfitting with Small Data.” |
- Official Transfer Learning Tutorial. - Deep Learning with PyTorch, Ch. 7 (intro to transfer learning). |
- Understand the difference between using a pretrained backbone for feature extraction vs. unfreezing all layers. - Identify potential use cases (e.g., medical imaging). |
Mon, 2 Feb |
| 5 | Part 2 (Wed) | Hands-On Pretrained Models Detailed Focus: - Implementing and fine-tuning ResNet, VGG, or EfficientNet. - Practical tips for small datasets (data augmentation, freeze/unfreeze patterns). - Keywords/Phrases: “Pretrained Weights,” “Last Layer Customization.” |
- FastAI Transfer Learning lessons (for hands-on examples). - Deep Learning, Ch. 9.3 (convolutional networks for feature learning). |
- Fine-tune ResNet-50 on a custom dataset (e.g., flowers). - Compare partial fine-tuning vs. full fine-tuning. |
Wed, 4 Feb |
| 5 | Self-Work | - Collect or create a small dataset (could be 100–500 images from Kaggle or personal data). - Document results and highlight improvements from transfer learning. |
- Kaggle dataset samples (Flowers, Dogs vs. Cats, etc.). - (Optional) Hands-On ML, Ch. 13 for more insight on CNN-based transfer learning. |
- Summarize fine-tuning strategies: which layers to freeze, for how long, etc. - Present accuracy improvements vs. a scratch-trained model. |
Thu–Sat |
| 6 | Part 1 (Monday) | NLP Basics & Tokenization Detailed Focus: - Text as sequences of tokens, word embedding concepts (Word2Vec, GloVe). - RNN background: unrolling sequences, hidden states, gating mechanisms (LSTM/GRU). - Keywords/Phrases: “Embedding Matrix,” “Sequence Modeling,” “Vanishing Gradient.” |
- Deep Learning with PyTorch, Ch. 8 (intro to NLP). - Stanford CS224n (Lecture on RNNs, Embeddings). - Word2Vec paper (2013) for historical context. |
- Understand the difference between static embeddings (GloVe) vs. context-based (BERT). - Quick math check on how word embeddings are multiplied or added in RNN cells. |
Mon, 9 Feb |
| 6 | Part 2 (Wed) | Implementing RNNs for Text Classification Detailed Focus: - Using torch.nn.LSTM or torch.nn.GRU for a simple sentiment analysis (IMDB).- Packing sequences, dealing with variable-length inputs. - Keywords/Phrases: “PackedSequence,” “Hidden State,” “Bidirectional RNN.” |
- PyTorch docs: torch.nn.LSTM, torch.nn.GRU usage.- Example code from official PyTorch NLP tutorials. |
- Build an IMDB (or similar) sentiment classifier with LSTM/GRU. - Evaluate accuracy, confusion matrix, and handle any vanishing/exploding gradient issues. |
Wed, 11 Feb |
| 6 | Self-Work | - Explore GloVe embeddings vs. random initialization in your sentiment model. - Look at sequence padding/truncation and see how it affects performance. |
- Pretrained GloVe embeddings (Stanford site). - (Optional) Hands-On ML, Ch. 16 for additional NLP insights. |
- Document how pretrained embeddings change training speed or final accuracy. - Save best model checkpoints for reference. |
Thu–Sat |
| 7 | Part 1 (Monday) | Transformers – Self-Attention Theory Detailed Focus: - Core idea of attention (query, key, value). - Reading the Attention Is All You Need (2017) abstract and main figures. - Keywords/Phrases: “Scaled Dot-Product,” “Multi-Head Attention,” “Positional Encoding.” |
- Jay Alammar’s “Illustrated Transformer” blog (great for visuals). - Stanford CS224n lecture on Transformers (attention mechanisms). |
- Sketch how Q, K, V are computed and see how attention is aggregated. - Understand the role of positional encoding in sequences. |
Mon, 16 Feb |
| 7 | Part 2 (Wed) | PyTorch Transformer Implementation Detailed Focus: - Using nn.Transformer for classification or translation.- Key hyperparameters (nhead, num_encoder_layers, etc.). - Keywords/Phrases: “TransformerEncoder,” “PositionalEncoding,” “Attention Mask.” |
- PyTorch docs: nn.Transformer modules.- Example code from PyTorch tutorials on building a simple Transformer. |
- Build a small Transformer model for text classification (AG News or a short translation task). - Compare training speed vs. an RNN-based approach. |
Wed, 18 Feb |
| 7 | Self-Work | - Tweak hyperparameters to handle overfitting or slow convergence in Transformers. - Document memory usage & performance vs. RNN. |
- Paper highlights from “Attention Is All You Need” (Sections 3–5). - (Optional) Test on a slightly bigger dataset if GPU resources allow. |
- Summarize the difference in training curves, final accuracy, and overall complexity. | Thu–Sat |
| 8 | Part 1 (Monday) | Unsupervised Learning & Dimensionality Reduction Detailed Focus: - K-means, PCA fundamentals (eigenvalues, eigenvectors). - Why reduce dimensions? Data visualization, feature engineering. - Keywords/Phrases: “Variance Explained,” “Latent Representation,” “Clustering.” |
- Hands-On ML, Ch. 8 (dimensionality reduction). - Classic K-means explanation from any ML reference (Andrew Ng’s or Géron). - (Optional) 3Blue1Brown video on PCA for visual math. |
- Implement basic PCA manually or using PyTorch to confirm understanding of matrix decomposition. - Explore K-means clustering on a small dataset. |
Mon, 23 Feb |
| 8 | Part 2 (Wed) | Autoencoders in PyTorch Detailed Focus: - Encoder-decoder design, reconstruction loss (MSE). - Autoencoder paper (Hinton, 2006). - Keywords/Phrases: “Latent Space,” “Bottleneck,” “Reconstruction Loss.” |
- Deep Learning with PyTorch (Autoencoder examples, if available). - PyTorch tutorials on building a simple autoencoder (if any). |
- Build an autoencoder for MNIST or CIFAR-10, visualize reconstructed images. - Experiment with different bottleneck sizes. |
Wed, 25 Feb |
| 8 | Self-Work | - Visualize latent space via t-SNE or PCA on the autoencoder’s bottleneck layer. - Observe how reconstruction quality changes with different dimension sizes. |
- Deep Learning, Ch. 15.1 (brief intro to representation learning). - Possibly check Google’s PAIR People + AI Research for AE examples. |
- Document the changes in reconstruction loss vs. latent dim size. - Keep final autoencoder code for your portfolio. |
Thu–Sat |
| 9 | Part 1 (Monday) | Generative Models – GAN Theory Detailed Focus: - Generator vs. Discriminator roles. - GAN paper (2014) main ideas (minimax game). - Keywords/Phrases: “Jensen–Shannon Divergence,” “Adversarial Training,” “Mode Collapse.” |
- Deep Learning, sections on generative modeling. - Deep Learning with PyTorch, Ch. 14 (if available). - Any blog post explaining fundamental GAN math (e.g., Lil’Log’s blog). |
- Understand how the generator’s objective differs from typical supervised learning. - Sketch the flow of real/fake examples in the training loop. |
Mon, 2 Mar |
| 9 | Part 2 (Wed) | DCGAN Implementation Detailed Focus: - Using Conv-based GAN (DCGAN) for image generation. - Stabilization tricks: label smoothing, batchnorm in G/D, learning rate selection. - Keywords/Phrases: “DCGAN Architecture,” “Discriminator Loss,” “Generator Loss.” |
- PyTorch DCGAN Tutorial. - Deep Learning, Ch. 20.4 (if available for generative models). |
- Implement a DCGAN for MNIST or CIFAR-10. - Save generated samples across epochs, note improvements or mode collapse. |
Wed, 4 Mar |
| 9 | Self-Work | - Tweak hyperparams (learning rate, beta1 in Adam, etc.) to stabilize training. - Keep a log of discriminator vs. generator loss. |
- TensorBoard or Matplotlib for plotting losses and generated images per epoch. - Research label smoothing ideas for further stability. |
- Generate a short report with images showing training progression. - Possibly share best results in a personal blog or GitHub. |
Thu–Sat |
| 10 | Part 1 (Monday) | Model Saving/Loading & TorchScript/ONNX Detailed Focus: - Serializing PyTorch models, pitfalls with dynamic graphs. - Converting to TorchScript or ONNX for deployment. - Keywords/Phrases: “State Dict,” “Model Serialization,” “Cross-Platform Inference.” |
- Official PyTorch docs: Saving and Loading Models. - TorchScript & ONNX export tutorials. |
- Export your best CNN or RNN model to ONNX/TorchScript. - Check the model loads correctly in a separate environment. |
Mon, 9 Mar |
| 10 | Part 2 (Wed) | Deployment with Flask/FastAPI/Gradio Detailed Focus: - Creating a simple REST API or web interface for inference. - Handling concurrency, potential GPU inference. - Keywords/Phrases: “API Endpoint,” “JSON Input/Output,” “Real-Time Inference.” |
- Tutorials on Flask, FastAPI, or Gradio for ML deployment. - Hands-On ML, Ch. 19 (deployment strategies). |
- Implement a minimal web service that loads a PyTorch model and serves predictions. - Document how to install and run the service. |
Wed, 11 Mar |
| 10 | Self-Work | - Dockerize or containerize your service if possible. - Test inference speed locally or on a small cloud instance. |
- Docker docs for Python apps. - (Optional) AWS/GCP free tier to test deployment performance. |
- Write a detailed README with steps to build/run the container or script. - Try sending sample requests to confirm correct model outputs. |
Thu–Sat |
| 11 | Part 1 (Monday) | Scaling ML Training – Distributed Basics Detailed Focus: - DataParallel vs. DistributedDataParallel, difference in approach. - Networking basics: how nodes communicate gradients. - Keywords/Phrases: “Parameter Server,” “MPI,” “All-Reduce,” “Synchronization.” |
- PyTorch docs: torch.distributed overview.- Deep Learning with PyTorch, advanced chapters on distributed training (if available). - (Optional) HPC/cluster computing references for background. |
- Understand the difference between single-machine multi-GPU vs. multi-node distributed training. - Outline steps needed for distributed setup (launch scripts, environment variables). |
Mon, 16 Mar |
| 11 | Part 2 (Wed) | Performance Profiling & Debugging Detailed Focus: - Using PyTorch Profiler or TensorBoard to identify bottlenecks. - Optimizing batch size, GPU utilization, and data loading workers. - Keywords/Phrases: “Profiler Trace,” “Bottleneck,” “Throughput,” “Latency.” |
- PyTorch Profiler docs and tutorials. - Deep Learning, Ch. 12.4 on large-scale or parallel computing. |
- Attempt a multi-GPU or distributed training (even if on a single machine with DataParallel). - Log performance stats: GPU usage, step time, etc. |
Wed, 18 Mar |
| 11 | Self-Work | - Fine-tune an existing model in distributed mode (if resources allow). - Tweak number of data loader workers, batch size, etc. |
- Official PyTorch distributed examples in GitHub (PyTorch Examples repo). - Use nvidia-smi or top to monitor resource usage. |
- Maintain a record of throughput changes with different configurations. - Summarize best practices for stable, high-performance training. |
Thu–Sat |
| 12 | Part 1 (Monday) | Capstone Planning & Research Focus Detailed Focus: - Decide on an ambitious end-to-end project (e.g., advanced NLP, multi-modal model, or large-scale CNN/Transformer). - Revisit relevant research papers and best practices learned so far. - Keywords/Phrases: “Project Scope,” “MVP,” “Evaluation Metrics.” |
- Revisit top papers for your chosen domain (could be CV, NLP, or generative). - Summaries from prior sessions (GANs, CNNs, RNNs, Transformers). - Deep Learning with PyTorch for references to combine multiple approaches. |
- Define dataset, model architecture, success metrics (accuracy, BLEU, FID, etc.). - Plan out training time and resource requirements. |
Mon, 23 Mar |
| 12 | Part 2 (Wed) | Capstone Execution & Finalization Detailed Focus: - Building, training, and debugging the final model pipeline. - Setting up inference pipeline & final evaluation. - Keywords/Phrases: “Full Pipeline,” “Evaluation Metric,” “Deployment Ready.” |
- Any relevant advanced PyTorch tutorials (depending on project type). - Deep Learning sections on integration of multiple components (GAN + classifier, or CNN + Transformer). |
- Implement the complete project codebase, from data loading → model → training → evaluation. - Gather metrics, final results, and track improvements. |
Wed, 25 Mar |
| 12 | Self-Work | - Wrap up your Capstone: finalize training, gather best model checkpoints, measure performance thoroughly. - Prepare documentation and (optional) a short demo video. |
- Use your experiment logs, runs from previous weeks as references. - (Optional) Publish results on GitHub or personal website for feedback. |
- Present or record a walk-through of your solution. - Outline possible future improvements or next steps. |
Thu–Sat |