Zero to mastery in data science.
- Score top 20% in Kaggle competitions
- Expert with different data types (text, image, audio, video)
- Expert with different techniques (regression, SVM, deep learning, genetic algorithms, etc)
- Familiar with modern tooling (python, pandas, scikit, R, tensorflow, apache spark, etc)
- Expert with various problems (classification, search, clustering, prediction, recommendation, etc)
- fundamentals (able to read and implement technical papers)
- building at scale pipelines / architectures
- Module 0 - Highschool Math
- Module 1 - College Math I (Calculus)
- Module 2 - College Math II (Linear Algebra)
- Module 3 - College Math III (Discrete Math)
- Module 4 - College Math IV (Probability and Statistics)
- Module 5 - Computation and Algorithms
- Module 6 - Artificial Intelligence and Machine Learning
- Module 7 - Deep Learning
- Module 8 - Data Mining and Recommenders
- Module 9 - NLP and Computer Vision
- Module 10 - Cloud Computing Architectures / Data Center Engineering
It is recommended to look ahead so long as the general trend is that of finishing earlier modules before later modules.
Not everyone was lucky enough to have a good start with math growing up. The goal is to level the playing field - by the end of module 0 you should feel like you went to a highschool with world class teachers and finished top of your math class.
Required Reading
- π The Joy of X
- Khan - Differential Calculus - 8%
- Khan - Integral Calculus - 7%
- Khan - AP Calculus AB - 2%
- Khan - AP Calculus BC - 2%
- MIT - Single Variable Calculus
Supplementary Material
- Calculus, Better Explained
- Essence of Calculus
- Engineering Math
- Intro to Calculus with Derivatives
- Coursera - Introduction to Complex Analysis
- Coursera - Mathematics for Machine Learning: Multivariate Calculus
- Prof. Leonard - Calculus I
- MIT - Differential Equations
- Khan - Differential Equations
- Khan - Linear Algebra
- Coursera - Mathematics for Machine Learning: Linear Algebra
- Coursera - Mathematics for Machine Learning: Principle Component Analysis
- Fast AI - Computational Linear Algebra
- MIT - Linear Algebra
Required Reading
Supplementary Material
- Matrix Calculus for Deep Learning
- Graphical linear algebra
- Essence of Linear Algebra
- Brown University - Coding the Matrix
- Udacity - Linear Algebra Refresher Course
Proofs, Set theory, propositional logic, induction, invariants, state-machines
- Coursera - What is a Proof?
- MIT - Mathematics for Computer Science (2015): Unit 1
- MIT - Mathematics for Computer Science (2010): Weeks 1,2,3
- π How to Prove It
- π Book of Proof
Number theory is fundamental in reasoning about numbers as discrete mathematic structures with applications in cryptography and efficient numerical computation.
By the end of this sub-module you should be very confident proving and reasoning about concepts including: divisibility, bezouts identity, modular arithmetic, eulers totient theorem, fermats little theorem, integer factorization, diophantine equations, the fundemental theorem of arithmetic, chinese remainder theorem, RSA and the discrete logarithm problem.
- Coursera - Number Theory and Cryptography
- MIT - Mathematics for Computer Science (2010) - Number Theory I and II
- MIT - Mathematics for Computer Science (2015) - GCDs, Congruences, Euler's Theorem, and RSA
Problem Sets
- MIT - Mathematics for Computer Science (2010): Recitation 4
- MIT - Mathematics for Computer Science (2010): Recitation 5
- MIT - Mathematics for Computer Science (2010): Assignment 3
Worked solutions to problem sets here
Optional Supplementary Material
- Coursera - Classical Cryptosystems and Core Concepts
- Coursera - Mathematical Foundations for Cryptography
Combinatorics is a vital skill in reasoning about the size of finite sets.
- Coursera - Combinatorics and Probability
- Coursera - Introduction to Enumerative Combinatorics
- MIT - Mathematics for Computer Science (2010) - Counting Rules I and II
- MIT - Mathematics for Computer Science (2015) - Counting
Problem Sets
- MIT - Mathematics for Computer Science (2010): Recitation 15
- MIT - Mathematics for Computer Science (2010): Recitation 16
- MIT - Mathematics for Computer Science (2010): Assignment 9
- Coursera - Introduction to Graph Theory
- Coursera - Solving the Delivery Problem
- π Introduction to Graph Theory
- Sarada Herke - Graph Theory Course
- http://courses.csail.mit.edu/6.889/fall11/lectures/
todo
todo
-
Visual Group Theory computer-science/6-042j-mathematics-for-computer-science-fall-2010/)
-
https://www.youtube.com/playlist?list=PLZzHxk_TPOStgPtqRZ6KzmkUQBQ8TSWVX
-
π Concrete Mathematics
- Discrete Stochastic Processes
- Khan - AP Statistics
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041sc-probabilistic-systems-analysis-and-applied-probability-fall-2013/unit-i/
- Coursera/Duke Uni - Introduction to Probability and Data
- Coursera/Duke Uni - Inferential Statistics
- Coursera/Duke Uni - Linear Regression and Modeling
- Coursera/Duke Uni - Bayesian Statistics
- Coursera/Duke Uni - Statistics with R Capstone
- Edx/Uni Texas - Foundations of Data Analysis - Part 1: Statistics Using R
- EdX/Uni Texas - Foundations of Data Analysis - Part 2: Inferential Statistics
- EdX/MIT - Introduction to Probability - The Science of Uncertainty
- EdX/MIT - Introduction to Probability Part II - Inferences and Processes
- MIT Probablistic Systems
- MIT Intro to Probability (2018)
- Coursera - Divide and Conquer, Sorting and Searching, and Randomized Algorithms
- Coursera - Graph Search, Shortest Path, and Data Structures
- Coursera - Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming
- Coursera - Shortest Paths Revisited, NP-Complete Problems
Resources
- π Grokking Algorithms
- π Algorithms to Live By
- π Introduction to Algorithms (CLRS)
- π The Algorithm Design Manual
- π Algorithms (Dasgupta)
- π Algorithm Design (Tardos and Kleinberg)
- π Algorithms (Sedgewick)
- Khan Algorithms
- MIT 6.006 - Introduction to Algorithms
- Intro to Algorithms
- Algorithmic Thinking I
- Algorithmic Thinking II
- https://www.youtube.com/watch?v=T_WffoMAaMA
- https://www.coursera.org/specializations/data-structures-algorithms
- https://www.youtube.com/user/mycodeschool
- http://www3.cs.stonybrook.edu/~algorith/
- https://www.youtube.com/watch?v=ufj5_bppBsA&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&index=7
- https://www.youtube.com/user/mikeysambol/playlists
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/
- Programming Conversations
- Efficient Programming with Components
- Four Algorithmic Journeys
- Computer Science: Algorithms, Theory, and Machines
- Data Structures & Algorithms Specialization
- Approximation Algorithms I & II
- http://jeffe.cs.illinois.edu/teaching/algorithms/?#book
-
https://www.edx.org/course/introduction-computer-science-mitx-6-00-1x-10
-
https://www.edx.org/course/introduction-computational-thinking-data-mitx-6-00-2x-5
- Coursera - Data Systems Specialization
- Coursera - Data Visualization Specialization
- Coursera - Computer Architecture
- MIT Computer System Engineering
- MIT Information and Entropy
- Coursera - Computer Science Algs, Theory, Machines
Supplementary
- https://www.coursera.org/learn/introduction-mongodb
- https://university.mongodb.com/
- https://www.khanacademy.org/computing/computer-science/informationtheory
- https://www.youtube.com/playlist?list=PLSE8ODhjZXjbisIGOepfnlbfxeH7TW-8O
- https://www.brianstorti.com/replication/
https://www.coursera.org/specializations/aml
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-868j-the-society-of-mind-fall-2011/video-lectures/
- https://www.youtube.com/watch?feature=player_embedded&v=J6PBD-wNEDs
- http://ai.berkeley.edu/lecture_videos.html
- https://www.udacity.com/course/artificial-intelligence-for-robotics--cs373
- http://aiplaybook.a16z.com/
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-034-artificial-intelligence-fall-2010/lecture-videos/
- http://rll.berkeley.edu/deeprlcourse/
Machine Learning Specialization by University of Washington on Coursera
- Machine Learning Foundations: A Case Study Approach
- Machine Learning: Regression
- Machine Learning: Classification
- Machine Learning: Clustering & Retrieval
- https://www.analyticsvidhya.com/blog/2015/07/top-youtube-videos-machine-learning-neural-network-deep-learning/
- Statistical Machine Learning 10-702/36-702
- https://www.udacity.com/ai
- https://www.udacity.com/drive
- https://www.udacity.com/course/machine-learning-engineer-nanodegree--nd009
- https://www.edx.org/xseries/data-science-engineering-apacher-sparktm
- https://www.coursera.org/specializations/data-mining
- https://www.coursera.org/specializations/machine-learning
- http://web.stanford.edu/class/cs20si/syllabus.html
- https://work.caltech.edu/telecourse.html
- https://work.caltech.edu/telecourse.html
- https://www.youtube.com/watch?v=bxe2T-V8XRs
- https://www.youtube.com/watch?v=UVwwYZMFocg&list=PLiaHhY2iBX9ihLasvE8BKnS2Xg8AhY6iV&index=8
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-868j-the-society-of-mind-fall-2011/video-lectures/
- https://www.coursera.org/specializations/gcp-data-machine-learning
- Neural Networks and Deep Learning
- Improving Deep Neural Networks: Hyperparameter Tuning, Regularization, and Optimization
- Structuring Machine Learning Projects
- Convolutional Neural Networks
- Sequence Models
Goals:
- different activation functions (sigmoid/tanh/relu)
- different cost functions
- with and without bias units
- classification and regression problems
- text / binary / image / recommenders
- batch vs stochastic
- JS, Python, PHP, Matlab, TensorFlow, SciKitLearn
- create visualizations and blog explanations
- Audit best courses / books
- http://explained.ai/matrix-calculus/index.html
- Practical Deep Learning For Coders
- https://classroom.udacity.com/courses/ud730
- http://neuralnetworksanddeeplearning.com/
- http://course.fast.ai/
- http://www.deeplearningbook.org/
- http://cs231n.github.io/ + https://www.youtube.com/playlist?list=PLlJy-eBtNFt6EuMxFYRiNRS07MCWN5UIA
- http://neuralnetworksanddeeplearning.com/
- https://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH
- http://rll.berkeley.edu/deeprlcourse/
- http://rll.berkeley.edu/deeprlcourse/#lecture-videos
- http://rll.berkeley.edu/deeprlcourse/
- http://introtodeeplearning.com/index.html
- https://www.youtube.com/watch?v=21EiKfQYZXc&app=desktop
- https://courses.csail.mit.edu/6.042/spring17/mcs.pdf
- http://yerevann.com/a-guide-to-deep-learning/
- https://www.coursera.org/learn/neural-networks
- https://www.youtube.com/playlist?list=PLE6Wd9FR--EfW8dtjAuPoTuPcqmOV53Fu
- https://cloud.google.com/blog/big-data/2017/01/learn-tensorflow-and-deep-learning-without-a-phd
- https://www.udacity.com/course/deep-learning--ud730
- http://nbviewer.jupyter.org/github/domluna/labs/blob/master/Build%20Your%20Own%20TensorFlow.ipynb
- https://goc.vivint.com/problems/mlc
- http://blog.floydhub.com/coding-the-history-of-deep-learning/
- https://www.udacity.com/course/deep-learning--ud730
- https://stats385.github.io/
- https://p.migdal.pl/interactive-machine-learning-list/
- https://scrimba.com/g/gneuralnetworks
-
https://www.coursera.org/specializations/recommender-systems
-
https://nlp.stanford.edu/IR-book/information-retrieval-book.html
-
https://www.coursera.org/specializations/gcp-data-machine-learning
- https://github.com/oxford-cs-deepnlp-2017/lectures
- https://www.youtube.com/watch?v=OQQ-W_63UgQ&list=PL3FW7Lu3i5Jsnh1rnUwq_TcylNr7EkRe6
- https://www.coursera.org/learn/digital/home/welcome
- http://cs231n.stanford.edu/syllabus.html
- https://www.udacity.com/course/interactive-3d-graphics--cs291
- https://www.youtube.com/watch?v=01YSK5gIEYQ&list=PL_w_qWAQZtAZhtzPI5pkAtcUVgmzdAP8g
- https://www.coursera.org/specializations/data-warehousing
- https://www.coursera.org/specializations/big-data-engineering
- https://www.coursera.org/specializations/gcp-architecture
- https://www.coursera.org/specializations/gcp-data-machine-learning
- https://www.coursera.org/specializations/cloud-computing
- https://www.coursera.org/specializations/data-science
- https://www.coursera.org/specializations/big-data
- https://www.coursera.org/specializations/scala
- https://www.coursera.org/learn/hadoop
- http://cagd.cs.byu.edu/~557/text/ch1.pdf
- https://www.coursera.org/learn/data-driven-astronomy
- https://www.coursera.org/specializations/genomic-data-science
- https://www.coursera.org/learn/data-genes-medicine
- https://www.coursera.org/specializations/systems-biology
- https://www.coursera.org/specializations/networking-basics
- https://www.coursera.org/learn/neurohacking
- https://www.youtube.com/playlist?list=PLUl4u3cNGP62K2DjQLRxDNRi0z2IRWnNh
-
https://www.youtube.com/playlist?list=PLoROMvodv4rMWw6rRoeSpkiseTHzWj6vu&disable_polymer=true
-
https://online-learning.harvard.edu/series/professional-certificate-data-science
-
computational geometry https://www.youtube.com/watch?v=rho8QqiHOe4
-
kaggle school https://www.kaggle.com/learn/overview
-
MIT self driving https://selfdrivingcars.mit.edu/
-
MIT GAI https://agi.mit.edu/
-
https://github.com/lexfridman/mit-deep-learning/blob/master/README.md#mit-deep-learning
-
https://www.jgoertler.com/visual-exploration-gaussian-processes/
-
http://webdam.inria.fr/Alice/ [databases]
- The Art of Unix Programming
- The C programming language
- GΓΆdel, Escher, Bach: An Eternal Golden Braid
- Deep Learning (Goodfellow, Bengio, Courville)
- Grokking Deep Learing
- Grokking Deep Reinforcement Learning
- Compilers: Principles, Techniques, and Tools (Dragon book)
- Code
- The elements of statistical learning
- The structure and intepretation of computer programs
- Hackers Delight
- Concrete Mathematics
- The Art of Computer Programming
- Artificial Intelligence: A Modern Approach
- https://blog.ycombinator.com/learning-math-for-machine-learning/