Learn how to use Spark with Python, including Spark Streaming, Machine Learning, Spark 2.0 DataFrames and more!
- Databricks Setup
- Local VirtualBox with Ubuntu Setup
- AWS EC2 PySpark Setup
- AWS EMR Cluster Setup
- Spark DataFrame Basics (Groupby, Aggregate, Missing Data, Dates and Timestramps)
- DataFrame Project Excercise
- Linear Regression
- Linear Regression Consulting Project
- Logistic Regression
- Logistic Regression Consulting Project
- Decision Trees and Random Forests
- Random Forest Clasification Consulting Project
- K-means Clustering
- Clustering Consulting Project
- Collaborative Filtering for Recommender Systems
- Recommender System Project
- NLP Tools
- Natural Language Processing Project
- Spark Streaming Documentation
- Spark Streaming Twitter Project