Skip to content

fischcheng/datacamp_notebooks

Repository files navigation

Datacamp notes

This repo documents all the notes I've taken going through datacamp courses. I'll also jot down some random project ideas.

Completed courses:

  • Parallel computing with Dask
  • Supervised Learning with Scikit-learn
  • Unsupervised Learning in Python
  • Machine Learning with the Experts: School Budgets
  • Introduction to PySpark
  • Deep Larning in Python
  • Introduction to Time Series Analysis in Python

Scikit-learn supervised learning (Classification, Regression) basic workflow

  • Preprocessing data: fill/drop/impute missing data
  • Standarlized, normalized, scaled features
  • Train_test_split the dataset (test_size, random_state)
  • Cross-validation (CV)
  • Hyperparameter tuning (GridSearch CV, randomized CV)
  • model.fit(X_train,y_train)
  • model.predict(X_test)
  • Evaluate performance (R2, F1, and score etc.)
  • Put everything together using Pipeline.

About

Notes of Datacamp courses.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published