Skip to content

luckamolkova/data_science

Repository files navigation

Data Science Toolbox

This repo contains snippets of code with a little bit of theory and explanation that can be handy for beginning data scientists. It was created while I was attending Galvanize Data Science immersive program in Seattle. Code is in Python, specifically in ipython notebooks as they are easy to view on GitHub.

Pull requests with updates are welcome!

Theory

Machine Learning

  • ML algorithms - linear regression, logistic regression, decision trees, random forests, gradient boosting, PCA, SVD, NMF and more. ml_algorithms.ipynb
  • Recommenders - different graphlab recommenders. recommenders.ipynb

NLP

  • Doc2Vec - document similarity search using gensim. doc2vec.ipynb
  • Text summarization - text summarization and keyword extraction using gensim. text_summarization.ipynb
  • NearPy - locality sensitive hashing (LHS) for approximated nearest neighbor search. nearpy.ipynb

Development

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •