This repo lists packages/repo that I have used and found useful over the years and those that I want to try
Scikit-Learn : An all round machine learning library from data preparation to model selection
Tensorflow : Machine Learning/Deep Learning Frameworks awesome to use in production as well (GCP notably)
Yellowbrick : Machine Learning visualisation
skope rules : machine learning with logical rules in Python
MLFLow : Open source platform for the machine learning lifecycle
ktrain : fast and easy deep learning wrapper for tensorflow/Keras (text, image)
scikit-lego : Missing blocks for sklearn pipelines, notably Fairness Classifier to try
interpret : Fit interpretable models. Explain blackbox machine learning
spacy : Industrial-Strength Natural Language Processing
nltk : NLTK is a leading platform for building Python programs to work with human language data.
gensim : topic modelling
docanno : Open source text annotation tool for machine learning practitioner. https://doccano.herokuapp.com
tsfresh : Automatic ts feature generation
statsmodels : Statistics in pyhton
geopandas : GeoDataFrames (awesome spatial joins)
osmnx : OSMnx: Python for street networks. Retrieve, model, analyze, and visualize street networks and other spatial data from OpenStreetMap. https://geoffboeing.com/publications/…
pandas : DataFrames
missingno : missing data visualisation for pandas
pandas_profiling : Create HTML profiling reports from pandas DataFrame objects
pyspark : Distributed DataFrames
ray : A fast and simple framework for building and running distributed applications.
modin : distributed pandas
dask : Dask provides advanced parallelism for analytics
papermill : Papermill is a tool for parameterizing and executing Jupyter Notebooks.
jupytext : Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
tdda : Test Driven Data Analysis
kedro : A Python library that implements software engineering best-practice for data and ML pipelines.
scrapy : Scrapy, a fast high-level web crawling & scraping framework for Python. https://scrapy.org
dash : Easy dashboarding
streamlit : Streamlit — The fastest way to build custom ML tools https://streamlit.io
fastapi : fast (high-performance), web framework for building APIs with Python 3.6+
darts : easy manipulation and forecasting of time series.
eo-learn : sattelite imagery data with preprocessing tools
PyCaret : Low Code machine learning (cf R Caret)
BentoML : Easy ML Model serving
Manifold : A model-agnostic visual debugging tool for machine learning http://manifold.mlvis.io/
behave behave is behaviour-driven development, Python style.
vaex : Out-of-Core DataFrames for Python, visualize and explore big tabular data at a billion rows per second. https://vaex.io
movingpandas : Implementation of Trajectory classes and functions built on top of GeoPandas
parallax : Tool for interactive embeddings visualization
faker : Faker is a Python package that generates fake data for you
cortex : Deploy ML models in production on AWS
TileDB : A datascience oriented databas