Skip to content

Latest commit

 

History

History
543 lines (426 loc) · 32.3 KB

File metadata and controls

543 lines (426 loc) · 32.3 KB

Crazy Awesome Python

A selection of 56 curated ml Python libraries and frameworks ordered by stars.

Checkout the interactive version that you can filter and sort: https://www.awesomepython.org/

scikit-learn: machine learning in Python
https://scikit-learn.org
https://github.com/scikit-learn/scikit-learn
81 stars per week over 596 weeks
48,636 stars, 22,533 forks, 2,211 watches
created 2010-08-17, last commit 2022-01-22, main language Python
data-analysis, data-science, machine-learning, python, statistics

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.ai/
https://github.com/dmlc/xgboost
53 stars per week over 415 weeks
22,128 stars, 8,287 forks, 941 watches
created 2014-02-06, last commit 2022-01-22, main language C++
distributed-systems, gbdt, gbm, gbrt, machine-learning, xgboost

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
https://github.com/TencentARC/GFPGAN
383 stars per week over 44 weeks
16,991 stars, 2,518 forks, 296 watches
created 2021-03-19, last commit 2022-01-08, main language Python
deep-learning, face-restoration, gan, gfpgan, image-restoration, pytorch, super-resolution

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
https://github.com/google/jax
94 stars per week over 169 weeks
16,040 stars, 1,469 forks, 252 watches
created 2018-10-25, last commit 2022-01-21, main language Python
jax

Latex code for making neural networks diagrams
https://github.com/HarisIqbal88/PlotNeuralNet
86 stars per week over 182 weeks
15,859 stars, 2,192 forks, 206 watches
created 2018-07-24, last commit 2020-11-06, main language TeX
deep-neural-networks, latex

Cross-platform, customizable ML solutions for live and streaming media.
https://mediapipe.dev
https://github.com/google/mediapipe
116 stars per week over 136 weeks
15,848 stars, 3,256 forks, 462 watches
created 2019-06-13, last commit 2021-12-13, main language C++
android, audio-processing, c-plus-plus, calculator, computer-vision, deep-learning, framework, graph-based, graph-framework, inference, machine-learning, mediapipe, mobile-development, perception, pipeline-framework, stream-processing, video-processing

Cloud-native neural search framework for 𝙖𝙣𝙮 kind of data
https://docs.jina.ai
https://github.com/jina-ai/jina
130 stars per week over 101 weeks
13,236 stars, 1,767 forks, 170 watches
created 2020-02-13, last commit 2022-01-22, main language Python
cloud-native, computer-vision, deep-learning, framework, hacktoberfest, image-search, jina, machine-learning, microservice, multimodal-search, neural-search, nlp, python, search, search-as-a-service, semantic-search, video-search, zmq

Open standard for machine learning interoperability
https://onnx.ai/
https://github.com/onnx/onnx
52 stars per week over 228 weeks
11,963 stars, 2,355 forks, 430 watches
created 2017-09-07, last commit 2022-01-22, main language C++
deep-learning, deep-neural-networks, dnn, keras, machine-learning, ml, mxnet, neural-network, onnx, pytorch, scikit-learn, tensorflow

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
https://nni.readthedocs.io
https://github.com/microsoft/nni
57 stars per week over 190 weeks
10,886 stars, 1,539 forks, 269 watches
created 2018-06-01, last commit 2022-01-21, main language Python
automated-machine-learning, automl, bayesian-optimization, data-science, deep-learning, deep-neural-network, distributed, feature-engineering, feature-extraction, hyperparameter-optimization, machine-learning, machine-learning-algorithms, model-compression, nas, neural-architecture-search, neural-network, python, pytorch, tensorflow

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
https://github.com/google/dopamine
https://github.com/google/dopamine
53 stars per week over 182 weeks
9,717 stars, 1,307 forks, 446 watches
created 2018-07-26, last commit 2021-12-14, main language Jupyter Notebook
ai, google, ml, rl, tensorflow

This repository contains implementations and illustrative code to accompany DeepMind publications
https://github.com/deepmind/deepmind-research
59 stars per week over 157 weeks
9,454 stars, 1,915 forks, 302 watches
created 2019-01-15, last commit 2021-12-10, main language Jupyter Notebook

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
https://github.com/spotify/annoy
20 stars per week over 459 weeks
9,374 stars, 981 forks, 318 watches
created 2013-04-01, last commit 2022-01-03, main language C++
approximate-nearest-neighbor-search, c-plus-plus, golang, locality-sensitive-hashing, lua, nearest-neighbor-search, python

SciPy library main repository
https://scipy.org
https://github.com/scipy/scipy
16 stars per week over 567 weeks
9,128 stars, 4,067 forks, 334 watches
created 2011-03-09, last commit 2022-01-23, main language Python
algorithms, closember, python, scientific-computing, scipy

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
http://epistasislab.github.io/tpot/
https://github.com/EpistasisLab/tpot
25 stars per week over 324 weeks
8,422 stars, 1,452 forks, 294 watches
created 2015-11-03, last commit 2021-01-06, main language Python
automated-machine-learning, automation, automl, data-science, feature-engineering, gradient-boosting, hyperparameter-optimization, machine-learning, model-selection, parameter-tuning, python, random-forest, scikit-learn, xgboost

Statsmodels: statistical modeling and econometrics in Python
http://www.statsmodels.org/devel/
https://github.com/statsmodels/statsmodels
12 stars per week over 554 weeks
7,033 stars, 2,409 forks, 258 watches
created 2011-06-12, last commit 2022-01-22, main language Python
data-analysis, econometrics, generalized-linear-models, python, regression-models, statistics, timeseries-analysis

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
https://catboost.ai
https://github.com/catboost/catboost
26 stars per week over 235 weeks
6,311 stars, 954 forks, 191 watches
created 2017-07-18, last commit 2022-01-23, main language C
big-data, catboost, categorical-features, coreml, cuda, data-mining, data-science, decision-trees, gbdt, gbm, gpu, gpu-computing, gradient-boosting, kaggle, machine-learning, python, r, tutorial

Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Aesara
https://docs.pymc.io/
https://github.com/pymc-devs/pymc3
9.48 stars per week over 663 weeks
6,292 stars, 1,510 forks, 230 watches
created 2009-05-05, last commit 2022-01-21, main language Python
aesara, bayesian-inference, hacktoberfest, mcmc, probabilistic-programming, python, statistical-analysis, variational-inference

Distributed Asynchronous Hyperparameter Optimization in Python
http://hyperopt.github.io/hyperopt
https://github.com/hyperopt/hyperopt
11 stars per week over 541 weeks
6,043 stars, 940 forks, 129 watches
created 2011-09-06, last commit 2021-11-29, main language Python

Automated Machine Learning with scikit-learn
https://automl.github.io/auto-sklearn
https://github.com/automl/auto-sklearn
17 stars per week over 342 weeks
5,982 stars, 1,114 forks, 217 watches
created 2015-07-02, last commit 2021-12-24, main language Python
automated-machine-learning, automl, bayesian-optimization, hyperparameter-optimization, hyperparameter-search, hyperparameter-tuning, meta-learning, metalearning, scikit-learn, smac

An open source python library for automated feature engineering
https://www.featuretools.com
https://github.com/FeatureLabs/featuretools
26 stars per week over 228 weeks
5,951 stars, 782 forks, 157 watches
created 2017-09-08, last commit 2022-01-21, main language Python
automated-feature-engineering, automated-machine-learning, automl, data-science, feature-engineering, machine-learning, python, scikit-learn

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
https://github.com/Megvii-BaseDetection/YOLOX
203 stars per week over 27 weeks
5,518 stars, 1,164 forks, 63 watches
created 2021-07-17, last commit 2022-01-18, main language Python
deep-learning, megengine, ncnn, object-detection, onnx, openvino, pytorch, tensorrt, yolo, yolov3, yolox

Uniform Manifold Approximation and Projection
https://github.com/lmcinnes/umap
22 stars per week over 238 weeks
5,349 stars, 607 forks, 121 watches
created 2017-07-02, last commit 2022-01-17, main language Python
dimensionality-reduction, machine-learning, topological-data-analysis, umap, visualization

A Python scikit for building and analyzing recommender systems
http://surpriselib.com
https://github.com/NicolasHug/Surprise
18 stars per week over 274 weeks
5,198 stars, 911 forks, 150 watches
created 2016-10-23, last commit 2020-08-05, main language Python
factorization, matrix, recommendation, recommender, svd, systems

An open-source, low-code machine learning library in Python
https://www.pycaret.org
https://github.com/pycaret/pycaret
44 stars per week over 113 weeks
4,983 stars, 1,129 forks, 107 watches
created 2019-11-23, last commit 2022-01-19, main language Jupyter Notebook
anomaly-detection, citizen-data-scientists, classification, clustering, data-science, gpu, machine-learning, ml, nlp, pycaret, python, regression, time-series

AutoGluon: AutoML for Text, Image, and Tabular Data
https://auto.gluon.ai/
https://github.com/awslabs/autogluon
31 stars per week over 129 weeks
4,071 stars, 545 forks, 80 watches
created 2019-07-29, last commit 2022-01-22, main language Python
autogluon, automated-machine-learning, automl, computer-vision, data-science, deep-learning, ensemble-learning, gluon, hyperparameter-optimization, image-classification, machine-learning, mxnet, natural-language-processing, neural-architecture-search, object-detection, pytorch, scikit-learn, structured-data, tabular-data, transfer-learning

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun
https://github.com/lucidrains/deep-daze
75 stars per week over 53 weeks
4,022 stars, 289 forks, 73 watches
created 2021-01-17, last commit 2021-10-19, main language Python
artificial-intelligence, deep-learning, implicit-neural-representation, multi-modality, siren, text-to-image, transformers

A library of extension and helper modules for Python's data analysis and machine learning libraries.
http://rasbt.github.io/mlxtend/
https://github.com/rasbt/mlxtend
9.71 stars per week over 388 weeks
3,773 stars, 743 forks, 121 watches
created 2014-08-14, last commit 2022-01-19, main language Python
association-rules, data-mining, data-science, machine-learning, python, supervised-learning, unsupervised-learning

🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
http://wandb.ai
https://github.com/wandb/client
14 stars per week over 252 weeks
3,655 stars, 289 forks, 35 watches
created 2017-03-24, last commit 2022-01-20, main language Python
collaboration, data-science, data-versioning, deep-learning, experiment-track, hyperparameter-optimization, hyperparameter-search, hyperparameter-tuning, keras, machine-learning, ml-platform, mlops, model-versioning, pytorch, reinforcement-learning, reproducibility, tensorflow

Visual analysis and diagnostic tools to facilitate machine learning model selection.
http://www.scikit-yb.org/
https://github.com/DistrictDataLabs/yellowbrick
11 stars per week over 296 weeks
3,478 stars, 512 forks, 106 watches
created 2016-05-18, last commit 2022-01-05, main language Python
anaconda, estimator, machine-learning, matplotlib, model-selection, python, scikit-learn, visual-analysis, visualization, visualizer

A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)
https://reagent.ai
https://github.com/facebookresearch/ReAgent
13 stars per week over 234 weeks
3,098 stars, 449 forks, 145 watches
created 2017-07-27, last commit 2022-01-21, main language Python

🌊 Online machine learning in Python
https://riverml.xyz
https://github.com/online-ml/river
19 stars per week over 156 weeks
3,045 stars, 330 forks, 78 watches
created 2019-01-24, last commit 2022-01-14, main language Python
concept-drift, data-science, incremental-learning, machine-learning, online-learning, online-machine-learning, online-statistics, python, streaming, streaming-data

Uplift modeling and causal inference with machine learning algorithms
https://github.com/uber/causalml
20 stars per week over 132 weeks
2,725 stars, 413 forks, 65 watches
created 2019-07-09, last commit 2022-01-22, main language Python
causal-inference, incubation, machine-learning, uplift-modeling

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
https://coremltools.readme.io
https://github.com/apple/coremltools
10 stars per week over 238 weeks
2,524 stars, 395 forks, 101 watches
created 2017-06-30, last commit 2022-01-21, main language Python
coreml, coremltools, machine-learning, model-conversion, model-converter, pytorch, tensorflow

A library for debugging/inspecting machine learning classifiers and explaining their predictions
http://eli5.readthedocs.io
https://github.com/TeamHG-Memex/eli5
8.91 stars per week over 279 weeks
2,491 stars, 316 forks, 70 watches
created 2016-09-15, last commit 2020-01-22, main language Jupyter Notebook
crfsuite, data-science, explanation, inspection, lightgbm, machine-learning, nlp, python, scikit-learn, xgboost

Automated CI toolchain to produce precompiled opencv-python, opencv-python-headless, opencv-contrib-python and opencv-contrib-python-headless packages.
https://pypi.org/project/opencv-python/
https://github.com/skvark/opencv-python
8.22 stars per week over 302 weeks
2,485 stars, 492 forks, 78 watches
created 2016-04-08, last commit 2021-12-27, main language Shell
manylinux, opencv, opencv-contrib-python, opencv-python, precompiled, pypi, python, python-3, wheel

VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
https://vissl.ai
https://github.com/facebookresearch/vissl
24 stars per week over 93 weeks
2,316 stars, 228 forks, 52 watches
created 2020-04-09, last commit 2022-01-10, main language Jupyter Notebook

🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision
https://huggingface.co/docs/accelerate
https://github.com/huggingface/accelerate
32 stars per week over 64 weeks
2,115 stars, 127 forks, 43 watches
created 2020-10-30, last commit 2022-01-11, main language Python

NeuralProphet: A simple forecasting package
https://neuralprophet.com
https://github.com/ourownstory/neural_prophet
22 stars per week over 89 weeks
2,015 stars, 249 forks, 51 watches
created 2020-05-04, last commit 2022-01-15, main language Python
artificial-intelligence, autoregression, deep-learning, fbprophet, forecast, forecasting, forecasting-algorithm, forecasting-model, machine-learning, neural, neural-network, neuralprophet, prediction, prophet, python, pytorch, seasonality, time-series, timeseries, trend

Header-only C++/python library for fast approximate nearest neighbors
https://github.com/nmslib/hnswlib
https://github.com/nmslib/hnswlib
7.82 stars per week over 237 weeks
1,857 stars, 352 forks, 59 watches
created 2017-07-06, last commit 2021-12-09, main language C++

A fast library for AutoML and tuning.
https://microsoft.github.io/FLAML/
https://github.com/microsoft/FLAML
22 stars per week over 74 weeks
1,709 stars, 222 forks, 39 watches
created 2020-08-20, last commit 2022-01-23, main language Python
automated-machine-learning, automl, classification, data-science, deep-learning, finetuning, hyperparam, hyperparameter-optimization, jupyter-notebook, machine-learning, natural-language-generation, natural-language-processing, python, random-forest, regression, scikit-learn, tabular-data, timeseries-forecasting, tuning

A Python toolbox for gaining geometric insights into high-dimensional data
http://hypertools.readthedocs.io/en/latest/
https://github.com/ContextLab/hypertools
6.07 stars per week over 277 weeks
1,685 stars, 158 forks, 58 watches
created 2016-09-27, last commit 2021-07-19, main language Python
data-visualization, data-wrangling, high-dimensional-data, python, text-vectorization, time-series, topic-modeling, visualization

Python library for interactive topic model visualization. Port of the R LDAvis package.
https://github.com/bmabey/pyLDAvis
4.41 stars per week over 354 weeks
1,562 stars, 326 forks, 56 watches
created 2015-04-09, last commit 2021-03-24, main language Jupyter Notebook

Prevent PyTorch's CUDA error: out of memory in just 1 line of code.
https://koila.pages.dev
https://github.com/rentruewang/koila
162 stars per week over 9 weeks
1,556 stars, 53 forks, 10 watches
created 2021-11-17, last commit 2022-01-19, main language Python
deep-learning, lazy-evaluation, machine-learning, memory-management, numpy, out-of-memory, python, pytorch, tensor, torch

🔅 Shapash makes Machine Learning models transparent and understandable by everyone
https://maif.github.io/shapash/
https://github.com/MAIF/shapash
16 stars per week over 90 weeks
1,535 stars, 207 forks, 32 watches
created 2020-04-29, last commit 2022-01-14, main language Jupyter Notebook
ethical-artificial-intelligence, explainability, explainable-ml, interpretability, lime, machine-learning, python, shap, transparency

GLIDE: a diffusion-based text-conditional image synthesis model
https://github.com/openai/glide-text2im
239 stars per week over 6 weeks
1,506 stars, 171 forks, 75 watches
created 2021-12-10, last commit 2021-12-22, main language Python

A flexible, intuitive and fast forecasting library
https://github.com/linkedin/greykite
37 stars per week over 38 weeks
1,443 stars, 65 forks, 35 watches
created 2021-04-27, last commit 2021-12-15, main language Python

High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
https://dglke.dgl.ai/doc/
https://github.com/awslabs/dgl-ke
8.29 stars per week over 98 weeks
818 stars, 141 forks, 26 watches
created 2020-03-03, last commit 2021-12-29, main language Python
dgl, graph-learning, knowledge-graph, knowledge-graphs-embeddings, machine-learning

LAMA - automatic model creation framework
https://github.com/sberbank-ai-lab/LightAutoML
6.68 stars per week over 101 weeks
678 stars, 83 forks, 26 watches
created 2020-02-13, last commit 2022-01-19, main language Python
automated-machine-learning, automl, blackbox, classification, data-science, ensembling, feature-engineering, gradient-boosting, kaggle, lama, linear-model, model-selection, multiclass, nlp, parameter-tuning, pipeline, pytorch, regression, stacking, whitebox

Finetuning any DNN for better embedding on neural search tasks
https://finetuner.jina.ai
https://github.com/jina-ai/finetuner
17 stars per week over 23 weeks
406 stars, 22 forks, 19 watches
created 2021-08-11, last commit 2022-01-18, main language Python
few-shot-learning, fine-tuning, finetuning, jina, keras, labeling-tool, metric-learning, negative-sampling, neural-search, paddlepaddle, pretrained-models, pytorch, siamese-network, tensorflow, transfer-learning, triplet-loss

Official code for our NeurIPS 2021 Spotlight "Focal Self-attention for Local-Global Interactions in Vision Transformers"
https://github.com/microsoft/Focal-Transformer
13 stars per week over 28 weeks
377 stars, 43 forks, 17 watches
created 2021-07-10, last commit 2021-12-07, main language Python

PECOS - Prediction for Enormous and Correlated Spaces
https://github.com/amzn/pecos
3.27 stars per week over 75 weeks
247 stars, 59 forks, 16 watches
created 2020-08-12, last commit 2022-01-21, main language Python
extreme-multi-label-classification, extreme-multi-label-ranking, machine-learning-algorithms, transformers

Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning Framework on a GPU
https://github.com/salesforce/warp-drive
11 stars per week over 21 weeks
240 stars, 39 forks, 11 watches
created 2021-08-25, last commit 2022-01-10, main language Python
cuda, deep-learning, gpu, high-throughput, multiagent-reinforcement-learning, reinforcement-learning

PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io
https://github.com/stan-dev/pystan
0.68 stars per week over 227 weeks
154 stars, 36 forks, 10 watches
created 2017-09-17, last commit 2021-10-21, main language Python

CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms
https://github.com/carla-recourse/CARLA
2.49 stars per week over 58 weeks
146 stars, 16 forks, 4 watches
created 2020-12-09, last commit 2022-01-20, main language Python
artificial-intelligence, benchmark, benchmarking, counterfactual, counterfactual-explanations, counterfactuals, explainability, explainable-ai, explainable-ml, machine-learning, python, pytorch, recourse, tensorflow, tensorflow2

Behavioral "black-box" testing for recommender systems
https://github.com/jacopotagliabue/reclist
12 stars per week over 10 weeks
133 stars, 4 forks, 5 watches
created 2021-11-08, last commit 2022-01-12, main language Python
machine-learning, qa-automation, recommender-system

Code for testing various M1 Chip benchmarks with TensorFlow.
https://github.com/mrdbourke/m1-machine-learning-test
12 stars per week over 10 weeks
129 stars, 24 forks, 4 watches
created 2021-11-14, last commit 2021-12-06, main language Jupyter Notebook
machine-learning, metal, tensorflow, tensorflow-macos

This file was automatically generated on 2022-01-23.

To curate your own github list, simply clone and change the input csv file.

Inspired by:
https://github.com/vinta/awesome-python
https://github.com/trananhkma/fucking-awesome-python