Skip to content

Latest commit

 

History

History
289 lines (248 loc) · 34.6 KB

README.md

File metadata and controls

289 lines (248 loc) · 34.6 KB

Awesome pandas logo

awesome-pandas

A collection of resources for pandas (Python) and related subjects. Pull requests are very welcome!

Contents: This is an unofficial collection of resources for learning pandas, an open source Python library for data analysis. Here you will find videos, cheat-sheets, tutorials and books / papers. The curated list is divided into three parts:

  1. pandas resources - A collection of videos, cheat-sheets, tutorials and books directly related to pandas.
  2. Data analysis with Python resources - Material related to adjacent Python libraries and software such as NumPy, scipy, matplotlib, seaborn, statsmodels and Jupyter.
  3. Miscellaneous related resources - Resources related to general data analysis, Python programming, algorithms, computer science, machine learning, statistics, etc.
  4. Packages - Python packages for helping to work with Pandas.

(1) 🐼 pandas resources

(1.1) 📺 Videos

The videos below were collected in July of 2018. They are all directly related to pandas, and the Level of a video is quantified roughly as follows:

  • 😃 : Beginner - requires little knowledge to jump into, elementary topics.
  • 😅 : Intermediate - some prior knowledge needed, more technical.
  • 😱 : Advanced - very technical, or discusses advanced topics.
  • ⭐ : Recommended video - high quality video and audio, great presentation.
Title Speaker Uploader Time Views Year Level
Pandas tutorial for Data Science Bikram Kundu - > 01:20 2K+ 2022 😃
Python for Data Analysis using Pandas part 1 & part 2 [repo] tommyod na 2:19 100 2019 😃
Data Science Best Practices with pandas [repo] Kevin Markham PyCon 3:23 1000 2019 😃
Thinking like a Panda Hannah Stepanek PyCon 0:36 700 2019 😃
Analyzing Census Data with Pandas [repo] Sergio Sánchez PyCon 3:15 600 2019 😃
Pandas is for Everyone [repo] Daniel Chen PyCon 3:18 600 2019 😃
Pandas From The Ground Up [repo] Brandon Rhodes PyCon 2015 2:24 91000 2015 😃
Introduction Into Pandas [repo] Daniel Chen Python Tutorial 1:28 46000 2017 😃
Introduction To Data Analytics With Pandas [repo] Quentin Caudron Python Tutorial 1:51 25000 2017 😃
Pandas for Data Analysis [repo] Daniel Chen Enthought 3:45 13000 2017 😅
Optimizing Pandas Code [repo] Sofia Heisler PyCon 2017 0:29 12000 2017 😅
A Visual Guide To Pandas Jason Wirth Next Day Video 0:26 49000 2015 😃
Analyzing and Manipulating Data with Pandas [repo] Jonathan Rocher Enthought 3:33 22000 2016 😃
Time Series Analysis [repo] Aileen Nielsen PyCon 2017 3:11 9000 2017 😅
Predicting sports winners with pandas Robert Layton PyCon Australia 0:38 13000 2015 😅
Pandas from the Inside [repo] [2016 talk] Stephen Simmons PyData 1:17 3000 2017 😱
Pandas part 1 & part 2 [repo] Joris Van den Bossche EuroSciPy 3:03 1000 2017 😃
Pandas: .head() to .tail() [repo] Tom Augspurger PyData 1:26 3000 2016 😅
Performance Pandas (london) [repo] Jeff Reback PyData 0:43 2000 2015 😅
Performance Pandas (NYC) [repo] Jeff Reback PyData 1:26 3000 2015 😅
Python Data Science with pandas [repo] Matt Harrison JetBrainsTV 1:09 2000 2018 😃
What is the Future of Pandas [slides] Jeff Reback PyData 0:31 4000 2017 😃
Introduction to Python for Data Science [repo] Skipper Seabold PyData 3:18 300 2018 😃
Pandas for Better (and Worse) Data Science [repo] Kevin Markham PyCon 2018 3:21 3000 2018 😃

Know of a recent, good video? Send a pull request! 👍

(1.2) ❗ Cheat-sheets

(1.3) 🎓 Tutorials

(1.4) 📘 Books / papers

  • [amazon] McKinney, Wes. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. 2 edition. O’Reilly Media, 2017.
  • [amazon] VanderPlas, Jake. Python Data Science Handbook: Essential Tools for Working with Data. 1 edition. O’Reilly Media, 2016.
  • [manning] Lerner, Reuven. 50 exercises that will strengthen your pandas skills to a level of automatic fluency. 1 edition. Manning Publications, 2021.
  • [manning] Paskhaver, Boris. This friendly and hands-on guide shows you how to start mastering Pandas with skills you already know from spreadsheet software.. 1 edition. Manning Publications, 2021.

(2) Data analysis with Python resources

(2.1) 📺 Videos

Title Speaker Uploader Time Views Keyword Year Level
NumPy Beginner [repo] Alexandre Chabot LeClerc Enthought 2:47 56000 NumPy 2016 😅
Machine Learning Andreas Mueller & Sebastian Raschka Enthought 3:03 47000 sklearn 2016 😅
The Python Visualization Landscape Jake VanderPlas PyCon 2017 0:33 21000 python 2017 😃
JupyterLab: Building Blocks for Interactive Computing Brian Granger Enthought 0:29 28000 jupyter 2016 😃
Machine Learning with Scikit Learn [repo] Andreas Mueller & Kyle Kastner Enthought 3:22 48000 sklearn 2015 😅
Machine Learning for Time Series Data in Python Brett Naul Enthought 0:24 24000 cesium 2016 😃
Computational Statistics [repo] Allen Downey Enthought 2:05 10000 scipy 2017 😅
Time Series Analysis [repo] Aileen Nielsen PyCon 2017 3:11 9000 pandas 2017 😅
Learning TensorFlow Robert Layton PyCon Australia 0:40 18000 tensorflow 2016 😅
JupyterHub: Deploying Jupyter Notebooks Min Ragan Kelley & Thomas Kluyver PyData 1:36 17000 jupyter 2016 😃
Applied Time Series Econometrics Jeffrey Yau PyData 1:39 17000 statsmodels 2016 😅
Machine Learning with scikit learn [repo] Andreas Mueller & Alexandre Gram Enthought 3:10 8000 sklearn 2017 😅
Introduction to Numerical Computing with NumPy Dillon Niederhut Enthought 2:27 8000 NumPy 2017 😃
Dask - A Pythonic Distributed Data Science Framework Matthew Rocklin PyCon 2017 0:46 7000 dask 2017 😅
Introduction to Statistical Modeling with Python [repo] Christopher Fonnesbeck PyCon 2017 3:19 7000 scipy 2017 😅
Fully Convolutional Networks for Image Segmentation Daniil Pakhomov Enthought 0:20 7000 scipy 2017 😃
Exploratory data analysis in python [repo] Chloe Mawer & Jonathan Whitmore PyCon 2017 2:54 7000 scipy 2017 😃
Libraries for Deep Learning with Sequences Alex Rubinsteyn PyData 0:44 23000 scipy 2015 😅
Numba - Tell Those C++ Bullies to Get Lost [repo] Gil Forsyth & Lorena Barba Enthought 2:25 5000 numba 2017 😅
Deploying Interactive Jupyter Dashboards Philipp Rudiger Enthought 0:18 5000 jupyter 2017 😅
Data Science Using Functional Python Joel Grus PyData 0:44 18000 python 2015 😅
Anatomy of matplotlib [repo] Benjamin Root & Joe Kington Enthought 3:18 18000 matplotlib 2015 😅
Anatomy of matplotlib [repo] Benjamin Root Enthought 3:02 4000 matplotlib 2017 😅
Data Science is Software [repo] Peter Bull & Isaac Slavitt Enthought 2:12 9000 jupyter 2016 😃
Machine Learning with Scikit Learn [repo] Jake VanderPlas PyData 1:34 16000 sklearn 2015 😅
Using Jupyter notebooks [repo] Ioanna Ioannou PyCon Australia 0:28 8000 jupyter 2016 😅
Parallel Python: Analyzing Large Datasets [repo] Matthew Rocklin Enthought 3:05 7000 scipy 2016 😱
Keynote: Project Jupyter Brian Granger Enthought 0:48 7000 jupyter 2016 😅
matplotlib beginner tutorial [repo] Nicolas Rougier Enthought 2:59 6000 matplotlib 2016 😅
Awesome Big Data Algorithms Titus Brown Next Day Video 0:39 41000 python 2013 😱
All About Jupyter Brian Granger PyData 0:39 11000 jupyter 2015 😅
PyMC: Markov Chain Monte Carlo Chris Fonnesbeck Enthought 0:20 9000 pyMC 2014 😅
Jupyter Advanced Topics Tutorial [repo] Jonathan Frederic & Matthias Bussonier Enthought 2:48 4000 jupyter 2015 😱
Using randomness to make code much faster Rachel Thomas SF Python 0:54 1000 scipy 2017 😅
Python Profiling & Performance Mahmoud Hashemi SF Python 0:28 1000 python 2016 😅
Using List Comprehensions and Generator Expressions Trey Hunner PyCon 2018 3:21 3000 python 2018 😅
Foundations of Numerical Computing Scott Sanderson PyCon 2018 3:22 1000 python 2018 😅

(2.2) ❗ Cheat-sheets

(2.3) 🎓 Tutorials

(2.4) 📘 Books / papers

  • Varoquaux, Gael, Valentin Haenel, Emmanuelle Gouillart, Zbigniew Jędrzejewski-Szmek, Ralf Gommers, Fabian Pedregosa, Olav Vahtras, et al. Scipy Lecture Notes. Zenodo, September 28, 2015. https://doi.org/10.5281/zenodo.31521.
  • [amazon] Nunez-Iglesias, Juan, Stéfan van der Walt, and Harriet Dashnow. Elegant SciPy: The Art of Scientific Python. 1 edition. O’Reilly Media, 2017.
  • Rougier, Nicolas P. From Python to Numpy Zenodo, December 31, 2016. https://doi.org/10.5281/zenodo.225783.
  • [amazon] Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. 1 edition. O’Reilly Media, 2017.

(3) Miscellaneous related resources

(3.1) 📺 Videos

Title Speaker Uploader Time Views Keyword Year Level
So you want to be a Python expert? James Powell PyData 1:54 28000 python 2017 😱
Transforming Code into Beautiful, Idiomatic Python Raymond Hettinger Next Day Video 0:48 340000 python 2013 😃
Builtin Superheroes David Beazley David Beazley 0:44 12000 python 2016 😅
How to become a Data Scientist in 6 months Tetiana Ivanova PyData 0:56 148000 misc 2016 😃
Modern Dictionaries Raymond Hettinger SF Python 1:07 44000 python 2016 😅
Keynote on Concurrency Raymond Hettinger SF Python 1:13 15000 python 2017 😅
The Fun of Reinvention David Beazley David Beazley 0:52 11000 python 2017 😱
Being a Core Developer in Python Raymond Hettinger SF Python 1:02 19000 python 2016 😃
Visualizing Geographic Data Christopher Roach PyData 0:31 14000 python 2016 😃
Python's Class Development Toolkit Raymond Hettinger Next Day Video 0:45 80000 python 2013 😅
The Other Async (Threads + Async = ❤️) - YouTube David Beazley David Beazley 0:47 5000 python 2017 😱
Functional Programming with Python Mike Müller Next Day Video 0:27 44000 python 2013 Novice
Building a Recommendation Engine using Python Anusua Trivedi PyData 0:37 11000 python 2015 Novice
Iterations of Evolution David Beazley David Beazley 0:34 2000 python 2017 Novice
"Good Enough" IS Good Enough! Alex Martelli SF Python 0:53 4000 python 2016 Novice
Automating Code Quality Kyle Knapp PyCon 2018 0:30 3000 python 2018 😅

(3.2) ❗ Cheat-sheets

(3.3) 🎓 Tutorials

(3.4) 📘 Books / papers

  • [amazon] Slatkin, Brett. Effective Python: 59 Specific Ways to Write Better Python. 1 edition. Addison-Wesley Professional, 2015.
  • [amazon] Ramalho, Luciano. Fluent Python. 1st edition. O’Reilly, 2015.
  • [pdf] P Rougier, Nicolas, Michael Droettboom, and Philip Bourne. "Ten Simple Rules for Better Figures." PLoS Computational Biology 10 (September 1, 2014): e1003833. https://doi.org/10.1371/journal.pcbi.1003833.
  • [pdf] Tidy Data | Wickham | Journal of Statistical Software. Accessed December 31, 2017. https://doi.org/10.18637/jss.v059.i10.
  • [amazon] [online] Chacon, Scott, and Ben Straub. Pro Git. 2nd ed. edition. New York, NY: Apress, 2014.

The books below are perhaps of an even more general nature.

  • [amazon] Dasgupta, Sanjoy, Christos H. . Papadimitriou, and Umesh Virkumar. Vazirani. Algorithms. Boston, Mass: McGraw Hill, 2008.
  • [amazon] Lloyd N. Trefethen. Numerical Linear Algebra. Society for Industrial and Applied Mathematics, 1997.
  • [amazon] Gene H. Golub. Matrix Computations. 4th ed. Johns Hopkins Studies in the Mathematical Sciences. Baltimore: Johns Hopkins University Press, 2013.

Every video is below.

Title Speaker Uploader Time Views Keyword Year Level
How to become a Data Scientist in 6 months Tetiana Ivanova PyData 0:56 148000 misc 2016 🐍
Introduction Into Pandas Daniel Chen Python Tutorial 1:28 46000 pandas 2017 🐍
So you want to be a Python expert? James Powell PyData 1:54 28000 python 2017 🐍🐍🐍
NumPy Beginner [repo] Alexandre Chabot LeClerc Enthought 2:47 56000 NumPy 2016 🐍 🐍
Introduction To Data Analytics With Pandas Quentin Caudron Python Tutorial 1:51 25000 pandas 2017 🐍
Transforming Code into Beautiful, Idiomatic Python Raymond Hettinger Next Day Video 0:48 340000 python 2013 🐍
Machine Learning Andreas Mueller & Sebastian Raschka Enthought 3:03 47000 sklearn 2016 🐍 🐍
Pandas From The Ground Up [repo] Brandon Rhodes PyCon 2015 2:24 91000 pandas 2015 🐍 🐍
Modern Dictionaries Raymond Hettinger SF Python 1:07 44000 python 2016 🐍 🐍
The Python Visualization Landscape Jake VanderPlas PyCon 2017 0:33 21000 python 2017 🐍
Keynote on Concurrency Raymond Hettinger SF Python 1:13 15000 python 2017 🐍🐍
Pandas for Data Analysis [repo] Daniel Chen Enthought 3:45 13000 pandas 2017 🐍🐍
JupyterLab: Building Blocks for Interactive Computing Brian Granger Enthought 0:29 28000 jupyter 2016 🐍
Optimizing Pandas Code for Speed and Efficiency Sofia Heisler PyCon 2017 0:29 12000 pandas 2017 🐍 🐍
A Visual Guide To Pandas Jason Wirth Next Day Video 0:26 49000 pandas 2015 🐍
Machine Learning with Scikit Learn [repo] Andreas Mueller & Kyle Kastner Enthought 3:22 48000 sklearn 2015 🐍 🐍
Machine Learning for Time Series Data in Python Brett Naul Enthought 0:24 24000 cesium 2016 🐍
The Fun of Reinvention David Beazley David Beazley 0:52 11000 python 2017 🐍🐍🐍
Analyzing and Manipulating Data with Pandas [repo] Jonathan Rocher Enthought 3:33 22000 pandas 2016 🐍
Computational Statistics [repo] Allen Downey Enthought 2:05 10000 scipy 2017 🐍 🐍
Being a Core Developer in Python Raymond Hettinger SF Python 1:02 19000 python 2016 🐍
Time Series Analysis [repo] Aileen Nielsen PyCon 2017 3:11 9000 pandas 2017 🐍 🐍
Learning TensorFlow Robert Layton PyCon Australia 0:40 18000 tensorflow 2016 🐍 🐍
JupyterHub: Deploying Jupyter Notebooks Min Ragan Kelley & Thomas Kluyver PyData 1:36 17000 jupyter 2016 🐍
Applied Time Series Econometrics Jeffrey Yau PyData 1:39 17000 statsmodels 2016 🐍 🐍
Machine Learning with scikit learn [repo] Andreas Mueller & Alexandre Gram Enthought 3:10 8000 sklearn 2017 🐍 🐍
Introduction to Numerical Computing with NumPy Dillon Niederhut Enthought 2:27 8000 NumPy 2017 🐍
Dask - A Pythonic Distributed Data Science Framework Matthew Rocklin PyCon 2017 0:46 7000 dask 2017 🐍 🐍
Introduction to Statistical Modeling with Python [repo] Christopher Fonnesbeck PyCon 2017 3:19 7000 scipy 2017 🐍 🐍
Fully Convolutional Networks for Image Segmentation Daniil Pakhomov Enthought 0:20 7000 scipy 2017 🐍
Exploratory data analysis in python [repo] Chloe Mawer & Jonathan Whitmore PyCon 2017 2:54 7000 scipy 2017 🐍
Visualizing Geographic Data Christopher Roach PyData 0:31 14000 python 2016 🐍
Builtin Superheroes David Beazley David Beazley 0:44 12000 python 2016 🐍 🐍
Python's Class Development Toolkit Raymond Hettinger Next Day Video 0:45 80000 python 2013 🐍 🐍
Libraries for Deep Learning with Sequences Alex Rubinsteyn PyData 0:44 23000 scipy 2015 🐍 🐍
The Other Async (Threads + Async = ❤️) - YouTube David Beazley David Beazley 0:47 5000 python 2017 🐍 🐍 🐍
Numba - Tell Those C++ Bullies to Get Lost [repo] Gil Forsyth & Lorena Barba Enthought 2:25 5000 numba 2017 🐍 🐍
Deploying Interactive Jupyter Dashboards Philipp Rudiger Enthought 0:18 5000 jupyter 2017 🐍 🐍
Eyal Trabelsi - Practical Optimisations for Pandas Eyal Trabelsi Europython 0:45 5000 jupyter 2020 🐍 🐍
Data Science Using Functional Python Joel Grus PyData 0:44 18000 python 2015 🐍 🐍
Pandas from the Inside Stephen Simmons PyData 1:20 9000 pandas 2016 🐍 🐍 🐍
Anatomy of matplotlib [repo] Benjamin Root & Joe Kington Enthought 3:18 18000 matplotlib 2015 🐍 🐍
Anatomy of matplotlib [repo] Benjamin Root Enthought 3:02 4000 matplotlib 2017 🐍 🐍
Data Science is Software [repo] Peter Bull & Isaac Slavitt Enthought 2:12 9000 jupyter 2016 🐍
Machine Learning with Scikit Learn [repo] Jake VanderPlas PyData 1:34 16000 sklearn 2015 Novice
Using Jupyter notebooks Ioanna Ioannou PyCon Australia 0:28 8000 jupyter 2016 Novice
Parallel Python: Analyzing Large Datasets [repo] Matthew Rocklin Enthought 3:05 7000 scipy 2016 Novice
Functional Programming with Python Mike Müller Next Day Video 0:27 44000 python 2013 Novice
Predicting sports winners with pandas and scikit-learn Robert Layton PyCon Australia 0:38 13000 pandas 2015 Novice
Keynote: Project Jupyter Brian Granger Enthought 0:48 7000 jupyter 2016 Novice
matplotlib beginner tutorial [repo] Nicolas Rougier Enthought 2:59 6000 matplotlib 2016 Novice
Awesome Big Data Algorithms Titus Brown Next Day Video 0:39 41000 python 2013 Novice
Pandas from the Inside Stephen Simmons PyData 1:17 3000 pandas 2017 Novice
All About Jupyter Brian Granger PyData 0:39 11000 jupyter 2015 Novice
Building a Recommendation Engine using Python Anusua Trivedi PyData 0:37 11000 python 2015 Novice
Iterations of Evolution David Beazley David Beazley 0:34 2000 python 2017 Novice
"Good Enough" IS Good Enough! Alex Martelli SF Python 0:53 4000 python 2016 Novice
PyMC: Markov Chain Monte Carlo Chris Fonnesbeck Enthought 0:20 9000 pyMC 2014 Novice
Jupyter Advanced Topics Tutorial [repo] Jonathan Frederic & Matthias Bussonier Enthought 2:48 4000 jupyter 2015 Novice
Using randomness to make code much faster Rachel Thomas SF Python 0:54 1000 scipy 2017 Novice
Python Profiling & Performance Mahmoud Hashemi SF Python 0:28 1000 python 2016 Novice

(4) Packages

  • datatest - Tools for test driven data-wrangling and data validation (DataFrame, Series, Index, MultiIndex).
  • pandera - A light-weight, flexible, and expressive data validation library for dataframes.
  • pandas-vet - A plugin for Flake8 that checks pandas code.