Presentation: { Lightweight Analytics at Scale }
In this workshop, you’ll learn how to build powerful yet lightweight data workflows using Python, DuckDB, and Smallpond. We’ll explore how DuckDB enables fast, in-process SQL analytics on massive datasets without heavy infrastructure, and how Smallpond extends those capabilities into distributed, collaborative, or cloud-friendly environments.
- PyLadies Amsterdam uses uv for dependency management
- Google account if you want to use Google Colab
- Docker desktop or Docker + Docker compose
Run the following code:
git clone https://github.com/pyladiesams/lightweight-analytics-duckdb-smallpond-oct2025.git
cd lightweight-analytics-duckdb-smallpond-oct2025
# create and activate venv, install dependencies
uv sync- Visit Google Colab
- In the top left corner select "File" → "Open Notebook"
- Under "GitHub", enter the URL of the repo of this workshop
- Select one of the notebooks within the repo.
- At the top of the notebook, add a Code cell and run the following code:
!git clone https://github.com/pyladiesams/lightweight-analytics-duckdb-smallpond-oct2025.git
%cd lightweight-analytics-duckdb-smallpond-oct2025
!pip install -r requirements.txtTo get started, open the pyproject.toml file and set the required Python version. The pre-selected version 3.8 is generally a safe choice for most use cases.
After you have specified the Python version, you can create a virtual environment with uv venv and add packages with uv add <package>. Before the workshop, you can generate a requirements.txt file, which is needed e.g. for running code in Google Colab, by running uv export > requirements.txt.
Re-watch this YouTube stream
This workshop was set up by @pyladiesams and @valerybriz
To ensure our code looks beautiful, PyLadies uses pre-commit hooks. You can enable them by running pre-commit install. You may have to install pre-commit first, using uv sync, uv pip install pre-commit or pip install pre-commit.
Happy Coding :)