Open source, open pipelines: Data ingestion with modern data stack

Workshop description

Data ingestion is the cornerstone of Data Engineering — it’s where every data journey begins. In this hands-on workshop, you’ll learn how to move data from anywhere to anywhere using the open-source modern data stack.

We’ll focus on practical skills, leveraging Python library dlt (data load tool) to ingest data from a REST API and load it into DuckDB, a fast and lightweight database. Whether you're just getting started with data pipelines or looking to modernize your current stack, this session will give you a solid foundation for building reliable, open-source ingestion workflows.

Come ready to write some code, get your hands dirty, and walk away with real-world ingestion superpowers.

Requirements

PyLadies Amsterdam uses uv for dependency management
Google account if you want to use Google Colab

Usage

There are two ways of running this workshop:

With Google Colab (recommended)
On local instance of Jupyter Notebook

with Google Colab

You can open direct links to Colab:

Or you can add .ipynb from this repository to your Google Drive the following way:

Visit Google Colab
In the top left corner select "File" → "Open Notebook"
Under "GitHub", enter the URL of the repo of this workshop
Select one of the notebooks within the repo.
At the top of the notebook, add a Code cell and run the following code:

!git clone <github-url-of-workshop-repo>
%cd <name-of-repo>
!pip install -r requirements.txt

on local Jupyter Notebook (with uv)

Run the following code:

git clone <github-url-of-workshop-repo>
cd <name-of-repo>

# create and activate venv, install dependencies
uv sync

And start the Jupyter Notebook:

uv run jupyter notebook

Video record

Re-watch this YouTube stream

Credits

This workshop was set up by @pyladiesams and @VioletM

Appendix

Pre-Commit Hooks

To ensure our code looks beautiful, PyLadies uses pre-commit hooks. You can enable them by running pre-commit install. You may have to install pre-commit first, using uv sync, uv pip install pre-commit or pip install pre-commit.

Happy Coding :)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
solutions		solutions
workshop		workshop
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Open source, open pipelines: Data ingestion with modern data stack

Workshop description

Requirements

Usage

with Google Colab

on local Jupyter Notebook (with uv)

Video record

Credits

Appendix

Pre-Commit Hooks

About

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

License

pyladiesams/data-ingestion-modern-stack-apr2025

Folders and files

Latest commit

History

Repository files navigation

Open source, open pipelines: Data ingestion with modern data stack

Workshop description

Requirements

Usage

with Google Colab

on local Jupyter Notebook (with uv)

Video record

Credits

Appendix

Pre-Commit Hooks

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 2

Uh oh!

Languages