Tutorial at DSC dach 2025 - https://dscdach.com/
Modern data teams need an orchestrator that scales from the developer’s laptop to production workloads. In this hands-on session you’ll learn how dagster’s asset-based approach lets Magenta Telekom move from siloed, ad-hoc jobs to governed, reusable data products — and how you can replicate the pattern locally in minutes. We’ll first explain the most important dagster concepts which make up the core of every dagster project. Then we take a look and extend a template (local data stack) implementation showcasing the concepts and at the end we discuss how Magenta leverages them to build their data platform. By the end you’ll have a runnable project and understanding to replicate Magenta’s scalable data platform in your own environment.
We are always interested to exchange thoughts about tough data challenges!
Aleks is a senior data engineer at Magenta. He is working on building and optimising an enterprise data platform at Magenta Telekom.
- https://www.linkedin.com/in/milicevica23 (feel free to connect)
Georg is a Senior data expert @Magenta and a ML-ops engineer @ASCII. He is solving challenges with data. His interests include geospatial graphs and time series. Georg transitions the data platform of Magenta to the cloud and is handling large scale multi-modal ML-ops challenges at ASCII.
- https://www.linkedin.com/in/geoheil/ (feel free to connect)
- https://geoheil.com (find more interesting stuff here)
Time (min) | Topic |
---|---|
0–5 | Welcome & goals – why Dagster, why asset-based orchestration |
05–30 | Crash course on Dagster concepts with slides and presentation: • Asset-based? • Metadata-created pipelines • Resources and IO managers |
30–70 | Hands-on lab (local or GitHub Codespace): • Spin up Dagster • Tour Dagster UI and run first asset • Understand, run, and extend the above Dagster concepts |
70–80 | Discuss about: • Dagster at Magenta • Open source implementation with local-data-stack |
80–90 | Wrap-up & Q-A – key takeaways, further resources |
Tool | Why you need it | Quick install |
---|---|---|
Python package manager (pixi) | To download environment to run the code (pixi) | curl -fsSL https://pixi.sh/install.sh |
Git | To clone prepared repository with example | pixi global install git |
Optional: VS Code | For code edits (or any other text editor) | https://code.visualstudio.com/ |
You can run the tutorial locally on your laptop or in a github codespace.
Find instructions for a local setup below.
In case you want to go for codespace:
# pre-requisites
## pixi install
curl -fsSL https://pixi.sh/install.sh | sh
pixi global install git
git clone https://github.com/l-mds/dsc-dach-tutorial-dagster.git
Then start one of the commands to begin with the tutorial:
# start the example
pixi run -e dev --frozen start
TODO
# activate the python environment with all the required dependencies for the tutorial
pixi shell -e dev
# move into the tutorial folder
cd src/tutorial
dg list defs
dg launch --assets hello
dg check defs
dg docs serve
dg dev
- https://georgheiler.com/event/magenta-data-architecture-25/
- https://georgheiler.com/post/learning-data-engineering/
- https://georgheiler.com/post/dbt-duckdb-production/
- https://georgheiler.com/post/lmds-template/
- https://docs.dagster.io/guides/build/assets/asset-versioning-and-caching
- https://gafni.dev/projects/sanas-ai-dagster-ray/
- https://www.youtube.com/watch?v=HPqQSR0BoUQ
- https://www.samsara.com/blog/building-a-modern-machine-learning-platform-with-ray
- https://metaops.solutions/blog/dagster-monitoring-prometheus-system-metrics-custom-assets-part-1 and https://metaops.solutions/blog/dagster-monitoring-prometheus-dbt-assets-sql-transformations-part-2
Feel free to raise an issue or send a PR to improve the tutorial! For any commits, please ensure to execute this prior to pushing:
pixi run -e opstooling pre-commit-install
pixi run -e opstooling pre-commit-run