During this presentation I talked about a few resources to help you get started and configure your Apache Airflow environments. Here are links to these resources.
During the session I walked through using Amazon Managed Workflows for Apache Airflow. To reproduce the entire thing, you can follow these deep dives. Each post provides all the resources you need to reproduce what I demoed.
- Part 1 - Installation and configuration of Managed Workflows for Apache Airflow
- Part 2 - Working with Permissions
- Part 3 - Accessing Amazon Managed Workflows for Apache Airflow
- Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line
- Part 5 - A simple CI/CD system for your development workflow
- Part 6 - Monitoring and logging
Integration with Amazon Elastic Map Reduce for big data workloads
- Running PySpark Applications on Amazon EMR
- Running Spark Jobs on Amazon EMR with Apache Airflow
- Building complex workflows with Amazon MWAA, AWS Step Functions, AWS Glue, and Amazon EMR
Integration with Amazon SageMaker for training and hyper-parameter tuning
- Amazon Sagemaker Workshop / Airflow Integration workshop - https://www.sagemakerworkshop.com/airflow/](https://aws-oss.beachgeek.co.uk/2h) - reproduce the Amazon SageMaker DAG yourself.
Integration with Amazon Personalize
- Managed Workflows for Apache Workflow and Amazon Personalise - Great blog post and source code from AWS Community Builder, Yi Ai.
The Apache Airflow community is amazing and if what you have seen so far has made you want to explore more, I urge you to check out the official documentation as well as joining the Apache Airflow slack channel.
Apache Airflow Officical Documentation Blog
Astronomer
I also mention one of the key contributors within the Apache Airflow community who also provide Apache Airflow managed services, Astronomer. They are well worth getting in touch with if you want to check out their own managed version of Apache Airflow.
Qubole
Whilst I didnt talk about Qubole during my session, they also provide Apache Airflow expertise so check them out too. You can find them at https://www.qubole.com/#
There are some great 3rd party resources that I found whilst learning/researching how customers are using Apache Airflow. Here are some of my favourite resources that I found to work and do exactly what they said. Well worth checking out.
- Airflow workshop from Delivery Hero- Check out this great starting workshop that was put together by Delivery Hero at PyConde 2019.
- Running Apache Airflow locally via Docker - this is the Docker compose file I showed and started. It mounts a local volume (dags) that you can use to upload DAGs for development.
- Apache Airflow main page
- Machine Learning in Production using Apache Airflow
Feel free to get in touch if you want to know more or are having issues with these resources. The quickest way is to create an issue which will notify me.