- Building and operating highly reliable Apache Airflow platforms at scale on Astronomer (Astro Cloud & Private Cloud)
- Monitoring and supporting 500+ Kubernetes clusters across AWS, GCP, and Azure
- Designing proactive monitoring DAGs and reliability automation using Airflow, Kubernetes, and cloud-native services
- Leading and mentoring a 10-member Airflow Reliability Engineering team
- Apache Airflow reliability, scalability, and performance tuning
- Kubernetes-based data platforms and cloud-native DataOps solutions
- Airflow DAG design best practices, observability, and event-driven workflows
- Open-source contributions around Airflow, Kubernetes, and data orchestration
- Advanced Airflow 3.x adoption patterns and event-based architectures
- Large-scale multi-tenant Airflow optimizations
- Improving cost efficiency and resource isolation in Kubernetes-based data platforms
- Event-driven Airflow architectures and asset-based workflows
- Advanced Kubernetes networking, autoscaling, and observability
- Data reliability engineering patterns using Snowflake, dbt, and modern warehouses
- Apache Airflow (1.10.x β 3.x), MWAA, and Astronomer
- Airflow upgrades, migrations, and executor selection
- Kubernetes (EKS), Docker, and cloud-native monitoring
- Debugging DAG latency, zombie tasks, scheduler issues, and DB bottlenecks
- Designing production-grade data orchestration platforms
- Iβve helped hundreds of teams worldwide keep their Airflow pipelines running reliablyβoften before they even knew something was about to break π



