Skip to content

Conversation

@uplsh580
Copy link
Contributor

@uplsh580 uplsh580 commented Jan 25, 2026

Related Issue

Issue: #61037

Description

This PR implements per-bundle DAG Processor deployments feature, allowing users to create separate Kubernetes deployments for each DAG bundle defined in dagBundleConfigList. This enables independent resource isolation, scaling, and configuration per bundle.

Motivation

Currently, when multiple DAG bundles are configured, all bundles are processed by a single DAG processor deployment. This limits the ability to:

  • Scale bundles independently based on their workload
  • Apply bundle-specific resource configurations
  • Isolate bundle processing failures
  • Use bundle-specific node selectors, affinities, or tolerations

Changes

Configuration (values.yaml)

Added new dagProcessor.deployPerBundle section:

dagProcessor:
  deployPerBundle:
    enabled: false  # Enable per-bundle deployments
    args: ["bash", "-c", "exec airflow dag-processor --bundle-name {{ bundleName }}"]
    bundleOverrides: {}  # Per-bundle configuration overrides

Features

  1. Per-bundle Deployments: When deployPerBundle.enabled is true, creates a separate Deployment for each bundle in dagBundleConfigList
  2. Bundle-specific Args: Supports templated args with {{ bundleName }} placeholder that gets replaced with actual bundle name
  3. Bundle Overrides: Allows per-bundle configuration overrides via bundleOverrides map:
    • Resources (CPU, memory)
    • Replicas
    • Node selectors, affinities, tolerations
    • Environment variables
    • Pod disruption budgets
    • And other deployment settings
  4. Per-bundle PodDisruptionBudget: Creates separate PDBs for each bundle when enabled
  5. Backward Compatibility: When deployPerBundle.enabled is false, maintains existing single deployment behavior

Implementation Details

  • Refactored deployment logic into a reusable helper template dag-processor.deployment
  • Refactored PDB logic into a reusable helper template dag-processor.poddisruptionbudget
  • Supports per-bundle enable/disable via bundleOverrides[bundleName].enabled

Files Changed

  • chart/templates/dag-processor/dag-processor-deployment.yaml: Added per-bundle deployment logic
  • chart/templates/dag-processor/dag-processor-poddisruptionbudget.yaml: Added per-bundle PDB logic
  • chart/values.yaml: Added deployPerBundle configuration section
  • chart/values.schema.json: Added schema validation for deployPerBundle
  • helm-tests/tests/helm_tests/airflow_core/test_dag_processor_per_bundle.py: Added comprehensive test cases

Usage Example

dagProcessor:
  enabled: true

  dagBundleConfigList:
    - name: bundle1
      classpath: "airflow.providers.git.bundles.git.GitDagBundle"
      kwargs:
        git_conn_id: "GITHUB__SAMPLE"
        subdir: "dags"
        tracking_ref: "main"
        refresh_interval: 60
    - name: bundle2
      classpath: "airflow.providers.git.bundles.git.GitDagBundle"
      kwargs:
        git_conn_id: "GITHUB__SAMPLE2"
        subdir: "dags"
        tracking_ref: "main"
        refresh_interval: 60

  deployPerBundle:
    enabled: true
    bundleOverrides:
      bundle1:
        replicas: 3
        resources:
          requests:
            memory: "2Gi"
            cpu: "1000m"
          limits:
            memory: "4Gi"
            cpu: "2000m"
        podDisruptionBudget:
          config:
            minAvailable: 2
            maxUnavailable: ~
      bundle2:
        replicas: 1
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
        podDisruptionBudget:
          config:
            minAvailable: 1
            maxUnavailable: ~

This will create:

  • {release-name}-dag-processor-bundle1 deployment with 3 replicas and production resources
  • {release-name}-dag-processor-bundle2 deployment with 1 replica and standard resources
  • Separate PodDisruptionBudgets for each bundle (if enabled)

Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)
  • cursor

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@boring-cyborg boring-cyborg bot added the area:helm-chart Airflow Helm Chart label Jan 25, 2026
@uplsh580
Copy link
Contributor Author

uplsh580 commented Jan 25, 2026

Validation - deployPerBundle

image

bundle1

image

bundle2

image

poddisruptionbudgets

image

@uplsh580
Copy link
Contributor Author

uplsh580 commented Jan 25, 2026

Validation - ASIS (Single Deployments)

image

dag processor log

image

poddisruptionbudgets

image

@uplsh580 uplsh580 force-pushed the chart/deployPerBundle branch 2 times, most recently from bc31c60 to 2531585 Compare January 25, 2026 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:helm-chart Airflow Helm Chart

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants