Skip to content

YAML first discovery for connection form metadata#60410

Merged
amoghrajesh merged 24 commits intoapache:mainfrom
astronomer:use-yaml-to-load-ui-conn-form
Feb 13, 2026
Merged

YAML first discovery for connection form metadata#60410
amoghrajesh merged 24 commits intoapache:mainfrom
astronomer:use-yaml-to-load-ui-conn-form

Conversation

@amoghrajesh
Copy link
Contributor

@amoghrajesh amoghrajesh commented Jan 12, 2026

closes: #60404

Problem

Currently, connection form contains various fields which involve custom fields, dropdowns, validation which requires providers to define python methods (get_connection_form_widgets(), get_ui_field_behaviour()) in hook classes to be rendered properly on the Airflow UI in the connection form. This approach has a few issues like this:

  1. Performance: API server must import all hook classes during startup just to retrieve UI metadata, even though most hooks may not be used at once
  2. Dependencies: Requires importing heavy modules like wtforms, flask_appbuilder, and flask_babel in the API server for simple metadata
  3. Maintainence: UI metadata is scattered across python code instead of being centralised

Goals

  • Enable providers to define connection UI metadata declaratively in provider.yaml instead of hooks
  • Allow lazy loading of UI metadata without importing hook classes
  • An added benefit that would come here is that we will reduce API server startup time

Approach

  1. Adding conn-fields section to provider.yaml in connection-types
  2. Using standard JSON Schema for field validation
  3. If YAML metadata is missing, we fall back to existing python methods in hooks
  4. Add ui-field-behaviour for customizing standard connection fields (host, port, etc.)

Why this approach was chosen

We adopted standard json schema instead of a custom validation format for several reasons:

  1. React UI Alignment: Our React UI already expects jsonschema format for field rendering
  2. Can leverage existing jsonschema Python package for validation
  3. additionalProperties: true allows future JSON Schema features without breaking changes
  4. UI metadata (label, description, sensitive) separate from validation logic (schema)

Structure

conn-fields:
  field_name:
    label: "Display Label"           # Required
    description: "Help text"         # Optional
    sensitive: false                 # Optional 
     schema:                          # Required
      type: string                   # Required - string|integer|boolean|number
      default: "value"               # Optional
      format: password               # Optional - password|email|url|json|date|multiline
      enum: ["val1", "val2"]         # Optional - creates dropdown
      minimum: 1                     # Optional - for numbers
      maximum: 100                   # Optional - for numbers
      pattern: "^[a-z]+$"            # Optional - regex validation
      minLength: 5                   # Optional - string length
      maxLength: 255                 # Optional - string length

Examples

Simple example

connection-types:
  - connection-type: http
    hook-class-name: airflow.providers.http.hooks.http.HttpHook
    ui-field-behaviour:
      hidden-fields: []
      relabeling: {}
      placeholders: {}
    conn-fields:
      api_key:
        label: "API Key"
        sensitive: true
        schema:
          type: string
          format: password

Example with enum

conn-fields:
  region:
    label: "Region"
    description: "AWS region for the connection"
    schema:
      type: string
      enum: ["us-east-1", "us-west-2", "eu-west-1"]
      default: "us-east-1"

Example with int field with range validator

conn-fields:
  timeout:
    label: "Connection Timeout"
    description: "Timeout in seconds"
    schema:
      type: integer
      minimum: 1
      maximum: 300
      default: 30

Example with boolean field

conn-fields:
  use_ssl:
    label: "Use SSL"
    description: "Enable SSL for secure connection"
    schema:
      type: boolean
      default: true

Testing

Tried to play around with couple of connections changing provider hook names and it all loaded fine, example of google cloud platform:

image

Backward Compatibility

  • Existing providers work unchanged - python methods (get_connection_form_widgets(), get_ui_field_behaviour()) continue to work
  • ProvidersManager tries yaml first, falls back to python methods if yaml keys not found
  • No breaking API changes

Migration Path

  1. Add ui-field-behaviour and conn-fields to provider.yaml for a provider
  2. Test that UI renders correctly
  3. Keep python methods for backward compatibility with older Airflow versions
  4. Eventually deprecate python methods in future major version

Was generative AI tooling used to co-author this PR?

  • Yes (cursor IDE with claude sonnet 4.5 & composer-1)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@amoghrajesh
Copy link
Contributor Author

Not quite there, but this is the direction I am heading in

@amoghrajesh amoghrajesh force-pushed the use-yaml-to-load-ui-conn-form branch from cb2041f to dc320b6 Compare January 13, 2026 10:42
@amoghrajesh amoghrajesh self-assigned this Jan 15, 2026
@amoghrajesh amoghrajesh added this to the Airflow 3.2.0 milestone Jan 15, 2026
@amoghrajesh amoghrajesh force-pushed the use-yaml-to-load-ui-conn-form branch from e8326cc to e5b4b0f Compare January 16, 2026 09:37
@amoghrajesh amoghrajesh marked this pull request as ready for review January 16, 2026 10:05
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty much how I imagined it to be.

@potiuk
Copy link
Member

potiuk commented Jan 20, 2026

Copied from slack:

One thing that we might consider to do, maybe is to create a very simple tool that produces necessary yaml / dictionary/json configuration output based on what we discover by running "get_providers_info" of specified provider.

That would be rather easy to do (reversing what we currently do when generating mock classes - we could make it a sub-command of the airflow connections command. Having such tool implemented now - we could also give it to the contributors to use it to convert all the remaining providers, they would test the tool then - and see if we missed anything, and we could give it as an easy-to-use tool for authors of 3rd-party providers and add it to the instruction on how to convert your providers - linked to deprecation warnings raised when we detect non-converted provider.

@jscheffl
Copy link
Contributor

Copied from Slack:

+1 - we can also leverage the existing code that "mocks" the classes and just "dump" the form defintion from the ParamsDict into YAML...
This would also ease the "batching" for existing providers... could even make a pre-commit out of this to keep it in sync until for compatibility the old form definition is still around

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. And thanks for the UI screenshot posted as well. I assume the UI has some fixes required (not needed in this PR though) as it seems the long labels destroy the alignment between label and input box... all looks very mis-aligned.

@jscheffl
Copy link
Contributor

jscheffl commented Feb 9, 2026

LGTM like the direction. Just the issue mentioned with TP during last meeting, how will the server API know what providers version are installed and fetch the appropriate yaml definition for the appropriate provider version. (this is not necessarily the last one and core dosn't know that if providers aren't installed) -> diagnostic tasks to collect runtime information?

Just a precision, this will not help with 'api server startup time', it's more the 'first request' on hook meta data endpoint that actually does the work. (and is cached)

The list also today on API server is lazy loaded - so first time you open the connection UI it will be loaded on API server. This "lazy long loading" will be optimized in future.

API server start time might be improved if we do not install all providers in the API server except plugins. But this is a larger/different work

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approval - overall LGTM - just a few nits.

@amoghrajesh
Copy link
Contributor Author

LGTM like the direction. Just the issue mentioned with TP during last meeting, how will the server API know what providers version are installed and fetch the appropriate yaml definition for the appropriate provider version. (this is not necessarily the last one and core dosn't know that if providers aren't installed) -> diagnostic tasks to collect runtime information?

Just a precision, this will not help with 'api server startup time', it's more the 'first request' on hook meta data endpoint that actually does the work. (and is cached)

Thanks for the clarification, very helpful. I think a better thing to say would be that it would help with the "dependency surface" over time when provider's needn't be installed on the API server.

@amoghrajesh
Copy link
Contributor Author

LGTM like the direction. Just the issue mentioned with TP during last meeting, how will the server API know what providers version are installed and fetch the appropriate yaml definition for the appropriate provider version. (this is not necessarily the last one and core dosn't know that if providers aren't installed) -> diagnostic tasks to collect runtime information?
Just a precision, this will not help with 'api server startup time', it's more the 'first request' on hook meta data endpoint that actually does the work. (and is cached)

The list also today on API server is lazy loaded - so first time you open the connection UI it will be loaded on API server. This "lazy long loading" will be optimized in future.

API server start time might be improved if we do not install all providers in the API server except plugins. But this is a larger/different work

Yea agreed. This is a little bit of a follow up that I plan to handle on this one.

@amoghrajesh
Copy link
Contributor Author

CC: @potiuk would you want to take a look on this one? I know you were interested :)

@amoghrajesh
Copy link
Contributor Author

Thanks for your thoughtful reviews, @jscheffl @jason810496 @kaxil @potiuk @pierrejeambrun! Merging this one now.

I will follow up with a couple of things:

  1. Migrate all providers to have their provider.yml populated
  2. Come up with a solution to not having providers on the API server at all.

@amoghrajesh amoghrajesh merged commit c4a10e7 into apache:main Feb 13, 2026
129 checks passed
@amoghrajesh amoghrajesh deleted the use-yaml-to-load-ui-conn-form branch February 13, 2026 06:31
@shahar1
Copy link
Contributor

shahar1 commented Feb 13, 2026

Started to get these failures in the CI when running Non-DB tests: core / Non-DB-core::3.10:Always, could it be related to the changes in this PR?

For example:
https://github.com/apache/airflow/actions/runs/21993456975/job/63548111350?pr=61848

FAILED airflow-core/tests/unit/always/test_providers_manager.py::TestWithoutCheckProviderManager::test_executors_without_check_property_should_not_called_import_string - AssertionError: assert equals failed
  set()                                                                                     set([                                                                                    
                                                                                              (                                                                                      
                                                                                                'airflow.providers.amazon.aws.auth_manager.aws_auth_manager.AwsAuthManager',         
                                                                                                'apache-airflow-providers-amazon',                                                   
                                                                                              ),                                                                                     
                                                                                              (                                                                                      
                                                                                                'airflow.providers.fab.auth_manager.fab_auth_manager.FabAuthManager',                
                                                                                                'apache-airflow-providers-fab',                                                      
                                                                                              ),                                                                                     
                                                                                              (                                                                                      
                                                                                                'airflow.providers.keycloak.auth_manager.keycloak_auth_manager.KeycloakAuthManager', 
                                                                                                'apache-airflow-providers-keycloak',                                                 
                                                                                              ),                                                                                     
                                                                                            ])
================================================================== 1 failed, 213 passed, 1512 skipped, 1 xfailed, 5 warnings in 59.03s ===================================================================

@amoghrajesh
Copy link
Contributor Author

Let me take a look

@jscheffl
Copy link
Contributor

Besides the potential glitch ^^^ - cool! Looking forward for follow-ups!

One thing that came into my mind: Do you have an idea for a version merker that once we drop support in providers for Airflow <3.2 that we can clean the legacy field definition any leave only YAML behind? Something like # TODO AIRFLOW_V_3_2_PLUS - we need to clean here ?

#protm

Ratasa143 pushed a commit to Ratasa143/airflow that referenced this pull request Feb 15, 2026
Add declarative UI metadata for connection forms in provider.yaml

Load conn-fields and ui-field-behaviour from provider info instead of
importing hook classes.
@amoghrajesh
Copy link
Contributor Author

Besides the potential glitch ^^^ - cool! Looking forward for follow-ups!

One thing that came into my mind: Do you have an idea for a version merker that once we drop support in providers for Airflow <3.2 that we can clean the legacy field definition any leave only YAML behind? Something like # TODO AIRFLOW_V_3_2_PLUS - we need to clean here ?

#protm

Thanks @jscheffl you mean clean up the get_ui_field_behaviour and the other function for providers? Maybe we should add that in the hook code files?

@jscheffl
Copy link
Contributor

Besides the potential glitch ^^^ - cool! Looking forward for follow-ups!
One thing that came into my mind: Do you have an idea for a version merker that once we drop support in providers for Airflow <3.2 that we can clean the legacy field definition any leave only YAML behind? Something like # TODO AIRFLOW_V_3_2_PLUS - we need to clean here ?
#protm

Thanks @jscheffl you mean clean up the get_ui_field_behaviour and the other function for providers? Maybe we should add that in the hook code files?

I wanted to say:

  • We should put a marker in the code at existing get_ui_field_behaviour() and get_connection_form_widgets() that we do not forget once we drop support < Airflow 3.2. But of course not before.
  • Also came into my mind and found airflow-core/docs/howto/connection.rst which describes the fields - so this description should be marked as deprecated and YAML structure be defined here.

choo121600 pushed a commit to choo121600/airflow that referenced this pull request Feb 22, 2026
Add declarative UI metadata for connection forms in provider.yaml

Load conn-fields and ui-field-behaviour from provider info instead of
importing hook classes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[SPIKE] Work out a way for API server to load providers declaratively

8 participants