Skip to content

Migrate Amazon Provider Package's SecretsManagerBackend's full_url_mode=False implementation. #26571

@dwreeves

Description

@dwreeves

Objective

I am following up on all the changes I've made in PR #25432 and which were originally discussed in issue #25104.

The objective of the deprecations introduced in #25432 is to flag and remove "odd" behaviors in the SecretsManagerBackend.

The objective of this issue being opened is to discuss them, and hopefully reach a consensus on how to move forward implementing the changes.

I realize that a lot of the changes I made and their philosophy were under-discussed, so I will place the discussion here.

What does it mean for a behavior to be "odd"?

You can think of the behaviors of SecretsManagerBackend, and which secret encodings it supports, as a Venn diagram.

Ideally, SecretsManagerBackend supports at least everything the base API supports. This is a pretty straightforward "principle of least astonishment" requirement.

For example, it would be "astonishing" if copy+pasting a secret that works with the base API did not work in the SecretsManagerBackend.

To be clear, it would also be "astonishing" if the reverse were not true-- i.e. copy+pasting a valid secret from SecretsManagerBackend doesn't work with, say, environment variables. That said, adding new functionality is less astonishing than when a subclass doesn't inherit 100% of the supported behaviors of what it is subclassing. So although adding support for new secret encodings is arguably not desirable (all else equal), I think we can all agree it's not as bad as the reverse.

Examples

I will cover two examples where we can see the "Venn diagram" nature of the secrets backend, and how some behaviors that are supported in one implementation are not supported in another:

Example 1

Imagine the following environment variable secret:

export AIRFLOW_CONN_POSTGRES_DEFAULT='{
  "conn_type": "postgres",
  "login": "usr",
  "password": "not%url@encoded",
  "host": "myhost"
}'

Prior to #25432, this was not a secret that worked with the SecretsManagerBackend, even though it did work with base Airflow's EnvironmentVariablesBackend(as of 2.3.0) due to the values not being URL-encoded.

Additionally, the EnvironmentVariablesBackend is smart enough to choose whether a secret should be treated as a JSON or a URI without having to be explicitly told. In a sense, this is also an incompatibility-- why should the EnvironmentVariablesBackend be "smarter" than the SecretsManagerBackend when it comes to discerning JSONs from URIs, and supporting both at the same time rather than needing secrets to be always one type of serialization?

Example 2

Imagine the following environment variable secret:

export AIRFLOW_CONN_POSTGRES_DEFAULT="{
  'conn_type': 'postgres',
  'user': 'usr',
  'pass': 'is%20url%20encoded',
  'host': 'myhost'
}"

This is not a valid secret in Airflow's base EnvironmentVariablesBackend implementation, although it is a valid secret in SecretsManagerBackend.

There are two things that make it invalid in the EnvironmentVariablesBackend but valid in SecretsManagerBackend:

  • ast.litera_eval in SecretsManagerBackend means that a Python dict repr is allowed to be read in.
  • user and pass are invalid field names in the base API; these should be login and password, respectively. But the _standardize_secret_keys() method in the SecretsManagerBackend implementation makes it valid.

Additionally, note that this secret also parses differently in the SecretsManagerBackend than the EnvironmentVariablesBackend: the password "is%20url%20encoded" renders as "is url encoded" in the SecretsManagerBackend, but is left untouched by the base API when using a JSON.

List of odd behaviors

Prior to #25432, the following behaviors were a part of the SecretsManagerBackend specification that I would consider "odd" because they are not part of the base API implementation:

  1. full_url_mode=False still required URL-encoded parameters for JSON values.
  2. ast.literal_eval was used instead of json.dumps, which means that the SecretsManagerBackend on full_url_mode=False was supporting Python dict reprs and other non-JSONs.
  3. The Airflow config required setting full_url_mode=False to determine what is a JSON or URI.
  4. get_conn_value() always must return a URI.
  5. The SecretsManagerBackend allowed for atypical / flexible field names (such as user instead of login) via the _standardize_secret_keys() method.

We also introduced a new odd behavior in order to assist with the migration effort, which is:

  1. New kwarg called are_secret_values_urlencoded to support secrets whose encodings are "non-idempotent".

In the below sections, I discuss each behavior in more detail, and I've added my own opinions about how odd these behaiors are and the estimated adverse impact of removing the behaviors.

Behavior 1: URL-encoding JSON values

Current Status Oddness Estimated Adverse Impact of Removal
Deprecated High High

This was the original behavior that frustrated me and motivated me to open issues + submit PRs.

With the "idempotency" check, I've done my best to smooth out the transition away from URL-encoded JSON values.

The requirement is now mostly removed, to the extent that the removal of this behavior can be backwards compatible and as smooth as possible:

  • Users whose secrets do not contain special characters will not have even noticed a change took place.
  • Users who do have secrets with special characters hopefully are checking their logs and will have seen a deprecation warning telling them to remove the URL-encoding.
  • In a select few rare cases where a user has a secret with a "non-idempotent" encoding, the user will have to reconfigure their backend_kwargs to have are_secret_values_urlencoded set.

I will admit that I was surprised at how smooth we could make the developer experience around migrating this behavior for the majority of use cases.

When you consider...

  • How smooth migrating is (just remove the URL-encoding! In most cases you don't need to do anything else!), and
  • How disruptive full removal of URL-encoding is (to people who have not migrated yet),

.. it makes me almost want to hold off on fully removing this behavior for a little while longer, just to make sure developers read their logs and see the deprecation warning.

Behavior 2: ast.literal_eval for deserializing JSON secrets

Current Status Oddness Estimated Adverse Impact of Removal
Deprecated High Low

It is hard to feel bad for anyone who is adversely impacted by this removal:

  • This behavior should never have been introduced
  • This behavior was never a documented behavior
  • A reasonable and educated user will have known better than to rely on non-JSONs.

Providing a DeprecationWarning for this behavior was already going above and beyond, and we can definitely remove this soon.

Behavior 3: full_url_mode=False is required for JSON secrets

Current Status Oddness Estimated Adverse Impact of Removal
Active Medium Low

This behavior is odd because the base API does not require such a thing-- whether it is a JSON or a URI can be inferred by checking whether the first character of the string is {.

Because it is possible to modify this behavior without introducing breaking changes, moving from lack of optionality for the full_url_mode kwarg can be considered a feature addition.

Ultimately, we would want to switch from full_url_mode: bool = True to full_url_mode: bool | None = None.

In the proposed implementation, when full_url_mode is None, we just use whether the value starts with { to check if it is a JSON. Only when full_url_mode is a bool would we explicitly raise errors e.g. if a JSON was given when full_url_mode=True, or a URI was given when full_url_mode=False.

Behavior 4: get_conn_value() must return URI

Current Status Oddness Estimated Adverse Impact of Removal
Deprecated + Active (until at least October 11th) Low Medium

The idea that the callback invoked by get_connection() (now called get_conn_value(), but previously called get_conn_uri()) can return a JSON is a new Airflow 2.3.0 behavior.

This behavior cannot change until at least October 11th, because it is required for 2.2.0 backwards compatibility. Via Airflow's README.md:

[...] by default we upgrade the minimum version of Airflow supported by providers to 2.3.0 in the first Provider's release after 11th of October 2022 (11th of October 2021 is the date when the first PATCHLEVEL of 2.2 (2.2.0) has been released.

Changing this behavior after October 11th is just a matter of whether maintainers are OK with introduce a breaking change to the 2.2.x folks who are relying on JSON secrets.

Note that right now, get_conn_value() is avoided entirely when full_url_mode=False and get_connection() is called. So although there is a deprecation warning, it is almost never hit.

if self.full_url_mode:
    return self._get_secret(self.connections_prefix, conn_id)
else:
    warnings.warn(
        f'In future versions, `{type(self).__name__}.get_conn_value` will return a JSON string when'
        ' full_url_mode is False, not a URI.',
        DeprecationWarning,
    )

Behavior 5: Flexible field names via _standardize_secret_keys()

Current Status Oddness Estimated Adverse Impact of Removal
Active Medium High

This is one of those things that is very hard to remove. Removing it can be quite disruptive!

It is also a low priority to remove because unlike some other behaviors, it does not detract from SecretsManagerBackend being a "strict superset" with the base API.

Maybe it will just be the case that SecretsManagerBackend has this incompatibility with the base API going forward indefinitely?

Even still, we should consider the two following proposals:

  1. The default field name of user should probably be switched to login, both in the dict[str, list[str]] used to implement the find+replace, and also in the documentation. I do not forsee any issues with doing this.
  2. Remove documentation for this feature if we think it is "odd" enough to warrant discouraging users from seeking it out.

I think # 1 should be uncontroversial, but # 2 may be more controversial. I do not want to detract from my other points by staking out too firm an opinion on this one, so the best solution may simply be to not touch this for now. In fact, not touching this is exactly what I did with the original PR.

Behavior 6: are_secret_values_urlencoded kwarg

Current Status Oddness Estimated Adverse Impact of Removal
Pending Deprecation Medium Medium

In the original discussion #25104, @potiuk told me to add something like this. I tried my best to avoid users needing to do this, hence the "idempotency" check. So only a few users actually need to specify this to assist in the migration of their secrets.

This was introduced as a "pending" deprecation because frankly, it is an odd behavior to have ever been URL-encoding these JSON values, and it only exists as a necessity to aid in migrating secrets. In our ideal end state, this doesn't exist.

Eventually when it comes time, removing this will not be all that disruptive:

  • This only impacts users who have full_url_mode=False
  • This only impacts users with secrets that have non-idempotent encodings.
  • are_secret_values_urlencoded should be set to False. Users should never be manually setting to True!

So we're talking about a small percent of a small minority of users who will ever see or need to set this are_secret_values_urlencoded kwarg. And even then, they should have set are_secret_values_urlencoded to False to assist in migrating.

Proposal for Next Steps

All three steps require breaking changes.

Proposed Step 1

  • Remove: Behavior 2: ast.literal_eval for deserializing JSON secrets
  • Remove: Behavior 3: full_url_mode=False is required for JSON secrets
  • Remove: Behavior 4: get_conn_value() must return URI
  • Note: Must wait until at least October 11th!

Right now the code is frankly a mess. I take some blame for that, as I did introduce the mess. But the mess is all inservice of backwards compatibility.

Removing Behavior 4 vastly simplifies the code, and means we don't need to continue overriding the get_connection() method.

Removing Behavior 2 also simplifies the code, and is a fairly straightforward change.

Removing Behavior 3 is fully backwards compatible (so no deprecation warnings required) and provides a much nicer user experience overall.

The main thing blocking "Proposed Step 1" is the requirement that 2.2.x be supported until at least October 11th.

Alternative to Proposed Step 1

It is possible to remove Behavior 2 and Behavior 3 without removing Behavior 4, and do so in a way that keeps 2.2.x backwards compatibility.

It will still however leave a mess of the code.

I am not sure how eager the Amazon Provider Package maintainers are to keep backwards compatibility here. Between the 1 year window, plus the fact that this can only possibly impact people using both the SecretsManagerBackend and who have full_url_mode=False turned on, it seems like not an incredibly huge deal to just scrap support for 2.2.x here when the time comes. But it is not appropriate for me to decide this, so I must be clear and say that we can start trimming away some of the odd behaviors without breaking Airflow 2.2.x backwards compatibility, and that the main benefit of breaking backwards compatibility is the source code becoming way simpler.

Proposed Step 2

  • Remove: Behavior 1: URL-encoding JSON values
  • Switch status from Pending Deprecation to Deprecation: Behavior 6: are_secret_values_urlencoded kwarg

Personally, I don't think we should rush on this. The reason I think we should take our time here is because the current way this works is surprisingly non-disruptive (i.e. no config changes required to migrate for most users), but fully removing the behavior may be pretty disruptive, especially to people who don't read their logs carefully.

Alternative to Proposed Step 2

The alternative to this step is to combine this step with step 1, instead of holding off for a future date.

The main arguments in favor of combining with step 1 are:

  • Reducing the number of version bumps that introduce breaking changes by simply combining all breaking changes into one step. It's unclear how many users even use full_url_mode and it is arguable that we're being too delicate with what was arguably a semi-experimental and odd feature to begin with; it's only become less experimental by the stroke of luck that Airflow 2.3.0 supports JSON-encoded secrets in the base API.
  • A sort of "rip the BandAid" ethos, or a "get it done and over with" ethos. I don't think this is very nice to users, but I see the appeal of not keeping odd code around for a while.

Proposed Step 3

  • Remove: Behavior 6: are_secret_values_urlencoded kwarg

Once URL-encoding is no longer happening for JSON secrets, and all non-idempotent secrets have been cast or explicitly handled, and we've deprecated everything appropriately, we can finally remove are_secret_values_urlencoded.

Conclusion

The original deprecations introduced were under-discussed, but hopefully now you both know where I was coming from, and also agree with the changes I made.

If you disagree with the deprecations that I introduced, I would also like to hear about that and why, and we can see about rolling any of them back.

Please let me know what you think about the proposed steps for changes to the code base.

Please also let me know what you think an appropriate schedule is for introducing the changes, and whether you think I should consider one of the alternatives (both considered and otherwise) to the steps I outlined in the penultimate section.

Other stuff

Use case/motivation

(See above)

Related issues

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions