Skip to content

Conversation

@vandonr-amz
Copy link
Contributor

Boto expects key/value lists in a weird format that's not super user friendly. I think we should offer a pleasant interface to the users, so we should accept regular python dictionaries and do the transformation ourselves when possible. I added a method to this effect, and plugged it in where I could see this pattern.

@vandonr-amz vandonr-amz requested a review from eladkal as a code owner January 9, 2023 21:43
@boring-cyborg boring-cyborg bot added area:providers provider:amazon AWS/Amazon - related issues labels Jan 9, 2023
@Taragolis
Copy link
Contributor

@vandonr-amz One possible side effect of this change - I think Airflow still can't render dictionaries keys if it templated value: #18938

@vandonr-amz
Copy link
Contributor Author

vandonr-amz commented Jan 10, 2023

@vandonr-amz One possible side effect of this change - I think Airflow still can't render dictionaries keys if it templated value: #18938

wow that PR is an adventure... sad to see it doesn't have a happy ending

@vandonr-amz
Copy link
Contributor Author

@vandonr-amz One possible side effect of this change - I think Airflow still can't render dictionaries keys if it templated value: #18938

However, I don't really see why you'd want to template dictionary keys. Templating values makes sense, but keys should be pretty much constant

@Taragolis
Copy link
Contributor

However, I don't really see why you'd want to template dictionary keys. Templating values makes sense, but keys should be pretty much constant

I don't. My opinion that AWS resource Tag name (key) should be consistent or all idea would be broken and turned into the mess.

However we don't know user cases, so at least we should add info that behaviour with dict-style could be different rather then use native AWS API style (this is not boto3 behaviour this how what AWS API expected this parameters)

@Taragolis
Copy link
Contributor

Also I think we need to keep compatibility with native API parts. I mean it would be nice if function check (btw i think you forget to add it into the PR) if value are dictionary then convert it to appropriate format, if not.. well then return as is or check if this has appropriate format.

Actually I also work on the same issue in the past, I don't remember when I didn't finish it, may be because I could't cover all cases or something like that. I've prepare GIST based on some past findings, it is not concern that we should do by this way, just an example. It use additional ability of Airflow render particular attributes from inner object

@vandonr-amz
Copy link
Contributor Author

oh so you mean we should always accept either preformated or as a dict. Yeah why not, I can do that.

@vandonr-amz
Copy link
Contributor Author

ok, made it accept a list or dict where meaningful, and added the missing file.
It's also shipping a clandestine commit for now because it's stacked on top of #28837 which has a higher priority.
It should look cleaner once the other commit passes and I can update the base.

ashb
ashb previously requested changes Jan 11, 2023
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

  • Please separate out the SageMakerCreateExperimentOperator changes Edit: ah right, okay.

  • Can this be more-universal across the provider? There are likely other operators that have the same thing, ISTR there are some EC2 operators.

  • Do we show a concrete example of the two styles anywhere in the docs?

    I'm not sure where the best place might be? It almost feels like it should be in a "top level" section like https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/index.html ? WDYT?

@vandonr-amz
Copy link
Contributor Author

vandonr-amz commented Jan 13, 2023

  • Can this be more-universal across the provider? There are likely other operators that have the same thing, ISTR there are some EC2 operators.

I didn't find other examples of this pattern in the AWS package, RDS, Sagemaker and S3 are the only 3 that popped up, but also it's a bit hard to grep for this.
EC2 apparently only has 2 operators, to start/stop an instance ? And no tag formatting there.

No I don't think we do. It's true that it might be good to have it somewhere, but also, I'd hope for new users to use the dictionary approach, which is a lot more intuitive.
Passing an already formatted list is not marked as so but is almost obsolete, I'm not sure we want to slap something on the front page advertising the complex approach when there is a simple one.

I'd rather copy-paste an example in every docstring where necessary.

@Taragolis
Copy link
Contributor

I didn't find other examples of this pattern in the AWS package, RDS, Sagemaker and S3 are the only 3 that popped up, but also it's a bit hard to grep for this.

I know a lot of samples where same approaches uses... but all of them inside of huge object structure.

For example in Batch client:

SubmitJob

  1. Environment variables in containerOverrides, EksContainerOverride expected list of dict with "name" and "value" keys
  2. resourceRequirements expected list of dict with "type" and "value" keys

RegisterJobDefinition

  1. environment expected list of dict with "name" and "value" keys
  2. secrets expected list of dict with "name" and "valueFrom" keys

So end users might also use this function for produce this objects (I don't think we should help them with this) by just provide dict and appropriate mapping for key_label and value_label.

def format_tags(source: Any, *, key_label: str = "Key", value_label: str = "Value"):

Some almost same function I use in the past in Airflow 1.10.x:

def aws_batch_env(d):
    return [{"name": k, "value": v} for k, v in d.items()]

...
    batch_env = {
        "VAR_1": "{{ ti.xcom_pull(task_ids='foo') }}",
        "VAR_2": "{{ ti.xcom_pull(task_ids='bar') }}",
        ...
        "VAR_n": "{{ ti.xcom_pull(task_ids='spam') }}",
    }

    batch_task = BatchOperator(
        ...
        overrides={
            "environment": aws_batch_env(batch_env)
        },
    )

@ashb ashb dismissed their stale review January 15, 2023 14:53

Fixed

@vandonr-amz
Copy link
Contributor Author

@eladkal @uranusjr anything else you'd like changed with that PR ? Otherwise we could merge it 😇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants