Skip to content

Cannot use templated to/from datetime fields in S3DeleteObjectsOperator #42363

@mgorsk1

Description

@mgorsk1

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==8.28.0

Apache Airflow version

2.9.1

Operating System

Debian GNU/Linux 11 (bullseye)

Deployment

Other 3rd-party Helm chart

Deployment details

No response

What happened

S3DeleteObjectsOperator fails when to_datetime or from_datetime are defined as airflow macros.

File "/opt/venv/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 465, in _execute_task
    result = _execute_callable(context=context, **execute_callable_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 432, in _execute_callable
    return execute_callable(context=context, **execute_callable_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/airflow/models/baseoperator.py", line 400, in wrapper
    return func(self, *args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/operators/s3.py", line 535, in execute
    keys = self.keys or s3_hook.list_keys(
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 132, in wrapper
    return func(*bound_args.args, **bound_args.kwargs)
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 869, in list_keys
    return self._list_key_object_filter(keys, from_datetime, to_datetime)
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 670, in _list_key_object_filter
    return [k["Key"] for k in keys if _is_in_period(k["LastModified"])]
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 670, in <listcomp>
    return [k["Key"] for k in keys if _is_in_period(k["LastModified"])]
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 666, in _is_in_period
    if to_datetime is not None and input_date > to_datetime:
TypeError: '>' not supported between instances of 'datetime.datetime' and 'str'

What you think should happen instead

to_datetime and from_datetime fields can be templated and are converted from string to datetime in appropriate place.

How to reproduce

Create dag with following task:

to_datetime = "{{ macros.ds_add(ds, -30) }}"

task = S3DeleteObjectsOperator(
    task_id='delete_old_logs',
    bucket='mybucket',
    prefix='logs/',
    to_datetime=to_datetime,
    aws_conn_id=aws_conn_id
)

Anything else

It would work if i pass to_datetime (or from_datetime) as datetime.now() + timedelta(days=-30). It is then confusing why these fields are accepted as template_fields.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions