-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Add GlacierUploadArchiveOperator #26652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
62cad95 to
a205040
Compare
airflow/providers/amazon/aws/example_dags/example_glacier_to_gcs.py
Outdated
Show resolved
Hide resolved
30f5a91 to
a2564ff
Compare
a2564ff to
a2e3956
Compare
a2e3956 to
ab438c4
Compare
96abca8 to
2ced174
Compare
2ced174 to
bdadcbf
Compare
| in which case Amazon S3 Glacier uses the AWS account ID associated with | ||
| the credentials used to sign the request | ||
| :param vault_name: The name of the vault | ||
| :param body: A bytes or seekable file-like object. The data to upload. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting that my concern of #22758 (comment) is also applied here. The question of where the bytes come from is very important. If it's from Xcom then body probably need to be templated (but I guess we don't really want users to push it to Xcom?) if it's from somewhere else then we probably a transfer operator is more suitable.
To me it seems like the only use case for this operator is when the body is hard coded and I'm not sure how useful that is?
That said... If people find this useful as is I'm OK with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the source (where data is coming from) is known then it makes sense to have a transfer operator but my purpose here is to automate the frequently used python function in DAG so that it can be reused as an operator. It would be useful if I want to upload my local data to Glacier or data from any source storage for which we do not have native API (download data and upload)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is still transfer operator: LocalTo* we have several of those in the project.
The issue is that operators don't share disk so the only valid case for upload from disk without transfer operator is if the task you are running is doing source to destination in single task (usually PythonOperator).
As I said I'm OK with adding this but I didn't see any real production case usage for this kind of operators. If this is one and can be shown in an example DAG I be happy to learn something new :)
ferruzzi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late reply, just left a couple of suggestions.
Add GlacierUploadArchiveOperator
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.