-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Description
Apache Airflow version
2.3.3
What happened
Bug when trying to use the S3Hook to download a file from S3 with extra parameters for security like an SSECustomerKey.
The function download_file fetches the extra_args from self where we can specify the security parameters about encryption as a dict.
But download_file is calling get_key() which does not use these extra_args when calling the load() method here, this results in a botocore.exceptions.ClientError: An error occurred (400) when calling the HeadObject operation: Bad Request. error.
This could be fixed like this:
load as says boto3 documentation is calling S3.Client.head_object() which can handle **kwargs and can have all the arguments below:
response = client.head_object(
Bucket='string',
IfMatch='string',
IfModifiedSince=datetime(2015, 1, 1),
IfNoneMatch='string',
IfUnmodifiedSince=datetime(2015, 1, 1),
Key='string',
Range='string',
VersionId='string',
SSECustomerAlgorithm='string',
SSECustomerKey='string',
RequestPayer='requester',
PartNumber=123,
ExpectedBucketOwner='string',
ChecksumMode='ENABLED'
)
An easy fix would be to give the extra_args to get_key then to load(**self.extra_args)
What you think should happen instead
the extra_args should be used in get_key() and therefore obj.load()
How to reproduce
Try to use the S3Hook as below to download an encrypted file:
from airflow.providers.amazon.aws.hooks.s3 import S3Hook
extra_args={
'SSECustomerAlgorithm': 'YOUR_ALGO',
'SSECustomerKey': YOUR_SSE_C_KEY
}
hook = S3Hook(aws_conn_id=YOUR_S3_CONNECTION, extra_args=extra_args)
hook.download_file(
key=key, bucket_name=bucket_name, local_path=local_path
)
Operating System
any
Versions of Apache Airflow Providers
No response
Deployment
Docker-Compose
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct