-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Closed
Labels
area:providerskind:metaHigh-level information important to the communityHigh-level information important to the communityprovider:amazonAWS/Amazon - related issuesAWS/Amazon - related issues
Description
Body
Original stacktrace from the Slack
Error:
File "/usr/local/airflow/plugins/plugins/others/data_source_monitor.py", line 53, in retrieve_data
get_time_query = s3_hook.read_key(
File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 64, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 92, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 514, in read_key
obj = self.get_key(key, bucket_name)
File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 64, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 92, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 493, in get_key
s3_resource = self.get_session().resource(
File "/usr/local/airflow/.local/lib/python3.10/site-packages/boto3/session.py", line 446, in resource
client = self.client(
File "/usr/local/airflow/.local/lib/python3.10/site-packages/boto3/session.py", line 299, in client
return self._session.create_client(
File "/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/session.py", line 976, in create_client
client = client_creator.create_client(
File "/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/client.py", line 116, in create_client
endpoints_ruleset_data = self._load_service_endpoints_ruleset(
File "/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/client.py", line 220, in _load_service_endpoints_ruleset
return self._loader.load_service_model(
File "/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py", line 142, in _wrapper
data = func(self, *args, **kwargs)
File "/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py", line 406, in load_service_model
known_services = self.list_available_services(type_name)
File "/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py", line 142, in _wrapper
data = func(self, *args, **kwargs)
File "/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py", line 311, in list_available_services
api_versions = os.listdir(full_dirname)
OSError: [Errno 12] Cannot allocate memory: '/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/data/efs'The reason of this error simple, for some operations S3Hook create resource (High Level client) in addition to S3.Client and this resource created every time when some method of S3Hook called as result additional memory required, for example if run S3Hook.download_file into the loop it might be reason for this error
As usual there are at least two solutions:
Option 1: use caching into the internal methods of S3Hook
Option 2: Get rid of resource usage in S3 hook and replace it by S3.Client methods. It might be better solution:
- Seems like resources do not actively maintained in
boto3 - It required for about 30-40 MB of memory for create new resource object, however everything (and even more) could be done by
S3.Client
Committer
- I acknowledge that I am a maintainer/committer of the Apache Airflow project.
Metadata
Metadata
Assignees
Labels
area:providerskind:metaHigh-level information important to the communityHigh-level information important to the communityprovider:amazonAWS/Amazon - related issuesAWS/Amazon - related issues