-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(hive): Workaround for Python 3.9 s3 transfer issue #19887
Conversation
Codecov Report
@@ Coverage Diff @@
## master #19887 +/- ##
=======================================
Coverage 66.52% 66.52%
=======================================
Files 1714 1714
Lines 65032 65033 +1
Branches 6717 6717
=======================================
+ Hits 43260 43261 +1
Misses 20065 20065
Partials 1707 1707
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one suggestion, lgtm otherwise
superset/db_engine_specs/hive.py
Outdated
@@ -80,6 +81,7 @@ def upload_to_s3(filename: str, upload_prefix: str, table: Table) -> str: | |||
filename, | |||
bucket_path, | |||
os.path.join(upload_prefix, table.table, os.path.basename(filename)), | |||
Config=TransferConfig(use_threads=False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Config=TransferConfig(use_threads=False), | |
# Disabling threading because it breaks python 3.9 | |
Config=TransferConfig(use_threads=False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@etr2460 I inlined the comment.
* fix(hive): Workaround for Python 3.9 s3 transfer issue * Update hive.py
* fix(hive): Workaround for Python 3.9 s3 transfer issue * Update hive.py
* fix(hive): Workaround for Python 3.9 s3 transfer issue * Update hive.py
* fix(hive): Workaround for Python 3.9 s3 transfer issue * Update hive.py
SUMMARY
This PR remedies an issue we (Airbnb) were facing after upgrading to Python 3.9. Per boto/s3transfer#197 (comment) there seems to be an s3 threading with Python 3.9+ causing the transfer to fail. Disabling threading seems to mitigate the issue. Note it's unclear what the potential performance impact is, especially when uploading large files.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
CI and verified within an Airbnb environment.
ADDITIONAL INFORMATION