-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-16202. Enhance openFile() for better read performance against object stores #2584
HADOOP-16202. Enhance openFile() for better read performance against object stores #2584
Commits on Apr 1, 2022
-
HADOOP-16202. Enhance openFile()
Roll-up of the previous PR Change-Id: Ib0aec173afcd8aae33f52da3f99ac813bd38c32f HADOOP-16202. javadocs and style Change-Id: Id4294ac7034155a10be22fb4631edf43cbadc22b HADOOP-16202. openFile: read policies Change "fs.option.openfile.fadvise" to "fs.option.openfile.read.policy" and expand with "vectored", "parquet" and "orc", all of which map in s3a to random. The concept is that by choosing a read policy you can do more than just change seek policy -it could switch buffering, caching etc. Change-Id: I2147840f58fb54853c797d2cab5d668c3d1d2541 HADOOP-16202: documentation changes and IOStatistics of open operations * Thomas Marquardt's suggestions on the docs * standard action name for file opened * S3AInputStream measures the count and duration of this, and reports it Change-Id: I7feacf4eb4d6494bb93b3dfc05b060ad75e52c18 HADOOP-16202. rebase to trunk; add "whole-file" option +slacken checks on Open contract tests so that if tested against an external connector things are less likely to fail. TODO: make that a compliance switch Change-Id: I9a4535d785949822752571f82f9448b9aac66aad HADOOP-16202: remove the orc, parquet and vectored options from read policy Going through Thomas's feedback... Change-Id: Ibdf2c4ec64c54704f8631d5775d83444660c923a
Configuration menu - View commit details
-
Copy full SHA for c5f543c - Browse repository at this point
Copy the full SHA c5f543cView commit details -
HADOOP-16202. enhance-openfile
* reinstate "vector" as mukund is about to merge that patch into a feature branch * remove refs to orc/parquet in docs * all style, deprecation, spotbugs, EOLs Change-Id: Id44617916e60688bdce2e1f107082704567e3515
Configuration menu - View commit details
-
Copy full SHA for 068eb16 - Browse repository at this point
Copy the full SHA 068eb16View commit details -
HADOOP-16202. enhance-openfile
* review docs, including hrefs * s3a maps vector to random * FileUtil uses whole-file in its copy. this matters when using the cli on a system where the s3a policy is set to random excluding complaints from yetus, i think this is ready to go in Change-Id: I54d43c5b4947e9e7ee91fa9c3feb0a075b4b4527
Configuration menu - View commit details
-
Copy full SHA for c0dfe72 - Browse repository at this point
Copy the full SHA c0dfe72View commit details -
Change-Id: Ibe564e22d336a5ff8a85e1fc678dcd06fa99bb9d
Configuration menu - View commit details
-
Copy full SHA for 97ebbef - Browse repository at this point
Copy the full SHA 97ebbefView commit details -
HADOOP-16202. resync with trunk
imports were invalid Change-Id: Ie80fe283a88e390c968d8136f45cb6ce41f29143
Configuration menu - View commit details
-
Copy full SHA for 756935f - Browse repository at this point
Copy the full SHA 756935fView commit details -
HADOOP-16202. s3a openfile to support a new drain policy fs.s3a.input…
….async.drain.threshold If the #of bytes is above this, the stream is drained in a separate thread; the threshold is there because for small amounts, scheduling the work seems to be slower than the actual processing. draining is also done via a 16K buffer, which reduces a lot of OS API calls. Change-Id: I1b2a71a05c37ada289cf23e128da0a6b01452ee6
Configuration menu - View commit details
-
Copy full SHA for 1bc73e6 - Browse repository at this point
Copy the full SHA 1bc73e6View commit details -
HADOOP-16202. one of Thomas's comments i'd missed
ChecksumFileSystem simplification Change-Id: I9637d386c5e9aea95f568d3f80104bdff99ffc31
Configuration menu - View commit details
-
Copy full SHA for cf13d12 - Browse repository at this point
Copy the full SHA cf13d12View commit details
Commits on Apr 5, 2022
-
HADOOP-16202. build warnings (not yet the spotbugs), and
clean the draining code Change-Id: I602e0414004dd2806e6e942b553c069770bd1250
Configuration menu - View commit details
-
Copy full SHA for f1a68eb - Browse repository at this point
Copy the full SHA f1a68ebView commit details
Commits on Apr 6, 2022
-
HADOOP-16202. improve s3a opening; fix tests
* fix build * mock unbuffer test working by mocking all read() calls and turning off async drain * all file open options moved into OpenFileHelper; these can directly update the S3AReadOpContext builder. This makes the s3a fs class slightly smaller, as a few fields have been cut. +fix style, javadoc complaints Change-Id: I36d888fba40152328eeeb6d17ceb192530ef76e3
Configuration menu - View commit details
-
Copy full SHA for 2843391 - Browse repository at this point
Copy the full SHA 2843391View commit details
Commits on Apr 7, 2022
-
Revert "HADOOP-16202. one of Thomas's comments i'd missed"
This reverts commit cf13d12. Change-Id: I56d69c1a6ae2692607009e8da089b11315ef08ae
Configuration menu - View commit details
-
Copy full SHA for c9c989e - Browse repository at this point
Copy the full SHA c9c989eView commit details -
Change-Id: I56185c5f866269dc2e253570a6564a3dad074666
Configuration menu - View commit details
-
Copy full SHA for 8cc26f9 - Browse repository at this point
Copy the full SHA 8cc26f9View commit details
Commits on Apr 12, 2022
-
Change-Id: I4d2fe8936e066970c2c54a4a053b06025845646c
Configuration menu - View commit details
-
Copy full SHA for e7b29ef - Browse repository at this point
Copy the full SHA e7b29efView commit details
Commits on Apr 19, 2022
-
HADOOP-16202. enhance-openfile: review feedback
Feedback from dannycjones. Change-Id: I546f28411c2475e1254b259c7e0734cc868ea9f0
Configuration menu - View commit details
-
Copy full SHA for 98ebf76 - Browse repository at this point
Copy the full SHA 98ebf76View commit details -
Merge branch 'trunk' into s3/HADOOP-16202-enhance-openfile
Change-Id: If30684e9b4d39e9d1ba9cfdf50963b655c20144f
Configuration menu - View commit details
-
Copy full SHA for bf8e1d4 - Browse repository at this point
Copy the full SHA bf8e1d4View commit details
Commits on Apr 22, 2022
-
HADOOP-16202. checkstyle; unused import
Change-Id: I64ac45369e4a6e1e9cd651b01acd380d258782fb
Configuration menu - View commit details
-
Copy full SHA for 60cb6b5 - Browse repository at this point
Copy the full SHA 60cb6b5View commit details