-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-17833. Improve Magic Committer performance #3289
HADOOP-17833. Improve Magic Committer performance #3289
Conversation
TESTING IN PROGRESS |
208929e
to
a2166e1
Compare
+plan to lift some of the statistic names from the manifest committer and do the same reporting as in manifest committer. will also include list costs in results. (side issue, thinking of whether the json deserializer could build stats on reading costs, which can then be collected too to measure cost of ser/deser and, by collecting stream read/write costs, those steps |
d6a0dcf
to
f850021
Compare
/** | ||
* Create a stub commit context for tests. | ||
* There's no job context and the thread pool is | ||
* not set up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it is if a count is passed in.
@@ -86,6 +86,6 @@ log4j.logger.org.apache.hadoop.fs.s3a.S3AStorageStatistics=INFO | |||
|
|||
# Auditing operations in all detail | |||
# Log before a request is made to S3 | |||
#log4j.logger.org.apache.hadoop.fs.s3a.audit=DEBUG | |||
log4j.logger.org.apache.hadoop.fs.s3a.audit=DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think I'll pull this out. Handy for me, but still a bit noisy
This is at a point where it's ready for some review and any benchmarking people can do. I've cut out a lot of HTTP IO per file create and load @mukund-thakur @mehakmeet @dongjoon-hyun @bogthe The only other big thing to consider is could we route the parallel POST calls in job commit through a fork/join thread pool and does that deliver better throughput due to the fact there's no need to yield to the OS scheduler to pick up the next bit of work. I am happy to do a live shared screen review of this PR next week, if people want do discuss things that way |
Thank you, @steveloughran ! |
|
|
9d37211
to
30748c9
Compare
|
dd5ef82
to
87ae7e5
Compare
87ae7e5
to
3b13315
Compare
Change-Id: I95412b3fe13a54389521423a4a8f5a9d6e1209da
Added enforcer rules to restrict use of mapred imports in production code to selected packages and classes under oah.fs.s3a.commit replaced an import in CommitterConstants with the string moved some private committer implementation classes into a new package oah.fs.s3a.commit.impl, a package which is allowed to use mr classes. This locks down use of mapred code and reduces the risk that there may be transitive dependency on the libraries in the filesystem class itself. Change-Id: Idc0120d3903b9d7e88267384a5380884f1b4e374
Change-Id: Ibda8ca966b6973c4c5a398bf1691325eff7b4388
as part of this change, reviewing HADOOP-17584 problem here is that magic committer MR recovery would use same job ID for path, and so if a task attempt ta0 from job attempt1 were to commit, it would scan its directory tree and find any files created by attempt 0. - "Job path" is the path with job id/uuid only. - Job attempt path is used for a subdirectory - magic committer works with a job attempt path, so is unique on a second attempt - job abort/cleanup cleans up the job path, not just the single event - and for magic committer, everything in _temporary Before anyone complains that the magic committer will be deleting more stuff in job cleanup, it was already stopping uploads and deleting the __magic dir. Change-Id: I69e65059517c1e4ca087b58606d381c79ada7505
Wrapping up createFile() with the ability to set headers on the file builder.must("fs.s3a.create.header.my-header","my-value"); this will set a user metadata entry "my-header" which xattr will return as "header.my-header". This forced me to make PutOptions something passed around more rigorously and used as the parameter class to innerCreateFile. This is ultimately good as it lets us set new options later..encryption, storage class etc. if we so choose. also: s3a path capabilities let you probe for this prefix, as well as the performance one...think this should be the standard behaviour for custom options. Change-Id: Icb5a1e923a3bc50f42e825d34b16da0cf622c289
Change-Id: If4dbd151c4099618f5e1a26f0229d019893d367e
Change-Id: I8a3272b4653fedd023d2490a77ca91d8baef1129
Change-Id: Ie944c2fc1cb7f1634076a7df37148b51d7f5a741
rebase onto trunk with HADOOP-12020/storage class; use PutObjectOptions as the place to pass the option from createFile() to the requests. The feature itself is not implemented, just prepared for. Change-Id: I5d2e57e672f49021c3cb8bfa3b1a3daf09c61a38
73aaec9
to
69eedd9
Compare
Change-Id: Ia2fe020bb3672fefd42f1ceec8af0337c4f00b73
🎊 +1 overall
This message was automatically generated. |
testing: s3 london |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Ran the IT's as well. No failures.
conscious decision to choose speed over safety and | ||
that the outcome was their own fault. | ||
|
||
Accordingly: *Use if and only if you are confident that the conditions are met.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially I was worried about inconsistencies leading to escalations but by the end I think we are clear enough. Nice doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we aren't actually that more vulnerable than when someone creates a file under a file, which they can do today.
@@ -592,13 +634,41 @@ public void jobCompleted(boolean success) { | |||
} | |||
|
|||
/** | |||
* Begin the final commit. | |||
* Crate a commit context for a job or task. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: create
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/commit/CommitterTestHelper.java
Outdated
Show resolved
Hide resolved
Change-Id: Iade3656fdf8c57284e7360c317e629b1cac8b109
LGTM +1 |
thanks |
Change-Id: I2dedac520cd5d22317f4fc7170ca25786c647004
Change-Id: I359af3578fc6dd7d669a0762a8869e4cd3c1ea7c
Speed up the magic committer with key changes being * Writes under __magic always retain directory markers * File creation under __magic skips all overwrite checks, including the LIST call intended to stop files being created over dirs. * mkdirs under __magic probes the path for existence but does not look any further. Extra parallelism in task and job commit directory scanning Use of createFile and openFile with parameters which all for HEAD checks to be skipped. The committer can write the summary _SUCCESS file to the path `fs.s3a.committer.summary.report.directory`, which can be in a different file system/bucket if desired, using the job id as the filename. Also: HADOOP-15460. S3A FS to add `fs.s3a.create.performance` Application code can set the createFile() option fs.s3a.create.performance to true to disable the same safety checks when writing under magic directories. Use with care. The createFile option prefix `fs.s3a.create.header.` can be used to add custom headers to S3 objects when created. Contributed by Steve Loughran. Change-Id: I9e086423f02eb25b6e70fc1c12a13e0a5afe9cb9
Speed up the magic committer with key changes being * Writes under __magic always retain directory markers * File creation under __magic skips all overwrite checks, including the LIST call intended to stop files being created over dirs. * mkdirs under __magic probes the path for existence but does not look any further. Extra parallelism in task and job commit directory scanning Use of createFile and openFile with parameters which all for HEAD checks to be skipped. The committer can write the summary _SUCCESS file to the path `fs.s3a.committer.summary.report.directory`, which can be in a different file system/bucket if desired, using the job id as the filename. Also: HADOOP-15460. S3A FS to add `fs.s3a.create.performance` Application code can set the createFile() option fs.s3a.create.performance to true to disable the same safety checks when writing under magic directories. Use with care. The createFile option prefix `fs.s3a.create.header.` can be used to add custom headers to S3 objects when created. Contributed by Steve Loughran.
Speed up the magic committer with key changes being * Writes under __magic always retain directory markers * File creation under __magic skips all overwrite checks, including the LIST call intended to stop files being created over dirs. * mkdirs under __magic probes the path for existence but does not look any further. Extra parallelism in task and job commit directory scanning Use of createFile and openFile with parameters which all for HEAD checks to be skipped. The committer can write the summary _SUCCESS file to the path `fs.s3a.committer.summary.report.directory`, which can be in a different file system/bucket if desired, using the job id as the filename. Also: HADOOP-15460. S3A FS to add `fs.s3a.create.performance` Application code can set the createFile() option fs.s3a.create.performance to true to disable the same safety checks when writing under magic directories. Use with care. The createFile option prefix `fs.s3a.create.header.` can be used to add custom headers to S3 objects when created. Contributed by Steve Loughran.
Speeding up the magic committer with key changes being
(no DELETEs after file/dir creation)
the LIST call intended to stop files being created over dirs
This is still WiP as it needs
Lots of changes in the tests because the committer has added
a CommitContext class which manages the lifecycle of
the thread pool and a set of thread local JSON serializers;
this is now what is passed around in internal committer methods,
so breaking tests calling in to them.
It is a better design (one we should have done from the start);
manifest committer is even better as all its operations "stages"
are modular. Just means that a lot of tests stopped compiling.
And as usual, mock tests played up.
Removed the injection/handling of inconsistent S3
from the committer tests. Not needed, and simply complicating
the code needlessly.
The write optimisations should provide significant benefits when writing files: at least one LIST per creation, for parquet a HEAD too, and O(depth) delete calls which just generates write load, risk of throttling etc. There is still more IO taking place for each magic file than writing a small simple file (it is always multipart and we have to add a marker file too). Spark also has to add an extra getXAttrs call for each file when building its intermediate stats.
Also of note