-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-17922. move to fs.s3a.encryption.algorithm - JCEKS integration (#3466) #3508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
apache#2706) This (big!) patch adds support for client side encryption in AWS S3, with keys managed by AWS-KMS. Read the documentation in encryption.md very, very carefully before use and consider it unstable. S3-CSE is enabled in the existing configuration option "fs.s3a.server-side-encryption-algorithm": fs.s3a.server-side-encryption-algorithm=CSE-KMS fs.s3a.server-side-encryption.key=<KMS_KEY_ID> You cannot enable CSE and SSE in the same client, although you can still enable a default SSE option in the S3 console. * Filesystem list/get status operations subtract 16 bytes from the length of all files >= 16 bytes long to compensate for the padding which CSE adds. * The SDK always warns about the specific algorithm chosen being deprecated. It is critical to use this algorithm for ranged GET requests to work (i.e. random IO). Ignore. * Unencrypted files CANNOT BE READ. The entire bucket SHOULD be encrypted with S3-CSE. * Uploading files may be a bit slower as blocks are now written sequentially. * The Multipart Upload API is disabled when S3-CSE is active. Contributed by Mehakmeet Singh
…d enabled (apache#3239) S3A S3Guard tests to skip if S3-CSE are enabled (apache#3263) Follow on to * HADOOP-13887. Encrypt S3A data client-side with AWS SDK (S3-CSE) If the S3A bucket is set up to use S3-CSE encryption, all tests which turn on S3Guard are skipped, so they don't raise any exceptions about incompatible configurations. Contributed by Mehakmeet Singh
This migrates the fs.s3a-server-side encryption configuration options to a name which covers client-side encryption too. fs.s3a.server-side-encryption-algorithm becomes fs.s3a.encryption.algorithm fs.s3a.server-side-encryption.key becomes fs.s3a.encryption.key The existing keys remain valid, simply deprecated and remapped to the new values. If you want server-side encryption options to be picked up regardless of hadoop versions, use the old keys. (the old key also works for CSE, though as no version of Hadoop with CSE support has shipped without this remapping, it's less relevant) Contributed by: Mehakmeet Singh
…apache#3466) The ordering of the resolution of new and deprecated s3a encryption options & secrets is the same when JCEKS and other hadoop credentials stores are used to store them as when they are in XML files: per-bucket settings always take priority over global values, even when the bucket-level options use the old option names. Contributed by Mehakmeet Singh and Steve Loughran
Tested on tip of the chain of commits: CSE:
timeout due to setup/bandwidth, happens on trunk as well for me. non-CSE
CSE-S3Guard
non-CSE-S3Guard
CC: @steveloughran |
So this is the full chain of commits? And there's been no changes other than cherrypicking on to branch-3.3? if so, +1 pending yetus. I can check out then commit the sequence locally, without having to merge the commits |
Yes, this is the tip branch, the commits in order are #3292, #3506, #3507, and then this one. I am not sure how the merge works in this case, would it merge all PRs after the tip is merged? #3506 was two Jiras made into one commit, as it's just CSE-s3guard related IOE and then skip tests, thought it's better to make that as one? |
💔 -1 overall
This message was automatically generated. |
the merge button on the github UI is "squash and merge", but if I Check out your branch I can just cherrypick the chain of commits on top of branch-3.3 |
@steveloughran, did you mean chain of commits in a single PR? I thought we were gonna do a chain of PRs with single commits. |
merged in branch-3.3 |
The ordering of the resolution of new and deprecated s3a encryption options & secrets is the same when JCEKS and other hadoop credentials stores are used to store them as
when they are in XML files: per-bucket settings always take priority over global values,
even when the bucket-level options use the old option names.
Contributed by Mehakmeet Singh and Steve Loughran