Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

basic SSE-KMS support #611

Merged
merged 1 commit into from
Sep 17, 2015
Merged

basic SSE-KMS support #611

merged 1 commit into from
Sep 17, 2015

Conversation

bsapiro
Copy link

@bsapiro bsapiro commented Aug 30, 2015

Can use KMS to manage encryption keys for S3

Can use KMS to manage encryption keys for S3
@fviard
Copy link
Contributor

fviard commented Sep 4, 2015

Good feature!
This should fixes:
#498

@fviard
Copy link
Contributor

fviard commented Sep 4, 2015

ok.
I did test it and it works well.

There is just a limitation that as the Etag is wrong and doesn't have a '-' char inside,
When the sync compare local and remote, it always thinks that the file is different as the md5 doesn't match and so re-upload the file. (in the case of not multipart files)

In FileList.py, there is the following code that selects to use the Etag or the md5 in metadata for the comparison:
"""
if '-' in rem_list[key]['md5']: # always get it for multipart uploads
_get_remote_attribs(S3Uri(object_uri_str), rem_list[key])
md5 = rem_list[key]['md5']
rem_list.record_md5(key, md5)
if break_now:
break
"""
I think that a way to fix this issue is to also always take the md5 of metadata when "kms_key" config is set.
Because, in such a case you are sure that there will be a head request for each file and that it is the "md5" that we have stored that will be used to compare the local an the remote file.

So, in exchange to not re-uploading any small kms encrypted file at each sync, we will perform an individual HEAD request for all the files of your folder/bucket.
I don't know if it is better.
@mdomsch Have you an opinion on that subject?

@mdomsch
Copy link
Contributor

mdomsch commented Sep 9, 2015

Ouch. I was willing to do HEADs for the (few?) multi-part files needed,
while getting MD5s from the directory listing (1000 in a single call) for
all non-multipart files. This would force us to issue HEADs for all files
in a bucket on each run (could we augment --cache-file somehow so at least
subsequent runs wouldn't have to do it? Keep the directory listing ETag,
HEAD s3cmd-attrs md5 both there for each object, and only do HEADs if the
objects have changed?) Otherwise, I'm not sure how much we care about
remote-copying based on source MD5 when KMS is in use, and I would just
disable it.

We likewise don't deal well with locally GPG-encrypted content and
de-duping against non-GPG-encrypted (local) content. I'm OK with that. If
someone really wanted to tackle that cleanly, it would probably apply to
KMS encryption too, and we could then re-enable the remote copy
optimization.

Thanks,
Matt

On Fri, Sep 4, 2015 at 5:44 PM, Florent Viard notifications@github.com
wrote:

ok.
I did test it and it works well.

There is just a limitation that as the Etag is wrong and doesn't have a
'-' char inside,
When the sync compare local and remote, it always thinks that the file is
different as the md5 doesn't match and so re-upload the file. (in the case
of not multipart files)

In FileList.py, there is the following code that selects to use the Etag
or the md5 in metadata for the comparison:
"""
if '-' in rem_list[key]['md5']: # always get it for multipart uploads
_get_remote_attribs(S3Uri(object_uri_str), rem_list[key])
md5 = rem_list[key]['md5']
rem_list.record_md5(key, md5)
if break_now:
break
"""
I think that a way to fix this issue is to also always take the md5 of
metadata when "kms_key" config is set.
Because, in such a case you are sure that there will be a head request for
each file and that it is the "md5" that we have stored that will be used to
compare the local an the remote file.

So, in exchange to not re-uploading any small kms encrypted file at each
sync, we will perform an individual HEAD request for all the files of your
folder/bucket.
I don't know if it is better.
@mdomsch https://github.com/mdomsch Have you an opinion on that subject?


Reply to this email directly or view it on GitHub
#611 (comment).

@fviard
Copy link
Contributor

fviard commented Sep 17, 2015

@bsapiro I will merge your PR and add myself the cmd line option.
Thanks for your contribution.

We will not do anything particular for the moment for the etags that are wrong but something should be done for that in the future :-s

fviard added a commit that referenced this pull request Sep 17, 2015
@fviard fviard merged commit e65af68 into s3tools:master Sep 17, 2015
fviard added a commit to fviard/s3cmd that referenced this pull request Sep 17, 2015
fviard added a commit to fviard/s3cmd that referenced this pull request Sep 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants