Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v4 signing #419

Merged
merged 13 commits into from
Dec 15, 2014
Merged

v4 signing #419

merged 13 commits into from
Dec 15, 2014

Conversation

vamitrou
Copy link
Contributor

Please feel free to try it out and help me debug it if needed. If you need any explanation please be my guest.

It is based on @mludvig 's branch. Thanks a lot for kick-starting it.

@vamitrou
Copy link
Contributor Author

  • default_region is added in config file. In case the requested bucket is not in our region, a redirect will be made behind the curtain.

@mdomsch
Copy link
Contributor

mdomsch commented Nov 16, 2014

Thank you! We need to cache bucket locations during an invocation, so you don't have to do the redirect on every operation, only on the first (failing) operation. For example, invoking the info operation causes the bucket location to be looked up 4 times, rather than the once I would expect.

$ ./s3cmd info s3://eu-central-1.domsch.com/
WARNING: Forwarding request to eu-central-1
s3://eu-central-1.domsch.com/ (bucket):
Location: eu-central-1
WARNING: Forwarding request to eu-central-1
Expiration Rule: none
WARNING: Forwarding request to eu-central-1
WARNING: Forwarding request to eu-central-1
policy: none
ACL: aeae39dcf6ebdee938eddfc9f15e450c1268e90bfe572482a08d720452ad38bc: FULL_CONTROL

@vamitrou
Copy link
Contributor Author

Thanks for the suggestion, I added it.

@mdomsch
Copy link
Contributor

mdomsch commented Nov 16, 2014

Thanks. That did resolve re-issuing the "get bucket location" call
repeatedly.

Sync into eu-central-1 did work for me.

Try issuing an 'info' command to a file in such a bucket. I get a failure.

DEBUG: Response: {'status': 400, 'headers': {'x-amz-id-2':
'u/RTtL+oslCek5t8vsNEL/cCvkdxnSmHkTjFenWj9V43/VQtPM/ArAYz1+NTOiMfVFaCUsnQTRY=',
'server': 'AmazonS3', 'transfer-encoding': 'chunked', 'connection':
'close', 'x-amz-request-id': '394953C0DCA53EDC', 'date': 'Sun, 16 Nov 2014
20:52:22 GMT', 'content-type': 'application/xml'}, 'reason': 'Bad Request',
'data': ''}
DEBUG: ConnMan.put(): connection put back to pool (
https://eu-central-1.domsch.com.s3.amazonaws.com#1)
ERROR: no element found: line 1, column 0

On Sun, Nov 16, 2014 at 2:22 PM, Vasileios Mitrousis <
notifications@github.com> wrote:

Thanks for the suggestion, I added it.


Reply to this email directly or view it on GitHub
#419 (comment).

@vamitrou
Copy link
Contributor Author

That's weird, the 400 response doesn't give me any error data. Expected API output should contain the correct region among the 400 reply, but in this case data is empty.

I will check it out. Thanks for crashing it.

DEBUG: Response: {'status': 400, 'headers': {'x-amz-id-2': '+h64zorKtSNUlEJcVlMZ3Cs/1t+duPn0AYLb9USVIOJM7o/rTKgWX2oJrdlTm33jbwhazVBsGac=', 'server': 'AmazonS3', 'transfer-encoding': 'chunked', 'connection': 'close', 'x-amz-request-id': '27494001090D4876', 'date': 'Mon, 17 Nov 2014 09:29:08 GMT', 'content-type': 'application/xml'}, 'reason': 'Bad Request', 'data': ''}

@vamitrou
Copy link
Contributor Author

fixed it now.

$ s3cmd info s3://eu-central-bucket-test/s3cmd
WARNING: Forwarding request to eu-central-1
WARNING: Forwarding request to eu-central-1
s3://eu-central-bucket-test/s3cmd (object):
   File size: 118815
   Last mod:  Sun, 16 Nov 2014 20:23:40 GMT
   MIME type: binary/octet-stream
   MD5 sum:   849d444394494b6cc2afcc73f818a557
   SSE:       NONE
   policy: none
   ACL:       dc42e84a409dcb65b7bdb02616704045fb06e3653198e1b39defe146634a5fcd: FULL_CONTROL

It seems that it is an API problem. "Invalid region" response is the one expected but not returned. What I am doing in this case is to call get_bucket_location to get the correct region. That leads to a second redirect.

It redirects first the get_bucket_location call and then redirects the previous (in that case info) call. I could do some trick to avoid running the get_bucket_location for the second time since we get the region from the erroneous response, but this might increase a bit the complexity of the handling.

Please check my last commit and let me know what do you think regarding my comments above.

Thanks!

vamitrou and others added 4 commits November 18, 2014 11:48
Some S3 API calls return HTTP 400 responses due to incorrectly-signed
headers.  These should also include 'data' in the response, but alas,
some do not.

Don't crash just because S3 didn't give us what we expected.
so we can see what is actually being sent.
@vamitrou
Copy link
Contributor Author

All tests run fine now. Still have an issue with multi-part upload.

@vamitrou
Copy link
Contributor Author

vamitrou commented Dec 9, 2014

@mdomsch file put and multi-part upload is fixed.

@mdomsch
Copy link
Contributor

mdomsch commented Dec 9, 2014

Excellent, thank you. For grins, can you remove (or rename so it doesn't
get called) the v2 signing stuff, to easily confirm it's not used anymore?

On Tue, Dec 9, 2014 at 6:46 AM, Vasileios Mitrousis <
notifications@github.com> wrote:

@mdomsch https://github.com/mdomsch file put and multi-part upload is
fixed.


Reply to this email directly or view it on GitHub
#419 (comment).

@vamitrou
Copy link
Contributor Author

vamitrou commented Dec 9, 2014

It cannot be removed since it's used in cases where the bucket name doesn't conform with DNS naming conventions.

The cause is that old datacenters still support names like xxx-Autotest-3 (capital chars) while new ones not. Signing v4 doesn't support these bucket names so in these edge cases I am still using v2 signing. (S3/S3.py:161)

I renamed the sign_string_v2() method and all the tests passed. But I found a call in S3/CloudFront.py which is not covered by tests. Is it still supported and/or used? If yes, v4 support should be added for CloudFront for the new regions.

@mdomsch
Copy link
Contributor

mdomsch commented Dec 9, 2014

Yes, CloudFront is still used, though I myself haven't done any significant
work with CloudFront.

On Tue, Dec 9, 2014 at 7:52 AM, Vasileios Mitrousis <
notifications@github.com> wrote:

It cannot be removed since it's used in cases where the bucket name
doesn't conform with DNS naming conventions.

The cause is that old datacenters still support names like xxx-Autotest-3
(capital chars) while new ones not. Signing v4 doesn't support these bucket
names so in these edge cases I am still using v2 signing. (S3/S3.py:161)

I renamed the sign_string_v2() method and all the tests passed. But I
found a call in S3/CloudFront.py which is not covered by tests. Is it still
supported and/or used? If yes, v4 support should be added for CloudFront
for the new regions.


Reply to this email directly or view it on GitHub
#419 (comment).

@vamitrou
Copy link
Contributor Author

vamitrou commented Dec 9, 2014

Do you think is a better idea to roll out these changes first? People can help on debugging until the release.

On the other hand I have the impression that CloudFront is not heavily used with s3cmd so it could be a different PR with not as high priority as this.

@mdomsch
Copy link
Contributor

mdomsch commented Dec 9, 2014

Lets merge the V4 work as is now, and then fix up CF's usage with a new PR.

On Tue, Dec 9, 2014 at 9:24 AM, Vasileios Mitrousis <
notifications@github.com> wrote:

Do you think is a better idea to roll out these changes first? People can
help on debugging until the release.

On the other hand I have the impression that CloudFront is not heavily
used with s3cmd so it could be a different PR with not as high priority as
this.


Reply to this email directly or view it on GitHub
#419 (comment).

@mdomsch
Copy link
Contributor

mdomsch commented Dec 9, 2014

We have a --default-location (now aliased to --region) that I'd like to use
instead of creating a new default_region config option.

On Tue, Dec 9, 2014 at 11:57 AM, Matt Domsch matt@domsch.com wrote:

Lets merge the V4 work as is now, and then fix up CF's usage with a new PR.

On Tue, Dec 9, 2014 at 9:24 AM, Vasileios Mitrousis <
notifications@github.com> wrote:

Do you think is a better idea to roll out these changes first? People can
help on debugging until the release.

On the other hand I have the impression that CloudFront is not heavily
used with s3cmd so it could be a different PR with not as high priority as
this.


Reply to this email directly or view it on GitHub
#419 (comment).

@mdomsch
Copy link
Contributor

mdomsch commented Dec 9, 2014

it's actually "bucket_location", not default_location.

On Tue, Dec 9, 2014 at 1:10 PM, Matt Domsch matt@domsch.com wrote:

We have a --default-location (now aliased to --region) that I'd like to
use instead of creating a new default_region config option.

On Tue, Dec 9, 2014 at 11:57 AM, Matt Domsch matt@domsch.com wrote:

Lets merge the V4 work as is now, and then fix up CF's usage with a new
PR.

On Tue, Dec 9, 2014 at 9:24 AM, Vasileios Mitrousis <
notifications@github.com> wrote:

Do you think is a better idea to roll out these changes first? People
can help on debugging until the release.

On the other hand I have the impression that CloudFront is not heavily
used with s3cmd so it could be a different PR with not as high priority as
this.


Reply to this email directly or view it on GitHub
#419 (comment).

@mdomsch mdomsch merged commit 0c29d7b into s3tools:master Dec 15, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants