Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Thumbor URLs to S3 paths consistently. #83

Merged
merged 2 commits into from
Jan 31, 2023
Merged

Convert Thumbor URLs to S3 paths consistently. #83

merged 2 commits into from
Jan 31, 2023

Conversation

aaugustin
Copy link
Contributor

Fix #66.

@aaugustin
Copy link
Contributor Author

This is an alternative to #80.

Like #80, it inserts an unquote in loader.py, and it factors common logic out on top of that. This factorization fixes other bugs e.g. an empty root_path is now supported in storage.py and result_storage.py.

@aaugustin
Copy link
Contributor Author

I wasn't able to run the tests. I'm sure I broke them: they're calling a method that I removed.

@aaugustin
Copy link
Contributor Author

If you agree this is the right approach, maybe I can fix them (if I haven't given up on the idea of using thumbor-aws by then).

@Tenzer
Copy link
Contributor

Tenzer commented Dec 16, 2022

Here's a patch that fixes the tests and some minor linting checks:

diff --git a/tests/test_storage.py b/tests/test_storage.py
index 3b8965b..aec646d 100644
--- a/tests/test_storage.py
+++ b/tests/test_storage.py
@@ -18,6 +18,7 @@ from tornado.testing import gen_test
 
 from tests import BaseS3TestCase
 from thumbor_aws.storage import Storage
+from thumbor_aws.utils import normalize_path
 
 
 @pytest.mark.usefixtures("test_images")
@@ -44,7 +45,7 @@ class StorageTestCase(BaseS3TestCase):
             f"{self._prefix}{filepath}",
         )
         status, data, _ = await storage.get_data(
-            self.bucket_name, storage.normalize_path(filepath)
+            self.bucket_name, normalize_path(storage.root_path, filepath)
         )
         expect(status).to_equal(200)
         expect(data).to_equal(expected)
@@ -66,7 +67,8 @@ class StorageTestCase(BaseS3TestCase):
             f"{self._prefix}{filepath}.txt",
         )
         status, data, _ = await storage.get_data(
-            self.bucket_name, storage.normalize_path(filepath + ".txt")
+            self.bucket_name,
+            normalize_path(storage.root_path, filepath + ".txt"),
         )
         expect(status).to_equal(200)
         expect(data).to_equal("ACME-SEC")
@@ -91,7 +93,7 @@ class StorageTestCase(BaseS3TestCase):
         )
         status, data, _ = await storage.get_data(
             self.bucket_name,
-            storage.normalize_path(filepath + ".detectors.txt"),
+            normalize_path(storage.root_path, filepath + ".detectors.txt"),
         )
         expect(status).to_equal(200)
         expect(data).to_equal(b'{"some": "data"}')
@@ -122,7 +124,9 @@ class StorageTestCase(BaseS3TestCase):
         await storage.put(filepath, expected)
 
         status, data, _ = await storage.get_data(
-            self.bucket_name, storage.normalize_path(filepath), expiration=0
+            self.bucket_name,
+            normalize_path(storage.root_path, filepath),
+            expiration=0,
         )
 
         expect(status).to_equal(410)
@@ -139,7 +143,7 @@ class StorageTestCase(BaseS3TestCase):
             storage.__class__, "get_location", return_value=None
         ):
             response = await storage.upload(
-                storage.normalize_path(filepath),
+                normalize_path(storage.root_path, filepath),
                 b"ACME-SEC2",
                 "application/text",
                 "https://my-site.com/{bucket_name}",
@@ -161,7 +165,7 @@ class StorageTestCase(BaseS3TestCase):
         filepath = f"/test/can_put_file_{uuid4()}"
 
         await storage.upload(
-            storage.normalize_path(filepath + ".txt"),
+            normalize_path(storage.root_path, filepath + ".txt"),
             b"ACME-SEC2",
             "application/text",
             "http://my-site.com",
@@ -178,7 +182,7 @@ class StorageTestCase(BaseS3TestCase):
         storage = Storage(self.context)
         filepath = f"/test/can_put_file_{uuid4()}"
         await storage.upload(
-            storage.normalize_path(filepath + ".detectors.txt"),
+            normalize_path(storage.root_path, filepath + ".detectors.txt"),
             b'{"some": "data"}',
             "application/text",
             "",
diff --git a/thumbor_aws/result_storage.py b/thumbor_aws/result_storage.py
index 25caf32..de46903 100644
--- a/thumbor_aws/result_storage.py
+++ b/thumbor_aws/result_storage.py
@@ -10,7 +10,6 @@
 
 
 from datetime import datetime, timezone
-from urllib.parse import unquote
 
 from deprecated import deprecated
 from thumbor.engines import BaseEngine
diff --git a/thumbor_aws/storage.py b/thumbor_aws/storage.py
index eae8036..6a97b41 100644
--- a/thumbor_aws/storage.py
+++ b/thumbor_aws/storage.py
@@ -10,7 +10,6 @@
 
 from json import dumps, loads
 from typing import Any
-from urllib.parse import unquote
 
 from thumbor import storages
 from thumbor.engines import BaseEngine
diff --git a/thumbor_aws/utils.py b/thumbor_aws/utils.py
index 0e2e306..f29c972 100644
--- a/thumbor_aws/utils.py
+++ b/thumbor_aws/utils.py
@@ -1,5 +1,6 @@
 from urllib.parse import unquote
 
+
 def normalize_path(prefix: str, path: str) -> str:
     """Convert a URL received from Thumbor to a key in a S3 bucket."""
     # Thumbor calls load/store functions with a URL that is  URL-encoded and

@github-actions
Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove the stale label or add a comment, or this PR will be closed in 10 days. You can always re-open if you feel this is something we should still keep working on. Tag @heynemann for more information.

@github-actions github-actions bot added the Stale label Jan 30, 2023
@aaugustin
Copy link
Contributor Author

aaugustin commented Jan 31, 2023

I fixed the tests (thanks @Tenzer!) and rebased on top of main.

I am happy to invest a bit of time to bring this PR in a mergeable state. However, before I invest, I would love a sign of life from current maintainers. The only sign we're getting on #66, #80, and here is the stale-bot insisting that the issue is stale when actually it's still present.

Tenzer's original report mentioned CloudFront and nginx. They're so widely used that it would be very surprising if they had URL-encoding bugs. In addition, I'm not using anything in front of Thumbor and I'm seeing the same issue.

You can easily reproduce by creating a an image file with a special character (e.g. café.jpg) in its name (or path) in a S3 bucket and attempting to fetch it with thumbor.

I think the issue should also happens if there's a space in the name or path. That would be worth fixing, even if you reject the idea to support non-ASCII file names.

EDIT - what I said isn't 100% correct - actually @guilhermef asked @heynemann to review #80.

@guilhermef
Copy link
Member

Thanks for the PR @aaugustin, thumbor-aws should support non-ASCII file names.

@coveralls
Copy link

coveralls commented Jan 31, 2023

Pull Request Test Coverage Report for Build 4051457320

  • 25 of 25 (100.0%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 91.52%

Totals Coverage Status
Change from base Build 3762712213: 0%
Covered Lines: 313
Relevant Lines: 342

💛 - Coveralls

@aaugustin
Copy link
Contributor Author

Thank you so much! I can drop my fork now :-)

aaugustin added a commit to aaugustin/myks-thumbor that referenced this pull request Feb 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Paths with URL encoded characters
4 participants