-
Notifications
You must be signed in to change notification settings - Fork 3.8k
druid.storage.type=s3 now honors druid.storage.zip to allow storing segments in s3 without zip compression #18544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…egments in s3 without zip compression
93a05a5 to
981707c
Compare
| ); | ||
| } | ||
| catch (AmazonServiceException e) { | ||
| if (S3Utils.ERROR_ENTITY_TOO_LARGE.equals(S3Utils.getS3ErrorCode(e))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole block is duplicated, would be good to dedupe it.
...nsions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/S3DataSegmentPuller.java
Dismissed
Show dismissed
Hide dismissed
| final URI uri = objectLocation.toUri(S3StorageDruidModule.SCHEME); | ||
| final ByteSource byteSource = getByteSource(uri); | ||
| final File outFile = new File(outDir, Paths.get(objectLocation.getPath()).getFileName().toString()); | ||
| outFile.createNewFile(); |
Check notice
Code scanning / CodeQL
Ignored error status of call Note
| private static final DataSegment DATA_SEGMENT_1_NO_ZIP = new DataSegment( | ||
| "test", | ||
| Intervals.of("2015-04-12/2015-04-13"), | ||
| "1", | ||
| ImmutableMap.of("bucket", TEST_BUCKET, "key", KEY_1 + "/"), | ||
| null, | ||
| null, | ||
| NoneShardSpec.instance(), | ||
| 0, | ||
| 1 | ||
| ); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note test
DataSegment.DataSegment
| private static final DataSegment DATA_SEGMENT_2_NO_ZIP = new DataSegment( | ||
| "test", | ||
| Intervals.of("2015-04-13/2015-04-14"), | ||
| "1", | ||
| ImmutableMap.of("bucket", TEST_BUCKET, "key", KEY_2 + "/"), | ||
| null, | ||
| null, | ||
| NoneShardSpec.instance(), | ||
| 0, | ||
| 1 | ||
| ); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note test
DataSegment.DataSegment
| object1.setBucketName(bucket); | ||
| object1.setKey(keyPrefix + "meta.smoosh"); | ||
| object1.getObjectMetadata().setLastModified(new Date(0)); | ||
| object1.setObjectContent(new FileInputStream(tmpFile)); |
Check warning
Code scanning / CodeQL
Potential input resource leak Warning test
| object2.setBucketName(bucket); | ||
| object2.setKey(keyPrefix + "00000.smoosh"); | ||
| object2.getObjectMetadata().setLastModified(new Date(0)); | ||
| object2.setObjectContent(new FileInputStream(tmpFile)); |
Check warning
Code scanning / CodeQL
Potential input resource leak Warning test
Description
This PR adds the capability to store segments in s3 without compressing with zip, similar to the 'local' deep storage option. This is mainly for experimentation at this point, but went ahead and documented it just in case anyone else wants to experiment 🤷
When
druid.storage.zipis false, the load spec stores the prefix to use instead of the exact location, and the pullers/killers/movers when for a path that ends with/and then do a list operation and iterate over and apply the operation to the results.