-
Notifications
You must be signed in to change notification settings - Fork 86
chore: Adding implmentation of RequestBody and RewindableContent #3363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… checksum support.
| String getMd5() throws IOException { | ||
| try (SeekableByteChannel in = Files.newByteChannel(path, StandardOpenOption.READ)) { | ||
| in.position(readOffset); | ||
| MessageDigest md = MessageDigest.getInstance("MD5"); | ||
| byte[] buffer = new byte[8192]; | ||
| ByteBuffer byteBuffer = ByteBuffer.wrap(buffer); | ||
| while (in.read(byteBuffer) != -1) { | ||
| byteBuffer.flip(); | ||
| md.update(byteBuffer); | ||
| byteBuffer.clear(); | ||
| } | ||
| return Base64.getEncoder().encodeToString(md.digest()); | ||
| } catch (NoSuchAlgorithmException e) { | ||
| throw new IOException(e); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to send MD5 to gcs, or can we send just the crc32c?
There's not much use in computing both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do without MD5 for now.
Will add it later if we get a strong pushback from Salesforce.
| com.google.common.hash.Hasher crc32cHasher = Hashing.crc32c().newHasher(); | ||
| for (ByteBuffer buffer : buffers) { | ||
| crc32cHasher.putBytes(buffer.duplicate()); | ||
| } | ||
| int crc32cInt = crc32cHasher.hash().asInt(); | ||
| ByteBuffer bb = ByteBuffer.allocate(4); | ||
| bb.putInt(crc32cInt); | ||
| return Base64.getEncoder().encodeToString(bb.array()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use our existing crc32c infrastructure and options.
| com.google.common.hash.Hasher crc32cHasher = Hashing.crc32c().newHasher(); | |
| for (ByteBuffer buffer : buffers) { | |
| crc32cHasher.putBytes(buffer.duplicate()); | |
| } | |
| int crc32cInt = crc32cHasher.hash().asInt(); | |
| ByteBuffer bb = ByteBuffer.allocate(4); | |
| bb.putInt(crc32cInt); | |
| return Base64.getEncoder().encodeToString(bb.array()); | |
| GuavaHasher hasher; | |
| { | |
| Hasher defaultHasher = Hasher.defaultHasher(); | |
| if (defaultHasher instanceof NoOpHasher) { | |
| return null; | |
| } else { | |
| hasher = Hasher.enabled(); | |
| } | |
| } | |
| Crc32cLengthKnown cumulative = Crc32cValue.zero(); | |
| for (ByteBuffer buffer : buffers) { | |
| cumulative = cumulative.concat(hasher.hash(buffer::duplicate)); | |
| } | |
| return Utils.crc32cCodec.encode(cumulative.getValue()); |
| String getMd5() throws IOException { | ||
| return ""; | ||
| } | ||
|
|
||
| @Override | ||
| String getCrc32c() throws IOException { | ||
| return ""; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values aren't correct. Empty bytes do have md5 and crc32c values that aren't empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WIP changes. It was a draft PR.
| abstract String getMd5() throws IOException; | ||
|
|
||
| abstract String getCrc32c() throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is the best place for this logic to live. Especially in the case of the File based one, the entire file would have to be read from disk before processing could proceed. gRPC has streaming checksum support, and JSON should be getting it soon as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me raise this PR along with UploadParts changes only, it would make it easierin terms of review.
| import java.security.MessageDigest; | ||
| import java.security.NoSuchAlgorithmException; | ||
| import java.util.Arrays; | ||
| import java.util.Base64; // Changed from com.google.api.client.util.Base64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is new, not changed from a previous value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WIP changes. It was a draft PR.
| Preconditions.checkArgument( | ||
| offset <= totalLength, | ||
| "provided offset must be less than or equal to totalLength (%s <= %s)", | ||
| "provided offset must be less than orequal to totalLength (%s <= %s)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WIP changes. It was a draft PR.
| assertThrows( | ||
| HttpResponseException.class, | ||
| () -> | ||
| multipartUploadHttpRequestManager.sendAbortMultipartUploadRequest( | ||
| endpoint, request, httpStorageOptions)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this change, please revert here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WIP changes. It was a draft PR.
| public UploadPartResponse uploadPart(UploadPartRequest request, RequestBody requestBody) { | ||
| return retrier.run( | ||
| retryAlgorithmManager.idempotent(), | ||
| () -> httpRequestManager.sendUploadPartRequest(uri, request, requestBody), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to do dirty state tracking, and to actually rewind the content before a followup invocation. See
java-storage/google-cloud-storage/src/main/java/com/google/cloud/storage/JsonResumableSession.java
Lines 62 to 76 in 763be91
| AtomicBoolean dirty = new AtomicBoolean(false); | |
| return retrier.run( | |
| () -> { | |
| if (dirty.getAndSet(true)) { | |
| ResumableOperationResult<@Nullable StorageObject> query = query(); | |
| long persistedSize = query.getPersistedSize(); | |
| if (contentRange.endOffsetEquals(persistedSize) || query.getObject() != null) { | |
| return query; | |
| } else { | |
| task.rewindTo(persistedSize); | |
| } | |
| } | |
| return task.call(); | |
| }, | |
| Decoder.identity()); |
Additionally, this should pass the RewindableContent to sendUploadPartRequest so that this method manages the lifecycle of it and integrates correctly with the retries.
| return false; | ||
| } | ||
|
|
||
| abstract String getCrc32c(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: return Crc32cLengthKnown and let the consumer of this value own the logic of encoding the value for the respected use.
For example, in grpc, crc32c values are ints not base64 encoded 4-byte big-endian representation of an unsigned 32-bit integer.
| private void addHeadersForUploadPart(RequestBody requestBody, HttpHeaders headers) { | ||
| headers.put("x-goog-hash", "crc32c=" + requestBody.getContent().getCrc32c()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: one-line logic-less helper methods don't help
Is there a plan to add more in here?
Refer: Support XML MPU in GCS Java SDK #3348