feat: server-side passthrough for same-credential encrypted COPY#107
Merged
ServerSideHannes merged 2 commits intoJul 3, 2026
Merged
Conversation
Scylla Manager's backup copies each SSTable to its snapshot-tagged key
(CopyObject, MetadataDirective=COPY, same bucket). The proxy handled every
encrypted copy via _copy_encrypted, i.e. GET source from upstream -> decrypt ->
re-encrypt -> PUT dest. Because client-side encryption defeats a native
server-side copy, a metadata-only "rename" became a full download + re-upload
of the whole object. Measured live this drove ~750 MB/s each way to Hetzner
(encrypt ~= decrypt), saturating the CPU/AES-bound fleet and triggering a
PutObject 503 SlowDown storm that stalled the daily backup at ~16%.
The encryption is not key-bound: GCM AAD is None, the DEK is random and stored
in object metadata (isec) / the multipart sidecar, and nonces are embedded in
the ciphertext. So a byte-identical copy that keeps the same wrapped-DEK
metadata decrypts fine regardless of its key name.
handle_copy_object now takes a native server-side CopyObject when:
- metadata_directive == COPY, and
- the source was wrapped by the calling credential (src kid == caller), so
re-keying would be a no-op, and
- the ciphertext is within the single-op CopyObject limit (5 GiB).
For multipart objects it also server-side-copies the .meta frame-map sidecar.
Cross-credential copies (must re-key under the caller's KEK), REPLACE (new
metadata) and >5 GiB objects still take the decrypt/re-encrypt path.
Verified against real Hetzner: CopyObject COPY on a 1.68 GB encrypted SSTable
preserves isec/isec-kid user metadata byte-for-byte.
Tests: tests/integration/test_copy_passthrough.py covers single-object and
multipart passthrough (no source download / dest re-upload, round-trips) and
the REPLACE re-encrypt fallback.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Scylla Manager backs up each SSTable, then copies it to its snapshot-tagged key (
X->X.sm_<tag>, same bucket,CopyObjectwithMetadataDirective=COPY). Normally a copy is a free server-side metadata op. But because this proxy does client-side AES-256-GCM,_copy_encryptedimplements every encrypted copy as GET source -> decrypt -> re-encrypt -> PUT dest — turning a "rename" into a full download + re-upload of the whole object.Measured live: the fleet was pushing ~750 MB/s up and ~730 MB/s down to Hetzner simultaneously (
bytes_encrypted ~= bytes_decrypted), with ~1000COPY_OBJECT/3 min. Real new-data goodput was tiny (~24 MB/s); the rest was copy amplification. It saturated the CPU/AES-bound proxy and produced aPutObject503 SlowDown storm (940 rejections/3 min on scylla), stalling the daily backup at ~16%.Fix
The ciphertext isn't key-bound — GCM AAD is
None, the DEK is random and stored in object metadata (isec) / the multipart sidecar, and nonces are embedded in the ciphertext. So a byte-identical copy that keeps the same wrapped-DEK metadata decrypts fine under any key name.handle_copy_objectnow issues a native server-sideCopyObjectwhen all hold:metadata_directive == "COPY",kid == caller(so re-keying is a no-op),CopyObjectlimit).Multipart objects also get their
.metaframe-map sidecar server-side-copied. Unchanged fallbacks: cross-credential copies still re-encrypt (re-key under the caller's KEK), and so doREPLACEand >5 GiB objects.Verification
CopyObjectCOPY on a 1.68 GB encrypted SSTable preservesisec/isec-kiduser metadata byte-for-byte (same size). Temp copy deleted after.tests/integration/test_copy_passthrough.py): single-object + multipart passthrough (asserts no source download / no dest re-upload, round-trips to original plaintext) and theREPLACEre-encrypt fallback. Full unit suite (457) + per-key/streaming-copy/copy-governing suites green;ruffclean.Impact
Same-credential dedup copies (Scylla's case) drop from "whole file down + re-encrypt + whole file up" to a metadata-only server-side op — removing the ~750 MB/s-each-way amplification and the 503 storm, so the backup can finish and finalize its manifest.
Follow-up (out of scope)