-
Notifications
You must be signed in to change notification settings - Fork 249
Add S3-style multi-part upload mechanism for large file parallel upload #1801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add endpoints for chunked file uploads to support large files:
- POST /files/upload/init - Initialize upload session
- PUT /files/upload/{uploadId} - Upload individual parts
- POST /files/upload/{uploadId}/complete - Assemble final file
- DELETE /files/upload/{uploadId} - Abort and cleanup
Parts are stored in temp directory and assembled sequentially
on completion using copy_file_range for efficiency.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove fsync in Complete (was ~40-100ms for 100MB) - Move temp directory cleanup to background goroutine For sandbox use cases, immediate durability is not critical. The kernel will flush data to disk eventually. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add POST/files/upload/ to allowed path prefixes for auth bypass - Add max session limit (100) to prevent resource exhaustion - Fix race condition in complete handler by copying parts under lock - Add logging for background cleanup errors - Add startup cleanup of stale temp directories - Fix test bug with part number string conversion - Add test for max sessions limit Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…s issues Security fixes: - Validate uploadId is a valid UUID to prevent path traversal attacks - Add max part size limit (100MB) to prevent DoS via disk exhaustion - Validate part numbers are non-negative Robustness fixes: - Add completed flag to prevent race between part uploads and complete/abort - Clean up destination file on assembly errors - Validate parts are contiguous (0, 1, 2, ..., n-1) before assembly - Warn on duplicate part number uploads (last write wins) - Add session TTL (1 hour) with background cleanup goroutine - Explicit destFile.Close() before marking success Features: - Add ETag header (MD5 hash) for uploaded parts - Add CreatedAt timestamp to sessions for TTL tracking Tests: - Invalid upload ID format (path traversal) - Negative part numbers - Missing parts in sequence - Upload part after complete started (conflict) - Part size limit exceeded - ETag header verification Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ETag was being computed and returned but never validated or used. Remove the unnecessary MD5 hash computation and header to simplify the code and improve performance. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace temp file based multipart upload with direct writes to destination: - Add totalSize and partSize to init request (breaking API change) - Create and preallocate destination file at init time - Write parts directly to file at computed offsets using WriteAt - Eliminate temp directory, assembly phase, and copy operations - Keep open file handle across session for concurrent part writes This removes multiple layers of I/O overhead: - No temp file creation per part - No reading temp files back during assembly - No sequential copy loop - Single file open/close cycle Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add blank lines before return statements (nlreturn) - Check json.Encode error return values (errchkjson) - Use integer range syntax for Go 1.22+ (intrange) - Add t.Parallel() to TestMultipartUploadRouting (paralleltest) - Use assert.Empty() instead of assert.Equal() (testifylint) - Fix import ordering (gci) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The test was using "non-existent" as the upload ID, but the handler validates UUID format before checking session existence, causing a 400 instead of the expected 404. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The prefix-based matching for allowed paths caused "POST/files" to incorrectly match "POST/filesystem.Filesystem/*" Connect RPC endpoints, allowing unauthenticated access to filesystem operations on secure sandboxes. Split allowed paths into exact matches (for static paths like /files) and prefix matches (for paths with dynamic segments like /files/upload/). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Explicitly clean paths returned by ExpandAndResolve to remove any .. or . components. This addresses the GitHub Advanced Security warning about path traversal, even though filepath.Abs already normalizes paths. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extract path sanitization into a dedicated function that: - Cleans the path using filepath.Clean - Validates the result is an absolute path - Rejects paths containing null bytes (path injection attack) This breaks the taint chain for CodeQL security analysis by adding explicit validation before the path is used in file operations. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Reverts the sanitizePath changes to path.go as they are not needed. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move the mutex lock to only protect the PartsWritten map access instead of the entire WriteAt operation. WriteAt is safe for concurrent writes at different offsets, so parallel chunk uploads no longer serialize on disk I/O. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add support for gzip-compressed request bodies on POST /files and
PUT /files/upload/{uploadId} endpoints. Clients can now send compressed
uploads by setting Content-Encoding: gzip header.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Keep gzip Content-Encoding support only in multipart upload API. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move the max upload sessions check earlier in PostFilesUploadInit to fail fast before creating and preallocating the destination file. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The early check before file operations is sufficient. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| return | ||
| } | ||
|
|
||
| // Validate signing if needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about this part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 35d1347079
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
- Reject truncated uploads by enforcing size == expectedSize instead of only checking size > expectedSize. This prevents silent data corruption when a client disconnects early. - Reject part uploads when TotalSize is 0 (NumParts == 0) to prevent writing unexpected data to files declared as empty. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| jsonError(w, http.StatusTooManyRequests, fmt.Errorf("too many concurrent upload sessions (max %d)", maxUploadSessions)) | ||
|
|
||
| return | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Session limit check-then-act race condition
Medium Severity
The session limit check is not atomic with session insertion. The code checks len(a.uploads) with RLock, releases the lock, then performs file operations, and only later acquires the write lock to insert the session. Multiple concurrent requests can all pass the limit check before any inserts, allowing maxUploadSessions to be exceeded. This could lead to resource exhaustion (file descriptors, disk space).
Additional Locations (1)
| jsonError(w, http.StatusConflict, fmt.Errorf("upload session is already completing or aborted")) | ||
|
|
||
| return | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upload part and complete endpoints lack authentication validation
Medium Severity
PutFilesUploadUploadId, PostFilesUploadUploadIdComplete, and DeleteFilesUploadUploadId bypass general auth via allowedPathPrefixes but don't call validateSigning. Unlike PostFilesUploadInit which validates signatures, these endpoints rely entirely on the upload ID as an implicit bearer token. Anyone who obtains an upload ID can upload arbitrary parts, complete, or abort the upload without authentication.
Additional Locations (1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels important.
| numParts := int((body.TotalSize + body.PartSize - 1) / body.PartSize) | ||
| if numParts == 0 && body.TotalSize == 0 { | ||
| numParts = 0 // Empty file, no parts needed | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing validation allows DoS via excessive part count
High Severity
There's no minimum partSize validation (only > 0) and no limit on the resulting numParts. An attacker can create a session with totalSize near max int64 and partSize of 1, producing quintillions of parts. When PostFilesUploadUploadIdComplete is called, the loop at line 383 iterates over all NumParts, appending each missing part to a slice, causing CPU exhaustion and memory exhaustion (OOM crash).
Additional Locations (1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty valid point. We should probably have a minimum part size, or maximum parts, or both.
| "404": | ||
| $ref: "#/components/responses/UploadNotFound" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might also need a UploadIncomplete error; 400?
| "400": | ||
| $ref: "#/components/responses/InvalidPath" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "InvalidPath" mean? We also probably need a "PartAlreadyUploadedError", and/or "PartUploadInProgress"
| a.uploadsLock.Lock() | ||
| now := time.Now() | ||
| for uploadID, session := range a.uploads { | ||
| if now.Sub(session.CreatedAt) > uploadSessionTTL { | ||
| // Mark as completed to prevent races | ||
| if session.completed.CompareAndSwap(false, true) { | ||
| delete(a.uploads, uploadID) | ||
| // Close file handle and remove file in background | ||
| go func(s *MultipartUploadSession) { | ||
| s.DestFile.Close() | ||
| if err := os.Remove(s.FilePath); err != nil && !os.IsNotExist(err) { | ||
| a.logger.Warn().Err(err).Str("filePath", s.FilePath).Msg("failed to cleanup expired upload file") | ||
| } | ||
| }(session) | ||
| a.logger.Info().Str("uploadId", uploadID).Msg("cleaned up expired multipart upload session") | ||
| } | ||
| } | ||
| } | ||
| a.uploadsLock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might read better if this were in a different function. then the ticket would just be:
for range ticker.C {
a.checkUploads()
}
and you could use defer to ensure lock/unlock
| if err := os.Remove(s.FilePath); err != nil && !os.IsNotExist(err) { | ||
| a.logger.Warn().Err(err).Str("filePath", s.FilePath).Msg("failed to cleanup expired upload file") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dunno if you'd prefer this, but sometimes it's easier to read:
if err := os.Remove(s.FilePath); ignoreNotExist(err) != nil {
a.logger.Warn(...)
}
func ignoreNotExist(err error) error {
if os.NotExist(err) {
return nil
}
return err
}
| } | ||
|
|
||
| // Create upload ID | ||
| uploadID := uuid.New().String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure there's a uuid.NewString()
| // Validate uploadId is a valid UUID to prevent path traversal | ||
| if _, err := uuid.Parse(uploadId); err != nil { | ||
| a.logger.Error().Err(err).Str(string(logs.OperationIDKey), operationID).Str("uploadId", uploadId).Msg("invalid upload ID format") | ||
| jsonError(w, http.StatusBadRequest, fmt.Errorf("invalid upload ID format: must be a valid UUID")) | ||
|
|
||
| return | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think you need this, as session, exists := a.uploads[uploadId] does a better job of validating whether it's a real uploadId or not.
| return | ||
| } | ||
|
|
||
| partNumber := params.Part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make Part a uint so that parsing will fail, and we can avoid some checks?
| TotalSize int64 // Total expected file size | ||
| PartSize int64 // Size of each part (except possibly last) | ||
| NumParts int // Total number of expected parts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make these uints to avoid some failure modes.
|
There are some important things here (security, memory exhaustion attack, some missing errors). Let me know when those are done and I'll check it again |


Note
Medium Risk
Introduces new file-write endpoints with concurrency, disk-space, and lifecycle handling, plus changes request auth bypass logic for additional routes; bugs here could affect file integrity or expose unintended unauthenticated access.
Overview
Adds an S3-style multipart upload flow to
envdwith new endpoints to init an upload (POST /files/upload/init), upload parts (PUT /files/upload/{uploadId}?part=N), complete (POST /files/upload/{uploadId}/complete), and abort (DELETE /files/upload/{uploadId}). Uploads are written directly into a preallocated destination file, tracked in-memory with session limits/TTL cleanup, and validated for UUID upload IDs, part ranges, and exact part sizes.Updates auth bypass logic to allow prefix-based matching for dynamic upload routes, extends the OpenAPI spec and generated server/client types, and adds unit tests covering happy paths and key failure cases (missing parts, invalid IDs, session limits, abort/complete races).
Written by Cursor Bugbot for commit 8183127. This will update automatically on new commits. Configure here.