improvement(uploads): add multipart upload + batching + retries#938
improvement(uploads): add multipart upload + batching + retries#938
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Skipped Deployment
|
There was a problem hiding this comment.
Greptile Summary
This PR implements a comprehensive file upload system with retry logic, multipart uploads, and enhanced UI feedback for the knowledge base feature. The changes span three key areas:
Core Upload Logic (use-knowledge-upload.ts): The hook now supports three upload strategies - direct uploads via presigned URLs, multipart uploads for large files (>50MB), and API fallback when cloud storage isn't configured. Key additions include exponential backoff retry logic with configurable timeouts, concurrent batch processing with race condition prevention, and detailed progress tracking per file. The implementation uses atomic state updates to handle complex concurrent upload scenarios while providing real-time status updates.
Multipart Upload API (multipart/route.ts): A new API endpoint enables S3 multipart uploads through four actions: initiate, get-part-urls, complete, and abort. This follows AWS S3's standard multipart upload pattern, allowing large files to be uploaded in smaller chunks that can be retried independently. The endpoint is currently restricted to S3 storage providers and hardcoded for knowledge-base uploads.
Enhanced UI (upload-modal.tsx): The upload modal now displays individual file progress bars, status icons (loading, success, failure), and detailed error messages. The interface shows granular progress information including current upload stage and file completion counts. The modal height was increased to accommodate the expanded status display.
These changes work together to solve reliability issues with large file uploads while providing much better user feedback during the upload process. The architecture supports various file sizes and storage configurations while handling edge cases like network failures and concurrent uploads.
PR Description Notes:
- The PR description is a template with no actual content filled in
- Missing description of changes, issue reference, testing details, and type of change selection
Confidence score: 2/5
- This PR requires careful review due to security concerns and missing validation in critical upload functionality
- Score lowered due to authentication bypass, input validation gaps, and complex state management without comprehensive error handling
- Pay close attention to
apps/sim/app/api/files/multipart/route.tsfor security issues andapps/sim/app/workspace/[workspaceId]/knowledge/hooks/use-knowledge-upload.tsfor race conditions
3 files reviewed, 5 comments
apps/sim/app/workspace/[workspaceId]/knowledge/hooks/use-knowledge-upload.ts
Show resolved
Hide resolved
apps/sim/app/workspace/[workspaceId]/knowledge/hooks/use-knowledge-upload.ts
Show resolved
Hide resolved
apps/sim/app/workspace/[workspaceId]/knowledge/hooks/use-knowledge-upload.ts
Show resolved
Hide resolved
apps/sim/app/workspace/[workspaceId]/knowledge/hooks/use-knowledge-upload.ts
Show resolved
Hide resolved
There was a problem hiding this comment.
Greptile Summary
This review covers only the changes made since the last review (commit c444a5d), not the entire PR.
The most recent changes focus on adding authentication to the file upload system across three API endpoints. The changes add session-based authentication checks to the /api/files/upload, /api/files/presigned, and /api/files/multipart routes using the getSession() function from the auth library.
In the upload route, the authentication check simply blocks unauthenticated users from uploading files. The presigned URL endpoint goes further by replacing the client-provided userId parameter with a sessionUserId extracted from the authenticated session, eliminating the security risk of trusting user-provided identity data. The multipart upload endpoint also includes the same authentication pattern.
These changes address a critical security vulnerability where the file upload endpoints were previously accessible to unauthenticated users, which could lead to abuse, unauthorized storage costs, and potential malicious file uploads. The implementation follows established patterns used throughout the codebase's API endpoints and maintains backward compatibility while strengthening security posture.
Confidence score: 4/5
- This authentication addition is a crucial security improvement with minimal risk of breaking existing functionality
- Score reflects solid implementation following established patterns, but the multipart endpoint still has underlying issues noted in previous reviews
- Pay close attention to the multipart endpoint which still lacks proper validation, rate limiting, and error handling mechanisms
3 files reviewed, no comments
…tudioai#938) * File upload retries + multipart uploads * Lint * FIle uploads * File uploads 2 * Lint * Fix file uploads * Add auth to file upload routes * Lint
Summary
Fixes #(issue)
Type of Change
Testing
Manual testing
Checklist