Skip to content

Conversation

@waleedlatif1
Copy link
Collaborator

Summary

  • handle larger files in the kb

Type of Change

  • Bug fix

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link

vercel bot commented Dec 12, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
docs Skipped Skipped Dec 12, 2025 1:25am

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 12, 2025

Greptile Overview

Greptile Summary

This PR improves handling of larger files in the knowledge base by extending processing timeouts and adding direct upload support for medium-sized files.

Key Changes:

  • Extended dead process threshold from 150s to 600s (10 minutes) to prevent premature timeout failures for larger file processing
  • Added direct upload path for files under 50MB when presigned URLs support it, improving upload reliability
  • Updated CSP to allow S3 and Azure Blob Storage image sources for serving uploaded files
  • Enhanced storage key extraction to properly handle s3/ and blob/ prefixes in file paths
  • Added CSP headers to login/signup pages for security consistency
  • Cleaned up duplicate comment blocks in upload hook

The changes are well-structured and address the core issue of handling larger files through multiple complementary improvements.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • All changes are focused, well-implemented, and directly address the stated issue. The timeout extension is reasonable for large file processing, the direct upload logic properly checks for support before using it, CSP changes appropriately allow necessary storage domains, and file path handling correctly strips storage-specific prefixes.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
apps/sim/app/workspace/[workspaceId]/knowledge/hooks/use-knowledge-upload.ts 5/5 Added direct upload support for medium-sized files, removed duplicate comment blocks
apps/sim/lib/core/security/csp.ts 5/5 Added S3 and Azure Blob Storage domains to CSP img-src to support serving uploaded files
apps/sim/lib/knowledge/documents/service.ts 5/5 Increased timeout threshold from 150s to 600s for marking documents as failed
apps/sim/lib/uploads/utils/file-utils.ts 5/5 Enhanced storage key extraction to handle s3/ and blob/ prefixes in file paths

Sequence Diagram

sequenceDiagram
    participant Client
    participant UploadHook as use-knowledge-upload
    participant API as /api/files
    participant Storage as S3/Azure Blob
    participant Processor as Document Processor
    
    Client->>UploadHook: uploadFiles(files)
    UploadHook->>UploadHook: Check file size
    
    alt File > 50MB (Large File)
        UploadHook->>API: Request multipart upload
        API-->>UploadHook: uploadId + part URLs
        UploadHook->>Storage: Upload parts in chunks (8MB)
        Storage-->>UploadHook: ETags
        UploadHook->>API: Complete multipart upload
        API-->>UploadHook: File URL
    else File <= 50MB with directUploadSupported
        UploadHook->>API: Request presigned URL
        API-->>UploadHook: Presigned URL + headers
        UploadHook->>Storage: Direct PUT with file
        Storage-->>UploadHook: Success
        UploadHook->>API: Confirm upload
        API-->>UploadHook: File URL
    else File <= 50MB without presigned
        UploadHook->>API: Upload through API
        API->>Storage: Store file
        API-->>UploadHook: File URL
    end
    
    UploadHook->>Processor: Trigger processing
    Processor->>Processor: Process document (up to 10 min timeout)
    
    alt Processing exceeds 10 minutes
        Processor->>Processor: Mark as failed (timeout)
    else Processing completes
        Processor-->>Client: Document ready
    end
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@waleedlatif1
Copy link
Collaborator Author

@greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@waleedlatif1 waleedlatif1 merged commit 855e2c4 into staging Dec 12, 2025
9 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/kb branch December 12, 2025 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant