feat(platform): optimized file preview endpoint + rich artifact cards#13281
feat(platform): optimized file preview endpoint + rich artifact cards#13281Abhi1992002 wants to merge 7 commits into
Conversation
Add GET /api/workspace/files/{id}/preview which returns a small preview
instead of the full file: Pillow-resized WebP for images, pypdfium2
page-1 raster for PDFs, embedded docProps/thumbnail for Office docs, and
byte-capped partial reads for text. Thumbnails cache in Redis (by
checksum) and responses set browser cache headers.
Frontend renders per-type cards off the new endpoint: image/pdf/office
thumbnails, CSV mini-table, JSON, ICS event card, VCF contact card, and a
typed illustration fallback (React badge for .jsx/.tsx). Per-kind size
caps mirror the backend so large files skip the request entirely. Rename
the Artifacts page and navbar entry to "Files".
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds an end-to-end file preview feature: backend FastAPI preview endpoint that renders/caches WebP thumbnails and byte-limited text previews, storage helpers for partial-range retrieval, and frontend preview classification, parsers, and modular preview UI components; UI text rebrands “Artifacts” → “Files.” ChangesFile Preview System
Sequence DiagramsequenceDiagram
participant Browser
participant Frontend as UI
participant PreviewAPI
participant Storage as WorkspaceStorage
participant Cache as Redis
participant Renderer as WebPRenderer
Browser->>UI: user opens artifacts page
UI->>PreviewAPI: GET /api/workspace/files/{id}/preview?w&bytes
PreviewAPI->>Cache: GET cache_key
alt cache miss
PreviewAPI->>Storage: retrieve or retrieve_partial(storage_path, max_bytes)
Storage-->>PreviewAPI: file bytes
PreviewAPI->>Renderer: render (Pillow / pypdfium2 / extract thumbnail)
Renderer-->>Cache: SET cache_key(base64 webp)
Renderer-->>PreviewAPI: webp bytes
else cache hit
Cache-->>PreviewAPI: base64 webp
end
PreviewAPI-->>UI: 200 image/webp
UI-->>Browser: render preview image
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 PR Overlap DetectionThis check compares your PR against all other open PRs targeting the same branch to detect potential merge conflicts early. 🔴 Merge Conflicts DetectedThe following PRs have been tested and will have merge conflicts if merged after this PR. Consider coordinating with the authors.
🟢 Low Risk — File Overlap OnlyThese PRs touch the same files but different sections (click to expand)
Summary: 5 conflict(s), 0 medium risk, 2 low risk (out of 7 PRs with file overlap) Auto-generated on push. Ignores: |
| async def _office_thumbnail(file: WorkspaceFile, width: int) -> bytes: | ||
| async def render() -> bytes: | ||
| content = await _retrieve(file) | ||
| raw = await asyncio.to_thread(_extract_zip_thumbnail, content) |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Fixed in 06dbda4: wrapped the _extract_zip_thumbnail call so corrupt/mislabeled Office files now map to 415 instead of a 500. Added test_office_preview_corrupt_zip_returns_415.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## dev #13281 +/- ##
==========================================
+ Coverage 72.55% 72.64% +0.09%
==========================================
Files 2349 2359 +10
Lines 174823 175583 +760
Branches 17726 17804 +78
==========================================
+ Hits 126836 127553 +717
- Misses 44220 44239 +19
- Partials 3767 3791 +24
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@autogpt_platform/backend/backend/api/features/workspace/preview.py`:
- Around line 135-140: The route's render() calls _extract_zip_thumbnail outside
the existing error-mapping, so zipfile.BadZipFile from corrupt/mislabeled
DOCX/PPTX/XLSX bubbles up as a 500; wrap the call to _extract_zip_thumbnail in a
try/except that catches zipfile.BadZipFile (and optionally zipfile.LargeZipFile)
and raises fastapi.HTTPException(status_code=415, detail="No embedded thumbnail"
or a similar 415 message) so Office zip errors are normalized to 415; keep other
exceptions unhandled so they continue to surface, and ensure zipfile is imported
if not already.
In `@autogpt_platform/backend/backend/api/middleware/security.py`:
- Around line 28-29: The SecurityHeadersMiddleware uses CACHEABLE_PATHS to allow
caching only for explicit paths, but the existing tests omit the workspace
preview path; update backend/api/middleware/security_test.py to add the
"/api/workspace/files/*/preview" entry to the cacheable-endpoint test cases so
the middleware-level regression is covered, ensuring the test hits
SecurityHeadersMiddleware (not just the router-mounted preview_test.py) and
verifies that requests matching that wildcard pattern receive the cacheable
headers rather than the default no-store behavior; refer to
SecurityHeadersMiddleware and CACHEABLE_PATHS in the test to mirror the
production matching rules.
In `@autogpt_platform/backend/backend/util/workspace_storage.py`:
- Around line 326-334: The retrieve_partial method currently does a
check-then-open using file_path.exists() which creates a TOCTOU race; instead
open the path returned by _parse_storage_path(storage_path) directly with
aiofiles.open and handle FileNotFoundError from the open/read to raise the same
FileNotFoundError with the storage_path context. Update retrieve_partial to
remove the exists() check, wrap the async with aiofiles.open(...) as f in a
try/except catching FileNotFoundError (and re-raising FileNotFoundError(f"File
not found: {storage_path}")) so concurrent deletions are handled correctly.
In `@autogpt_platform/backend/pyproject.toml`:
- Around line 105-106: Update the Pillow dependency to require a minimum version
that includes PIL.Image.Resampling so preview.py's use of
Image.Resampling.LANCZOS is safe; change the toml entry for the pillow package
to a range such as >=9.1.0,<12 (or a newer upper bound you prefer) so runtime
imports of Resampling succeed and avoid older Pillow releases that lack that
enum.
In
`@autogpt_platform/frontend/src/app/`(platform)/artifacts/components/ArtifactsList/ArtifactCard/MediaPreview.tsx:
- Around line 44-50: The video element in MediaPreview uses onLoadedData while
preload="metadata" guarantees loadedmetadata; update the <video> handler to use
onLoadedMetadata={() => setIsLoaded(true)} (replace the onLoadedData usage) so
setIsLoaded is triggered correctly and the preview won't remain hidden; locate
the Video/MediaPreview component and the setIsLoaded call to make this change.
In
`@autogpt_platform/frontend/src/app/`(platform)/artifacts/components/ArtifactsList/ArtifactCard/PreviewParts.tsx:
- Around line 14-30: The preview effect in PreviewParts keeps previous text
until the new fetch resolves, causing stale previews; inside the useEffect for
url/onError, reset the cached preview state (call setText with an empty string
or undefined) before invoking async load(), then proceed with the existing fetch
logic (keeping cancelled handling and onError) so the component renders a
loading/empty state immediately when url changes; refer to the useEffect block,
the load() async function, setText, cancelled, url, and onError.
In
`@autogpt_platform/frontend/src/app/`(platform)/artifacts/components/ArtifactsList/parsers.ts:
- Around line 28-41: The parser unconditionally drops the last line and
naive-splits on delimiter which breaks quoted fields; update the logic around
lines, delimiter and cells so you only pop the last line when the fetched text
was actually truncated (e.g., detect truncation by checking if the original text
does NOT end with a newline/CR or by using a truncation flag provided by the
fetch), and replace the line.split(delimiter) call with a quote-aware CSV
tokenizer (either call an existing CSV parse helper or add a parseCsvLine(line,
delimiter) utility that handles quoted fields and escaped quotes) while still
enforcing maxRows and maxCols when building cells (respecting sniffDelimiter,
maxRows and maxCols).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b626b9aa-e426-44a3-98ed-84810d3793e6
⛔ Files ignored due to path filters (1)
autogpt_platform/backend/poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (21)
autogpt_platform/backend/backend/api/features/workspace/preview.pyautogpt_platform/backend/backend/api/features/workspace/preview_test.pyautogpt_platform/backend/backend/api/features/workspace/routes.pyautogpt_platform/backend/backend/api/middleware/security.pyautogpt_platform/backend/backend/util/gcs_utils.pyautogpt_platform/backend/backend/util/workspace_storage.pyautogpt_platform/backend/pyproject.tomlautogpt_platform/frontend/src/app/(platform)/artifacts/__tests__/main.test.tsxautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/ArtifactCard/CardPreview.tsxautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/ArtifactCard/MediaPreview.tsxautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/ArtifactCard/PreviewParts.tsxautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/ArtifactCard/StructuredPreviews.tsxautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/ArtifactCard/TextPreviews.tsxautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/ArtifactsList.tsxautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/FileIllustration.test.tsxautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/FileIllustration.tsxautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/helpers.test.tsautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/helpers.tsautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/parsers.tsautogpt_platform/frontend/src/app/(platform)/artifacts/page.tsxautogpt_platform/frontend/src/components/layout/Navbar/Navbar.tsx
Resolves frontend lint (prettier --check) and backend 'poetry lock' CI checks. Lock regenerated with poetry 2.2.1 to match CI; only marker ordering changed, no dependency versions.
Adds unit tests for gcs_utils.download_range (range header, headers fallback, 404 mapping) and WorkspaceStorage.retrieve_partial (local + GCS). Lifts backend patch coverage for the preview feature to ~91%.
Backend: - Office thumbnail: map corrupt/mislabeled DOCX/PPTX/XLSX to 415 instead of a 500 (BadZipFile now wrapped) (CodeRabbit/Sentry). - retrieve_partial: drop check-then-open TOCTOU race; open directly and translate FileNotFoundError (CodeRabbit). - Pin pillow >=9.1.0 so Image.Resampling is guaranteed (CodeRabbit); lock regenerated with poetry 2.2.1. - Tests: corrupt-Office 415 case + preview path in middleware cacheable allow-list test. Frontend: - VideoPreview: use onLoadedMetadata (guaranteed with preload=metadata) so the preview is not left permanently hidden (CodeRabbit). - useFileText: clear cached text on URL change to avoid stale previews on reused cards (CodeRabbit). - parseCsv: quote-aware tokenizer (handles "a, b" and "" escapes) and only drop the trailing row when the fetch was truncated mid-line (CodeRabbit); add parsers unit tests.
| _ensure_size(file, PREVIEW_MAX_TEXT_BYTES) | ||
| content = await _text_preview(file, max_bytes) |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Fixed in 9f464c2: build_preview_response now wraps dispatch and maps FileNotFoundError (file row present but bytes gone from storage) to a 404 across all preview kinds. Added test_preview_404_when_bytes_missing_from_storage.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@autogpt_platform/backend/backend/api/features/workspace/preview_test.py`:
- Around line 166-174: The test test_office_preview_corrupt_zip_returns_415
currently only checks the status code; update it to also assert the error detail
to ensure the corrupt-zip branch is hit. After calling response =
client.get("/files/file-001/preview") assert something like
response.json()["detail"] (or response.get_json()["detail"]) contains a
distinguishing phrase from the corrupt-archive error (e.g., "not a zip",
"invalid zip", or the exact message used by the preview handler). Target the
response object in this test function so the assertion ties the 415 to the
corrupt-zip path rather than any other 415 cause.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0a6e749c-86d1-4d85-82bc-4d1e6faa961a
⛔ Files ignored due to path filters (1)
autogpt_platform/backend/poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (9)
autogpt_platform/backend/backend/api/features/workspace/preview.pyautogpt_platform/backend/backend/api/features/workspace/preview_test.pyautogpt_platform/backend/backend/api/middleware/security_test.pyautogpt_platform/backend/backend/util/workspace_storage.pyautogpt_platform/backend/pyproject.tomlautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/ArtifactCard/MediaPreview.tsxautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/ArtifactCard/PreviewParts.tsxautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/parsers.test.tsautogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/parsers.ts
✅ Files skipped from review due to trivial changes (1)
- autogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/parsers.test.ts
🚧 Files skipped from review as they are similar to previous changes (6)
- autogpt_platform/backend/pyproject.toml
- autogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/ArtifactCard/MediaPreview.tsx
- autogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/ArtifactCard/PreviewParts.tsx
- autogpt_platform/backend/backend/util/workspace_storage.py
- autogpt_platform/frontend/src/app/(platform)/artifacts/components/ArtifactsList/parsers.ts
- autogpt_platform/backend/backend/api/features/workspace/preview.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: check API types
- GitHub Check: integration_test
- GitHub Check: lint
- GitHub Check: end-to-end tests
- GitHub Check: type-check (3.11)
- GitHub Check: type-check (3.13)
- GitHub Check: type-check (3.12)
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
- GitHub Check: test (3.13)
- GitHub Check: lint
- GitHub Check: Seer Code Review
- GitHub Check: Check PR Status
- GitHub Check: Analyze (typescript)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
autogpt_platform/backend/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development
autogpt_platform/backend/**/*.py: Usepoetry run ...command for executing Python package dependencies
Use top-level imports only — avoid local/inner imports except for lazy imports of heavy optional dependencies likeopenpyxl
Use absolute imports withfrom backend.module import ...for cross-package imports; single-dot relative imports are acceptable for sibling modules within the same package; avoid double-dot relative imports
Do not use duck typing — avoidhasattr/getattr/isinstancefor type dispatch; use typed interfaces/unions/protocols instead
Use Pydantic models over dataclass/namedtuple/dict for structured data
Do not use linter suppressors — no# type: ignore,# noqa,# pyright: ignore; fix the type/code instead
Prefer list comprehensions over manual loop-and-append patterns
Use early return with guard clauses first to avoid deep nesting
Use%sfor deferred interpolation indebuglog statements for efficiency; use f-strings elsewhere for readability (e.g.,logger.debug("Processing %s items", count)vslogger.info(f"Processing {count} items"))
Sanitize error paths by usingos.path.basename()in error messages to avoid leaking directory structure
Be aware of TOCTOU (Time-Of-Check-Time-Of-Use) issues — avoid check-then-act patterns for file access and credit charging
Usetransaction=Truefor Redis pipelines to ensure atomicity on multi-step operations
Usemax(0, value)guards for computed values that should never be negative
Keep files under ~300 lines; if a file grows beyond this, split by responsibility (extract helpers, models, or a sub-module into a new file)
Keep functions under ~40 lines; extract named helpers when a function grows longer
...
Files:
autogpt_platform/backend/backend/api/middleware/security_test.pyautogpt_platform/backend/backend/api/features/workspace/preview_test.py
autogpt_platform/{backend,autogpt_libs}/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Format Python code with
poetry run format
Files:
autogpt_platform/backend/backend/api/middleware/security_test.pyautogpt_platform/backend/backend/api/features/workspace/preview_test.py
autogpt_platform/backend/**/api/**/*.py
📄 CodeRabbit inference engine (autogpt_platform/backend/AGENTS.md)
autogpt_platform/backend/**/api/**/*.py: UseSecurity()instead ofDepends()for authentication dependencies to get proper OpenAPI security specification
Follow SSE (Server-Sent Events) protocol: usedata:lines for frontend-parsed events (must match Zod schema) and: commentlines for heartbeats/status
Files:
autogpt_platform/backend/backend/api/middleware/security_test.pyautogpt_platform/backend/backend/api/features/workspace/preview_test.py
autogpt_platform/backend/**/*_test.py
📄 CodeRabbit inference engine (autogpt_platform/backend/AGENTS.md)
autogpt_platform/backend/**/*_test.py: Use pytest with snapshot testing for API responses
Colocate test files with source files using*_test.pynaming convention
Mock at boundaries — mock where the symbol is used, not where it's defined; after refactoring, update mock targets to match new module paths
UseAsyncMockfromunittest.mockfor async functions in tests
When writing tests, use Test-Driven Development (TDD): write failing tests marked with@pytest.mark.xfailbefore implementation, then remove the marker once the implementation is complete
When creating snapshots in tests, usepoetry run pytest path/to/test.py --snapshot-update; always review snapshot changes withgit diffbefore committing
Files:
autogpt_platform/backend/backend/api/middleware/security_test.pyautogpt_platform/backend/backend/api/features/workspace/preview_test.py
autogpt_platform/backend/**/api/middleware/**/*.py
📄 CodeRabbit inference engine (autogpt_platform/backend/AGENTS.md)
The cache protection middleware in
backend/api/middleware/security.pydisables caching for ALL endpoints by default usingCache-Control: no-store, no-cache, must-revalidate, private; only explicitly permitted paths inCACHEABLE_PATHScan be cached
Files:
autogpt_platform/backend/backend/api/middleware/security_test.py
autogpt_platform/backend/backend/api/features/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Update routes in '/backend/backend/api/features/' and add/update Pydantic models in the same directory for API development
Files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
🔇 Additional comments (1)
autogpt_platform/backend/backend/api/middleware/security_test.py (1)
121-122: LGTM!Also applies to: 129-130
A file row can exist while its bytes are gone from storage; FileNotFoundError now maps to 404 (was an unhandled 500) across all preview kinds (Sentry). Tightens the corrupt-zip test to assert the 415 detail and adds a missing-storage 404 test (CodeRabbit).
| preload="metadata" | ||
| muted | ||
| playsInline | ||
| onLoadedMetadata={() => setIsLoaded(true)} |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Fixed in c871be4: now reveal on either onLoadedData (first painted frame → no blank box, your point) or onLoadedMetadata (fallback so the preview is never left permanently hidden when the browser only fetches metadata under preload="metadata", per the earlier CodeRabbit note). Handles both concerns.
Handles both review notes: loadeddata reveals on the first painted frame (no blank box, Sentry) and loadedmetadata is the fallback so the preview is never left permanently hidden under preload=metadata (CodeRabbit).
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
autogpt_platform/backend/backend/api/features/workspace/preview_test.py (2)
156-163:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAssert the 415 detail to pin this test to the missing-thumbnail branch.
As written, this passes for any 415 (e.g., a corrupt-ZIP failure), so it doesn't actually prove the
raw is Nonepath. Mirror the tightening already applied totest_office_preview_corrupt_zip_returns_415by asserting the distinguishing detail ("No embedded thumbnail"per_office_thumbnail).Suggested test tightening
response = client.get("/files/file-001/preview") assert response.status_code == 415 + assert response.json()["detail"] == "No embedded thumbnail"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@autogpt_platform/backend/backend/api/features/workspace/preview_test.py` around lines 156 - 163, The test test_office_preview_without_thumbnail_returns_415 is currently only asserting a 415 status and can be triggered by any 415 error; update it to also assert the error detail matches the missing-thumbnail branch by checking the response JSON detail equals (or contains) the specific message emitted by _office_thumbnail (e.g., "No embedded thumbnail"); locate the test function and add an assertion on response.json()["detail"] (or response.json().get("detail")) to pin the failure to the raw is None path.
10-10:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix
preview_test.pyimport style and strengthen the 415 assertion
- Move the fixture-local import
from autogpt_libs.auth.jwt_utils import get_jwt_payload(line 25) to a top-level import to match the “avoid local/inner imports” rule.- In
test_office_preview_without_thumbnail_returns_415(lines 156-163), assert the response body detail (e.g.,response.json()["detail"]) in addition tostatus_code, like the corrupt-zip test already does.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@autogpt_platform/backend/backend/api/features/workspace/preview_test.py` at line 10, Move the fixture-local import of get_jwt_payload to a top-level import (add "from autogpt_libs.auth.jwt_utils import get_jwt_payload" alongside the existing top-level imports such as FPDF) to comply with the no-local-import rule, and in the test_office_preview_without_thumbnail_returns_415 test add an assertion on the response body detail (e.g., assert response.status_code == 415 and "detail" in response.json() and expected substring or exact message == response.json()["detail"]) similar to the corrupt-zip test to validate the error message content; update only the import location and enhance the assertion in test_office_preview_without_thumbnail_returns_415 (leave other behavior unchanged).
🧹 Nitpick comments (1)
autogpt_platform/backend/backend/api/features/workspace/preview_test.py (1)
23-29: 💤 Low valuePrefer a top-level import for
get_jwt_payload.
get_jwt_payloadis a lightweight symbol; the inner import in the fixture isn't a lazy heavy-dependency case. Move it to module scope unless there's an import-time side effect that requires deferral.As per coding guidelines: "Use top-level imports only — avoid local/inner imports except for lazy imports of heavy optional dependencies like
openpyxl".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@autogpt_platform/backend/backend/api/features/workspace/preview_test.py` around lines 23 - 29, Move the inner import of get_jwt_payload out of the setup_app_auth fixture to module scope: add a top-level import for get_jwt_payload from autogpt_libs.auth.jwt_utils at the top of the test module, then update the setup_app_auth fixture to reference get_jwt_payload directly (it currently sets app.dependency_overrides[get_jwt_payload] = mock_jwt_user["get_jwt_payload"]); keep the yield and app.dependency_overrides.clear() behavior unchanged and ensure no import-time side effects exist before making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@autogpt_platform/backend/backend/api/features/workspace/preview_test.py`:
- Around line 156-163: The test
test_office_preview_without_thumbnail_returns_415 is currently only asserting a
415 status and can be triggered by any 415 error; update it to also assert the
error detail matches the missing-thumbnail branch by checking the response JSON
detail equals (or contains) the specific message emitted by _office_thumbnail
(e.g., "No embedded thumbnail"); locate the test function and add an assertion
on response.json()["detail"] (or response.json().get("detail")) to pin the
failure to the raw is None path.
- Line 10: Move the fixture-local import of get_jwt_payload to a top-level
import (add "from autogpt_libs.auth.jwt_utils import get_jwt_payload" alongside
the existing top-level imports such as FPDF) to comply with the no-local-import
rule, and in the test_office_preview_without_thumbnail_returns_415 test add an
assertion on the response body detail (e.g., assert response.status_code == 415
and "detail" in response.json() and expected substring or exact message ==
response.json()["detail"]) similar to the corrupt-zip test to validate the error
message content; update only the import location and enhance the assertion in
test_office_preview_without_thumbnail_returns_415 (leave other behavior
unchanged).
---
Nitpick comments:
In `@autogpt_platform/backend/backend/api/features/workspace/preview_test.py`:
- Around line 23-29: Move the inner import of get_jwt_payload out of the
setup_app_auth fixture to module scope: add a top-level import for
get_jwt_payload from autogpt_libs.auth.jwt_utils at the top of the test module,
then update the setup_app_auth fixture to reference get_jwt_payload directly (it
currently sets app.dependency_overrides[get_jwt_payload] =
mock_jwt_user["get_jwt_payload"]); keep the yield and
app.dependency_overrides.clear() behavior unchanged and ensure no import-time
side effects exist before making this change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c9baa972-7cbc-4812-ae6d-1b863830a5e5
📒 Files selected for processing (2)
autogpt_platform/backend/backend/api/features/workspace/preview.pyautogpt_platform/backend/backend/api/features/workspace/preview_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
- autogpt_platform/backend/backend/api/features/workspace/preview.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: integration_test
- GitHub Check: check API types
- GitHub Check: Seer Code Review
- GitHub Check: end-to-end tests
- GitHub Check: test (3.13)
- GitHub Check: Check PR Status
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
autogpt_platform/backend/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development
autogpt_platform/backend/**/*.py: Usepoetry run ...command for executing Python package dependencies
Use top-level imports only — avoid local/inner imports except for lazy imports of heavy optional dependencies likeopenpyxl
Use absolute imports withfrom backend.module import ...for cross-package imports; single-dot relative imports are acceptable for sibling modules within the same package; avoid double-dot relative imports
Do not use duck typing — avoidhasattr/getattr/isinstancefor type dispatch; use typed interfaces/unions/protocols instead
Use Pydantic models over dataclass/namedtuple/dict for structured data
Do not use linter suppressors — no# type: ignore,# noqa,# pyright: ignore; fix the type/code instead
Prefer list comprehensions over manual loop-and-append patterns
Use early return with guard clauses first to avoid deep nesting
Use%sfor deferred interpolation indebuglog statements for efficiency; use f-strings elsewhere for readability (e.g.,logger.debug("Processing %s items", count)vslogger.info(f"Processing {count} items"))
Sanitize error paths by usingos.path.basename()in error messages to avoid leaking directory structure
Be aware of TOCTOU (Time-Of-Check-Time-Of-Use) issues — avoid check-then-act patterns for file access and credit charging
Usetransaction=Truefor Redis pipelines to ensure atomicity on multi-step operations
Usemax(0, value)guards for computed values that should never be negative
Keep files under ~300 lines; if a file grows beyond this, split by responsibility (extract helpers, models, or a sub-module into a new file)
Keep functions under ~40 lines; extract named helpers when a function grows longer
...
Files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
autogpt_platform/backend/backend/api/features/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Update routes in '/backend/backend/api/features/' and add/update Pydantic models in the same directory for API development
Files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
autogpt_platform/{backend,autogpt_libs}/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Format Python code with
poetry run format
Files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
autogpt_platform/backend/**/api/**/*.py
📄 CodeRabbit inference engine (autogpt_platform/backend/AGENTS.md)
autogpt_platform/backend/**/api/**/*.py: UseSecurity()instead ofDepends()for authentication dependencies to get proper OpenAPI security specification
Follow SSE (Server-Sent Events) protocol: usedata:lines for frontend-parsed events (must match Zod schema) and: commentlines for heartbeats/status
Files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
autogpt_platform/backend/**/*_test.py
📄 CodeRabbit inference engine (autogpt_platform/backend/AGENTS.md)
autogpt_platform/backend/**/*_test.py: Use pytest with snapshot testing for API responses
Colocate test files with source files using*_test.pynaming convention
Mock at boundaries — mock where the symbol is used, not where it's defined; after refactoring, update mock targets to match new module paths
UseAsyncMockfromunittest.mockfor async functions in tests
When writing tests, use Test-Driven Development (TDD): write failing tests marked with@pytest.mark.xfailbefore implementation, then remove the marker once the implementation is complete
When creating snapshots in tests, usepoetry run pytest path/to/test.py --snapshot-update; always review snapshot changes withgit diffbefore committing
Files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
🧠 Learnings (16)
📚 Learning: 2026-04-08T17:28:23.439Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/AGENTS.md:0-0
Timestamp: 2026-04-08T17:28:23.439Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : When creating snapshots in tests, use `poetry run pytest path/to/test.py --snapshot-update`; always review snapshot changes with `git diff` before committing
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-04-08T17:28:23.439Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/AGENTS.md:0-0
Timestamp: 2026-04-08T17:28:23.439Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Use pytest with snapshot testing for API responses
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-04-08T17:28:23.439Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/AGENTS.md:0-0
Timestamp: 2026-04-08T17:28:23.439Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Colocate test files with source files using `*_test.py` naming convention
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-04-08T17:28:23.439Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/AGENTS.md:0-0
Timestamp: 2026-04-08T17:28:23.439Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Mock at boundaries — mock where the symbol is **used**, not where it's **defined**; after refactoring, update mock targets to match new module paths
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-02-04T16:49:42.490Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-02-04T16:49:42.490Z
Learning: Applies to autogpt_platform/backend/**/test/**/*.py : Use snapshot testing with '--snapshot-update' flag in backend tests when output changes; always review with 'git diff'
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-02-04T16:49:42.490Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-02-04T16:49:42.490Z
Learning: Applies to autogpt_platform/backend/backend/api/features/**/*.py : Update routes in '/backend/backend/api/features/' and add/update Pydantic models in the same directory for API development
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-03-16T16:35:40.236Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/api/features/workflow_import.py:54-63
Timestamp: 2026-03-16T16:35:40.236Z
Learning: Avoid using the word 'competitor' in public-facing identifiers and text. Use neutral naming for API paths, model names, function names, and UI text. Examples: rename 'CompetitorFormat' to 'SourcePlatform', 'convert_competitor_workflow' to 'convert_workflow', '/competitor-workflow' to '/workflow'. Apply this guideline to files under autogpt_platform/backend and autogpt_platform/frontend.
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-03-31T15:37:38.626Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12623
File: autogpt_platform/backend/backend/copilot/tools/agent_generator/fixer.py:37-47
Timestamp: 2026-03-31T15:37:38.626Z
Learning: When validating/constructing Anthropic API model IDs in Significant-Gravitas/AutoGPT, allow the hyphen-separated Claude Opus 4.6 model ID `claude-opus-4-6` (it corresponds to `LlmModel.CLAUDE_4_6_OPUS` in `autogpt_platform/backend/backend/blocks/llm.py`). Do NOT require the dot-separated form in Anthropic contexts. Only OpenRouter routing variants should use the dot separator (e.g., `anthropic/claude-opus-4.6`); `claude-opus-4-6` should be treated as correct when passed to Anthropic, and flagged only if it’s used in the OpenRouter path where the dot form is expected.
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-04-15T02:43:36.890Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12780
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:0-0
Timestamp: 2026-04-15T02:43:36.890Z
Learning: When reviewing Python exception handlers, do not flag `isinstance(e, X)` checks as dead/unreachable if the caught exception `X` is a subclass of the exception type being handled. For example, if `X` (e.g., `VirusScanError`) inherits from `ValueError` (directly or via an intermediate class) and it can be raised within an `except ValueError:` block, then `isinstance(e, X)` inside that handler is reachable and should not be treated as dead code.
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-05-23T05:29:43.085Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 13200
File: autogpt_platform/backend/backend/executor/scheduler.py:590-593
Timestamp: 2026-05-23T05:29:43.085Z
Learning: When reviewing Python code that uses Pydantic discriminated/tagged unions (e.g., `Annotated[Union[...], Field(discriminator="kind")]`), recognize that using `isinstance(x, SomeVariantInfo)` to narrow the union is an intentional and correct runtime guard and should also enable static type narrowing in tools like Pyright. Do not recommend replacing such `isinstance`-based narrowing with `cast(...)` when the check already proves the variant at runtime.
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-04-22T11:46:04.431Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12881
File: autogpt_platform/backend/backend/copilot/config.py:0-0
Timestamp: 2026-04-22T11:46:04.431Z
Learning: Do not flag the Claude Sonnet 4.6 model ID as incorrect when it uses the project’s established hyphenated convention: `anthropic/claude-sonnet-4-6`. This hyphen form is the intentional, production convention and should be treated as valid (including in files like llm.py, blocks tests, reasoning.py, `_is_anthropic_model` tests, and config defaults). Note that OpenRouter also accepts the dot variant `anthropic/claude-sonnet-4.6`, so either form may be tolerated, but `anthropic/claude-sonnet-4-6` should be considered the standard to match project usage.
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-04-22T11:46:12.892Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12881
File: autogpt_platform/backend/backend/copilot/baseline/service.py:322-332
Timestamp: 2026-04-22T11:46:12.892Z
Learning: In this codebase (Significant-Gravitas/AutoGPT), OpenRouter-routed Anthropic model IDs should use the hyphen-separated convention (e.g., `anthropic/claude-sonnet-4-6`, `anthropic/claude-opus-4-6`). Although OpenRouter may accept both hyphen and dot variants, treat the hyphen-separated form as the intended, correct codebase-wide convention and do not flag it as an error. Only flag the dot-separated variant (e.g., `anthropic/claude-sonnet-4.6`) as incorrect when reviewing/validating model ID strings for OpenRouter-routed Anthropic models.
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-05-07T18:48:14.242Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 13040
File: autogpt_platform/backend/backend/blocks/llm.py:0-0
Timestamp: 2026-05-07T18:48:14.242Z
Learning: In this repository, isort may split imports from the same module into separate blocks when some imports are aliased (e.g., `from module import X as Y`) and others are not. Preserve the two-block layout when it results from isort (such as keeping `from openai.types.chat import ChatCompletion as OpenAIChatCompletion` separate from non-aliased imports from `openai.types.chat`). Do not treat that split as a style issue during review; merging them into a single block can fail CI with `Imports are incorrectly sorted and/or formatted`.
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
📚 Learning: 2026-05-26T14:24:34.866Z
Learnt from: Abhi1992002
Repo: Significant-Gravitas/AutoGPT PR: 13217
File: autogpt_platform/backend/backend/api/features/search/service.py:137-137
Timestamp: 2026-05-26T14:24:34.866Z
Learning: In the Significant-Gravitas/AutoGPT backend, treat `user_id` (an opaque UUID used only for correlation/tracing) as non-PII. Do not flag direct logging of `user_id` in `logger.warning`/`logger.info` statements as a PII exposure issue, as the established convention is to log `user_id` for tracing while reserving PII for fields like email or display name.
Applied to files:
autogpt_platform/backend/backend/api/features/workspace/preview_test.py
Why / What / How
Why: The Files/Artifacts page (behind
ARTIFACTS_PAGE) previewed every file by downloading the whole file through/api/workspace/files/{id}/download— a 10 MB image or 200 KB log was fully transferred just to paint a tiny card, and non-image types (PDF, Office, CSV, ICS/VCF) had no real preview at all. On a grid of N files that's N full downloads over the expensive backend→browser leg.What: A dedicated, cheaper preview endpoint plus richer per-type client rendering, scoped to the popular/cheap-to-render types. Large files skip the request and show an illustration.
How: New
GET /api/workspace/files/{id}/previewreads the file on the cheap storage→backend leg and returns something small on the expensive leg:?w=)pypdfium2(Apache/BSD, not AGPL) → WebPdocProps/thumbnailextracted via stdlibzipfile(no render; 415 when absent → client shows illustration)?bytes=) via a newretrieve_partial()on the storage backends (true partial read locally; GCSRangerequest with full-read fallback)Screen.Recording.2026-06-03.at.9.44.35.AM.mov
Thumbnails are cached in Redis keyed by file checksum; the endpoint sets
Cache-Control: private, max-age=86400and is added to the cache middleware allow-list so the browser caches too. CPU work (Pillow/pypdfium2/zip) runs inasyncio.to_thread. Per-kind size ceilings are enforced server-side (413) and mirrored on the frontend so oversized files never hit the endpoint.Frontend routes each kind to a small component: shared
<img>for image/pdf/office thumbnails, a CSV mini-table, JSON, an ICS event card (weekday → day → title → time), a VCF contact card (photo/initials, name, org, phone, email), and a typed illustration fallback (React-logo badge for.jsx/.tsx). The page heading and navbar entry are renamed Artifacts → Files (route/flag unchanged).Changes 🏗️
Backend
backend/api/features/workspace/preview.py— MIME dispatch, image/PDF/Office rendering, text partial read, Redis cache.GET /files/{id}/previewroute (w,bytesparams); reusesget_workspace+get_workspace_filefor ownership.workspace_storage.py—retrieve_partial()on the ABC + Local + GCS;gcs_utils.py—download_range().security.py— allow caching for/api/workspace/files/*/preview.pillow(<12, moviepy constraint),pypdfium2.preview_test.py(9 cases incl. 413 too-large, 415 unsupported, cache hit).Frontend
helpers.ts—getFilePreviewUrl(), extendedgetPreviewKind()with size caps mirroring the backend.parsers.ts(CSV/ICS/VCF) + preview sub-components (MediaPreview,TextPreviews,StructuredPreviews,PreviewParts);CardPreviewis now a thin router.FileIllustration.tsx— React-apptypeKey+ filename-awarepickFileTypeKey.page.tsx,Navbar.tsx, empty state).Checklist 📋
For code changes:
poetry run pytest backend/api/features/workspace/preview_test.py(9 passed) +poetry run lintcleanpnpm test:unitfor the artifacts suiteopenapi.json(pnpm generate:api) and confirm the API-types check passes/preview(small WebP / byte-capped) instead of/download; reload serves from cache; Download button still returns the full file