fix(api): "File validation failed" on Chatflow follow-up with custom file type + memory#35891
Merged
wylswz merged 6 commits intoMay 11, 2026
Merged
Conversation
48bb3af to
424ed80
Compare
wylswz
reviewed
May 8, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes a chatflow regression where file inputs that were accepted on the first turn could fail validation on subsequent turns when conversation history is replayed (notably with CUSTOM/“Other file types” + memory enabled). The change updates backend file validation to treat CUSTOM as an extension-gated fallback bucket and prevents re-validating persisted/history files during prompt reconstruction.
Changes:
- Refactor
is_file_valid_with_configto implement “bucket semantics” and normalize extension allowlists case-/dot-insensitively. - Skip file re-validation on history replay paths in
TokenBufferMemoryandBaseAgentRunnerby passingconfig=Noneinto message-file rehydration. - Add unit tests covering the new validation semantics and the “no re-validation on replay” contract.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| api/factories/file_factory/validation.py | Updates file validation logic (CUSTOM fallback bucket + normalized extension matching). |
| api/core/memory/token_buffer_memory.py | Prevents history replay from re-validating persisted files by rehydrating with config=None. |
| api/core/agent/base_agent_runner.py | Aligns agent history/user prompt reconstruction with “no re-validation on replay”. |
| api/tests/unit_tests/factories/test_file_validation.py | Adds unit tests for new validation/bucketing + extension normalization behavior. |
| api/tests/unit_tests/core/memory/test_token_buffer_memory.py | Adds a unit test asserting replay calls build_from_message_file(..., config=None). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Contributor
Pyrefly Type Coverage
|
c8b14a2 to
84d02de
Compare
A Chatflow file uploaded into the CUSTOM type slot is coerced to its detected type by _resolve_file_type (PNG -> IMAGE), and MessageFile.type persists that resolved type. On history replay, build_from_message_file rebuilds mapping["type"] from MessageFile.type, so a file that passed round 1 (mapping["type"]=="custom") was rejected on round 2 (mapping["type"]=="image") even though the workflow config was unchanged. - Refactor is_file_valid_with_config with bucket semantics: CUSTOM acts as a fallback bucket gated by allowed_file_extensions, compared case- and dot-insensitively. This also fixes a parallel mismatch where a user whitelist of [".PNG", "png", "JPG", ...] failed to match the upload-side ".png" (always lowercase with leading dot). - Skip re-validation when rehydrating files from conversation history in TokenBufferMemory and BaseAgentRunner; history files were validated at upload time, mirroring build_file_from_stored_mapping.
Follow-up to the prior fix. The bucket-semantics rewrite changed the extension-whitelist guard from `is not None` to truthiness, which silently widened behavior for the empty-list case (UI never submits it, but DSL / API paths could). Restore the original deny-on-empty posture: when a file falls into the CUSTOM bucket, an explicitly set whitelist (including []) is authoritative. Also tightens _normalize_extension so whitespace-only input returns "" consistent with empty input, and locks two contracts with tests: - empty whitelist + CUSTOM bucket rejects (regression guard for the silent widening) - TokenBufferMemory passes config=None to build_from_message_file (regression guard for the replay-skips-validation contract)
A whitelist with an empty / whitespace entry (e.g. a stray comma in DSL) combined with an extensionless file would spuriously match — both sides normalize to "" and pass. Filter empty normalized whitelist entries and short-circuit when the input extension itself normalizes to empty, so invalid whitelist entries can't widen the allowlist. Reported by Copilot on PR review.
The walrus filter was redundant given the early return on empty input: empty whitelist entries normalize to "" and can never match a non-empty input extension, and empty input is already rejected upfront.
Both helpers in factories/file_factory/message_files.py are only invoked from replay paths that intentionally skip re-validation, so the config argument was always None. Remove it from the signatures and update the two call sites; module docstring records the design intent.
84d02de to
185121b
Compare
wylswz
approved these changes
May 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A Chatflow whose LLM node has memory enabled and File Upload set to
Other file types(CUSTOM) fails on the second turn withFile validation failed for file: <name>, even when no new file is uploaded.Root cause. A file uploaded into the CUSTOM slot is coerced to its detected type by
_resolve_file_type(PNG →IMAGE), andMessageFile.typepersists the resolved type. On history replay,build_from_message_filerebuildsmapping["type"]fromMessageFile.type, so a file that passed round 1 (mapping["type"]=="custom") is rejected on round 2 (mapping["type"]=="image"). The validator only bypassed the type gate for literal"custom", not for "config has CUSTOM as a fallback bucket".A parallel mismatch was reachable on round 1: the extension whitelist used a raw
incheck, so a user-typed list like[".PNG", "png", "JPG", ...]failed to match the upload-side".png"(always lowercased with leading dot).Fix.
is_file_valid_with_configto bucket semantics.CUSTOMis a fallback bucket gated byallowed_file_extensions, compared case- and dot-insensitively. Empty whitelist while in the CUSTOM bucket continues to deny (defensive against DSL/API paths that bypass the UI).TokenBufferMemoryandBaseAgentRunner, mirroring thebuild_file_from_stored_mappingpattern. Validation belongs at upload time, not on replay.Validator behavior matrix
allowed_file_typesinput_file_typeallowed_file_extensionsfile_extension[CUSTOM]custom(round 1)[".png"].png[CUSTOM]image(replay)[".png"].png[CUSTOM]custom[".PNG", "png", "JPG"].png[IMAGE, CUSTOM]document[".pdf"].pdf[CUSTOM]custom[].png[IMAGE]video.mp4TOOL_FILE)Bold rows are the user-visible fixes. The empty-whitelist row preserves the original deny-on-empty posture for paths that bypass the UI.
Screenshots
N/A — backend-only change.
Checklist
make lint && make type-check(backend) andcd web && pnpm exec vp staged(frontend) to appease the lint gods