Best practices for file ingestion and state management in workflows #2941

dragospadurariu · 2025-12-17T19:08:56Z

dragospadurariu
Dec 17, 2025

Hi Team,

I am building a workflow using agent-framework that needs to ingest and analyze various files (PDFs, images, docx etc.). I'm looking for guidance on the best way to handle this in a graph-based workflow:

State Management: Is it best practice to load file content into the WorkflowContext / State object, or should nodes pass file paths and load data locally as needed?
Multi-modality: How should image data be "carried" through the workflow so downstream multimodal executors can access it?
Patterns: Is a single Ingestor node preferred or should I use a fan-out to specialized parsing nodes for different file types?

The current workflow samples focus on text inputs, so any advice on handling file-heavy state would be helpful. Thanks!

moonbox3 · 2026-01-12T08:25:30Z

moonbox3
Jan 12, 2026
Maintainer

Great questions, @dragospadurariu.

Here are the recommended patterns for file ingestion and state management in workflows:

State Management: References vs. Content

Best practice: Store content in shared state, pass lightweight references (IDs/paths) in messages.

Large payloads (file content, images, parsed data) should be stored in SharedState and accessed via keys. Messages between executors should carry only identifiers.

@executor(id="ingest_file")
async def ingest_file(file_path: str, ctx: WorkflowContext[str]) -> None:
    # Load file content once
    async with aiofiles.open(file_path, "rb") as f:
        content = await f.read()
    
    # Store in shared state with unique key
    file_id = str(uuid4())
    await ctx.set_shared_state(f"file:{file_id}", {
        "path": file_path,
        "content": content,
        "mime_type": detect_mime_type(file_path),
    })
    
    # Pass only the lightweight reference downstream
    await ctx.send_message(file_id)

Downstream executors retrieve via await ctx.get_shared_state(f"file:{file_id}").

See shared_states_with_agents.py for a complete example of this pattern.

Multi-modality: Carrying Image Data

For multimodal workflows where downstream agents need image access:

Store raw bytes + metadata in shared state:

await ctx.set_shared_state(f"image:{image_id}", {
    "bytes": image_bytes,
    "mime_type": "image/png",
    "width": 1024,
    "height": 768,
})

When invoking a multimodal agent, reconstruct the content in the executor that calls the agent:

@executor(id="analyze_image")
async def analyze_image(image_id: str, ctx: WorkflowContext[str]) -> None:
    image_data = await ctx.get_shared_state(f"image:{image_id}")
    
    # Build multimodal message for the agent
    messages = [ChatMessage(
        role=Role.USER,
        contents=[
            TextContent(text="Describe this image"),
            ImageContent(data=image_data["bytes"], mime_type=image_data["mime_type"]),
        ],
    )]
    # ... invoke agent with messages

Patterns: Single Ingestor vs. Specialized Parsers

Recommendation: Use a dispatcher pattern with fan-out to specialized parsers.

                          ┌─> PDFParser ─────┐
Input -> FileDispatcher ──┼─> ImageParser ───┼─> Aggregator -> ...
                          └─> DocxParser ────┘

@executor(id="file_dispatcher")
async def file_dispatcher(file_path: str, ctx: WorkflowContext[FileRef]) -> None:
    mime = detect_mime_type(file_path)
    file_id = str(uuid4())
    
    # Store raw file
    await ctx.set_shared_state(f"file:{file_id}", {"path": file_path, "mime": mime})
    
    # Route to appropriate parser based on type
    await ctx.send_message(FileRef(id=file_id, mime=mime))

# Use conditional edges to route by mime type
workflow = (
    WorkflowBuilder()
    .register_executor(lambda: FileDispatcher(id="dispatcher"), name="dispatcher")
    .register_executor(lambda: PDFParser(id="pdf_parser"), name="pdf_parser")
    .register_executor(lambda: ImageParser(id="image_parser"), name="image_parser")
    .add_conditional_edge(
        "dispatcher",
        [("pdf_parser", lambda m: m.mime == "application/pdf"),
         ("image_parser", lambda m: m.mime.startswith("image/"))],
    )
    .set_start_executor("dispatcher")
    # ...
    .build()
)

This approach:

Keeps each parser focused on one file type
Enables parallel processing via fan-out when handling multiple files
Scales by adding new parsers without modifying existing ones

For large-scale file processing with parallel map/reduce, see map_reduce_and_visualization.py which demonstrates passing file paths through shared state to bound memory usage.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Best practices for file ingestion and state management in workflows #2941

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Best practices for file ingestion and state management in workflows #2941

Uh oh!

dragospadurariu Dec 17, 2025

Replies: 1 comment

Uh oh!

moonbox3 Jan 12, 2026 Maintainer

State Management: References vs. Content

Multi-modality: Carrying Image Data

Patterns: Single Ingestor vs. Specialized Parsers

dragospadurariu
Dec 17, 2025

moonbox3
Jan 12, 2026
Maintainer