Skip to content

Graph extractor processes chunks serially within each document, making large PDFs very slow #1867

@earayu

Description

@earayu

Problem

During the Singapore enterprise ApeRAG import on 2026-04-29, vector and fulltext converged for 169 documents, but graph index creation was much slower. At runtime we observed graph worker concurrency at the document level only: with 2 API pods and graph worker concurrency=4 per pod, there were about 7 RUNNING graph documents. However, ACTIVE increased very slowly because each large document processes its chunks sequentially.

Code path:

  • aperag/indexing/orchestrator.py: run_graph_worker = _entrypoint(Modality.GRAPH, concurrency=4) controls document-level worker concurrency.
  • aperag/indexing/graph_extractor.py: _extractor(chunks) loops for chunk in chunks and awaits _extract_one_chunk(...) before moving to the next chunk.
  • Default per-chunk timeout is 60s, so large PDFs with many chunks can occupy one graph worker for a long time.

Impact

  • CPU can stay low while graph completion is slow because work is mostly remote LLM I/O.
  • Scaling API pods increases document-level graph worker count, but does not reduce latency for one large document.
  • In deployments where API and workers are in the same process, long graph jobs also increase API tail latency/readiness risk.

Suggested fix

Add bounded per-document chunk-level concurrency for graph extraction, configurable in collection.config.knowledge_graph_config or deployment config. For example:

  • chunk_concurrency default small (e.g. 4) with upper bound.
  • Use an asyncio.Semaphore around per-chunk LLM calls and asyncio.gather(..., return_exceptions=True) while preserving per-chunk failure isolation.
  • Keep global/document-level graph worker concurrency separate from chunk-level concurrency to avoid overloading the model provider.
  • Add tests that prove multiple chunks can be in flight and that one failed chunk does not fail the whole document.

Related operational finding

This is separate from ApeRAG#1866, where graph compaction logs non-fatal warnings due missing keyword arguments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions