Skip to content

[Bug]: Exceptions from Trio nursery #5761

@sebastiangeislerlipo

Description

@sebastiangeislerlipo

Is there an existing issue for the same bug?

  • I have checked the existing issues.

RAGFlow workspace code commit ID

?

RAGFlow image version

v0.17.0-57-g4f950430 full

Other environment information

Docker nightly pull

Actual behavior

2025-03-07 08:29:23,561 INFO 29 HTTP Request: POST http://host.docker.internal:666/v1/chat/completions "HTTP/1.1 200 OK"
2025-03-07 08:29:36,630 INFO 29 HTTP Request: POST http://host.docker.internal:666/v1/chat/completions "HTTP/1.1 200 OK"
2025-03-07 08:29:37,055 INFO 29 set_progress(6e0cb884fb2411efac46760ef4049cc8), progress: -1, progress_msg: 08:29:37 [ERROR][Exception]: Exceptions from Trio nursery (1 sub-exception)
2025-03-07 08:29:37,095 ERROR 29 handle_task got exception for task {"id": "6e0cb884fb2411efac46760ef4049cc8", "doc_id": "fe441a36f5e411efb063563e61c9c160", "from_page": 100000000, "to_page": 100000000, "retry_count": 0, "kb_id": "13ff7924f5db11efb2dec6a17a35e4d4", "parser_id": "paper", "parser_config": {"auto_keywords": 0, "auto_questions": 0, "raptor": {"use_raptor": true, "prompt": "Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\n {cluster_content}\nThe above is the content you need to summarize.", "max_token": 256, "threshold": 0.1, "max_cluster": 64, "random_seed": 0}, "graphrag": {"use_graphrag": true, "entity_types": ["organization", "person", "geo", "event", "category"], "method": "light", "resolution": true, "community": true}}, "name": "xxx.pdf", "type": "pdf", "location": "xxx.pdf", "size": 1554186, "tenant_id": "577c828aeebc11ef8ae60242ac130006", "language": "English", "embd_id": "sentence-transformers/all-MiniLM-L6-v2@FastEmbed", "pagerank": 0, "kb_parser_config": {"auto_keywords": 0, "auto_questions": 0, "raptor": {"use_raptor": true, "prompt": "Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\n {cluster_content}\nThe above is the content you need to summarize.", "max_token": 256, "threshold": 0.1, "max_cluster": 64, "random_seed": 0}, "graphrag": {"use_graphrag": true, "entity_types": ["organization", "person", "geo", "event", "category"], "method": "light", "resolution": true, "community": true}}, "img2txt_id": "", "asr_id": "", "llm_id": "AI___OpenAI-API@OpenAI-API-Compatible", "update_time": 1741331935552, "task_type": "graphrag"}

  • Exception Group Traceback (most recent call last):
    | File "/ragflow/rag/svr/task_executor.py", line 617, in handle_task
    | await do_handle_task(task)
    | File "/ragflow/rag/svr/task_executor.py", line 529, in do_handle_task
    | await run_graphrag(task, chat_model, task_language, embedding_model, progress_callback)
    | File "/ragflow/rag/svr/task_executor.py", line 471, in run_graphrag
    | await dealer()
    | File "/ragflow/graphrag/general/index.py", line 62, in call
    | ents, rels = await self.ext(self.chunks, self.callback)
    | File "/ragflow/graphrag/general/extractor.py", line 103, in call
    | async with trio.open_nursery() as nursery:
    | File "/ragflow/.venv/lib/python3.10/site-packages/trio/_core/_run.py", line 1058, in aexit
    | raise combined_error_from_nursery
    | exceptiongroup.ExceptionGroup: Exceptions from Trio nursery (1 sub-exception)
    +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    | File "/ragflow/graphrag/light/graph_extractor.py", line 95, in _process_single_content
    | final_result = await trio.to_thread.run_sync(lambda: self._chat(hint_prompt, [{"role": "user", "content": "Output:"}], gen_conf))
    | File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync
    | return msg_from_thread.unwrap()
    | File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    | raise captured_error
    | File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result
    | return result.unwrap()
    | File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    | raise captured_error
    | File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn
    | ret = context.run(sync_fn, *args)
    | File "/ragflow/graphrag/light/graph_extractor.py", line 95, in
    | final_result = await trio.to_thread.run_sync(lambda: self._chat(hint_prompt, [{"role": "user", "content": "Output:"}], gen_conf))
    | File "/ragflow/graphrag/general/extractor.py", line 63, in _chat
    | response = self._llm.chat(system_msg[0]["content"], hist, conf)
    | IndexError: list index out of range
    +------------------------------------
    2025-03-07 08:29:49,270 INFO 29 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2025-03-07T08:29:49.269+01:00", "boot_at": "2025-03-07T08:15:17.104+01:00", "pending": 0, "lag": 0, "done": 1, "failed": 1, "current": {}}
    2025-03-07 08:30:19,291 INFO 29 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2025-03-07T08:30:19.291+01:00", "boot_at": "2025-03-07T08:15:17.104+01:00", "pending": 0, "lag": 0, "done": 1, "failed": 1, "current": {}}
    2025-03-07 08:30:33,592 INFO 16 172.19.0.6 - - [07/Mar/2025 08:30:33] "GET /v1/user/info HTTP/1.1" 200 -
    2025-03-07 08:30:33,605 INFO 16 172.19.0.6 - - [07/Mar/2025 08:30:33] "GET /v1/user/tenant_info HTTP/1.1" 200 -

Expected behavior

No response

Steps to reproduce

Using a 77 page PDF I am getting this error. Other smaller PDFs are working.

Additional information

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐞 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions