Description
Is there an existing issue for the same bug?
- I have checked the existing issues.
RAGFlow workspace code commit ID
?
RAGFlow image version
v0.17.0-57-g4f950430 full
Other environment information
Docker nightly pull
Actual behavior
2025-03-07 08:29:23,561 INFO 29 HTTP Request: POST http://host.docker.internal:666/v1/chat/completions "HTTP/1.1 200 OK"
2025-03-07 08:29:36,630 INFO 29 HTTP Request: POST http://host.docker.internal:666/v1/chat/completions "HTTP/1.1 200 OK"
2025-03-07 08:29:37,055 INFO 29 set_progress(6e0cb884fb2411efac46760ef4049cc8), progress: -1, progress_msg: 08:29:37 [ERROR][Exception]: Exceptions from Trio nursery (1 sub-exception)
2025-03-07 08:29:37,095 ERROR 29 handle_task got exception for task {"id": "6e0cb884fb2411efac46760ef4049cc8", "doc_id": "fe441a36f5e411efb063563e61c9c160", "from_page": 100000000, "to_page": 100000000, "retry_count": 0, "kb_id": "13ff7924f5db11efb2dec6a17a35e4d4", "parser_id": "paper", "parser_config": {"auto_keywords": 0, "auto_questions": 0, "raptor": {"use_raptor": true, "prompt": "Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\n {cluster_content}\nThe above is the content you need to summarize.", "max_token": 256, "threshold": 0.1, "max_cluster": 64, "random_seed": 0}, "graphrag": {"use_graphrag": true, "entity_types": ["organization", "person", "geo", "event", "category"], "method": "light", "resolution": true, "community": true}}, "name": "xxx.pdf", "type": "pdf", "location": "xxx.pdf", "size": 1554186, "tenant_id": "577c828aeebc11ef8ae60242ac130006", "language": "English", "embd_id": "sentence-transformers/all-MiniLM-L6-v2@FastEmbed", "pagerank": 0, "kb_parser_config": {"auto_keywords": 0, "auto_questions": 0, "raptor": {"use_raptor": true, "prompt": "Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\n {cluster_content}\nThe above is the content you need to summarize.", "max_token": 256, "threshold": 0.1, "max_cluster": 64, "random_seed": 0}, "graphrag": {"use_graphrag": true, "entity_types": ["organization", "person", "geo", "event", "category"], "method": "light", "resolution": true, "community": true}}, "img2txt_id": "", "asr_id": "", "llm_id": "AI___OpenAI-API@OpenAI-API-Compatible", "update_time": 1741331935552, "task_type": "graphrag"}
- Exception Group Traceback (most recent call last):
| File "/ragflow/rag/svr/task_executor.py", line 617, in handle_task
| await do_handle_task(task)
| File "/ragflow/rag/svr/task_executor.py", line 529, in do_handle_task
| await run_graphrag(task, chat_model, task_language, embedding_model, progress_callback)
| File "/ragflow/rag/svr/task_executor.py", line 471, in run_graphrag
| await dealer()
| File "/ragflow/graphrag/general/index.py", line 62, in call
| ents, rels = await self.ext(self.chunks, self.callback)
| File "/ragflow/graphrag/general/extractor.py", line 103, in call
| async with trio.open_nursery() as nursery:
| File "/ragflow/.venv/lib/python3.10/site-packages/trio/_core/_run.py", line 1058, in aexit
| raise combined_error_from_nursery
| exceptiongroup.ExceptionGroup: Exceptions from Trio nursery (1 sub-exception)
+-+---------------- 1 ----------------
| Traceback (most recent call last):
| File "/ragflow/graphrag/light/graph_extractor.py", line 95, in _process_single_content
| final_result = await trio.to_thread.run_sync(lambda: self._chat(hint_prompt, [{"role": "user", "content": "Output:"}], gen_conf))
| File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync
| return msg_from_thread.unwrap()
| File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
| raise captured_error
| File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result
| return result.unwrap()
| File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
| raise captured_error
| File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn
| ret = context.run(sync_fn, *args)
| File "/ragflow/graphrag/light/graph_extractor.py", line 95, in
| final_result = await trio.to_thread.run_sync(lambda: self._chat(hint_prompt, [{"role": "user", "content": "Output:"}], gen_conf))
| File "/ragflow/graphrag/general/extractor.py", line 63, in _chat
| response = self._llm.chat(system_msg[0]["content"], hist, conf)
| IndexError: list index out of range
+------------------------------------
2025-03-07 08:29:49,270 INFO 29 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2025-03-07T08:29:49.269+01:00", "boot_at": "2025-03-07T08:15:17.104+01:00", "pending": 0, "lag": 0, "done": 1, "failed": 1, "current": {}}
2025-03-07 08:30:19,291 INFO 29 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2025-03-07T08:30:19.291+01:00", "boot_at": "2025-03-07T08:15:17.104+01:00", "pending": 0, "lag": 0, "done": 1, "failed": 1, "current": {}}
2025-03-07 08:30:33,592 INFO 16 172.19.0.6 - - [07/Mar/2025 08:30:33] "GET /v1/user/info HTTP/1.1" 200 -
2025-03-07 08:30:33,605 INFO 16 172.19.0.6 - - [07/Mar/2025 08:30:33] "GET /v1/user/tenant_info HTTP/1.1" 200 -
Expected behavior
No response
Steps to reproduce
Using a 77 page PDF I am getting this error. Other smaller PDFs are working.