Skip to content

[Bug]: Deploy on Nvidia 5080, Docker starts with GPU configuration, and the knowledge base parses the document with an error. #5907

Open
@zmxccxy

Description

@zmxccxy

Is there an existing issue for the same bug?

  • I have checked the existing issues.

RAGFlow workspace code commit ID

Latest v0.17.0

RAGFlow image version

Latest v0.17.0

Other environment information

Windows 11 system cpu i7-14700kf、gpu 5080, Installed Docker desktop.

After installing the graphics card driver cuda kit 12.8、cudnn 9.8.1。

Then it was changed to version 12.6 of CUDA, and the corresponding version of CUDNN still reported the same problem.

Actual behavior

Specific error information:
ERROR:root:Fail to bind embedding model: CUDA error: no kernel image is available for execution on the device 2025-03-11 14:57:45 CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 2025-03-11 14:57:45 For debugging consider passing CUDA_LAUNCH_BLOCKING=1 2025-03-11 14:57:45 Compile with to enable device-side assertions. 2025-03-11 14:57:45 Traceback (most recent call last): 2025-03-11 14:57:45 File "/ragflow/rag/svr/task_executor.py", line 519, in do_handle_task 2025-03-11 14:57:45 vts, _ = embedding_model.encode(["ok"]) 2025-03-11 14:57:45 File "<https://github.com/beartype(api.db.services.llm_service.LLMBundle.encode) at 0x7f089fb60430>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/api/db/services/llm_service.py", line 240, in encode 2025-03-11 14:57:45 embeddings, used_tokens = self.mdl.encode(texts) 2025-03-11 14:57:45 File "<https://github.com/beartype(rag.llm.embedding_model.DefaultEmbedding.encode) at 0x7f08a1bddf30>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/rag/llm/embedding_model.py", line 104, in encode 2025-03-11 14:57:45 ress.extend(self._model.encode(texts[i:i + batch_size]).tolist()) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context 2025-03-11 14:57:45 return func(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/FlagEmbedding/flag_models.py", line 96, in encode 2025-03-11 14:57:45 last_hidden_state = self.model(**inputs, return_dict=True).last_hidden_state 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl 2025-03-11 14:57:45 return self._call_impl(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl 2025-03-11 14:57:45 return forward_call(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 986, in forward 2025-03-11 14:57:45 extended_attention_mask: torch.Tensor = self.get_extended_attention_mask(attention_mask, input_shape) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 969, in get_extended_attention_mask 2025-03-11 14:57:45 extended_attention_mask = extended_attention_mask.to(dtype=dtype) # fp16 compatibility 2025-03-11 14:57:45 RuntimeError: CUDA error: no kernel image is available for execution on the device 2025-03-11 14:57:45 CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 2025-03-11 14:57:45 For debugging consider passing CUDA_LAUNCH_BLOCKING=1 2025-03-11 14:57:45 Compile with to enable device-side assertions. 2025-03-11 14:57:45 2025-03-11 14:57:45 ERROR:root:handle_task got exception for task {"id": "22c765f2fe4611ef98d00242ac120006", "doc_id": "2ea8b848fe1f11efa1b20242ac120006", "from_page": 0, "to_page": 94, "retry_count": 0, "kb_id": "1ddb13e4fe1f11efbe600242ac120006", "parser_id": "table", "parser_config": {"layout_recognize": "DeepDOC", "chunk_token_num": 128, "delimiter": "\n!?;\u3002\uff1b\uff01\uff1f", "auto_keywords": 0, "auto_questions": 0, "html4excel": false, "graphrag": {"use_graphrag": false}, "pages": []}, "name": "\u4e09\u8f6e\u592e\u7763\u6574\u6539\u8fdb\u5c55.xlsx", "type": "doc", "location": "\u4e09\u8f6e\u592e\u7763\u6574\u6539\u8fdb\u5c55.xlsx", "size": 149379, "tenant_id": "3fcce5fafe1e11efa9770242ac120006", "language": "Chinese", "embd_id": "BAAI/bge-large-zh-v1.5@BAAI", "pagerank": 0, "kb_parser_config": {"layout_recognize": "DeepDOC", "chunk_token_num": 128, "delimiter": "\n!?;\u3002\uff1b\uff01\uff1f", "auto_keywords": 0, "auto_questions": 0, "html4excel": false, "raptor": {"use_raptor": false}, "graphrag": {"use_graphrag": false}}, "img2txt_id": "", "asr_id": "", "llm_id": "deepseek-r1-distill-llama-70b@Tongyi-Qianwen", "update_time": 1741676265665, "task_type": ""} 2025-03-11 14:57:45 Traceback (most recent call last): 2025-03-11 14:57:45 File "/ragflow/rag/svr/task_executor.py", line 662, in handle_task 2025-03-11 14:57:45 do_handle_task(task) 2025-03-11 14:57:45 File "/ragflow/rag/svr/task_executor.py", line 519, in do_handle_task 2025-03-11 14:57:45 vts, _ = embedding_model.encode(["ok"]) 2025-03-11 14:57:45 File "<https://github.com/beartype(api.db.services.llm_service.LLMBundle.encode) at 0x7f089fb60430>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/api/db/services/llm_service.py", line 240, in encode 2025-03-11 14:57:45 embeddings, used_tokens = self.mdl.encode(texts) 2025-03-11 14:57:45 File "<https://github.com/beartype(rag.llm.embedding_model.DefaultEmbedding.encode) at 0x7f08a1bddf30>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/rag/llm/embedding_model.py", line 104, in encode 2025-03-11 14:57:45 ress.extend(self._model.encode(texts[i:i + batch_size]).tolist()) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context 2025-03-11 14:57:45 return func(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/FlagEmbedding/flag_models.py", line 96, in encode 2025-03-11 14:57:45 last_hidden_state = self.model(**inputs, return_dict=True).last_hidden_state 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl 2025-03-11 14:57:45 return self._call_impl(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl 2025-03-11 14:57:45 return forward_call(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 986, in forward 2025-03-11 14:57:45 extended_attention_mask: torch.Tensor = self.get_extended_attention_mask(attention_mask, input_shape) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 969, in get_extended_attention_mask 2025-03-11 14:57:45 extended_attention_mask = extended_attention_mask.to(dtype=dtype) # fp16 compatibility 2025-03-11 14:57:45 RuntimeError: CUDA error: no kernel image is available for execution on the device 2025-03-11 14:57:45 CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 2025-03-11 14:57:45 For debugging consider passing CUDA_LAUNCH_BLOCKING=1 2025-03-11 14:57:45 Compile with to enable device-side assertions.TORCH_USE_CUDA_DSATORCH_USE_CUDA_DSATORCH_USE_CUDA_DSA

image:

Image

Image

Image

Image

Expected behavior

No response

Steps to reproduce

According to the above content

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐞 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions