Description
Is there an existing issue for the same bug?
- I have checked the existing issues.
RAGFlow workspace code commit ID
最新的v0.17.0
RAGFlow image version
最新的v0.17.0
Other environment information
Windows 11 系统、cpu i7-14700kf、gpu 5080,安装了docker desktop。
安装好了显卡驱动、cuda kit 12.8、cudnn 9.8.1。
Actual behavior
具体报错信息:
ERROR:root:Fail to bind embedding model: CUDA error: no kernel image is available for execution on the device 2025-03-11 14:57:45 CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 2025-03-11 14:57:45 For debugging consider passing CUDA_LAUNCH_BLOCKING=1 2025-03-11 14:57:45 Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions. 2025-03-11 14:57:45 Traceback (most recent call last): 2025-03-11 14:57:45 File "/ragflow/rag/svr/task_executor.py", line 519, in do_handle_task 2025-03-11 14:57:45 vts, _ = embedding_model.encode(["ok"]) 2025-03-11 14:57:45 File "<@beartype(api.db.services.llm_service.LLMBundle.encode) at 0x7f089fb60430>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/api/db/services/llm_service.py", line 240, in encode 2025-03-11 14:57:45 embeddings, used_tokens = self.mdl.encode(texts) 2025-03-11 14:57:45 File "<@beartype(rag.llm.embedding_model.DefaultEmbedding.encode) at 0x7f08a1bddf30>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/rag/llm/embedding_model.py", line 104, in encode 2025-03-11 14:57:45 ress.extend(self._model.encode(texts[i:i + batch_size]).tolist()) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context 2025-03-11 14:57:45 return func(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/FlagEmbedding/flag_models.py", line 96, in encode 2025-03-11 14:57:45 last_hidden_state = self.model(**inputs, return_dict=True).last_hidden_state 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl 2025-03-11 14:57:45 return self._call_impl(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl 2025-03-11 14:57:45 return forward_call(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 986, in forward 2025-03-11 14:57:45 extended_attention_mask: torch.Tensor = self.get_extended_attention_mask(attention_mask, input_shape) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 969, in get_extended_attention_mask 2025-03-11 14:57:45 extended_attention_mask = extended_attention_mask.to(dtype=dtype) # fp16 compatibility 2025-03-11 14:57:45 RuntimeError: CUDA error: no kernel image is available for execution on the device 2025-03-11 14:57:45 CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 2025-03-11 14:57:45 For debugging consider passing CUDA_LAUNCH_BLOCKING=1 2025-03-11 14:57:45 Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions. 2025-03-11 14:57:45 2025-03-11 14:57:45 ERROR:root:handle_task got exception for task {"id": "22c765f2fe4611ef98d00242ac120006", "doc_id": "2ea8b848fe1f11efa1b20242ac120006", "from_page": 0, "to_page": 94, "retry_count": 0, "kb_id": "1ddb13e4fe1f11efbe600242ac120006", "parser_id": "table", "parser_config": {"layout_recognize": "DeepDOC", "chunk_token_num": 128, "delimiter": "\n!?;\u3002\uff1b\uff01\uff1f", "auto_keywords": 0, "auto_questions": 0, "html4excel": false, "graphrag": {"use_graphrag": false}, "pages": []}, "name": "\u4e09\u8f6e\u592e\u7763\u6574\u6539\u8fdb\u5c55.xlsx", "type": "doc", "location": "\u4e09\u8f6e\u592e\u7763\u6574\u6539\u8fdb\u5c55.xlsx", "size": 149379, "tenant_id": "3fcce5fafe1e11efa9770242ac120006", "language": "Chinese", "embd_id": "BAAI/bge-large-zh-v1.5@BAAI", "pagerank": 0, "kb_parser_config": {"layout_recognize": "DeepDOC", "chunk_token_num": 128, "delimiter": "\n!?;\u3002\uff1b\uff01\uff1f", "auto_keywords": 0, "auto_questions": 0, "html4excel": false, "raptor": {"use_raptor": false}, "graphrag": {"use_graphrag": false}}, "img2txt_id": "", "asr_id": "", "llm_id": "deepseek-r1-distill-llama-70b@Tongyi-Qianwen", "update_time": 1741676265665, "task_type": ""} 2025-03-11 14:57:45 Traceback (most recent call last): 2025-03-11 14:57:45 File "/ragflow/rag/svr/task_executor.py", line 662, in handle_task 2025-03-11 14:57:45 do_handle_task(task) 2025-03-11 14:57:45 File "/ragflow/rag/svr/task_executor.py", line 519, in do_handle_task 2025-03-11 14:57:45 vts, _ = embedding_model.encode(["ok"]) 2025-03-11 14:57:45 File "<@beartype(api.db.services.llm_service.LLMBundle.encode) at 0x7f089fb60430>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/api/db/services/llm_service.py", line 240, in encode 2025-03-11 14:57:45 embeddings, used_tokens = self.mdl.encode(texts) 2025-03-11 14:57:45 File "<@beartype(rag.llm.embedding_model.DefaultEmbedding.encode) at 0x7f08a1bddf30>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/rag/llm/embedding_model.py", line 104, in encode 2025-03-11 14:57:45 ress.extend(self._model.encode(texts[i:i + batch_size]).tolist()) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context 2025-03-11 14:57:45 return func(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/FlagEmbedding/flag_models.py", line 96, in encode 2025-03-11 14:57:45 last_hidden_state = self.model(**inputs, return_dict=True).last_hidden_state 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl 2025-03-11 14:57:45 return self._call_impl(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl 2025-03-11 14:57:45 return forward_call(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 986, in forward 2025-03-11 14:57:45 extended_attention_mask: torch.Tensor = self.get_extended_attention_mask(attention_mask, input_shape) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 969, in get_extended_attention_mask 2025-03-11 14:57:45 extended_attention_mask = extended_attention_mask.to(dtype=dtype) # fp16 compatibility 2025-03-11 14:57:45 RuntimeError: CUDA error: no kernel image is available for execution on the device 2025-03-11 14:57:45 CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 2025-03-11 14:57:45 For debugging consider passing CUDA_LAUNCH_BLOCKING=1 2025-03-11 14:57:45 Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
最后查资料,是不是ragflow目前不支持50系相卡?
Expected behavior
No response
Steps to reproduce
如上图
Additional information
No response