[Bug]: Deploy on Nvidia 5080, Docker starts with GPU configuration, and the knowledge base parses the document with an error.

### Is there an existing issue for the same bug?

- [x] I have checked the existing issues.

### RAGFlow workspace code commit ID

Latest v0.17.0

### RAGFlow image version

Latest v0.17.0

### Other environment information

```Markdown
Windows 11 system cpu i7-14700kf、gpu 5080， Installed Docker desktop.

After installing the graphics card driver cuda kit 12.8、cudnn 9.8.1。

Then it was changed to version 12.6 of CUDA, and the corresponding version of CUDNN still reported the same problem.
```

### Actual behavior

Specific error information：
ERROR:root:Fail to bind embedding model: CUDA error: no kernel image is available for execution on the device 2025-03-11 14:57:45 CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 2025-03-11 14:57:45 For debugging consider passing CUDA_LAUNCH_BLOCKING=1 2025-03-11 14:57:45 Compile with to enable device-side assertions. 2025-03-11 14:57:45 Traceback (most recent call last): 2025-03-11 14:57:45 File "/ragflow/rag/svr/task_executor.py", line 519, in do_handle_task 2025-03-11 14:57:45 vts, _ = embedding_model.encode(["ok"]) 2025-03-11 14:57:45 File "<https://github.com/beartype(api.db.services.llm_service.LLMBundle.encode) at 0x7f089fb60430>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/api/db/services/llm_service.py", line 240, in encode 2025-03-11 14:57:45 embeddings, used_tokens = self.mdl.encode(texts) 2025-03-11 14:57:45 File "<https://github.com/beartype(rag.llm.embedding_model.DefaultEmbedding.encode) at 0x7f08a1bddf30>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/rag/llm/embedding_model.py", line 104, in encode 2025-03-11 14:57:45 ress.extend(self._model.encode(texts[i:i + batch_size]).tolist()) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context 2025-03-11 14:57:45 return func(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/FlagEmbedding/flag_models.py", line 96, in encode 2025-03-11 14:57:45 last_hidden_state = self.model(**inputs, return_dict=True).last_hidden_state 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl 2025-03-11 14:57:45 return self._call_impl(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl 2025-03-11 14:57:45 return forward_call(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 986, in forward 2025-03-11 14:57:45 extended_attention_mask: torch.Tensor = self.get_extended_attention_mask(attention_mask, input_shape) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 969, in get_extended_attention_mask 2025-03-11 14:57:45 extended_attention_mask = extended_attention_mask.to(dtype=dtype) # fp16 compatibility 2025-03-11 14:57:45 RuntimeError: CUDA error: no kernel image is available for execution on the device 2025-03-11 14:57:45 CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 2025-03-11 14:57:45 For debugging consider passing CUDA_LAUNCH_BLOCKING=1 2025-03-11 14:57:45 Compile with to enable device-side assertions. 2025-03-11 14:57:45 2025-03-11 14:57:45 ERROR:root:handle_task got exception for task {"id": "22c765f2fe4611ef98d00242ac120006", "doc_id": "2ea8b848fe1f11efa1b20242ac120006", "from_page": 0, "to_page": 94, "retry_count": 0, "kb_id": "1ddb13e4fe1f11efbe600242ac120006", "parser_id": "table", "parser_config": {"layout_recognize": "DeepDOC", "chunk_token_num": 128, "delimiter": "\n!?;\u3002\uff1b\uff01\uff1f", "auto_keywords": 0, "auto_questions": 0, "html4excel": false, "graphrag": {"use_graphrag": false}, "pages": []}, "name": "\u4e09\u8f6e\u592e\u7763\u6574\u6539\u8fdb\u5c55.xlsx", "type": "doc", "location": "\u4e09\u8f6e\u592e\u7763\u6574\u6539\u8fdb\u5c55.xlsx", "size": 149379, "tenant_id": "3fcce5fafe1e11efa9770242ac120006", "language": "Chinese", "embd_id": "BAAI/bge-large-zh-v1.5@BAAI", "pagerank": 0, "kb_parser_config": {"layout_recognize": "DeepDOC", "chunk_token_num": 128, "delimiter": "\n!?;\u3002\uff1b\uff01\uff1f", "auto_keywords": 0, "auto_questions": 0, "html4excel": false, "raptor": {"use_raptor": false}, "graphrag": {"use_graphrag": false}}, "img2txt_id": "", "asr_id": "", "llm_id": "deepseek-r1-distill-llama-70b@Tongyi-Qianwen", "update_time": 1741676265665, "task_type": ""} 2025-03-11 14:57:45 Traceback (most recent call last): 2025-03-11 14:57:45 File "/ragflow/rag/svr/task_executor.py", line 662, in handle_task 2025-03-11 14:57:45 do_handle_task(task) 2025-03-11 14:57:45 File "/ragflow/rag/svr/task_executor.py", line 519, in do_handle_task 2025-03-11 14:57:45 vts, _ = embedding_model.encode(["ok"]) 2025-03-11 14:57:45 File "<https://github.com/beartype(api.db.services.llm_service.LLMBundle.encode) at 0x7f089fb60430>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/api/db/services/llm_service.py", line 240, in encode 2025-03-11 14:57:45 embeddings, used_tokens = self.mdl.encode(texts) 2025-03-11 14:57:45 File "<https://github.com/beartype(rag.llm.embedding_model.DefaultEmbedding.encode) at 0x7f08a1bddf30>", line 31, in encode 2025-03-11 14:57:45 File "/ragflow/rag/llm/embedding_model.py", line 104, in encode 2025-03-11 14:57:45 ress.extend(self._model.encode(texts[i:i + batch_size]).tolist()) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context 2025-03-11 14:57:45 return func(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/FlagEmbedding/flag_models.py", line 96, in encode 2025-03-11 14:57:45 last_hidden_state = self.model(**inputs, return_dict=True).last_hidden_state 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl 2025-03-11 14:57:45 return self._call_impl(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl 2025-03-11 14:57:45 return forward_call(*args, **kwargs) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 986, in forward 2025-03-11 14:57:45 extended_attention_mask: torch.Tensor = self.get_extended_attention_mask(attention_mask, input_shape) 2025-03-11 14:57:45 File "/ragflow/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 969, in get_extended_attention_mask 2025-03-11 14:57:45 extended_attention_mask = extended_attention_mask.to(dtype=dtype) # fp16 compatibility 2025-03-11 14:57:45 RuntimeError: CUDA error: no kernel image is available for execution on the device 2025-03-11 14:57:45 CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 2025-03-11 14:57:45 For debugging consider passing CUDA_LAUNCH_BLOCKING=1 2025-03-11 14:57:45 Compile with to enable device-side assertions.TORCH_USE_CUDA_DSATORCH_USE_CUDA_DSATORCH_USE_CUDA_DSA

image：

![Image](https://github.com/user-attachments/assets/06d4ee19-9a59-4303-a335-66d20098006a)

![Image](https://github.com/user-attachments/assets/c616e0eb-9cf0-41df-9cb6-641c755243d3)

![Image](https://github.com/user-attachments/assets/ed1847be-349b-4bd1-989c-5e3ffdd1899a)

![Image](https://github.com/user-attachments/assets/2c03553f-f4eb-4563-a7a6-3cf16efcde1a)

### Expected behavior

_No response_

### Steps to reproduce

```Markdown
According to the above content
```

### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Deploy on Nvidia 5080, Docker starts with GPU configuration, and the knowledge base parses the document with an error. #5907

Is there an existing issue for the same bug?

RAGFlow workspace code commit ID

RAGFlow image version

Other environment information

Actual behavior

Expected behavior

Steps to reproduce

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Deploy on Nvidia 5080, Docker starts with GPU configuration, and the knowledge base parses the document with an error. #5907

Description

Is there an existing issue for the same bug?

RAGFlow workspace code commit ID

RAGFlow image version

Other environment information

Actual behavior

Expected behavior

Steps to reproduce

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions