[Bug]: api/v1/retrieval  api slow and cpu high

### Self Checks

- [x] I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
- [x] I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
- [x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
- [x] Please do not modify this template :) and fill in all the required fields.

### RAGFlow workspace code commit ID

no

### RAGFlow image version

v0.20.5

### Other environment information

```Markdown
8C 32G
```

### Actual behavior

Hardware Configuration: 8-core CPU, 32GB RAM, single-instance deployment
Stress Test Results:
At 10 QPS, the response of the retrieval interface (api/v1/retrieval) slows down significantly.
Without reranking model: CPU usage reaches 400% ( 4 cores)
With Tongyi Qianwen reranking model: CPU usage reaches 100% ( 1 core)
The hardware resources have not hit the upper limit. When increasing the number of requests, CPU usage stops increasing, but the interface response becomes even slower.
Questions:
1、Is the API limited by concurrent thread count? How to expand the number of concurrent threads?
2、Why does CPU usage become higher when not using the reranking model?
3、How to optimize to reduce CPU usage?

### Expected behavior

_No response_

### Steps to reproduce

```Markdown
-
```

### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: api/v1/retrieval api slow and cpu high #10487

Self Checks

RAGFlow workspace code commit ID

RAGFlow image version

Other environment information

Actual behavior

Expected behavior

Steps to reproduce

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: api/v1/retrieval api slow and cpu high #10487

Description

Self Checks

RAGFlow workspace code commit ID

RAGFlow image version

Other environment information

Actual behavior

Expected behavior

Steps to reproduce

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions