Skip to content

[Bug]: api/v1/retrieval api slow and cpu high #10487

@DJ-RunTu

Description

@DJ-RunTu

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

no

RAGFlow image version

v0.20.5

Other environment information

8C 32G

Actual behavior

Hardware Configuration: 8-core CPU, 32GB RAM, single-instance deployment
Stress Test Results:
At 10 QPS, the response of the retrieval interface (api/v1/retrieval) slows down significantly.
Without reranking model: CPU usage reaches 400% ( 4 cores)
With Tongyi Qianwen reranking model: CPU usage reaches 100% ( 1 core)
The hardware resources have not hit the upper limit. When increasing the number of requests, CPU usage stops increasing, but the interface response becomes even slower.
Questions:
1、Is the API limited by concurrent thread count? How to expand the number of concurrent threads?
2、Why does CPU usage become higher when not using the reranking model?
3、How to optimize to reduce CPU usage?

Expected behavior

No response

Steps to reproduce

-

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐞 bugSomething isn't working, pull request that fix bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions