Skip to content

[Bug]: During performance testing of phrase match, QPS remained normal at start stage, at some point, then all requests timed out. #39894

Closed
@zhuwenxing

Description

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:master-20250214-f7d95877-amd64
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

Image

hit rate 0.1

Image

hit rate 0.01

Image

hit rate 0.001

Image

# Test phrases with their probabilities
TEST_PHRASES = {
    "vector similarity": 0.1,        # Most common phrase
    "milvus search": 0.01,         # Medium frequency phrase
    "nearest neighbor": 0.001,  # Less common phrase
    "high dimensional": 0.0001,  # Rare phrase
}

Expected Behavior

No response

Steps To Reproduce

Milvus Log

failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/phrase_match_perf_test/detail/phrase_match_perf_test/13/pipeline

log:
artifacts-phrase-match-test-13-server-logs.tar.gz

cluster: 4am
ns: chaos-testing
pod info


[2025-02-14T09:24:00.737Z] + kubectl get pods -o wide

[2025-02-14T09:24:00.740Z] + grep phrase-match-test-13

[2025-02-14T09:24:00.740Z] phrase-match-test-13-etcd-0                                       1/1     Running            0                54m     10.104.15.160   4am-node20   <none>           <none>

[2025-02-14T09:24:00.740Z] phrase-match-test-13-etcd-1                                       1/1     Running            0                54m     10.104.32.133   4am-node39   <none>           <none>

[2025-02-14T09:24:00.740Z] phrase-match-test-13-etcd-2                                       1/1     Running            0                54m     10.104.20.45    4am-node22   <none>           <none>

[2025-02-14T09:24:00.740Z] phrase-match-test-13-milvus-standalone-66d74df56-w4g6q            1/1     Running            1 (53m ago)      54m     10.104.15.159   4am-node20   <none>           <none>

[2025-02-14T09:24:00.740Z] phrase-match-test-13-minio-5859bc996-lrb4v                        1/1     Running            0                54m     10.104.15.158   4am-node20   <none>           <none>

[2025-02-14T09:24:00.740Z] phrase-match-test-13-pulsarv3-bookie-0                            1/1     Running            0                54m     10.104.15.162   4am-node20   <none>           <none>

[2025-02-14T09:24:00.740Z] phrase-match-test-13-pulsarv3-bookie-1                            1/1     Running            0                54m     10.104.30.187   4am-node38   <none>           <none>

[2025-02-14T09:24:00.740Z] phrase-match-test-13-pulsarv3-bookie-2                            1/1     Running            0                54m     10.104.20.46    4am-node22   <none>           <none>

[2025-02-14T09:24:00.740Z] phrase-match-test-13-pulsarv3-bookie-init-95pcg                   0/1     Completed          0                54m     10.104.15.147   4am-node20   <none>           <none>

[2025-02-14T09:24:00.741Z] phrase-match-test-13-pulsarv3-broker-0                            1/1     Running            0                54m     10.104.15.151   4am-node20   <none>           <none>

[2025-02-14T09:24:00.741Z] phrase-match-test-13-pulsarv3-broker-1                            1/1     Running            0                54m     10.104.23.14    4am-node27   <none>           <none>

[2025-02-14T09:24:00.741Z] phrase-match-test-13-pulsarv3-proxy-0                             1/1     Running            0                54m     10.104.15.148   4am-node20   <none>           <none>

[2025-02-14T09:24:00.741Z] phrase-match-test-13-pulsarv3-proxy-1                             1/1     Running            0                54m     10.104.32.131   4am-node39   <none>           <none>

[2025-02-14T09:24:00.741Z] phrase-match-test-13-pulsarv3-pulsar-init-zjjnw                   0/1     Completed          0                54m     10.104.15.149   4am-node20   <none>           <none>

[2025-02-14T09:24:00.741Z] phrase-match-test-13-pulsarv3-recovery-0                          1/1     Running            0                54m     10.104.15.150   4am-node20   <none>           <none>

[2025-02-14T09:24:00.741Z] phrase-match-test-13-pulsarv3-zookeeper-0                         1/1     Running            0                54m     10.104.23.22    4am-node27   <none>           <none>

[2025-02-14T09:24:00.741Z] phrase-match-test-13-pulsarv3-zookeeper-1                         1/1     Running            0                54m     10.104.15.161   4am-node20   <none>           <none>

[2025-02-14T09:24:00.741Z] phrase-match-test-13-pulsarv3-zookeeper-2                         1/1     Running            0                54m     10.104.26.155   4am-node32   <none>           <none>

Anything else?

The timing of the phrase match timeout does not necessarily occur right after a specific hit rate is reached.

https://qa-jenkins.milvus.io/blue/organizations/jenkins/phrase_match_perf_test/detail/phrase_match_perf_test/12/pipeline
The test encountered a timeout when the hit rate was 0.01.

Image
You can see that only one peak of a successful request appeared.
pod info

[2025-02-14T08:22:20.558Z] + kubectl get pods -o wide

[2025-02-14T08:22:20.570Z] + grep phrase-match-test-12

[2025-02-14T08:22:20.570Z] phrase-match-test-12-etcd-0                                       1/1     Running       0                61m     10.104.27.49    4am-node31   <none>           <none>

[2025-02-14T08:22:20.570Z] phrase-match-test-12-etcd-1                                       1/1     Running       0                61m     10.104.23.240   4am-node27   <none>           <none>

[2025-02-14T08:22:20.570Z] phrase-match-test-12-etcd-2                                       1/1     Running       0                61m     10.104.25.179   4am-node30   <none>           <none>

[2025-02-14T08:22:20.570Z] phrase-match-test-12-milvus-standalone-5d7d9dd55f-csh6r           1/1     Running       2 (60m ago)      61m     10.104.27.50    4am-node31   <none>           <none>

[2025-02-14T08:22:20.570Z] phrase-match-test-12-minio-78d98d6d5b-gtsx8                       1/1     Running       0                61m     10.104.27.52    4am-node31   <none>           <none>

[2025-02-14T08:22:20.570Z] phrase-match-test-12-pulsarv3-bookie-0                            1/1     Running       0                61m     10.104.27.51    4am-node31   <none>           <none>

[2025-02-14T08:22:20.570Z] phrase-match-test-12-pulsarv3-bookie-1                            1/1     Running       0                61m     10.104.25.178   4am-node30   <none>           <none>

[2025-02-14T08:22:20.570Z] phrase-match-test-12-pulsarv3-bookie-2                            1/1     Running       0                61m     10.104.23.244   4am-node27   <none>           <none>

[2025-02-14T08:22:20.571Z] phrase-match-test-12-pulsarv3-bookie-init-mk8z6                   0/1     Completed     0                61m     10.104.27.34    4am-node31   <none>           <none>

[2025-02-14T08:22:20.571Z] phrase-match-test-12-pulsarv3-broker-0                            1/1     Running       0                61m     10.104.9.223    4am-node14   <none>           <none>

[2025-02-14T08:22:20.571Z] phrase-match-test-12-pulsarv3-broker-1                            1/1     Running       0                61m     10.104.27.39    4am-node31   <none>           <none>

[2025-02-14T08:22:20.571Z] phrase-match-test-12-pulsarv3-proxy-0                             1/1     Running       0                61m     10.104.9.224    4am-node14   <none>           <none>

[2025-02-14T08:22:20.571Z] phrase-match-test-12-pulsarv3-proxy-1                             1/1     Running       0                61m     10.104.27.41    4am-node31   <none>           <none>

[2025-02-14T08:22:20.571Z] phrase-match-test-12-pulsarv3-pulsar-init-2npv9                   0/1     Completed     0                61m     10.104.9.222    4am-node14   <none>           <none>

[2025-02-14T08:22:20.571Z] phrase-match-test-12-pulsarv3-recovery-0                          1/1     Running       0                61m     10.104.27.38    4am-node31   <none>           <none>

[2025-02-14T08:22:20.571Z] phrase-match-test-12-pulsarv3-zookeeper-0                         1/1     Running       0                61m     10.104.27.48    4am-node31   <none>           <none>

[2025-02-14T08:22:20.571Z] phrase-match-test-12-pulsarv3-zookeeper-1                         1/1     Running       0                61m     10.104.25.175   4am-node30   <none>           <none>

[2025-02-14T08:22:20.571Z] phrase-match-test-12-pulsarv3-zookeeper-2                         1/1     Running       0                61m     10.104.23.241   4am-node27   <none>           <none>

Metadata

Assignees

Labels

kind/bugIssues or changes related a bugresolution/won't fixIndicates an issue that can not or will not be fixedtriage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions