Skip to content

[Bug]: search failed with error segment lacks[segment=451679606836035900]: channel not available after standalone pod kill chaos test #35361

Closed
@zhuwenxing

Description

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:2.4-20240807-d14d00b0-amd64
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:37 - DEBUG - ci_test]: (api_request)  : [Connections.connect] args: ['default', '', '', 'default', ''], kwargs: {'host': '10.255.181.57', 'port': 19530} (api_request.py:62)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:37 - DEBUG - ci_test]: (api_response) : None  (api_request.py:37)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:37 - DEBUG - ci_test]: (api_request)  : [Connections.has_connection] args: ['default'], kwargs: {} (api_request.py:62)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:37 - DEBUG - ci_test]: (api_response) : True  (api_request.py:37)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:37 - DEBUG - ci_test]: (api_request)  : [Collection] args: ['Checker__P5GMBjpl', {'auto_id': False, 'description': '', 'fields': [{'name': 'int64', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float', 'description': '', 'type': <DataType.FLOAT: 10>}, {'name': 'varchar', 'description': '', 'type': <DataType......, kwargs: {'consistency_level': 'Strong'} (api_request.py:62)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:37 - DEBUG - ci_test]: (api_response) : <Collection>:

[2024-08-07T09:13:11.193Z] -------------

[2024-08-07T09:13:11.193Z] <name>: Checker__P5GMBjpl

[2024-08-07T09:13:11.193Z] <description>: 

[2024-08-07T09:13:11.193Z] <schema>: {'auto_id': False, 'description': '', 'fields': [{'name': 'int64', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float', 'description': '', 'type': <DataType.FLOAT: 10>}......  (api_request.py:37)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:37 - DEBUG - ci_test]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': 180} (api_request.py:62)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:38 - DEBUG - ci_test]: (api_response) : None  (api_request.py:37)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:38 - DEBUG - ci_test]: (api_request)  : [Collection.compact] args: [180], kwargs: {} (api_request.py:62)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:38 - DEBUG - ci_test]: (api_response) : None  (api_request.py:37)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:39 - DEBUG - ci_test]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': 180} (api_request.py:62)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:49 - DEBUG - ci_test]: (api_response) : None  (api_request.py:37)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:49 - INFO - ci_test]: assert create collection: 0.34676313400268555, init_entities: 136995 (test_all_collections_after_chaos.py:49)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:50 - DEBUG - ci_test]: (api_request)  : [Collection.insert] args: [[[-3000, -2999, -2998, -2997, -2996, -2995, -2994, -2993, -2992, -2991, -2990, -2989, -2988, -2987, -2986, -2985, -2984, -2983, -2982, -2981, -2980, -2979, -2978, -2977, -2976, -2975, -2974, -2973, -2972, -2971, -2970, -2969, -2968, -2967, -2966, -2965, -2964, -2963, -2962, -2961, -2960, -2959, -29......, kwargs: {'timeout': 180} (api_request.py:62)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:51 - DEBUG - ci_test]: (api_response) : (insert count: 2000, delete count: 0, upsert count: 0, timestamp: 451679840116670466, success count: 2000, err count: 0  (api_request.py:37)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:51 - INFO - ci_test]: assert insert: 1.272249460220337 (test_all_collections_after_chaos.py:57)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:51 - DEBUG - ci_test]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': 180} (api_request.py:62)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:59 - DEBUG - ci_test]: (api_response) : None  (api_request.py:37)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:10:59 - DEBUG - ci_test]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': 180} (api_request.py:62)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:11:09 - DEBUG - ci_test]: (api_response) : None  (api_request.py:37)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:11:09 - INFO - ci_test]: assert flush: 7.671832799911499, entities: 138995 (test_all_collections_after_chaos.py:67)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:11:09 - INFO - ci_test]: index info: [{'collection': 'Checker__P5GMBjpl', 'field': 'float_vector', 'index_name': 'float_vector', 'index_param': {'metric_type': 'L2', 'params': {'M': 48, 'efConstruction': 500}, 'index_type': 'HNSW'}}, {'collection': 'Checker__P5GMBjpl', 'field': 'image_emb', 'index_name': 'image_emb', 'index_param': {'index_type': 'HNSW', 'metric_type': 'L2', 'params': {'M': 48, 'efConstruction': 500}}}, {'collection': 'Checker__P5GMBjpl', 'field': 'text_emb', 'index_name': 'text_emb', 'index_param': {'index_type': 'HNSW', 'metric_type': 'L2', 'params': {'M': 48, 'efConstruction': 500}}}, {'collection': 'Checker__P5GMBjpl', 'field': 'voice_emb', 'index_name': 'voice_emb', 'index_param': {'index_type': 'HNSW', 'metric_type': 'L2', 'params': {'M': 48, 'efConstruction': 500}}}, {'collection': 'Checker__P5GMBjpl', 'field': 'int64', 'index_name': 'int64', 'index_param': {'index_type': 'INVERTED'}}, {'collection': 'Checker__P5GMBjpl', 'field': 'float', 'index_name': 'float', 'index_param': {'index_type': 'INVERTED'}}, {'collection': 'Checker__P5GMBjpl', 'field': 'varchar', 'index_name': 'varchar', 'index_param': {'index_type': 'INVERTED'}}] (test_all_collections_after_chaos.py:71)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:11:09 - INFO - ci_test]: index info: [{'collection': 'Checker__P5GMBjpl', 'field': 'voice_emb', 'index_name': 'voice_emb', 'index_param': {'index_type': 'HNSW', 'metric_type': 'L2', 'params': {'M': 48, 'efConstruction': 500}}}, {'collection': 'Checker__P5GMBjpl', 'field': 'int64', 'index_name': 'int64', 'index_param': {'index_type': 'INVERTED'}}, {'collection': 'Checker__P5GMBjpl', 'field': 'float', 'index_name': 'float', 'index_param': {'index_type': 'INVERTED'}}, {'collection': 'Checker__P5GMBjpl', 'field': 'varchar', 'index_name': 'varchar', 'index_param': {'index_type': 'INVERTED'}}, {'collection': 'Checker__P5GMBjpl', 'field': 'float_vector', 'index_name': 'float_vector', 'index_param': {'metric_type': 'L2', 'params': {'M': 48, 'efConstruction': 500}, 'index_type': 'HNSW'}}, {'collection': 'Checker__P5GMBjpl', 'field': 'image_emb', 'index_name': 'image_emb', 'index_param': {'index_type': 'HNSW', 'metric_type': 'L2', 'params': {'M': 48, 'efConstruction': 500}}}, {'collection': 'Checker__P5GMBjpl', 'field': 'text_emb', 'index_name': 'text_emb', 'index_param': {'index_type': 'HNSW', 'metric_type': 'L2', 'params': {'M': 48, 'efConstruction': 500}}}] (test_all_collections_after_chaos.py:86)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:11:09 - DEBUG - ci_test]: (api_request)  : [Collection.load] args: [None, 1, 180], kwargs: {} (api_request.py:62)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:11:09 - DEBUG - ci_test]: (api_response) : None  (api_request.py:37)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:11:09 - DEBUG - ci_test]: (api_request)  : [Collection.search] args: [[[0.10321139337677954, 0.12237845132389466, 0.0008106671732834907, 0.07056239880516353, 0.08767849430454548, 0.0672583600964388, 0.023293687165044725, 0.0807304363631584, 0.008693150625500795, 0.03487908435206497, 0.11515404612929885, 0.14392505232247643, 0.06795949935007947, 0.024356334911214687, ......, kwargs: {} (api_request.py:62)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:11:09 - ERROR - pymilvus.decorators]: RPC error: [search], <MilvusException: (code=503, message=failed to search: segment lacks[segment=451679606836035900]: channel not available[channel=by-dev-rootcoord-dml_7_451679330252759529v0])>, <Time:{'RPC start': '2024-08-07 09:11:09.288978', 'RPC error': '2024-08-07 09:11:09.291185'}> (decorators.py:146)

[2024-08-07T09:13:11.193Z] [2024-08-07 09:11:09 - ERROR - ci_test]: Traceback (most recent call last):

[2024-08-07T09:13:11.193Z]   File "/home/jenkins/agent/workspace/tests/python_client/utils/api_request.py", line 32, in inner_wrapper

[2024-08-07T09:13:11.193Z]     res = func(*args, **_kwargs)

[2024-08-07T09:13:11.193Z]   File "/home/jenkins/agent/workspace/tests/python_client/utils/api_request.py", line 63, in api_request

[2024-08-07T09:13:11.193Z]     return func(*arg, **kwargs)

[2024-08-07T09:13:11.193Z]   File "/usr/local/lib/python3.8/dist-packages/pymilvus/orm/collection.py", line 801, in search

[2024-08-07T09:13:11.193Z]     resp = conn.search(

[2024-08-07T09:13:11.193Z]   File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 147, in handler

[2024-08-07T09:13:11.193Z]     raise e from e

[2024-08-07T09:13:11.193Z]   File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 143, in handler

[2024-08-07T09:13:11.193Z]     return func(*args, **kwargs)

[2024-08-07T09:13:11.193Z]   File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 182, in handler

[2024-08-07T09:13:11.193Z]     return func(self, *args, **kwargs)

[2024-08-07T09:13:11.193Z]   File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 122, in handler

[2024-08-07T09:13:11.193Z]     raise e from e

[2024-08-07T09:13:11.193Z]   File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 87, in handler

[2024-08-07T09:13:11.193Z]     return func(*args, **kwargs)

[2024-08-07T09:13:11.193Z]   File "/usr/local/lib/python3.8/dist-packages/pymilvus/client/grpc_handler.py", line 801, in search

[2024-08-07T09:13:11.193Z]     return self._execute_search(request, timeout, round_decimal=round_decimal, **kwargs)

[2024-08-07T09:13:11.193Z]   File "/usr/local/lib/python3.8/dist-packages/pymilvus/client/grpc_handler.py", line 742, in _execute_search

[2024-08-07T09:13:11.193Z]     raise e from e

[2024-08-07T09:13:11.193Z]   File "/usr/local/lib/python3.8/dist-packages/pymilvus/client/grpc_handler.py", line 735, in _execute_search

[2024-08-07T09:13:11.193Z]     check_status(response.status)

[2024-08-07T09:13:11.194Z]   File "/usr/local/lib/python3.8/dist-packages/pymilvus/client/utils.py", line 63, in check_status

[2024-08-07T09:13:11.194Z]     raise MilvusException(status.code, status.reason, status.error_code)

[2024-08-07T09:13:11.194Z] pymilvus.exceptions.MilvusException: <MilvusException: (code=503, message=failed to search: segment lacks[segment=451679606836035900]: channel not available[channel=by-dev-rootcoord-dml_7_451679330252759529v0])>

[2024-08-07T09:13:11.194Z]  (api_request.py:45)

[2024-08-07T09:13:11.194Z] [2024-08-07 09:11:09 - ERROR - ci_test]: (api_response) : <MilvusException: (code=503, message=failed to search: segment lacks[segment=451679606836035900]: channel not available[channel=by-dev-rootcoord-dml_7_451679330252759529v0])> (api_request.py:46)

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/chaos-test-kafka-for-release-cron/detail/chaos-test-kafka-for-release-cron/15665/pipeline
log:
artifacts-standalone-pod-kill-15665-server-logs.tar.gz

cluster: 4am
ns:chaos-testing
pod

[2024-08-07T09:09:15.217Z] + grep standalone-pod-kill-15665

[2024-08-07T09:09:15.472Z] standalone-pod-kill-15665-etcd-0                                  1/1     Running       0               32m     10.104.34.193   4am-node37   <none>           <none>

[2024-08-07T09:09:15.472Z] standalone-pod-kill-15665-etcd-1                                  1/1     Running       0               32m     10.104.24.148   4am-node29   <none>           <none>

[2024-08-07T09:09:15.472Z] standalone-pod-kill-15665-etcd-2                                  1/1     Running       0               32m     10.104.26.188   4am-node32   <none>           <none>

[2024-08-07T09:09:15.472Z] standalone-pod-kill-15665-kafka-0                                 2/2     Running       2 (31m ago)     32m     10.104.24.147   4am-node29   <none>           <none>

[2024-08-07T09:09:15.472Z] standalone-pod-kill-15665-kafka-1                                 2/2     Running       1 (31m ago)     32m     10.104.23.72    4am-node27   <none>           <none>

[2024-08-07T09:09:15.472Z] standalone-pod-kill-15665-kafka-2                                 2/2     Running       1 (31m ago)     32m     10.104.26.190   4am-node32   <none>           <none>

[2024-08-07T09:09:15.472Z] standalone-pod-kill-15665-kafka-exporter-5788cd868d-wqbks         1/1     Running       4 (31m ago)     32m     10.104.33.65    4am-node36   <none>           <none>

[2024-08-07T09:09:15.472Z] standalone-pod-kill-15665-milvus-standalone-655bdcc4d7-gdsxg      1/1     Running       0               9m14s   10.104.33.78    4am-node36   <none>           <none>

[2024-08-07T09:09:15.473Z] standalone-pod-kill-15665-minio-747dd9479d-d7r6l                  1/1     Running       0               32m     10.104.33.69    4am-node36   <none>           <none>

[2024-08-07T09:09:15.473Z] standalone-pod-kill-15665-zookeeper-0                             1/1     Running       0               32m     10.104.32.92    4am-node39   <none>           <none>

[2024-08-07T09:09:15.473Z] standalone-pod-kill-15665-zookeeper-1                             1/1     Running       0               32m     10.104.23.73    4am-node27   <none>           <none>

[2024-08-07T09:09:15.473Z] standalone-pod-kill-15665-zookeeper-2                             1/1     Running       0               32m     10.104.26.191   4am-node32   <none>           <none>

Anything else?

No response

Metadata

Assignees

Labels

kind/bugIssues or changes related a bugtest/chaoschaos testtriage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions