Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [benchmark][standalone] Each request RT gradually increases in concurrent DDL scene #32277

Open
1 task done
wangting0128 opened this issue Apr 15, 2024 · 11 comments
Open
1 task done
Assignees
Labels
2.4-features kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@wangting0128
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.4-20240412-9613d368-amd64
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka): rocksmq   
- SDK version(e.g. pymilvus v2.0.0rc2): 2.4.0rc66
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: fouramf-multi-vector-d5bqx
test case name: test_concurrent_locust_50m_multi_ivf_sq8_ddl_dql_standalone

server:

NAME                                                              READY   STATUS                            RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-multi-vr-d5bqx-67-9394-etcd-0                             1/1     Running                           0                2d21h   10.104.15.103   4am-node20   <none>           <none>
fouramf-multi-vr-d5bqx-67-9394-milvus-standalone-58d69d94dv8zxg   1/1     Running                           0                2d21h   10.104.28.222   4am-node33   <none>           <none>
fouramf-multi-vr-d5bqx-67-9394-minio-54f8dffc6c-gqjr8             1/1     Running                           0                2d21h   10.104.19.120   4am-node28   <none>           <none>
截屏2024-04-15 19 38 35 截屏2024-04-15 19 38 56

CreateCollection
截屏2024-04-15 19 34 31

DropCollection
截屏2024-04-15 19 34 49

DescribeCollection
截屏2024-04-15 19 35 52

DescribeIndex
截屏2024-04-15 19 35 21

GetCollectionStatistics
截屏2024-04-15 19 36 22

Flush
截屏2024-04-15 19 36 55

LoadCollection
截屏2024-04-15 19 37 28

GetLoadState
截屏2024-04-15 19 37 49

client pod name: fouramf-multi-vector-d5bqx-3501960532
client monitor:
image

Expected Behavior

No response

Steps To Reproduce

concurrent test and calculation of RT and QPS

        :test steps:
            1. create collection with fields:
                'float_vector': 128dim,
                'float_vector_1': 200dim,
                scalar field: id(pk)
            2. build indexes:
                IVF_SQ8(nlist=2048): 'float_vector'
                IVF_SQ8(nlist=1024): 'float_vector_1',
                DEFAULT index type(STL_SORT): 'id'
            3. insert 50 million data
            4. flush collection
            5. build indexes again using the same params
            6. load collection
                replica: 1
            7. concurrent request:
                - search
                - query
                - load
                - hybrid_search
                - scene_test
                    (collection: create->insert->flush->index->drop)
                - scene_hybrid_search_test: 4 vector fields, 3 scalar fields
                    (collection: create->insert->flush->index->load->hybrid_search->drop)

Milvus Log

No response

Anything else?

test result:

[2024-04-15 06:48:41,216 -  INFO - fouram]: Print locust final stats. (locust_runner.py:56)
[2024-04-15 06:48:41,217 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-04-15 06:48:41,217 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-04-15 06:48:41,217 -  INFO - fouram]: grpc     hybrid_search                                                                 312313     0(0.00%) |     32      20    6121     28 |    1.45        0.00 (stats.py:789)
[2024-04-15 06:48:41,217 -  INFO - fouram]: grpc     load                                                                           15587     0(0.00%) |     57       2    4804      4 |    0.07        0.00 (stats.py:789)
[2024-04-15 06:48:41,217 -  INFO - fouram]: grpc     query                                                                         155462     0(0.00%) |      8       3    6117      6 |    0.72        0.00 (stats.py:789)
[2024-04-15 06:48:41,217 -  INFO - fouram]: grpc     scene_hybrid_search_test                                                       15593     0(0.00%) | 121010   20755 1357139 119000 |    0.07        0.00 (stats.py:789)
[2024-04-15 06:48:41,217 -  INFO - fouram]: grpc     scene_test                                                                     30908     0(0.00%) |  78120   63249 1298588  77000 |    0.14        0.00 (stats.py:789)
[2024-04-15 06:48:41,217 -  INFO - fouram]: grpc     search                                                                        311586     0(0.00%) |     13       8    9036     10 |    1.44        0.00 (stats.py:789)
[2024-04-15 06:48:41,217 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-04-15 06:48:41,217 -  INFO - fouram]:          Aggregated                                                                    841449     0(0.00%) |   5131       2 1357139     13 |    3.90        0.00 (stats.py:789)
[2024-04-15 06:48:41,218 -  INFO - fouram]:  (stats.py:790)
[2024-04-15 06:48:41,221 -  INFO - fouram]: [PerfTemplate] Report data: 
{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'standalone',
            'config_name': 'standalone_32c128m',
            'config': {'standalone': {'resources': {'limits': {'cpu': '32.0',
                                                               'memory': '128Gi'},
                                                    'requests': {'cpu': '17.0',
                                                                 'memory': '65Gi'}}},
                       'cluster': {'enabled': False},
                       'etcd': {'replicaCount': 1,
                                'metrics': {'enabled': True,
                                            'podMonitor': {'enabled': True}}},
                       'minio': {'mode': 'standalone',
                                 'metrics': {'podMonitor': {'enabled': True}}},
                       'pulsar': {'enabled': False},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'image': {'all': {'repository': 'harbor.milvus.io/milvus/milvus',
                                         'tag': '2.4-20240412-9613d368-amd64'}}},
            'host': 'fouramf-multi-vr-d5bqx-67-9394-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_concurrent_locust_50m_multi_ivf_sq8_ddl_dql_standalone',
            'test_case_params': {'dataset_params': {'metric_type': 'L2',
                                                    'dim': 128,
                                                    'scalars_index': {'id': {}},
                                                    'vectors_index': {'float_vector_1': {'index_type': 'IVF_SQ8',
                                                                                         'index_param': {'nlist': 1024},
                                                                                         'metric_type': 'L2'}},
                                                    'scalars_params': {'float_vector_1': {'params': {'dim': 200},
                                                                                          'other_params': {'dataset': 'text2img',
                                                                                                           'dim': 200}}},
                                                    'dataset_name': 'sift',
                                                    'dataset_size': 50000000,
                                                    'ni_per': 25000},
                                 'collection_params': {'other_fields': ['float_vector_1'],
                                                       'shards_num': 2},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False,
                                                          'reset_db': False},
                                 'index_params': {'index_type': 'IVF_SQ8',
                                                  'index_param': {'nlist': 2048}},
                                 'concurrent_params': {'concurrent_number': 20,
                                                       'during_time': '60h',
                                                       'interval': 20,
                                                       'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'search',
                                                       'weight': 20,
                                                       'params': {'nq': 10,
                                                                  'top_k': 10,
                                                                  'search_param': {'nprobe': 16},
                                                                  'expr': None,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'output_fields': None,
                                                                  'ignore_growing': False,
                                                                  'group_by_field': None,
                                                                  'timeout': 60,
                                                                  'random_data': True}},
                                                      {'type': 'query',
                                                       'weight': 10,
                                                       'params': {'ids': None,
                                                                  'expr': ' '
                                                                          '110 '
                                                                          '> '
                                                                          'id '
                                                                          '> '
                                                                          '100',
                                                                  'output_fields': None,
                                                                  'offset': None,
                                                                  'limit': None,
                                                                  'ignore_growing': False,
                                                                  'partition_names': None,
                                                                  'timeout': 60,
                                                                  'random_data': False,
                                                                  'random_count': 0,
                                                                  'random_range': [0,
                                                                                   1],
                                                                  'field_name': 'id',
                                                                  'field_type': 'int64'}},
                                                      {'type': 'load',
                                                       'weight': 1,
                                                       'params': {'replica_number': 1,
                                                                  'timeout': 30}},
                                                      {'type': 'scene_test',
                                                       'weight': 2,
                                                       'params': {'dim': 128,
                                                                  'data_size': 3000,
                                                                  'nb': 3000,
                                                                  'index_type': 'IVF_SQ8',
                                                                  'index_param': {'nlist': 2048},
                                                                  'metric_type': 'L2',
                                                                  'other_fields': [],
                                                                  'scalars_params': {},
                                                                  'scalars_index': {},
                                                                  'vectors_index': {}}},
                                                      {'type': 'hybrid_search',
                                                       'weight': 20,
                                                       'params': {'nq': 1,
                                                                  'top_k': 10,
                                                                  'reqs': [{'search_param': {'nprobe': 128},
                                                                            'anns_field': 'float_vector',
                                                                            'top_k': 100},
                                                                           {'search_param': {'nprobe': 64},
                                                                            'anns_field': 'float_vector_1',
                                                                            'top_k': 10}],
                                                                  'rerank': {'WeightedRanker': [0.85,
                                                                                                0.95]},
                                                                  'output_fields': ['*'],
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': 600,
                                                                  'random_data': True}},
                                                      {'type': 'scene_hybrid_search_test',
                                                       'weight': 1,
                                                       'params': {'nq': 1,
                                                                  'top_k': 1,
                                                                  'reqs': [{'search_param': {'nprobe': 128},
                                                                            'anns_field': 'float_vector',
                                                                            'top_k': 100},
                                                                           {'search_param': {'nprobe': 32},
                                                                            'anns_field': 'float_vector_1',
                                                                            'top_k': 10},
                                                                           {'search_param': {'ef': 32},
                                                                            'anns_field': 'float_vector_2',
                                                                            'top_k': 5},
                                                                           {'search_param': {'search_list': 20},
                                                                            'anns_field': 'float_vector_3',
                                                                            'top_k': 10}],
                                                                  'rerank': {'RRFRanker': []},
                                                                  'output_fields': None,
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': 600,
                                                                  'random_data': True,
                                                                  'dataset': 'local',
                                                                  'dim': 128,
                                                                  'shards_num': 2,
                                                                  'data_size': 3000,
                                                                  'nb': 3000,
                                                                  'index_type': 'IVF_SQ8',
                                                                  'index_param': {'nlist': 2048},
                                                                  'metric_type': 'L2',
                                                                  'other_fields': ['float_vector_1',
                                                                                   'float_vector_2',
                                                                                   'float_vector_3',
                                                                                   'int64_1',
                                                                                   'bool_1',
                                                                                   'varchar_1'],
                                                                  'replica_number': 1,
                                                                  'scalars_params': {'float_vector_1': {'params': {'dim': 128},
                                                                                                        'other_params': {'dataset': 'sift',
                                                                                                                         'dim': 128}},
                                                                                     'float_vector_2': {'params': {'dim': 128},
                                                                                                        'other_params': {'dataset': 'sift',
                                                                                                                         'dim': 128}},
                                                                                     'float_vector_3': {'params': {'dim': 128},
                                                                                                        'other_params': {'dataset': 'sift',
                                                                                                                         'dim': 128}}},
                                                                  'scalars_index': {'int64_1': {},
                                                                                    'bool_1': {'index_type': 'INVERTED'},
                                                                                    'varchar_1': {'index_type': 'INVERTED'}},
                                                                  'vectors_index': {'float_vector_1': {'index_type': 'IVF_FLAT',
                                                                                                       'index_param': {'nlist': 1024},
                                                                                                       'metric_type': 'L2'},
                                                                                    'float_vector_2': {'index_type': 'HNSW',
                                                                                                       'index_param': {'M': 8,
                                                                                                                       'efConstruction': 200},
                                                                                                       'metric_type': 'L2'},
                                                                                    'float_vector_3': {'index_type': 'DISKANN',
                                                                                                       'index_param': {},
                                                                                                       'metric_type': 'IP'}},
                                                                  'prepare_before_insert': False,
                                                                  'hybrid_search_counts': 10,
                                                                  'new_connect': False,
                                                                  'new_user': False}}]},
            'run_id': 2024041248268691,
            'datetime': '2024-04-12 09:40:26.761458',
            'client_version': '2.2'},
 'result': {'test_result': {'index': {'RT': 11161.4768,
                                      'float_vector_1': {'RT': 9247.9628},
                                      'id': {'RT': 7131.9659}},
                            'insert': {'total_time': 3530.8539,
                                       'VPS': 14160.8805,
                                       'batch_time': 1.7654,
                                       'batch': 25000},
                            'flush': {'RT': 2.5276},
                            'load': {'RT': 157.8468},
                            'Locust': {'Aggregated': {'Requests': 841449,
                                                      'Fails': 0,
                                                      'RPS': 3.9,
                                                      'fail_s': 0.0,
                                                      'RT_max': 1357139.54,
                                                      'RT_avg': 5131.41,
                                                      'TP50': 13,
                                                      'TP99': 114000.0},
                                       'hybrid_search': {'Requests': 312313,
                                                         'Fails': 0,
                                                         'RPS': 1.45,
                                                         'fail_s': 0.0,
                                                         'RT_max': 6121.38,
                                                         'RT_avg': 32.3,
                                                         'TP50': 28,
                                                         'TP99': 76},
                                       'load': {'Requests': 15587,
                                                'Fails': 0,
                                                'RPS': 0.07,
                                                'fail_s': 0.0,
                                                'RT_max': 4804.15,
                                                'RT_avg': 57.38,
                                                'TP50': 4,
                                                'TP99': 1000.0},
                                       'query': {'Requests': 155462,
                                                 'Fails': 0,
                                                 'RPS': 0.72,
                                                 'fail_s': 0.0,
                                                 'RT_max': 6117.47,
                                                 'RT_avg': 8.06,
                                                 'TP50': 6,
                                                 'TP99': 24},
                                       'scene_hybrid_search_test': {'Requests': 15593,
                                                                    'Fails': 0,
                                                                    'RPS': 0.07,
                                                                    'fail_s': 0.0,
                                                                    'RT_max': 1357139.54,
                                                                    'RT_avg': 121010.15,
                                                                    'TP50': 119000.0,
                                                                    'TP99': 208000.0},
                                       'scene_test': {'Requests': 30908,
                                                      'Fails': 0,
                                                      'RPS': 0.14,
                                                      'fail_s': 0.0,
                                                      'RT_max': 1298588.57,
                                                      'RT_avg': 78120.83,
                                                      'TP50': 77000.0,
                                                      'TP99': 96000.0},
                                       'search': {'Requests': 311586,
                                                  'Fails': 0,
                                                  'RPS': 1.44,
                                                  'fail_s': 0.0,
                                                  'RT_max': 9036.07,
                                                  'RT_avg': 13.2,
                                                  'TP50': 10,
                                                  'TP99': 36}}}}} 
@wangting0128 wangting0128 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. test/benchmark benchmark test 2.4-features labels Apr 15, 2024
@wangting0128 wangting0128 added this to the 2.4.0 milestone Apr 15, 2024
@yanliang567
Copy link
Contributor

it looks obviously that the scene_hybrid_search_test is getting more and more slow.
@wangting0128 one more question, can we tell the screenshots above, such as create collection, drop collection are in scene_hybrid_search_test or scene_test?

/unassign

@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 16, 2024
@wangting0128
Copy link
Contributor Author

it looks obviously that the scene_hybrid_search_test is getting more and more slow. @wangting0128 one more question, can we tell the screenshots above, such as create collection, drop collection are in scene_hybrid_search_test or scene_test?

/unassign

CreateCollection and DropCollection include scene_test and scene_hybrid_search_test

I have initially checked with @czs007 . It is caused by too many collection metrics contained in rootCoord.

@xiaofan-luan
Copy link
Collaborator

might be due to our snapshot gc issue.

all the meta takes 24 hours to garbage collected

@yanliang567 yanliang567 modified the milestones: 2.4.0, 2.4.1 Apr 18, 2024
@wangting0128 wangting0128 added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Apr 23, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.1, 2.4.2 May 7, 2024
@yanliang567
Copy link
Contributor

@wangting0128 any updates for milvus 2.4 latest build? I believe rootcoord might not be the bottleneck any more with 10K+ collections.

@wangting0128
Copy link
Contributor Author

@wangting0128 any updates for milvus 2.4 latest build? I believe rootcoord might not be the bottleneck any more with 10K+ collections.

There is no update. I will test again using the latest master branch image today

@wangting0128
Copy link
Contributor Author

@wangting0128 any updates for milvus 2.4 latest build? I believe rootcoord might not be the bottleneck any more with 10K+ collections.

There is no update. I will test again using the latest master branch image today

Verified with image: master-20240520-555df49d-amd64, the problem of request RT rising seems to have been alleviated.
截屏2024-05-21 10 53 12
截屏2024-05-21 10 52 49

@wangting0128
Copy link
Contributor Author

@wangting0128 any updates for milvus 2.4 latest build? I believe rootcoord might not be the bottleneck any more with 10K+ collections.

The 2.4 branch seems to still have this problem

argo task:fouramf-pxv6r-release
image:2.4-20240520-2f260cd3-amd64
test case name:test_concurrent_locust_50m_multi_ivf_sq8_ddl_dql_standalone

server:

NAME                                                              READY   STATUS        RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-pxv6r-release-86-8408-etcd-0                              1/1     Running       0               23h     10.104.24.203   4am-node29   <none>           <none>
fouramf-pxv6r-release-86-8408-milvus-standalone-c49d84dcb-ttpm6   1/1     Running       3 (23h ago)     23h     10.104.30.176   4am-node38   <none>           <none>
fouramf-pxv6r-release-86-8408-minio-79dd5dc784-27hlh              1/1     Running       0               23h     10.104.21.152   4am-node24   <none>           <none>
截屏2024-05-22 10 49 45 截屏2024-05-22 10 50 00 截屏2024-05-22 10 50 24

client pod name: fouramf-pxv6r-release-3102168818
client monitor:
image

test result:

{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'standalone',
            'config_name': 'standalone_32c128m',
            'config': {'standalone': {'resources': {'limits': {'cpu': '32.0',
                                                               'memory': '128Gi'},
                                                    'requests': {'cpu': '17.0',
                                                                 'memory': '65Gi'}}},
                       'cluster': {'enabled': False},
                       'etcd': {'replicaCount': 1,
                                'metrics': {'enabled': True,
                                            'podMonitor': {'enabled': True}}},
                       'minio': {'mode': 'standalone',
                                 'metrics': {'podMonitor': {'enabled': True}}},
                       'pulsar': {'enabled': False},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'image': {'all': {'repository': 'harbor.milvus.io/milvus/milvus',
                                         'tag': '2.4-20240520-2f260cd3-amd64'}}},
            'host': 'fouramf-pxv6r-release-86-8408-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_concurrent_locust_50m_multi_ivf_sq8_ddl_dql_standalone',
            'test_case_params': {'dataset_params': {'metric_type': 'L2',
                                                    'dim': 128,
                                                    'scalars_index': {'id': {}},
                                                    'vectors_index': {'float_vector_1': {'index_type': 'IVF_SQ8',
                                                                                         'index_param': {'nlist': 1024},
                                                                                         'metric_type': 'L2'}},
                                                    'scalars_params': {'float_vector_1': {'params': {'dim': 200},
                                                                                          'other_params': {'dataset': 'text2img',
                                                                                                           'dim': 200}}},
                                                    'dataset_name': 'sift',
                                                    'dataset_size': 50000000,
                                                    'ni_per': 25000},
                                 'collection_params': {'other_fields': ['float_vector_1'],
                                                       'shards_num': 2},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False,
                                                          'reset_db': False},
                                 'index_params': {'index_type': 'IVF_SQ8',
                                                  'index_param': {'nlist': 2048}},
                                 'concurrent_params': {'concurrent_number': 20,
                                                       'during_time': '12h',
                                                       'interval': 20,
                                                       'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'search',
                                                       'weight': 20,
                                                       'params': {'nq': 10,
                                                                  'top_k': 10,
                                                                  'search_param': {'nprobe': 16},
                                                                  'expr': None,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'output_fields': None,
                                                                  'ignore_growing': False,
                                                                  'group_by_field': None,
                                                                  'timeout': 60,
                                                                  'random_data': True}},
                                                      {'type': 'query',
                                                       'weight': 10,
                                                       'params': {'ids': None,
                                                                  'expr': ' '
                                                                          '110 '
                                                                          '> '
                                                                          'id '
                                                                          '> '
                                                                          '100',
                                                                  'output_fields': None,
                                                                  'offset': None,
                                                                  'limit': None,
                                                                  'ignore_growing': False,
                                                                  'partition_names': None,
                                                                  'timeout': 60,
                                                                  'random_data': False,
                                                                  'random_count': 0,
                                                                  'random_range': [0,
                                                                                   1],
                                                                  'field_name': 'id',
                                                                  'field_type': 'int64'}},
                                                      {'type': 'load',
                                                       'weight': 1,
                                                       'params': {'replica_number': 1,
                                                                  'timeout': 30}},
                                                      {'type': 'scene_test',
                                                       'weight': 2,
                                                       'params': {'dim': 128,
                                                                  'data_size': 3000,
                                                                  'nb': 3000,
                                                                  'index_type': 'IVF_SQ8',
                                                                  'index_param': {'nlist': 2048},
                                                                  'metric_type': 'L2',
                                                                  'other_fields': [],
                                                                  'scalars_params': {},
                                                                  'scalars_index': {},
                                                                  'vectors_index': {}}},
                                                      {'type': 'hybrid_search',
                                                       'weight': 20,
                                                       'params': {'nq': 1,
                                                                  'top_k': 10,
                                                                  'reqs': [{'search_param': {'nprobe': 128},
                                                                            'anns_field': 'float_vector',
                                                                            'top_k': 100},
                                                                           {'search_param': {'nprobe': 64},
                                                                            'anns_field': 'float_vector_1',
                                                                            'top_k': 10}],
                                                                  'rerank': {'WeightedRanker': [0.85,
                                                                                                0.95]},
                                                                  'output_fields': ['*'],
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': 600,
                                                                  'random_data': True}},
                                                      {'type': 'scene_hybrid_search_test',
                                                       'weight': 1,
                                                       'params': {'nq': 1,
                                                                  'top_k': 1,
                                                                  'reqs': [{'search_param': {'nprobe': 128},
                                                                            'anns_field': 'float_vector',
                                                                            'top_k': 100},
                                                                           {'search_param': {'nprobe': 32},
                                                                            'anns_field': 'float_vector_1',
                                                                            'top_k': 10},
                                                                           {'search_param': {'ef': 32},
                                                                            'anns_field': 'float_vector_2',
                                                                            'top_k': 5},
                                                                           {'search_param': {'search_list': 20},
                                                                            'anns_field': 'float_vector_3',
                                                                            'top_k': 10}],
                                                                  'rerank': {'RRFRanker': []},
                                                                  'output_fields': None,
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': 600,
                                                                  'random_data': True,
                                                                  'dataset': 'local',
                                                                  'dim': 128,
                                                                  'shards_num': 2,
                                                                  'data_size': 3000,
                                                                  'nb': 3000,
                                                                  'index_type': 'IVF_SQ8',
                                                                  'index_param': {'nlist': 2048},
                                                                  'metric_type': 'L2',
                                                                  'other_fields': ['float_vector_1',
                                                                                   'float_vector_2',
                                                                                   'float_vector_3',
                                                                                   'int64_1',
                                                                                   'bool_1',
                                                                                   'varchar_1'],
                                                                  'replica_number': 1,
                                                                  'scalars_params': {'float_vector_1': {'params': {'dim': 128},
                                                                                                        'other_params': {'dataset': 'sift',
                                                                                                                         'dim': 128}},
                                                                                     'float_vector_2': {'params': {'dim': 128},
                                                                                                        'other_params': {'dataset': 'sift',
                                                                                                                         'dim': 128}},
                                                                                     'float_vector_3': {'params': {'dim': 128},
                                                                                                        'other_params': {'dataset': 'sift',
                                                                                                                         'dim': 128}}},
                                                                  'scalars_index': {'int64_1': {},
                                                                                    'bool_1': {'index_type': 'INVERTED'},
                                                                                    'varchar_1': {'index_type': 'INVERTED'}},
                                                                  'vectors_index': {'float_vector_1': {'index_type': 'IVF_FLAT',
                                                                                                       'index_param': {'nlist': 1024},
                                                                                                       'metric_type': 'L2'},
                                                                                    'float_vector_2': {'index_type': 'HNSW',
                                                                                                       'index_param': {'M': 8,
                                                                                                                       'efConstruction': 200},
                                                                                                       'metric_type': 'L2'},
                                                                                    'float_vector_3': {'index_type': 'DISKANN',
                                                                                                       'index_param': {},
                                                                                                       'metric_type': 'IP'}},
                                                                  'prepare_before_insert': False,
                                                                  'hybrid_search_counts': 10,
                                                                  'new_connect': False,
                                                                  'new_user': False}}]},
            'run_id': 2024052129701132,
            'datetime': '2024-05-21 03:42:50.076883',
            'client_version': '2.2'},
 'result': {'test_result': {'index': {'RT': 7631.7959,
                                      'float_vector_1': {'RT': 6649.1025},
                                      'id': {'RT': 5278.8701}},
                            'insert': {'total_time': 3443.8363,
                                       'VPS': 14518.6924,
                                       'batch_time': 1.7219,
                                       'batch': 25000},
                            'flush': {'RT': 2.5153},
                            'load': {'RT': 127.4589},
                            'Locust': {'Aggregated': {'Requests': 202200,
                                                      'Fails': 0,
                                                      'RPS': 4.68,
                                                      'fail_s': 0.0,
                                                      'RT_max': 370200.46,
                                                      'RT_avg': 4266.24,
                                                      'TP50': 16,
                                                      'TP99': 84000.0},
                                       'hybrid_search': {'Requests': 74794,
                                                         'Fails': 0,
                                                         'RPS': 1.73,
                                                         'fail_s': 0.0,
                                                         'RT_max': 9580.78,
                                                         'RT_avg': 46.94,
                                                         'TP50': 40,
                                                         'TP99': 130.0},
                                       'load': {'Requests': 3728,
                                                'Fails': 0,
                                                'RPS': 0.09,
                                                'fail_s': 0.0,
                                                'RT_max': 3456.41,
                                                'RT_avg': 9.11,
                                                'TP50': 5,
                                                'TP99': 41},
                                       'query': {'Requests': 37679,
                                                 'Fails': 0,
                                                 'RPS': 0.87,
                                                 'fail_s': 0.0,
                                                 'RT_max': 8792.82,
                                                 'RT_avg': 11.66,
                                                 'TP50': 9,
                                                 'TP99': 31},
                                       'scene_hybrid_search_test': {'Requests': 3647,
                                                                    'Fails': 0,
                                                                    'RPS': 0.08,
                                                                    'fail_s': 0.0,
                                                                    'RT_max': 364687.24,
                                                                    'RT_avg': 85282.19,
                                                                    'TP50': 85000.0,
                                                                    'TP99': 142000.0},
                                       'scene_test': {'Requests': 7420,
                                                      'Fails': 0,
                                                      'RPS': 0.17,
                                                      'fail_s': 0.0,
                                                      'RT_max': 370200.46,
                                                      'RT_avg': 73652.97,
                                                      'TP50': 73000.0,
                                                      'TP99': 85000.0},
                                       'search': {'Requests': 74932,
                                                  'Fails': 0,
                                                  'RPS': 1.73,
                                                  'fail_s': 0.0,
                                                  'RT_max': 7573.87,
                                                  'RT_avg': 14.96,
                                                  'TP50': 11,
                                                  'TP99': 40}}}}}

@xiaofan-luan
Copy link
Collaborator

some of the data might leaked in coordinator, causing the latency goes up.

@shaoting-huang
please help on investigating it.

@yanliang567 yanliang567 modified the milestones: 2.4.2, 2.4.3, 2.4.4 May 24, 2024
@shaoting-huang
Copy link
Contributor

The earliest version(2.4-20240412-9613d368-amd64) in the issue uses datacoord channel manager v1, which is based on etcd, resulting in the increment of the DDL RT.
image

Comparing to version 2.4-20240520-2f260cd3-amd64 and version master-20240520-555df49d-amd64, these two versions use datacoord channel manager v2, which is based on rpc. Therefore the DDL RT is alleviated. I do not see any delay with version 2.4-20240520-2f260cd3-amd64.
Screenshot 2024-06-03 at 10 56 08

@yanliang567 yanliang567 modified the milestones: 2.4.5, 2.4.6 Jun 26, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.6, 2.4.7 Jul 19, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.7, 2.4.8 Aug 12, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.8, 2.4.10 Aug 19, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.10, 2.4.11 Sep 5, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.11, 2.4.12 Sep 18, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.12, 2.4.13 Sep 27, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.13, 2.4.14 Oct 15, 2024
@yanliang567 yanliang567 assigned XuanYang-cn and unassigned czs007 Oct 28, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.14, 2.4.16 Nov 14, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.16, 2.4.17, 2.4.18 Nov 21, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.18, 2.4.19, 2.4.20 Dec 24, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.20, 2.4.21 Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.4-features kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

7 participants