[Bug]: datanode memory usage increased to 150GB when there are 50m vectors to be flush #26177

yanliang567 · 2023-08-07T08:49:03Z

Is there an existing issue for this?

I have searched the existing issues

Environment

- Milvus version: master-20230805-241117dd
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

I scaled the datanode down to 0, and insert 50m_768d vectors. Then I scaled the datanode up to 1, the datanode memory usage increased to 150GB in 15 mins

Expected Behavior

the baseline on master-20230802-df26b909: datanode memory usage is about 2.3-1.3GB for the same size of vectors.

Steps To Reproduce

1. create a collection with 20k_768d vectors, build hnsw index
2. scale down the datanode to 0
3. insert 50m_768d vectors
4. scale up the datanode back to 1
5. wait and check the tt lag, datanode cpu and memory

Milvus Log

pod names on devops:

yanliang-ttlag-milvus-datanode-cbf79cbdc-bx4h6                  1/1     Running       0               34m     10.102.7.245    devops-node11   <none>           <none>
yanliang-ttlag-milvus-indexnode-6699c566d7-9l49n                1/1     Running       2 (2m16s ago)   6h54m   10.102.7.231    devops-node11   <none>           <none>
yanliang-ttlag-milvus-mixcoord-987654d85-pfzg2                  1/1     Running       0               6h54m   10.102.7.238    devops-node11   <none>           <none>
yanliang-ttlag-milvus-proxy-df7b5955f-5twjd                     1/1     Running       0               6h54m   10.102.7.239    devops-node11   <none>           <none>
yanliang-ttlag-milvus-querynode-76cf9c9b55-rcwx9                1/1     Running       0               6h54m   10.102.7.232    devops-node11   <none>           <none>

Anything else?

the suspected pr: #26144

The text was updated successfully, but these errors were encountered:

yanliang567 · 2023-08-07T08:49:19Z

/assign @congqixia
/unassign

congqixia · 2023-08-07T09:32:05Z

from the pprof, there are lot's of msg pack buffered in memory

there are some channels that are too large which could cause this problem:

MsgStream buffers(mq buffer & receive buffer) 1024*2
Flowgraph node buffer (input node -> dd node -> insert buffer node) 1024*2

in high read pressure, all channels shall be full with will lead to 102448MB memory cost

And the flush manager will buffer flush task as well, which will multiply this memory cost.

congqixia · 2023-08-09T03:59:57Z

@yanliang567 after #26179 merged, could you please verify with this parameter enlarged

dataNode:
  dataSync:
    maxParallelSyncTaskNum: 2 # Maximum number of sync tasks executed in parallel in each flush manager

xiaofan-luan · 2023-08-13T07:52:23Z

can we simplify the mqstream logic to make it easier to understand?

yanliang567 · 2023-10-16T09:52:00Z

@congqixia any plans for fixing this issue in v2.3.2?

congqixia · 2023-10-16T09:53:11Z

@yanliang567 nope, l0 delta and other datanode refining will be implemented after 2.3.2

yanliang567 · 2023-10-16T10:54:20Z

moving to 2.3.3

yanliang567 · 2023-12-05T08:00:33Z

moving to 2.4 for L0 deletion

congqixia · 2024-03-05T08:03:56Z

@yanliang567 now we shall verify where this problem persists when l0 segment is enabled
/assign @yanliang567

yanliang567 · 2024-03-05T11:05:51Z

will do as L0 segment enabled.
/unassign @congqixia

yanliang567 · 2024-04-07T10:52:58Z

test on 2.4-20240407-e3b65203-amd64
datanode memory goes to 38GB

and ttlag catch up from 5.3h to 200ms in about 60mins

xiaofan-luan · 2024-04-07T18:54:25Z

I thought the key might to increase flush concurrency make sure flush can catch up insertion rate

xiaofan-luan · 2024-04-07T18:54:38Z

/assign @congqixia

congqixia · 2024-04-08T02:04:46Z

@xiaofan-luan the scenario here is to verify the datanode behavior when datanode is down for a long time

@yanliang567 the last run did not limit the memory of datanode. Memory usage went to arount 40GB. Maybe it's still an issue here. Let's check what the behavior is when datanode has memory limit.

The catch-up time is about one hour for 5 hour ttlag with insertion. Does this value good enough for our system? @xiaofan-luan @yanliang567 @tedxu @jaime0815

xiaofan-luan · 2024-04-08T05:04:39Z

@xiaofan-luan the scenario here is to verify the datanode behavior when datanode is down for a long time

@yanliang567 the last run did not limit the memory of datanode. Memory usage went to arount 40GB. Maybe it's still an issue here. Let's check what the behavior is when datanode has memory limit.

The catch-up time is about one hour for 5 hour ttlag with insertion. Does this value good enough for our system? @xiaofan-luan @yanliang567 @tedxu @jaime0815

How long do we stop the cluster?
Is there anything we can improve? what is the bottleneck?

yanliang567 · 2024-04-10T01:28:12Z

we did not stop the cluster, we just scale down the datanode replica to 0, and insert for 6 hours(~50M_768d data), and then bring one datanode up.
@congqixia is trying to make a pr

See also milvus-io#27675 milvus-io#26177 Make memory check evict memory buffer until memory water level is safe. Also make `EvictBuffer` wait until sync task done. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

See also #27675 #26177 Make memory check evict memory buffer until memory water level is safe. Also make `EvictBuffer` wait until sync task done. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

See also milvus-io#27675 milvus-io#26177 Make memory check evict memory buffer until memory water level is safe. Also make `EvictBuffer` wait until sync task done. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

…32172) (#32201) Cherry-pick from master pr: #32172 See also #27675 #26177 Make memory check evict memory buffer until memory water level is safe. Also make `EvictBuffer` wait until sync task done. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

yanliang567 · 2024-04-28T09:56:40Z

on master-20240426-bed6363f, the tt lag catches up quickly, but the datanode uses memory without any limitation, OOM occurs for times in a 8c32g datanode pod.

stale · 2024-06-10T06:01:57Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

yanliang567 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 7, 2023

yanliang567 self-assigned this Aug 7, 2023

sre-ci-robot assigned congqixia and unassigned yanliang567 Aug 7, 2023

yanliang567 added this to the 2.3 milestone Aug 7, 2023

yanliang567 mentioned this issue Aug 7, 2023

[Enhancement]: Add pursuit mode for mqtt msgstream #26097

Closed

1 task

congqixia mentioned this issue Aug 7, 2023

Reduce MQ buffer length and flowgraph wait queue length to 16 #26179

Merged

yanliang567 modified the milestones: 2.3, 2.3.2 Oct 11, 2023

yanliang567 modified the milestones: 2.3.2, 2.3.3 Oct 16, 2023

yanliang567 modified the milestones: 2.3.3, 2.3.4 Nov 16, 2023

yanliang567 modified the milestones: 2.3.4, 2.4.0 Dec 5, 2023

sre-ci-robot assigned yanliang567 Mar 5, 2024

sre-ci-robot unassigned congqixia Mar 5, 2024

yanliang567 removed the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Mar 5, 2024

sre-ci-robot assigned congqixia Apr 7, 2024

congqixia mentioned this issue Apr 11, 2024

enhance: Make write buffer memory check do until safe #32172

Merged

congqixia mentioned this issue Apr 12, 2024

enhance: [Cherry-pick] Make write buffer memory check do until safe (#32172) #32201

Merged

yanliang567 modified the milestones: 2.4.0, 2.4.1 Apr 18, 2024

yanliang567 modified the milestones: 2.4.1, 2.4.2 May 7, 2024

stale bot added the stale indicates no udpates for 30 days label Jun 10, 2024

stale bot closed this as completed Jul 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: datanode memory usage increased to 150GB when there are 50m vectors to be flush #26177

[Bug]: datanode memory usage increased to 150GB when there are 50m vectors to be flush #26177

yanliang567 commented Aug 7, 2023 •

edited

Loading

yanliang567 commented Aug 7, 2023

congqixia commented Aug 7, 2023

congqixia commented Aug 9, 2023

xiaofan-luan commented Aug 13, 2023

yanliang567 commented Oct 16, 2023

congqixia commented Oct 16, 2023 •

edited

Loading

yanliang567 commented Oct 16, 2023

yanliang567 commented Dec 5, 2023

congqixia commented Mar 5, 2024

yanliang567 commented Mar 5, 2024

yanliang567 commented Apr 7, 2024

xiaofan-luan commented Apr 7, 2024

xiaofan-luan commented Apr 7, 2024

congqixia commented Apr 8, 2024

xiaofan-luan commented Apr 8, 2024

yanliang567 commented Apr 10, 2024

yanliang567 commented Apr 28, 2024

stale bot commented Jun 10, 2024

[Bug]: datanode memory usage increased to 150GB when there are 50m vectors to be flush #26177

[Bug]: datanode memory usage increased to 150GB when there are 50m vectors to be flush #26177

Comments

yanliang567 commented Aug 7, 2023 • edited Loading

Is there an existing issue for this?

Environment

Current Behavior

Expected Behavior

Steps To Reproduce

Milvus Log

Anything else?

yanliang567 commented Aug 7, 2023

congqixia commented Aug 7, 2023

congqixia commented Aug 9, 2023

xiaofan-luan commented Aug 13, 2023

yanliang567 commented Oct 16, 2023

congqixia commented Oct 16, 2023 • edited Loading

yanliang567 commented Oct 16, 2023

yanliang567 commented Dec 5, 2023

congqixia commented Mar 5, 2024

yanliang567 commented Mar 5, 2024

yanliang567 commented Apr 7, 2024

xiaofan-luan commented Apr 7, 2024

xiaofan-luan commented Apr 7, 2024

congqixia commented Apr 8, 2024

xiaofan-luan commented Apr 8, 2024

yanliang567 commented Apr 10, 2024

yanliang567 commented Apr 28, 2024

stale bot commented Jun 10, 2024

yanliang567 commented Aug 7, 2023 •

edited

Loading

congqixia commented Oct 16, 2023 •

edited

Loading