-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: datanode memory usage increased to 150GB when there are 50m vectors to be flush #26177
Comments
/assign @congqixia |
@yanliang567 after #26179 merged, could you please verify with this parameter enlarged dataNode:
dataSync:
maxParallelSyncTaskNum: 2 # Maximum number of sync tasks executed in parallel in each flush manager |
can we simplify the mqstream logic to make it easier to understand? |
@congqixia any plans for fixing this issue in v2.3.2? |
@yanliang567 nope, l0 delta and other datanode refining will be implemented after 2.3.2 |
moving to 2.3.3 |
moving to 2.4 for L0 deletion |
@yanliang567 now we shall verify where this problem persists when l0 segment is enabled |
will do as L0 segment enabled. |
I thought the key might to increase flush concurrency make sure flush can catch up insertion rate |
/assign @congqixia |
@xiaofan-luan the scenario here is to verify the datanode behavior when datanode is down for a long time @yanliang567 the last run did not limit the memory of datanode. Memory usage went to arount 40GB. Maybe it's still an issue here. Let's check what the behavior is when datanode has memory limit. The catch-up time is about one hour for 5 hour ttlag with insertion. Does this value good enough for our system? @xiaofan-luan @yanliang567 @tedxu @jaime0815 |
|
we did not stop the cluster, we just scale down the datanode replica to 0, and insert for 6 hours(~50M_768d data), and then bring one datanode up. |
See also milvus-io#27675 milvus-io#26177 Make memory check evict memory buffer until memory water level is safe. Also make `EvictBuffer` wait until sync task done. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also milvus-io#27675 milvus-io#26177 Make memory check evict memory buffer until memory water level is safe. Also make `EvictBuffer` wait until sync task done. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also milvus-io#27675 milvus-io#26177 Make memory check evict memory buffer until memory water level is safe. Also make `EvictBuffer` wait until sync task done. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Is there an existing issue for this?
Environment
Current Behavior
I scaled the datanode down to 0, and insert 50m_768d vectors. Then I scaled the datanode up to 1, the datanode memory usage increased to 150GB in 15 mins
Expected Behavior
the baseline on master-20230802-df26b909: datanode memory usage is about 2.3-1.3GB for the same size of vectors.
Steps To Reproduce
Milvus Log
pod names on devops:
Anything else?
the suspected pr: #26144
The text was updated successfully, but these errors were encountered: