Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Channels are not evenly balanced among dataNode #33583

Open
1 task done
ThreadDao opened this issue Jun 4, 2024 · 4 comments
Open
1 task done

[Bug]: Channels are not evenly balanced among dataNode #33583

ThreadDao opened this issue Jun 4, 2024 · 4 comments
Assignees
Labels
kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@ThreadDao
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: cardinal-milvus-io-2.4-deebae70a-20240527
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

milvus has two dataNodes, and collection has 2 shards. One of the dataNode subscribe 2 channels and the other is idle
image

laion1b-test-2-milvus-datanode-7d95c68964-d8zqd                   1/1     Running            0                  7d23h   10.104.13.73    4am-node16   <none>           <none>
laion1b-test-2-milvus-datanode-7d95c68964-k6bl7                   1/1     Running            0                  7d23h   10.104.16.152   4am-node21   <none>           <none>
laion1b-test-2-milvus-mixcoord-5755449b5c-77gbk                   1/1     Running            0                  7d23h   10.104.34.103   4am-node37   <none>           <none>

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

@ThreadDao ThreadDao added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 4, 2024
@ThreadDao ThreadDao added this to the 2.4.4 milestone Jun 4, 2024
@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 4, 2024
@yanliang567 yanliang567 removed their assignment Jun 4, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.4, 2.4.5 Jun 5, 2024
@xiaofan-luan
Copy link
Collaborator

/assign @yiwangdr
please help on it

@yiwangdr
Copy link
Contributor

yiwangdr commented Jun 7, 2024

@ThreadDao I can't reproduce it. could you provide the link to the log?

@yanliang567 yanliang567 modified the milestones: 2.4.5, 2.4.6 Jun 26, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.6, 2.4.7 Jul 19, 2024
@XuanYang-cn
Copy link
Contributor

/assign @weiliu1031
/unassign @yiwangdr

@sre-ci-robot sre-ci-robot assigned weiliu1031 and unassigned yiwangdr Jul 25, 2024
sre-ci-robot pushed a commit that referenced this issue Jul 26, 2024
issue: #33583
the old policy permit datanode has at most 2 more channels than other
datanode. so if milvus has 2 datanode and 2 channels, both 2 channels
will be assign to 1 datanode, left another datanode empty.

This PR refine the balance policy to solve channel unbalance on datanode

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
weiliu1031 added a commit to weiliu1031/milvus that referenced this issue Jul 26, 2024
issue: milvus-io#33583
the old policy permit datanode has at most 2 more channels than other
datanode. so if milvus has 2 datanode and 2 channels, both 2 channels
will be assign to 1 datanode, left another datanode empty.

This PR refine the balance policy to solve channel unbalance on datanode

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Jul 29, 2024
issue: #33583
pr: #34984
the old policy permit datanode has at most 2 more channels than other
datanode. so if milvus has 2 datanode and 2 channels, both 2 channels
will be assign to 1 datanode, left another datanode empty.

This PR refine the balance policy to solve channel unbalance on datanode

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
sumitd2 pushed a commit to sumitd2/milvus that referenced this issue Aug 6, 2024
issue: milvus-io#33583
the old policy permit datanode has at most 2 more channels than other
datanode. so if milvus has 2 datanode and 2 channels, both 2 channels
will be assign to 1 datanode, left another datanode empty.

This PR refine the balance policy to solve channel unbalance on datanode

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Signed-off-by: Sumit Dubey <sumit.dubey2@ibm.com>
@yanliang567 yanliang567 removed this from the 2.4.7 milestone Aug 12, 2024
@yanliang567 yanliang567 added this to the 2.4.8 milestone Aug 12, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.8, 2.4.10 Aug 19, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.10, 2.4.11 Sep 5, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.11, 2.4.12 Sep 18, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.12, 2.4.13 Sep 27, 2024
@ThreadDao
Copy link
Contributor Author

/assign @ThreadDao To verify

@ThreadDao ThreadDao assigned ThreadDao and unassigned weiliu1031 Oct 9, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.13, 2.4.14 Oct 15, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.14, 2.4.16 Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

6 participants