-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: [cluster] Nightly test case hangs for panic in querynode and datacoord #15283
Comments
querynode was restarted and not back to service |
For nightly 305: there is panic in datacoord and querynode: (1) md-305-n-milvus-datacoord-f455cddc-56x5m_milvus-ci_datacoord-5a3376049d42d661cb64cc4484b97d98aaa684b96997104babf32713a1e527c4.log:
(2) md-305-n-milvus-querynode-75c88f95cc-58qrm_milvus-ci_querynode-56feae97f0a74b6188854829f418a01551c57f8157014e43d4bcc690ce6ad0e9.log:
|
Panic occurs again in nightly 306: (cluster mode)
(2) md-306-n-milvus-querynode-7886c46b96-t6nf8_milvus-ci_querynode-56caf7d1f2e6c2b8e85235916c15e9ebceff9cf1095b7548f8258e73e570aea0.log:
|
Test hang again for cluster mode in latest nightly: milvus: 21f999f
@DragonDriver could you please check whether it is the same root cause with this issue? If not, I will open another issue to track this, thanks. |
@DragonDriver rootcoord crashes for many times. please help to check
|
@DragonDriver any progress? |
I fired an issue in pulsar, apache/pulsar#13920. |
querycoord panicd when rootcoord drop collection, and rootcoord release collection failed when drop collection. So after querycoord resarted, it called rootcoord.showpartition() and get error message, then querycoord will panic after every reboot
|
Panic occurs again: milvus: a62e2ef
|
Same issue occurs again: milvus: b4bfe58
|
assign @xiaofan-luan |
Similar issue occurs again: 1 Log:
|
/assign @xiaofan-luan |
Hey guys, I meet the similar issue again, after I manually delete the querycoord node. Here is the log, I'v met this issue several times before
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Has been reproduced for a long time, so close it and will reopen it if it occurs again in the future. |
Is there an existing issue for this?
Environment
Current Behavior
test case hangs for cluster mode
Expected Behavior
All the test cases executed successfully
Steps To Reproduce
Anything else?
Tests hangs and there is no logs generated at this point, but the environment is still existed, so could login the machine to have some check, thanks.
The text was updated successfully, but these errors were encountered: