[Bug]: Flush hang and mixcoord panic: runtime error: index out of range [0] with length 0 #28628
Closed
Description
Is there an existing issue for this?
- I have searched the existing issues
Environment
- Milvus version: master-20231121-cc952e04
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka):
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
- create collection
fouram_79xHGrEL
-> index -> load - create partitions: p1, p2, p3
- Insert 6m, 6m, and 8m same entities into partitions p1, p2, and p3 respectively
- delete pk [0, 2800000) from
p2
partition in 9 times, and manually call flush() after each deletion. - Then flush hang with None timeout param, and mixcoord restarted due to panic
[2023-11-21 12:16:25,574 - INFO - fouram]: [FunctionalCases] Flush params: {} (functional_cases.py:137)
[2023-11-21 12:16:25,574 - DEBUG - fouram]: (api_request) : [Collection.flush] args: [], kwargs: {} (api_request.py:72)
[2023/11/21 12:43:16.361 +00:00] [INFO] [datacoord/compaction.go:224] ["Compaction finished"] [planID=445793404926507826] [nodeID=7]
[2023/11/21 12:43:16.361 +00:00] [INFO] [datacoord/compaction.go:243] ["Compaction scheduler status"] [waiting="[445793404926514005,445793404926514597]"] [executing="[]"]
panic: runtime error: index out of range [0] with length 0
goroutine 426 [running]:
panic({0x4b8ef60, 0xc0027e9530})
/usr/local/go/src/runtime/panic.go:987 +0x3bb fp=0xc0032d9060 sp=0xc0032d8fa0 pc=0x19adc5b
runtime.goPanicIndex(0x0, 0x0)
/usr/local/go/src/runtime/panic.go:113 +0x7f fp=0xc0032d90a0 sp=0xc0032d9060 pc=0x19ab99f
github.com/milvus-io/milvus/internal/datacoord.(*meta).PrepareCompleteCompactionMutation(0xc001549500, 0xc003141400, 0xc0031119e0)
/go/src/github.com/milvus-io/milvus/internal/datacoord/meta.go:1056 +0x151d fp=0xc0032d94b0 sp=0xc0032d90a0 pc=0x3bb2a5d
github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).handleMergeCompactionResult(0xc0000528c0, 0xc003141400, 0x62fc6bd67db3b32?)
/go/src/github.com/milvus-io/milvus/internal/datacoord/compaction.go:420 +0x45 fp=0xc0032d9798 sp=0xc0032d94b0 pc=0x3b6d9e5
github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).completeCompaction(0xc0000528c0, 0xc0031119e0)
/go/src/github.com/milvus-io/milvus/internal/datacoord/compaction.go:406 +0x176 fp=0xc0032d9888 sp=0xc0032d9798 pc=0x3b6d596
github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).updateCompaction(0xc0000528c0, 0x62fc798ac340002)
/go/src/github.com/milvus-io/milvus/internal/datacoord/compaction.go:482 +0x75e fp=0xc0032d9dc8 sp=0xc0032d9888 pc=0x3b6eb7e
github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).start.func1()
/go/src/github.com/milvus-io/milvus/internal/datacoord/compaction.go:299 +0x47d fp=0xc0032d9fe0 sp=0xc0032d9dc8 pc=0x3b6c0bd
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0032d9fe8 sp=0xc0032d9fe0 pc=0x19e79e1
created by github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).start
/go/src/github.com/milvus-io/milvus/internal/datacoord/compaction.go:277 +0xcb
Expected Behavior
No response
Steps To Reproduce
4am argo link: https://argo-workflows.zilliz.cc/workflows/qa/test-delete-functional-20m-partitions-5?tab=workflow&nodeId=test-delete-functional-20m-partitions-5&nodePanelView=containers
Milvus Log
4am pods:
test-delete-cluster-1-milvus-datanode-7fc5765545-67f59 1/1 Running 0 3h24m 10.104.17.213 4am-node23 <none> <none>
test-delete-cluster-1-milvus-datanode-7fc5765545-cd7bt 1/1 Running 0 3h24m 10.104.16.133 4am-node21 <none> <none>
test-delete-cluster-1-milvus-indexnode-6df54c8f4f-wzgnr 1/1 Running 0 3h24m 10.104.24.46 4am-node29 <none> <none>
test-delete-cluster-1-milvus-mixcoord-67cdf4df98-vl77f 1/1 Running 1 (141m ago) 3h24m 10.104.20.198 4am-node22 <none> <none>
test-delete-cluster-1-milvus-proxy-79559fbc9b-lgkjk 1/1 Running 0 3h24m 10.104.14.98 4am-node18 <none> <none>
test-delete-cluster-1-milvus-querynode-57b596c57c-ctwcx 1/1 Running 0 3h24m 10.104.1.254 4am-node10 <none> <none>
test-delete-cluster-1-milvus-querynode-57b596c57c-mj9bz 1/1 Running 0 3h24m 10.104.16.134 4am-node21 <none> <none>
Anything else?
No response