Closed
Description
Describe the bug
21:58:32 2020-06-20:13:58:32,104 INFO [client.py:123] Building index start, collection_name: sift_1m_128_128_l2, index_type: IVFLAT
21:58:32 2020-06-20:13:58:32,105 INFO [client.py:125] {'nlist': 4096}
22:01:09 create_index
22:01:09 <_MultiThreadedRendezvous of RPC that terminated with:
22:01:09 status = StatusCode.UNAVAILABLE
22:01:09 details = "Socket closed"
22:01:09 debug_error_string = "{"created":"@1592661660.495892340","description":"Error received from peer ipv4:10.44.0.1:19530","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"Socket closed","grpc_status":14}"
22:01:09 >
22:01:09 2020-06-20:14:01:00,497 ERROR [grpc_handler.py:41] create_index
22:01:09 <_MultiThreadedRendezvous of RPC that terminated with:
22:01:09 status = StatusCode.UNAVAILABLE
22:01:09 details = "Socket closed"
22:01:09 debug_error_string = "{"created":"@1592661660.495892340","description":"Error received from peer ipv4:10.44.0.1:19530","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"Socket closed","grpc_status":14}"
22:01:09 >
22:01:09 2020-06-20:14:01:00,497 ERROR [main.py:69] 'tuple' object has no attribute 'OK'
22:01:09 2020-06-20:14:01:00,499 ERROR [main.py:70] Traceback (most recent call last):
22:01:09 File "main.py", line 67, in queue_worker
22:01:09 runner.run(run_type, collection)
22:01:09 File "/home/jenkins/agent/workspace/milvus-benchmark-0.8.1/milvus_benchmark/k8s_runner.py", line 155, in run
22:01:09 milvus_instance.create_index(index_type, index_param)
22:01:09 File "/home/jenkins/agent/workspace/milvus-benchmark-0.8.1/milvus_benchmark/client.py", line 32, in wrapper
22:01:09 result = func(*args, **kwargs)
22:01:09 File "/home/jenkins/agent/workspace/milvus-benchmark-0.8.1/milvus_benchmark/client.py", line 127, in create_index
22:01:09 self.check_status(status)
22:01:09 File "/home/jenkins/agent/workspace/milvus-benchmark-0.8.1/milvus_benchmark/client.py", line 71, in check_status
22:01:09 if not status.OK():
22:01:09 AttributeError: 'tuple' object has no attribute 'OK'
22:01:09
22:01:09 2020-06-20:14:01:00,500 DEBUG [k8s_runner.py:66] benchmark-test-fxghtjsw
22:01:09 Error: uninstall: Release not loaded: benchmark-test-gzelwvgk: release: not found
22:01:09 2020-06-20:14:01:00,575 DEBUG [utils.py:259] helm uninstall -n milvus benchmark-test-fxghtjsw
22:01:09 release "benchmark-test-fxghtjsw" uninstalled
22:01:09 2020-06-20:14:01:00,797 DEBUG [main.py:75] All task finished in queue: poseidon
2020-06-20 22:01:01,256 | INFO | default | [SERVER] Milvus Release version: v0.8.1, built at 2020-06-20 12:45.41
2020-06-20 22:01:01,256 | INFO | default | [SERVER] CPU edition
2020-06-20 22:01:01,266 | INFO | default | [ENGINE] Using SQLite
2020-06-20 22:01:01,305 | INFO | default | [WAL] record type 5 record lsn 140734830687464 error code 0
2020-06-20 22:01:01,305 | INFO | default | [WAL] record type 5 collection lsn 0
2020-06-20 22:01:02,306 | INFO | default | [WAL] record type 5 collection lsn 0
2020-06-20 22:01:02,311 | INFO | default | [SERVER] Server received critical signal: 11
2020-06-20 22:01:02,311 | INFO | default | [SERVER] Call stack:
2020-06-20 22:01:02,312 | INFO | default | [SERVER] ../bin/milvus_server() [0x5e4584]
2020-06-20 22:01:02,312 | INFO | default | [SERVER] ../bin/milvus_server() [0x5e4ca8]
2020-06-20 22:01:02,312 | INFO | default | [SERVER] /lib64/libc.so.6(+0x36400) [0x7fd70ece2400]
2020-06-20 22:01:02,312 | INFO | default | [SERVER] ../bin/milvus_server() [0x7c8274]
2020-06-20 22:01:02,312 | INFO | default | [SERVER] ../bin/milvus_server() [0x63bfce]
2020-06-20 22:01:02,312 | INFO | default | [SERVER] ../bin/milvus_server() [0x6411ca]
2020-06-20 22:01:02,312 | INFO | default | [SERVER] ../bin/milvus_server() [0x4acb8d]
2020-06-20 22:01:02,312 | INFO | default | [SERVER] ../bin/milvus_server() [0x4a79ca]
2020-06-20 22:01:02,312 | INFO | default | [SERVER] ../bin/milvus_server() [0xd12bff]
2020-06-20 22:01:02,312 | INFO | default | [SERVER] /lib64/libpthread.so.0(+0x7ea5) [0x7fd70fab6ea5]
2020-06-20 22:01:02,312 | INFO | default | [SERVER] /lib64/libc.so.6(clone+0x6d) [0x7fd70edaa8dd]
2020-06-20 22:01:02,364 | INFO | default | [WAL] record type 5 collection lsn 0
error logs:
2020-06-20 22:01:01,304 | ERROR | default | [WAL] bad wal file 2
2020-06-20 22:01:02,307 | ERROR | default | [ENGINE] Collection file doesn't exist: /test/milvus/db_data_080/sift_1m_128_128_l2/db/tables/sift_1m_128_128_l2/1592661205816766000/1592661205816766000 in path: /test/milvus/db_data_080/sift_1m_128_128_l2/db for collection: sift_1m_128_128_l2
2020-06-20 22:01:02,310 | ERROR | default | [ENGINE] Failed to open file: /test/milvus/db_data_080/sift_1m_128_128_l2/db/tables/sift_1m_128_128_l2/1592661204384909000/deleted_docs, error: No such file or directory
2020-06-20 22:01:02,310 | ERROR | default | [ENGINE] Failed to load segment from /test/milvus/db_data_080/sift_1m_128_128_l2/db/tables/sift_1m_128_128_l2/1592661204384909000/1592661204384909000
2020-06-20 22:01:02,311 | ERROR | default | [ENGINE] Failed to load segment from
2020-06-20 22:01:02,366 | ERROR | default | [ENGINE] Collection file doesn't exist: /test/milvus/db_data_080/sift_1m_128_128_l2/db/tables/sift_1m_128_128_l2/1592661205816766000/1592661205816766000 in path: /test/milvus/db_data_080/sift_1m_128_128_l2/db for collection: sift_1m_128_128_l2
2020-06-20 22:01:02,367 | ERROR | default | [ENGINE] Failed to open file: /test/milvus/db_data_080/sift_1m_128_128_l2/db/tables/sift_1m_128_128_l2/1592661204384909000/deleted_docs, error: No such file or directory
2020-06-20 22:01:02,367 | ERROR | default | [ENGINE] Failed to load segment from /test/milvus/db_data_080/sift_1m_128_128_l2/db/tables/sift_1m_128_128_l2/1592661204384909000/1592661204384909000
2020-06-20 22:01:02,367 | ERROR | default | [ENGINE] Failed to load segment from
2020-06-20 22:01:02,419 | ERROR | default | [ENGINE] Failed to build index 1592661662311016000, reason: Resource deadlock avoided
debug logs:
2020-06-20 22:01:02,367 | DEBUG | default | [ENGINE] Index params: {"dim":128,"gpu_id":0,"metric_type":"L2"}
2020-06-20 22:01:02,367 | DEBUG | default | [SERVER] BuildIndexJob 2 finish index file: 36
2020-06-20 22:01:02,367 | DEBUG | default | [SERVER] cpu load BuildIndexTask
2020-06-20 22:01:02,367 | DEBUG | default | [SERVER] BuildIndexJob 2 all done
2020-06-20 22:01:02,367 | DEBUG | default | [ENGINE] Building index job 2 succeed.
2020-06-20 22:01:02,367 | DEBUG | default | [ENGINE] Unmark ongoing file:1592661204384909000 refcount:0
2020-06-20 22:01:02,367 | DEBUG | default | [ENGINE] Finish build index file 1592661204384909000
2020-06-20 22:01:02,367 | DEBUG | default | [ENGINE] Index params: {"dim":128,"gpu_id":0,"metric_type":"L2"}
2020-06-20 22:01:02,368 | DEBUG | default | [SERVER] BuildIndexJob 3 finish index file: 37
2020-06-20 22:01:02,368 | DEBUG | default | [SERVER] BuildIndexJob 3 all done
2020-06-20 22:01:02,368 | DEBUG | default | [ENGINE] Building index job 3 succeed.
2020-06-20 22:01:02,368 | DEBUG | default | [ENGINE] Unmark ongoing file:1592661205816766000 refcount:0
2020-06-20 22:01:02,368 | DEBUG | default | [ENGINE] Finish build index file 1592661205816766000
2020-06-20 22:01:02,368 | DEBUG | default | [ENGINE] Background build index thread finished
2020-06-20 22:01:02,368 | DEBUG | default | [ENGINE] DB background thread exit
2020-06-20 22:01:02,369 | DEBUG | default | [ENGINE] Remove collection file type as NEW
2020-06-20 22:01:02,370 | DEBUG | default | [ENGINE] Clean 1 files
2020-06-20 22:01:02,418 | DEBUG | default | [ENGINE] DB background metric thread exit
2020-06-20 22:01:02,419 | DEBUG | default | [ENGINE] Update single collection file, file id = 1592661662311016000
2020-06-20 22:01:02,419 | DEBUG | default | [SERVER] BuildIndexJob 0 finish index file: 0
2020-06-20 22:01:02,419 | DEBUG | default | [SERVER] XBuildIndexTask::Execute 0: totally cost (0.108663 second [108.663442 ms])
2020-06-20 22:01:02,419 | DEBUG | default | [SERVER] XBuildIndexTask::Execute 0: totally cost (0.000003 second [0.003230 ms])
2020-06-20 22:01:02,419 | DEBUG | default | [SERVER] XBuildIndexTask::Execute 0: totally cost (0.000003 second [0.002880 ms])
2020-06-20 22:01:02,419 | DEBUG | default | [SERVER] XBuildIndexTask::Execute 0: totally cost (0.000003 second [0.002614 ms])
Steps/Code to reproduce behavior
- create collection and insert sift-1m into it, flush
- clean up the container.
- start test with new container, and build index with ivf_flat
Create index failed caused by server crashed
data/log path on NAS:/test/milvus/db_data_080/sift_1m_128_128_l2
Expected behavior
Environment details
0.8.1-cpu
commit id:
registry.zilliz.com/milvus/engine 0.8.1-cpu-centos7-release 51e4c23dbd63
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.