Skip to content

Server received critical signal: 11 #3906

Closed
@markoarnauto

Description

Please state your issue using the following template and, most importantly, in English.

Describe the bug
I have a streaming setup, where new data is inserted continuously. If a certain threshold of data is reached the data is written into a new partition while the oldest one is dropped. But after a while milvus throws an exception and stops (see info.log). I wrote a minimum script to reconstruct my setting (see below).

Steps/Code to reproduce behavior

    # setup milvus client
    milv_coll = 'stress_test'
    dimension = 256
    param = {'collection_name': milv_coll, 'dimension': dimension, 'metric_type': MetricType.IP}
    milv = milvus.Milvus(**milvus_conf)
    if milv.has_collection(milv_coll):
        milv.drop_collection(milv_coll)
        time.sleep(2)
    milv.create_collection(param)
    milv.create_index(milv_coll, IndexType.IVF_FLAT, {'nlist': 4096, 'index_file_size': 2048})
   
    # minimum streaming example
    batch_size = 10  # batch size of one insert
    part_size = 1000  # number of documents in a partition
    look_back = 30  # number of partitions to keep
    vecs = [[random.random() for _ in range(dimension)] for _ in range(batch_size)]
    i = 1

    while True:
        ids = [j for j in range(i, i + batch_size)]
        i += batch_size
        part_id = i // part_size
        cnt_partition = f'part_{part_id}'

        milv.search(milv_coll, query_records=vecs, top_k=1, params={'nprobe': 32})
        res, _ = milv.insert(milv_coll, records=vecs, ids=ids, partition_tag=cnt_partition)
        if res.code == 1:  # partition does not exist yes
            oldest_partition = f'part_{part_id - look_back}'  # get partition beyond look_back
            milv.drop_partition(milv_coll, oldest_partition)  # drop old partition
            milv.create_partition(milv_coll, cnt_partition)  # and create a new one
            milv.insert(milv_coll, records=vecs, ids=ids, partition_tag=cnt_partition)  # insert vecs into new partition
        res = milv.flush(collection_name_array=[milv_coll])

Method of installation

  • [* ] Docker/cpu
  • Docker/gpu
  • Build from source

Environment details

  • Hardware/Softwars conditions (OS, CPU, GPU, Memory)
    Ubuntu 18.04

  • Milvus version (master or released version)
    0.10.3

Configuration file
Settings you made in server_config.yaml or milvus.yaml

# Copyright (C) 2019-2020 Zilliz. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software distributed under the License
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
# or implied. See the License for the specific language governing permissions and limitations under the License.

version: 0.5

#----------------------+------------------------------------------------------------+------------+-----------------+
# Cluster Config       | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# enable               | If runinng with Mishards, set true, otherwise false.       | Boolean    | false           |
#----------------------+------------------------------------------------------------+------------+-----------------+
# role                 | Milvus deployment role: rw / ro                            | role       | rw              |
#----------------------+------------------------------------------------------------+------------+-----------------+
cluster:
  enable: false
  role: rw

#----------------------+------------------------------------------------------------+------------+-----------------+
# General Config       | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# time_zone            | Use UTC-x or UTC+x to specify a time zone.                 | Timezone   | UTC+8           |
#----------------------+------------------------------------------------------------+------------+-----------------+
# meta_uri             | URI for metadata storage, using SQLite (for single server  | URL        | sqlite://:@:/   |
#                      | Milvus) or MySQL (for distributed cluster Milvus).         |            |                 |
#                      | Format: dialect://username:password@host:port/database     |            |                 |
#                      | Keep 'dialect://:@:/', 'dialect' can be either 'sqlite' or |            |                 |
#                      | 'mysql', replace other texts with real values.             |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
general:
  timezone: UTC+0
  meta_uri: sqlite://:@:/

#----------------------+------------------------------------------------------------+------------+-----------------+
# Network Config       | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# bind.address         | IP address that Milvus server monitors.                    | IP         | 0.0.0.0         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# bind.port            | Port that Milvus server monitors. Port range (1024, 65535) | Integer    | 19530           |
#----------------------+------------------------------------------------------------+------------+-----------------+
# http.enable          | Enable web server or not.                                  | Boolean    | true            |
#----------------------+------------------------------------------------------------+------------+-----------------+
# http.port            | Port that Milvus web server monitors.                      | Integer    | 19121           |
#                      | Port range (1024, 65535)                                   |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
network: 
  bind.address: 0.0.0.0
  bind.port: 19530
  http.enable: true
  http.port: 19121

#----------------------+------------------------------------------------------------+------------+-----------------+
# Storage Config       | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# path                 | Path used to save meta data, vector data and index data.   | Path       | /var/lib/milvus |
#----------------------+------------------------------------------------------------+------------+-----------------+
# auto_flush_interval  | The interval, in seconds, at which Milvus automatically    | Integer    | 1 (s)           |
#                      | flushes data to disk.                                      |            |                 |
#                      | 0 means disable the regular flush.                         |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
storage:
  path: /var/lib/milvus
  auto_flush_interval: 0

#----------------------+------------------------------------------------------------+------------+-----------------+
# WAL Config           | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# enable               | Whether to enable write-ahead logging (WAL) in Milvus.     | Boolean    | true            |
#                      | If WAL is enabled, Milvus writes all data changes to log   |            |                 |
#                      | files in advance before implementing data changes. WAL     |            |                 |
#                      | ensures the atomicity and durability for Milvus operations.|            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# recovery_error_ignore| Whether to ignore logs with errors that happens during WAL | Boolean    | false           |
#                      | recovery. If true, when Milvus restarts for recovery and   |            |                 |
#                      | there are errors in WAL log files, log files with errors   |            |                 |
#                      | are ignored. If false, Milvus does not restart when there  |            |                 |
#                      | are errors in WAL log files.                               |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# buffer_size          | Sum total of the read buffer and the write buffer in MBs.  | Integer    | 256 (MB)        |
#                      | buffer_size must be in range [64, 4096] (MB).              |            |                 |
#                      | If the value you specified is out of range, Milvus         |            |                 |
#                      | automatically uses the boundary value closest to the       |            |                 |
#                      | specified value. It is recommended you set buffer_size to  |            |                 |
#                      | a value greater than the inserted data size of a single    |            |                 |
#                      | insert operation for better performance.                   |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# path                 | Location of WAL log files.                                 | String     |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
wal:
  enable: false
  recovery_error_ignore: true
  buffer_size: 256MB
  path: /var/lib/milvus/wal

#----------------------+------------------------------------------------------------+------------+-----------------+
# Cache Config         | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# cache_size           | The size of CPU memory used for caching data for faster    | Integer    | 4 (GB)          |
#                      | query. The sum of 'cpu_cache_capacity' and                 |            |                 |
#                      | 'insert_buffer_size' must be less than system memory size. |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# insert_buffer_size   | Buffer size used for data insertion.                       | Integer    | 1 (GB)          |
#                      | The sum of 'insert_buffer_size' and 'cpu_cache_capacity'   |            |                 |
#                      | must be less than system memory size.                      |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# preload_collection   | A comma-separated list of collection names that need to    | StringList |                 |
#                      | be pre-loaded when Milvus server starts up.                |            |                 |
#                      | '*' means preload all existing tables (single-quote or     |            |                 |
#                      | double-quote required).                                    |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
cache:
  cache_size: 4GB
  insert_buffer_size: 1GB
  preload_collection:

#----------------------+------------------------------------------------------------+------------+-----------------+
# GPU Config           | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# enable               | Enable GPU resources or not.                               | Boolean    | false           |
#----------------------+------------------------------------------------------------+------------+-----------------+
# cache_size           | The size of GPU memory per card used for cache.            | Integer    | 1 (GB)          |
#----------------------+------------------------------------------------------------+------------+-----------------+
# gpu_search_threshold | A Milvus performance tuning parameter. This value will be  | Integer    | 1000            |
#                      | compared with 'nq' to decide if the search computation will|            |                 |
#                      | be executed on GPUs only.                                  |            |                 |
#                      | If nq >= gpu_search_threshold, the search computation will |            |                 |
#                      | be executed on GPUs only;                                  |            |                 |
#                      | if nq < gpu_search_threshold, the search computation will  |            |                 |
#                      | be executed on both CPUs and GPUs.                         |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# search_resources     | The list of GPU devices used for search computation.       | DeviceList | gpu0            |
#                      | Must be in format gpux.                                    |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# build_index_resources| The list of GPU devices used for index building.           | DeviceList | gpu0            |
#                      | Must be in format gpux.                                    |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
gpu:
  enable: false
  cache_size: 1GB
  gpu_search_threshold: 1000
  search_devices:
    - gpu0
  build_index_devices:
    - gpu0

#----------------------+------------------------------------------------------------+------------+-----------------+
# Logs Config          | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# level                | Log level in Milvus. Must be one of debug, info, warning,  | String     | debug           |
#                      | error, fatal                                               |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# trace.enable         | Whether to enable trace level logging in Milvus.           | Boolean    | true            |
#----------------------+------------------------------------------------------------+------------+-----------------+
# path                 | Absolute path to the folder holding the log files.         | String     |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# max_log_file_size    | The maximum size of each log file, size range [512, 4096]  | Integer    | 1024 (MB)       |
#----------------------+------------------------------------------------------------+------------+-----------------+
# log_rotate_num       | The maximum number of log files that Milvus keeps for each | Integer    | 0               |
#                      | logging level, num range [0, 1024], 0 means unlimited.     |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
logs:
  level: info
  trace.enable: true
  path: /var/lib/milvus/logs
  max_log_file_size: 1024MB
  log_rotate_num: 0

#----------------------+------------------------------------------------------------+------------+-----------------+
# Metric Config        | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# enable               | Enable monitoring function or not.                         | Boolean    | false           |
#----------------------+------------------------------------------------------------+------------+-----------------+
# address              | Pushgateway address                                        | IP         | 127.0.0.1       +
#----------------------+------------------------------------------------------------+------------+-----------------+
# port                 | Pushgateway port, port range (1024, 65535)                 | Integer    | 9091            |
#----------------------+------------------------------------------------------------+------------+-----------------+
metric:
  enable: false
  address: 127.0.0.1
  port: 9091

Screenshots
Stack trace (info.log):

2020-09-28 10:08:51,947][INFO][SERVER][Insert][grpcpp_sync_ser] Request [65324] Insert begin.
[2020-09-28 10:08:51,948][INFO][SERVER][OnExecute][reqsched_thread] [insert][0] Execute insert request.
[2020-09-28 10:08:51,951][INFO][SERVER][Insert][grpcpp_sync_ser] Request [65324] Insert end.
[2020-09-28 10:08:51,953][INFO][SERVER][Flush][grpcpp_sync_ser] Request [65325] Flush begin.
[2020-09-28 10:08:51,962][INFO][SERVER][Flush][grpcpp_sync_ser] Request [65325] Flush end.
[2020-09-28 10:08:51,991][INFO][SERVER][Search][grpcpp_sync_ser] Request [65323] Search end.
[2020-09-28 10:08:51,996][INFO][SERVER][Insert][grpcpp_sync_ser] Request [65326] Insert begin.
[2020-09-28 10:08:51,996][INFO][SERVER][OnExecute][reqsched_thread] [insert][0] Execute insert request.
[2020-09-28 10:08:52,000][INFO][SERVER][Insert][grpcpp_sync_ser] Request [65326] Insert end.
[2020-09-28 10:08:52,003][INFO][SERVER][Flush][grpcpp_sync_ser] Request [65327] Flush begin.
[2020-09-28 10:08:52,008][INFO][SERVER][Flush][grpcpp_sync_ser] Request [65327] Flush end.
[2020-09-28 10:08:52,013][INFO][SERVER][Search][grpcpp_sync_ser] Request [65328] Search begin.
[2020-09-28 10:08:52,013][INFO][SERVER][OnPreExecute][grpcpp_sync_ser] [search][0] Search pre-execute. Check search parameters
[2020-09-28 10:08:52,013][INFO][SERVER][OnExecute][reqsched_thread] [search][0] Search execute.
[2020-09-28 10:08:52,016][INFO][SERVER][HandleSignal][jobmgr_thread] Release lock!11
[2020-09-28 10:08:52,016][INFO][SERVER][HandleSignal][jobmgr_thread] Server received critical signal: 11
[2020-09-28 10:08:52,016][INFO][SERVER][PrintStacktrace][jobmgr_thread] Call stack:
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /var/lib/milvus/bin/milvus_server() [0x6b11ec]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /var/lib/milvus/bin/milvus_server() [0x6b1cd4]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /usr/lib64/libc.so.6(+0x363b0) [0x7f9ea55f73b0]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /usr/lib64/libc.so.6(abort+0x297) [0x7f9ea55f8b77]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /usr/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x165) [0x7f9ea5f077d5]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /usr/lib64/libstdc++.so.6(+0x5e746) [0x7f9ea5f05746]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /usr/lib64/libstdc++.so.6(+0x5e773) [0x7f9ea5f05773]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /usr/lib64/libstdc++.so.6(__cxa_rethrow+0x49) [0x7f9ea5f059e9]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /var/lib/milvus/bin/milvus_server() [0x4c1b05]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /var/lib/milvus/bin/milvus_server() [0x4a99a4]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /var/lib/milvus/bin/milvus_server() [0xeedf3f]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /usr/lib64/libpthread.so.0(+0x7e65) [0x7f9ea63cbe65]
[2020-09-28 10:08:52,017][INFO][SERVER][PrintStacktrace][jobmgr_thread] /usr/lib64/libc.so.6(clone+0x6d) [0x7f9ea56bf88d]
[2020-09-28 10:08:52,020][INFO][SERVER][Cmd][grpcpp_sync_ser] Request [65329] Cmd begin.
[2020-09-28 10:08:52,020][INFO][SERVER][Cmd][grpcpp_sync_ser] Request [65329] Cmd end.
[2020-09-28 10:08:52,028][INFO][SERVER][Search][grpcpp_sync_ser] Request [65330] Search begin.
[2020-09-28 10:08:52,028][INFO][SERVER][OnPreExecute][grpcpp_sync_ser] [search][0] Search pre-execute. Check search parameters
[2020-09-28 11:29:33,353][INFO][SERVER][HandleSignal][milvus_server] Release lock!15
[2020-09-28 11:29:33,353][INFO][SERVER][HandleSignal][milvus_server] Server received critical signal: 15
[2020-09-28 11:29:33,353][INFO][SERVER][PrintStacktrace][milvus_server] Call stack:
[2020-09-28 11:29:33,353][INFO][SERVER][PrintStacktrace][milvus_server] /var/lib/milvus/bin/milvus_server() [0x6b11ec]
[2020-09-28 11:29:33,353][INFO][SERVER][PrintStacktrace][milvus_server] /var/lib/milvus/bin/milvus_server() [0x6b1cd4]
[2020-09-28 11:29:33,353][INFO][SERVER][PrintStacktrace][milvus_server] /usr/lib64/libc.so.6(+0x363b0) [0x7f9ea55f73b0]
[2020-09-28 11:29:33,353][INFO][SERVER][PrintStacktrace][milvus_server] /usr/lib64/libpthread.so.0(pause+0x2d) [0x7f9ea63d2f1d]
[2020-09-28 11:29:33,353][INFO][SERVER][PrintStacktrace][milvus_server] /var/lib/milvus/bin/milvus_server() [0x42d899]
[2020-09-28 11:29:33,353][INFO][SERVER][PrintStacktrace][milvus_server] /usr/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f9ea55e3505]
[2020-09-28 11:29:33,353][INFO][SERVER][PrintStacktrace][milvus_server] /var/lib/milvus/bin/milvus_server() [0x43ed27]

Metadata

Assignees

Labels

kind/bugIssues or changes related a bugseverity/criticalCritical, lead to crash, data missing, wrong result, function totally doesn't work.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions