Skip to content

TiKV running over 2 years may panic #11940

Closed
@you06

Description

Bug Report

What version of TiKV are you using?

nightly(this case also appears on v3.0.5)

What operating system and CPU are you using?

Steps to reproduce

Running TiKV for 795 days, then some RPC failure may cause panic.

A simple reproduce using Chaos Mesh:

tiup playground --tiflash=0 --monitor=false
sudo kill {pid_of_pd}
sudo ./watchmaker -pid {pid_of_tikv} -sec_delta 68719436 -clk_ids=CLOCK_MONOTONIC

What did you expect?

No fatal error with TiKV.

What did happened?

TiKV quits with fatal error.

[2022/02/08 20:32:34.418 +08:00] [ERROR] [util.rs:419] ["request failed, retry"] [err_code=KV:PD:Unknown] [err="Other(\"[components/pd_client/src/tso.rs:85]: TimestampRequest channel is closed\")"]
[2022/02/08 20:32:34.418 +08:00] [ERROR] [util.rs:419] ["request failed, retry"] [err_code=KV:PD:Unknown] [err="Other(\"[components/pd_client/src/tso.rs:85]: TimestampRequest channel is closed\")"]
[2022/02/08 20:32:34.418 +08:00] [ERROR] [util.rs:419] ["request failed, retry"] [err_code=KV:PD:Unknown] [err="Other(\"[components/pd_client/src/tso.rs:85]: TimestampRequest channel is closed\")"]
[2022/02/08 20:32:34.418 +08:00] [ERROR] [util.rs:419] ["request failed, retry"] [err_code=KV:PD:Unknown] [err="Other(\"[components/pd_client/src/tso.rs:85]: TimestampRequest channel is closed\")"]
[2022/02/08 20:32:34.421 +08:00] [FATAL] [lib.rs:465] ["index out of bounds: the len is 6 but the index is 6"] [backtrace="   0: tikv_util::set_panic_hook::{{closure}}\n             at /home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tikv/components/tikv_util/src/lib.rs:464:18\n   1: std::panicking::rust_panic_with_hook\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/panicking.rs:626:17\n   2: std::panicking::begin_panic_handler::{{closure}}\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/panicking.rs:519:13\n   3: std::sys_common::backtrace::__rust_end_short_backtrace\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/sys_common/backtrace.rs:141:18\n   4: rust_begin_unwind\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/panicking.rs:515:5\n   5: core::panicking::panic_fmt\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/core/src/panicking.rs:92:14\n   6: core::panicking::panic_bounds_check\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/core/src/panicking.rs:69:5\n   7: <usize as core::slice::index::SliceIndex<[T]>>::index_mut\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/core/src/slice/index.rs:190:14\n      core::slice::index::<impl core::ops::index::IndexMut<I> for [T]>::index_mut\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/core/src/slice/index.rs:26:9\n      <alloc::vec::Vec<T,A> as core::ops::index::IndexMut<I>>::index_mut\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/alloc/src/vec/mod.rs:2445:9\n      tokio_timer::wheel::Wheel<T>::insert\n             at /rust/registry/src/github.com-1ecc6299db9ec823/tokio-timer-0.2.13/src/wheel/mod.rs:114:9\n      tokio_timer::timer::Timer<T,N>::add_entry\n             at /rust/registry/src/github.com-1ecc6299db9ec823/tokio-timer-0.2.13/src/timer/mod.rs:324:15\n   8: tokio_timer::timer::Timer<T,N>::process_queue\n             at /rust/registry/src/github.com-1ecc6299db9ec823/tokio-timer-0.2.13/src/timer/mod.rs:301:21\n   9: <tokio_timer::timer::Timer<T,N> as tokio_executor::park::Park>::park\n             at /rust/registry/src/github.com-1ecc6299db9ec823/tokio-timer-0.2.13/src/timer/mod.rs:361:9\n      tokio_timer::timer::Timer<T,N>::turn\n             at /rust/registry/src/github.com-1ecc6299db9ec823/tokio-timer-0.2.13/src/timer/mod.rs:256:21\n  10: tikv_util::timer::start_global_timer::{{closure}}\n             at /home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tikv/components/tikv_util/src/timer.rs:98:17\n  11: std::sys_common::backtrace::__rust_begin_short_backtrace\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/sys_common/backtrace.rs:125:18\n  12: std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}}\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/thread/mod.rs:476:17\n  13: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/panic.rs:347:9\n  14: std::panicking::try::do_call\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/panicking.rs:401:40\n      std::panicking::try\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/panicking.rs:365:19\n      std::panic::catch_unwind\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/panic.rs:434:14\n      std::thread::Builder::spawn_unchecked::{{closure}}\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/thread/mod.rs:475:30\n      core::ops::function::FnOnce::call_once{{vtable.shim}}\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/core/src/ops/function.rs:227:5\n  15: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/alloc/src/boxed.rs:1572:9\n      <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/alloc/src/boxed.rs:1572:9\n      std::sys::unix::thread::Thread::new::thread_start\n             at /rustc/2faabf579323f5252329264cc53ba9ff803429a3/library/std/src/sys/unix/thread.rs:91:17\n  16: start_thread\n  17: __GI___clone\n"] [location=/rust/registry/src/github.com-1ecc6299db9ec823/tokio-timer-0.2.13/src/wheel/mod.rs:114] [thread_name=timer]

Metadata

Assignees

No one assigned

    Labels

    affects-4.0This bug affects 4.0.x versions.affects-5.0This bug affects 5.0.x versions.affects-5.1This bug affects 5.1.x versions.affects-5.2This bug affects 5.2.x versions.affects-5.3This bug affects 5.3.x versions.affects-5.4This bug affects the 5.4.x(LTS) versions.severity/criticaltype/bugThe issue is confirmed as a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions