Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic in analyze sampling #11192

Closed
gengliqi opened this issue Oct 30, 2021 · 6 comments · Fixed by #12696
Closed

Panic in analyze sampling #11192

gengliqi opened this issue Oct 30, 2021 · 6 comments · Fixed by #12696
Assignees
Labels
affects-5.3 This bug affects 5.3.x versions. affects-5.4 affects-6.0 affects-6.1 severity/major type/bug The issue is confirmed as a bug.

Comments

@gengliqi
Copy link
Member

gengliqi commented Oct 30, 2021

Bug Report

What version of TiKV are you using?

https://github.com/tikv/tikv/tree/d0c129dde8d3f41bbac26ab786419bb4b5e24878

What operating system and CPU are you using?

doesn't matter

Steps to reproduce

I run tipocket pipeline test. I find several TiKV panic and stuck in an infinite crash loop.

What did you expect?

Not panic.

What did happened?

[FATAL] [lib.rs:465] ["called `Option::unwrap()` on a `None` value"] [backtrace="   0: tikv_util::set_panic_hook::{{closure}}\n             at data1/glq/tikv/components/tikv_util/src/lib.rs:464:18\n   1: std::panicking::rust_panic_with_hook\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/panicking.rs:626:17\n   2: std::panicking::begin_panic_handler::{{closure}}\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/panicking.rs:517:13\n   3: std::sys_common::backtrace::__rust_end_short_backtrace\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/sys_common/backtrace.rs:141:18\n   4: rust_begin_unwind\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/panicking.rs:515:5\n   5: core::panicking::panic_fmt\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/core/src/panicking.rs:92:14\n   6: core::panicking::panic\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/core/src/panicking.rs:50:5\n   7: core::option::Option<T>::unwrap\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/core/src/option.rs:722:21\n      tikv::coprocessor::statistics::analyze::RowSampleCollector::sampling\n             at data1/glq/tikv/src/coprocessor/statistics/analyze.rs:520:19\n      tikv::coprocessor::statistics::analyze::RowSampleCollector::collect_column\n             at data1/glq/tikv/src/coprocessor/statistics/analyze.rs:512:9\n      tikv::coprocessor::statistics::analyze::RowSampleBuilder<S>::collect_column_stats::{{closure}}\n             at data1/glq/tikv/src/coprocessor/statistics/analyze.rs:399:17\n   8: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/core/src/future/mod.rs:80:19\n      tikv::coprocessor::statistics::analyze::AnalyzeContext<S>::handle_full_sampling::{{closure}}\n             at data1/glq/tikv/src/coprocessor/statistics/analyze.rs:104:26\n      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/core/src/future/mod.rs:80:19\n      <tikv::coprocessor::statistics::analyze::AnalyzeContext<S> as tikv::coprocessor::RequestHandler>::handle_request::__handle_request::{{closure}}\n             at data1/glq/tikv/src/coprocessor/statistics/analyze.rs:267:27\n   9: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/core/src/future/mod.rs:80:19\n  10: <core::pin::Pin<P> as core::future::future::Future>::poll\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/core/src/future/future.rs:119:9\n      <tikv::coprocessor::interceptors::deadline::DeadlineChecker<F> as core::future::future::Future>::poll\n             at data1/glq/tikv/src/coprocessor/interceptors/deadline.rs:34:9\n      <tikv::coprocessor::interceptors::tracker::Tracker<F> as core::future::future::Future>::poll\n             at data1/glq/tikv/src/coprocessor/interceptors/tracker.rs:49:19\n  11: <tikv::coprocessor::interceptors::concurrency_limiter::ConcurrencyLimiter<PF,F> as core::future::future::Future>::poll\n             at data1/glq/tikv/src/coprocessor/interceptors/concurrency_limiter.rs:111:15\n  12: tikv::coprocessor::endpoint::Endpoint<E>::handle_unary_request_impl::{{closure}}\n             at data1/glq/tikv/src/coprocessor/endpoint.rs:428:13\n      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/core/src/future/mod.rs:80:19\n      <resource_metering::InTags<T> as core::future::future::Future>::poll\n             at data1/glq/tikv/components/resource_metering/src/lib.rs:157:9\n      tikv::read_pool::ReadPoolHandle::spawn_handle::{{closure}}\n             at data1/glq/tikv/src/read_pool.rs:145:27\n      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/core/src/future/mod.rs:80:19\n  13: tikv::read_pool::ReadPoolHandle::spawn::{{closure}}\n             at data1/glq/tikv/src/read_pool.rs:121:25\n      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/core/src/future/mod.rs:80:19\n  14: <yatp::task::future::Runner as yatp::pool::runner::Runner>::handle\n             at root/.cargo/git/checkouts/yatp-e704b73c3ee279b6/0c477fb/src/task/future.rs:261:20\n  15: <tikv_util::yatp_pool::YatpPoolRunner<T> as yatp::pool::runner::Runner>::handle\n             at data1/glq/tikv/components/tikv_util/src/yatp_pool/mod.rs:104:24\n      <yatp::queue::multilevel::MultilevelRunner<R> as yatp::pool::runner::Runner>::handle\n             at root/.cargo/git/checkouts/yatp-e704b73c3ee279b6/0c477fb/src/queue/multilevel.rs:245:19\n      yatp::pool::worker::WorkerThread<T,R>::run\n             at root/.cargo/git/checkouts/yatp-e704b73c3ee279b6/0c477fb/src/pool/worker.rs:48:13\n      yatp::pool::builder::LazyBuilder<T>::build::{{closure}}\n             at root/.cargo/git/checkouts/yatp-e704b73c3ee279b6/0c477fb/src/pool/builder.rs:91:25\n      std::sys_common::backtrace::__rust_begin_short_backtrace\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/sys_common/backtrace.rs:125:18\n  16: std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}}\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/thread/mod.rs:476:17\n      <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/panic.rs:347:9\n      std::panicking::try::do_call\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/panicking.rs:401:40\n      std::panicking::try\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/panicking.rs:365:19\n      std::panic::catch_unwind\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/panic.rs:434:14\n      std::thread::Builder::spawn_unchecked::{{closure}}\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/thread/mod.rs:475:30\n      core::ops::function::FnOnce::call_once{{vtable.shim}}\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/core/src/ops/function.rs:227:5\n  17: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/alloc/src/boxed.rs:1572:9\n      <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/alloc/src/boxed.rs:1572:9\n      std::sys::unix::thread::Thread::new::thread_start\n             at rustc/b70888601af92f6cdc0364abab3446e418b91d36/library/std/src/sys/unix/thread.rs:91:17\n  18: <unknown>\n  19: clone\n"] [location=src/coprocessor/statistics/analyze.rs:520] [thread_name=unified-read-pool-1]

if self.samples.len() < self.max_sample_size {
need_push = true;
} else if self.samples.peek().unwrap().0.0 < cur_rng {
need_push = true;
let (_, evicted) = self.samples.pop().unwrap().0;
self.memory_usage -= evicted.iter().map(|x| x.capacity()).sum::<usize>();
}

It seems this panic can be triggered only when samples is empty and max_sample_size is 0.

@gengliqi
Copy link
Member Author

/cc @hicqu @winoros

@gengliqi gengliqi added the type/bug The issue is confirmed as a bug. label Oct 30, 2021
@bufferflies
Copy link
Contributor

bufferflies commented Nov 4, 2021

the cause seems that tidb version higher than v5.3 and tikv is less than v5.3. In version 5.3, tidb change the analyze behaviors using sampling. so you can set analyze version=1 or update tikv to solve it.
tidb pr: pingcap/tidb#28999

@Lily2025
Copy link

Lily2025 commented Nov 4, 2021

/severity major

@gengliqi
Copy link
Member Author

gengliqi commented Nov 9, 2021

the cause seems that tidb version higher than v5.3 and tikv is less than v5.3. In version 5.3, tidb change the analyze behaviors using sampling. so you can set analyze version=1 or update tikv to solve it. tidb pr: pingcap/tidb#28999

Got it. BTW, I think we can make this code more robust.

@winoros
Copy link
Contributor

winoros commented Nov 22, 2021

/assign winoros

@LykxSassinator
Copy link
Contributor

/assign LykxSassinator

LykxSassinator added a commit to LykxSassinator/tikv that referenced this issue May 30, 2022
… tidb tried to do sampling with `max_sample_size == 0`

Fix the `panic` error when tidb tried to sample with an abnormal setting - `max_sample_size == 0`.

Signed-off-by: Lucasliang <nkcs_lykx@hotmail.com>
ti-chi-bot added a commit that referenced this issue Jun 24, 2022
close #11192, ref #11425

Signed-off-by: Lucasliang <nkcs_lykx@hotmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
ti-chi-bot added a commit that referenced this issue Jun 24, 2022
…) (#12906)

close #11192, ref #11425, ref #12696

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>

Co-authored-by: Lucas <nkcs_lykx@hotmail.com>
Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
ethercflow pushed a commit to ethercflow/tikv that referenced this issue Jun 28, 2022
…#12696)

close tikv#11192, ref tikv#11425

Signed-off-by: Lucasliang <nkcs_lykx@hotmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.3 This bug affects 5.3.x versions. affects-5.4 affects-6.0 affects-6.1 severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants