Skip to content

apply term should be assigned when applying snapshot #10225

Closed
@BusyJay

Description

Bug Report

What version of TiKV are you using?

v5.0.1, though v4.0.x should also be affected.

What operating system and CPU are you using?

Doesn't matter.

Steps to reproduce

Suppose a region contains 3 replicas a, b and c, a is leader, c is isolated and has not been initialized. a and b decides to add a new replica d, and promote it as voter before it's initialized. After a has applied to last log, it sends a snapshot to d and initialize it with latest data. And then a transfer leader to d. Right after d wins the election, c is recovered and requests snapshot from d before d commits any entry.

What did you expect?

c will be initialized properly.

What did happened?

c will panic with following stack.

[2021/05/20 19:02:04.640 +00:00] [FATAL] [lib.rs:465] ["to_commit 41 is out of range [last_index 0], raft_id: 149208, region_id: 149205"] [backtrace="stack backtrace:
   0: tikv_util::set_panic_hook::{{closure}}
             at /tikv/components/tikv_util/src/lib.rs:464
   1: std::panicking::rust_panic_with_hook
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/panicking.rs:595
   2: std::panicking::begin_panic_handler::{{closure}}
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/panicking.rs:497
   3: std::sys_common::backtrace::__rust_end_short_backtrace
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/sys_common/backtrace.rs:141
   4: rust_begin_unwind
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/panicking.rs:493
   5: std::panicking::begin_panic_fmt
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/panicking.rs:435
   6: raft::raft_log::RaftLog<T>::commit_to
             at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft_log.rs:252
   7: raft::raft::Raft<T>::restore
             at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft.rs:2389
      raft::raft::Raft<T>::handle_snapshot
             at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft.rs:2307
   8: raft::raft::Raft<T>::step_follower
             at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft.rs:2135
      raft::raft::Raft<T>::step
             at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft.rs:1440
   9: raft::raw_node::RawNode<T>::step
             at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raw_node.rs:378
      raftstore::store::peer::Peer<EK,ER>::step
             at /tikv/components/raftstore/src/store/peer.rs:1139
      raftstore::store::fsm::peer::PeerFsmDelegate<EK,ER,T>::on_raft_message
             at /tikv/components/raftstore/src/store/fsm/peer.rs:1228
  10: raftstore::store::fsm::peer::PeerFsmDelegate<EK,ER,T>::handle_msgs
             at /tikv/components/raftstore/src/store/fsm/peer.rs:484
  11: <raftstore::store::fsm::store::RaftPoller<EK,ER,T> as batch_system::batch::PollHandler<raftstore::store::fsm::peer::PeerFsm<EK,ER>,raftstore::store::fsm::store::StoreFsm<EK>>>::handle_normal
             at /tikv/components/raftstore/src/store/fsm/store.rs:829
  12: batch_system::batch::Poller<N,C,Handler>::poll
             at /tikv/components/batch-system/src/batch.rs:295
  13: batch_system::batch::BatchSystem<N,C>::spawn::{{closure}}
             at /tikv/components/batch-system/src/batch.rs:399
      std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/sys_common/backtrace.rs:125
  14: std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}}
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/thread/mod.rs:474
      <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/panic.rs:322
      std::panicking::try::do_call
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/panicking.rs:379
      std::panicking::try
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/panicking.rs:343
      std::panic::catch_unwind
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/panic.rs:396
      std::thread::Builder::spawn_unchecked::{{closure}}
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/thread/mod.rs:473
      core::ops::function::FnOnce::call_once{{vtable.shim}}
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/core/src/ops/function.rs:227
  15: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/alloc/src/boxed.rs:1484
      <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/alloc/src/boxed.rs:1484
      std::sys::unix::thread::Thread::new::thread_start
             at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/sys/unix/thread.rs:71
  16: start_thread
  17: clone
"] [location=/rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft_log.rs:252] [thread_name=raftstore-1-1]

After v4.0.0, snapshot are generated using apply index and apply term from apply worker. Apply worker initializes these fields when a peer fsm has applied the snapshot. But applying snapshot only updates apply index and leave apply term untouched, so apply worker will set apply term to a stale value until it applies next entry. If a snapshot is generated at that time, the snapshot will be set to a wrong log term.

Because raft-rs return 0 for any term query on entries beyond its logs, so term check can succeed. Receiver will fast-forward the snapshot and commit the index. Hence panic as receiver has no such log at all. Generally, PD only promotes initialized learner to voter, so the learner has to apply at least one log before it starts to campaign, apply term will be assigned to correct value before sending snapshots.

But if a long time isolated follower becomes leader right after accepting snapshot, it can still generate snapshots with wrong metadata. In this case, the stale term may not be zero and receiver can apply the data with wrong metadata. Though receiver should be able to find conflict when accepting next empty entry and request a new snapshot again. There seems to be no harm except wrong metadata itself.

Metadata

Labels

affects-4.0This bug affects 4.0.x versions.affects-5.0This bug affects 5.0.x versions.affects-5.1This bug affects 5.1.x versions.severity/criticalsig/raftComponent: Raft, RaftStore, etc.type/bugThe issue is confirmed as a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions