apply term should be assigned when applying snapshot #10225
Description
Bug Report
What version of TiKV are you using?
v5.0.1, though v4.0.x should also be affected.
What operating system and CPU are you using?
Doesn't matter.
Steps to reproduce
Suppose a region contains 3 replicas a, b and c, a is leader, c is isolated and has not been initialized. a and b decides to add a new replica d, and promote it as voter before it's initialized. After a has applied to last log, it sends a snapshot to d and initialize it with latest data. And then a transfer leader to d. Right after d wins the election, c is recovered and requests snapshot from d before d commits any entry.
What did you expect?
c will be initialized properly.
What did happened?
c will panic with following stack.
[2021/05/20 19:02:04.640 +00:00] [FATAL] [lib.rs:465] ["to_commit 41 is out of range [last_index 0], raft_id: 149208, region_id: 149205"] [backtrace="stack backtrace:
0: tikv_util::set_panic_hook::{{closure}}
at /tikv/components/tikv_util/src/lib.rs:464
1: std::panicking::rust_panic_with_hook
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/panicking.rs:595
2: std::panicking::begin_panic_handler::{{closure}}
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/panicking.rs:497
3: std::sys_common::backtrace::__rust_end_short_backtrace
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/sys_common/backtrace.rs:141
4: rust_begin_unwind
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/panicking.rs:493
5: std::panicking::begin_panic_fmt
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/panicking.rs:435
6: raft::raft_log::RaftLog<T>::commit_to
at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft_log.rs:252
7: raft::raft::Raft<T>::restore
at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft.rs:2389
raft::raft::Raft<T>::handle_snapshot
at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft.rs:2307
8: raft::raft::Raft<T>::step_follower
at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft.rs:2135
raft::raft::Raft<T>::step
at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft.rs:1440
9: raft::raw_node::RawNode<T>::step
at /rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raw_node.rs:378
raftstore::store::peer::Peer<EK,ER>::step
at /tikv/components/raftstore/src/store/peer.rs:1139
raftstore::store::fsm::peer::PeerFsmDelegate<EK,ER,T>::on_raft_message
at /tikv/components/raftstore/src/store/fsm/peer.rs:1228
10: raftstore::store::fsm::peer::PeerFsmDelegate<EK,ER,T>::handle_msgs
at /tikv/components/raftstore/src/store/fsm/peer.rs:484
11: <raftstore::store::fsm::store::RaftPoller<EK,ER,T> as batch_system::batch::PollHandler<raftstore::store::fsm::peer::PeerFsm<EK,ER>,raftstore::store::fsm::store::StoreFsm<EK>>>::handle_normal
at /tikv/components/raftstore/src/store/fsm/store.rs:829
12: batch_system::batch::Poller<N,C,Handler>::poll
at /tikv/components/batch-system/src/batch.rs:295
13: batch_system::batch::BatchSystem<N,C>::spawn::{{closure}}
at /tikv/components/batch-system/src/batch.rs:399
std::sys_common::backtrace::__rust_begin_short_backtrace
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/sys_common/backtrace.rs:125
14: std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}}
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/thread/mod.rs:474
<std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/panic.rs:322
std::panicking::try::do_call
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/panicking.rs:379
std::panicking::try
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/panicking.rs:343
std::panic::catch_unwind
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/panic.rs:396
std::thread::Builder::spawn_unchecked::{{closure}}
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/std/src/thread/mod.rs:473
core::ops::function::FnOnce::call_once{{vtable.shim}}
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/core/src/ops/function.rs:227
15: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/alloc/src/boxed.rs:1484
<alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35/library/alloc/src/boxed.rs:1484
std::sys::unix::thread::Thread::new::thread_start
at /rustc/bc39d4d9c514e5fdb40a5782e6ca08924f979c35//library/std/src/sys/unix/thread.rs:71
16: start_thread
17: clone
"] [location=/rust/git/checkouts/raft-rs-42b8049ef2e3af07/91a60ce/src/raft_log.rs:252] [thread_name=raftstore-1-1]
After v4.0.0, snapshot are generated using apply index and apply term from apply worker. Apply worker initializes these fields when a peer fsm has applied the snapshot. But applying snapshot only updates apply index and leave apply term untouched, so apply worker will set apply term to a stale value until it applies next entry. If a snapshot is generated at that time, the snapshot will be set to a wrong log term.
Because raft-rs return 0 for any term query on entries beyond its logs, so term check can succeed. Receiver will fast-forward the snapshot and commit the index. Hence panic as receiver has no such log at all. Generally, PD only promotes initialized learner to voter, so the learner has to apply at least one log before it starts to campaign, apply term will be assigned to correct value before sending snapshots.
But if a long time isolated follower becomes leader right after accepting snapshot, it can still generate snapshots with wrong metadata. In this case, the stale term may not be zero and receiver can apply the data with wrong metadata. Though receiver should be able to find conflict when accepting next empty entry and request a new snapshot again. There seems to be no harm except wrong metadata itself.