Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mis-reuse StoragePool::max_data_page_id cause data corruption after changing tiflash replica number #8695

Closed
JaySon-Huang opened this issue Jan 17, 2024 · 3 comments · Fixed by #8698
Assignees
Labels
affects-7.5 component/storage report/customer Customers have encountered this bug. severity/critical type/bug The issue is confirmed as a bug.

Comments

@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Jan 17, 2024

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. Create a table, insert some data, add tiflash replica
  2. Set tiflash replica to 0, and wait for its data on TiFlash to be GCed
  3. Set tiflash replica to 1 again, and do some queries

2. What did you expect to see? (Required)

3. What did you see instead (Required)

# table is tombstoned after setting tiflash replica to 0
[2024/01/11 14:05:08.373 +08:00] [INFO] [SchemaBuilder.cpp:1203] ["Tombstone table db_2855.testdemo begin, table_id=5578"] [source="keyspace=4294967295"] [thread_id=434]
[2024/01/11 14:05:08.382 +08:00] [INFO] [SchemaBuilder.cpp:1225] ["Tombstone table db_2855.testdemo end, table_id=5578"] [source="keyspace=4294967295"] [thread_id=434]
...
# table get physically removed
[2024/01/11 14:20:14.622 +08:00] [INFO] [SchemaSyncService.cpp:218] ["Schema GC begin, last_safepoint=446943112160346112 safepoint=446943269446746112"] [source="keyspace=4294967295"] [thread_id=58]
[2024/01/11 14:20:14.622 +08:00] [INFO] [SchemaSyncService.cpp:256] ["Detect stale table, database_name=db_2855 table_name=t_5578 database_tombstone=0 table_tombstone=446943227634515973 safepoint=446943269446746112"] [thread_id=58]
[2024/01/11 14:20:14.623 +08:00] [INFO] [SchemaSyncService.cpp:295] ["Physically dropping table, table_tombstone=446943227634515973 safepoint=446943269446746112 baocai.testdemo, database_id=2855 table_id=5578"] [source="keyspace=4294967295"] [thread_id=58]
[2024/01/11 14:20:14.646 +08:00] [INFO] [SchemaSyncService.cpp:306] ["Physically dropped table baocai.testdemo, database_id=2855 table_id=5578"] [source="keyspace=4294967295"] [thread_id=58]
[2024/01/11 14:20:14.646 +08:00] [INFO] [SchemaSyncService.cpp:383] ["Schema GC done, tables_removed=1 databases_removed=0 safepoint=446943269446746112"] [source="keyspace=4294967295"] [thread_id=58]
...
# re-create the table after setting tiflash replica to 1
[2024/01/11 14:30:51.736 +08:00] [INFO] [SchemaBuilder.cpp:1163] ["Create table baocai.testdemo (database_id=2855 table_id=5578) with statement: CREATE TABLE `db_2855`.`t_5578`(...)"] [source="keyspace=4294967295"] [thread_id=417]
[2024/01/11 14:30:51.747 +08:00] [INFO] [SchemaBuilder.cpp:1182] ["Creat table baocai.testdemo end, database_id=2855 table_id=5578"] [source="keyspace=4294967295"] [thread_id=417]
[2024/01/11 14:30:51.747 +08:00] [INFO] [TiDBSchemaSyncer.cpp:247] ["Sync table schema end after syncSchemas, table_id=5578"] [source="keyspace=4294967295"] [thread_id=417]

New queries may failed with exception like

[2024/01/11 14:51:05.844 +08:00] [ERROR] [SegmentReader.cpp:119] ["ErrMsg: Unknown compression method: 8: (while reading from DTFile: /data2/tidb-data/tiflash-9001/data/t_5578/stable/dmf_37480) StackTrace 
       0x1ec569e    DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) [tiflash+32265886]
                    dbms/src/Common/Exception.h:46
       0x1d97d6f    DB::CompressedReadBufferBase<false>::readCompressedData(unsigned long&, unsigned long&) [tiflash+31030639]
                    dbms/src/IO/CompressedReadBufferBase.cpp:87
       0x1dc6093    DB::CompressedReadBufferFromFileProvider<false>::nextImpl() [tiflash+31219859]
                    dbms/src/Encryption/CompressedReadBufferFromFileProvider.cpp:32
       0x1dc6805    DB::CompressedReadBufferFromFileProvider<false>::seek(unsigned long, unsigned long) [tiflash+31221765]
                    dbms/src/Encryption/CompressedReadBufferFromFileProvider.cpp:128
       0x7685530    std::__1::__function::__func<DB::DM::DMFileReader::readFromDisk(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::mutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, bool)::$_5, std::__1::allocator<DB::DM::DMFileReader::readFromDisk(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::mutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, bool)::$_5>, DB::ReadBuffer* (std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&)>::operator()(std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&) [tiflash+124278064]
                    /usr/local/bin/../include/c++/v1/__functional/function.h:345
       0x2085cea    DB::IDataType::deserializeBinaryBulkWithMultipleStreams(DB::IColumn&, std::__1::function<DB::ReadBuffer* (std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&)> const&, unsigned long, double, bool, std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> >&) const [tiflash+34102506]
                    dbms/src/DataTypes/IDataType.h:162
       0x76826b6    DB::DM::DMFileReader::readColumn(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, unsigned long) [tiflash+124266166]
                    dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp:824
       0x767fb0d    DB::DM::DMFileReader::read() [tiflash+124254989]
                    dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp:703
       0x76736a5    DB::DM::DMFileBlockInputStream::read() [tiflash+124204709]
                    dbms/src/Storages/DeltaMerge/File/DMFileBlockInputStream.h:62
       0x7618ac0    DB::DM::ConcatSkippableBlockInputStream<true>::read() [tiflash+123833024]
                    dbms/src/Storages/DeltaMerge/SkippableBlockInputStream.h:184
       0x75d8ee0    DB::DM::DMRowKeyFilterBlockInputStream<true>::read() [tiflash+123571936]
                    dbms/src/Storages/DeltaMerge/RowKeyFilter.h:219
       0x75b642b    DB::DM::readNextBlock(std::__1::shared_ptr<DB::IBlockInputStream> const&) [tiflash+123429931]
                    dbms/src/Storages/DeltaMerge/DeltaMergeHelpers.h:260
       0x1dba745    DB::DM::DMVersionFilterBlockInputStream<0>::initNextBlock() [tiflash+31172421]
                    dbms/src/Storages/DeltaMerge/DMVersionFilterBlockInputStream.h:143
       0x1db9695    DB::DM::DMVersionFilterBlockInputStream<0>::read(DB::PODArray<unsigned char, 4096ul, Allocator<false>, 15ul, 16ul>*&, bool) [tiflash+31168149]
                    dbms/src/Storages/DeltaMerge/DMVersionFilterBlockInputStream.cpp:51
       0x762bfb4    DB::DM::BitmapFilter::set(std::__1::shared_ptr<DB::IBlockInputStream>&) [tiflash+123912116]
                    dbms/src/Storages/DeltaMerge/BitmapFilter/BitmapFilter.cpp:32
       0x75b15ec    DB::DM::Segment::buildBitmapFilter(DB::DM::DMContext const&, std::__1::shared_ptr<DB::DM::SegmentSnapshot> const&, std::__1::vector<DB::DM::RowKeyRange, std::__1::allocator<DB::DM::RowKeyRange> > const&, std::__1::shared_ptr<DB::DM::RSOperator> const&, unsigned long, unsigned long) [tiflash+123409900]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:2711
       0x7590d23    DB::DM::Segment::getBitmapFilterInputStream(DB::DM::DMContext const&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> > const&, std::__1::shared_ptr<DB::DM::SegmentSnapshot> const&, std::__1::vector<DB::DM::RowKeyRange, std::__1::allocator<DB::DM::RowKeyRange> > const&, std::__1::shared_ptr<DB::DM::PushDownFilter> const&, unsigned long, unsigned long, unsigned long) [tiflash+123276579]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:3147
       0x758ecdd    DB::DM::Segment::getInputStream(DB::DM::ReadMode const&, DB::DM::DMContext const&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> > const&, std::__1::shared_ptr<DB::DM::SegmentSnapshot> const&, std::__1::vector<DB::DM::RowKeyRange, std::__1::allocator<DB::DM::RowKeyRange> > const&, std::__1::shared_ptr<DB::DM::PushDownFilter> const&, unsigned long, unsigned long) [tiflash+123268317]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:791
       0x76089ab    DB::DM::SegmentReadTaskPool::buildInputStream(std::__1::shared_ptr<DB::DM::SegmentReadTask>&) [tiflash+123767211]
                    dbms/src/Storages/DeltaMerge/SegmentReadTaskPool.cpp:177
       0x76fe27b    DB::DM::MergedTask::initOnce() [tiflash+124772987]
                    dbms/src/Storages/DeltaMerge/ReadThread/MergedTask.cpp:54
       0x7704c77    DB::DM::SegmentReader::run() [tiflash+124800119]
                    dbms/src/Storages/DeltaMerge/ReadThread/SegmentReader.cpp:149
       0x77063b2    void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (DB::DM::SegmentReader::*)(), DB::DM::SegmentReader*> >(void*) [tiflash+124806066]
                    /usr/local/bin/../include/c++/v1/thread:291
  0x7fa3d27fafed    <unknown symbol> [libpthread.so.0+36845]
  0x7fa3d265016f    clone [libc.so.6+1036655]"] [thread_id=2]

[2024/01/11 14:51:37.217 +08:00] [ERROR] [Exception.cpp:91] ["Code: 89, e.displayText() = DB::Exception: Unknown compression method: 8: (while reading from DTFile: /data1/tidb-data/tiflash-9000/data/t_5578/stable/dmf_36863): Error while GC segment, segment=<segment_id=1 epoch=10 range=[-9223372036854775808,1161418) next_segment_id=4250 delta_rows=0 delta_bytes=0 delta_deletes=0 stable_file=dmf_36863 stable_rows=180224 stable_bytes=33765295 dmf_rows=797819 dmf_bytes=149471027 dmf_packs=98> table=t_5578, e.what() = DB::Exception, Stack trace:
       0x1ec569e    DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) [tiflash+32265886]
                    dbms/src/Common/Exception.h:46
       0x1d97d6f    DB::CompressedReadBufferBase<false>::readCompressedData(unsigned long&, unsigned long&) [tiflash+31030639]
                    dbms/src/IO/CompressedReadBufferBase.cpp:87
       0x1dc6093    DB::CompressedReadBufferFromFileProvider<false>::nextImpl() [tiflash+31219859]
                    dbms/src/Encryption/CompressedReadBufferFromFileProvider.cpp:32
       0x1dc6805    DB::CompressedReadBufferFromFileProvider<false>::seek(unsigned long, unsigned long) [tiflash+31221765]
                    dbms/src/Encryption/CompressedReadBufferFromFileProvider.cpp:128
       0x7685530    std::__1::__function::__func<DB::DM::DMFileReader::readFromDisk(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::mutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, bool)::$_5, std::__1::allocator<DB::DM::DMFileReader::readFromDisk(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::mutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, bool)::$_5>, DB::ReadBuffer* (std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&)>::operator()(std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&) [tiflash+124278064]
                    /usr/local/bin/../include/c++/v1/__functional/function.h:345
       0x2085cea    DB::IDataType::deserializeBinaryBulkWithMultipleStreams(DB::IColumn&, std::__1::function<DB::ReadBuffer* (std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&)> const&, unsigned long, double, bool, std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> >&) const [tiflash+34102506]
                    dbms/src/DataTypes/IDataType.h:162
       0x76826b6    DB::DM::DMFileReader::readColumn(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, unsigned long) [tiflash+124266166]
                    dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp:824
       0x767fb0d    DB::DM::DMFileReader::read() [tiflash+124254989]
                    dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp:703
       0x76736a5    DB::DM::DMFileBlockInputStream::read() [tiflash+124204709]
                    dbms/src/Storages/DeltaMerge/File/DMFileBlockInputStream.h:62
       0x75d7d9d    DB::DM::ConcatSkippableBlockInputStream<false>::read() [tiflash+123567517]
                    dbms/src/Storages/DeltaMerge/SkippableBlockInputStream.h:184
       0x75de30f    DB::DM::DeltaMergeBlockInputStream<DB::DM::DeltaValueReader, DB::DM::DTCompactedEntries<55ul, 20ul, 3ul>::Iterator, false, false>::fillStableBlockIfNeeded() [tiflash+123593487]
                    dbms/src/Storages/DeltaMerge/DeltaMerge.h:433
       0x75dc7f3    DB::DM::DeltaMergeBlockInputStream<DB::DM::DeltaValueReader, DB::DM::DTCompactedEntries<55ul, 20ul, 3ul>::Iterator, false, false>::read() [tiflash+123586547]
                    dbms/src/Storages/DeltaMerge/DeltaMerge.h:208
       0x75d8ee0    DB::DM::DMRowKeyFilterBlockInputStream<true>::read() [tiflash+123571936]
                    dbms/src/Storages/DeltaMerge/RowKeyFilter.h:219
       0x75b642b    DB::DM::readNextBlock(std::__1::shared_ptr<DB::IBlockInputStream> const&) [tiflash+123429931]
                    dbms/src/Storages/DeltaMerge/DeltaMergeHelpers.h:260
       0x75b517e    DB::DM::PKSquashingBlockInputStream<false>::read() [tiflash+123425150]
                    dbms/src/Storages/DeltaMerge/PKSquashingBlockInputStream.h:74
       0x75b642b    DB::DM::readNextBlock(std::__1::shared_ptr<DB::IBlockInputStream> const&) [tiflash+123429931]
                    dbms/src/Storages/DeltaMerge/DeltaMergeHelpers.h:260
       0x1dbdf85    DB::DM::DMVersionFilterBlockInputStream<1>::initNextBlock() [tiflash+31186821]
                    dbms/src/Storages/DeltaMerge/DMVersionFilterBlockInputStream.h:143
       0x1dbbaa8    DB::DM::DMVersionFilterBlockInputStream<1>::read(DB::PODArray<unsigned char, 4096ul, Allocator<false>, 15ul, 16ul>*&, bool) [tiflash+31177384]
                    dbms/src/Storages/DeltaMerge/DMVersionFilterBlockInputStream.cpp:51
       0x75e642a    DB::ConcatBlockInputStream::readImpl(DB::PODArray<unsigned char, 4096ul, Allocator<false>, 15ul, 16ul>*&, bool) [tiflash+123626538]
                    dbms/src/DataStreams/ConcatBlockInputStream.h:54
       0x75e6394    DB::ConcatBlockInputStream::readImpl() [tiflash+123626388]
                    dbms/src/DataStreams/ConcatBlockInputStream.h:45
       0x7774e85    DB::IProfilingBlockInputStream::read(DB::PODArray<unsigned char, 4096ul, Allocator<false>, 15ul, 16ul>*&, bool) [tiflash+125259397]
                    dbms/src/DataStreams/IProfilingBlockInputStream.cpp:82
       0x7774b85    DB::IProfilingBlockInputStream::read() [tiflash+125258629]
                    dbms/src/DataStreams/IProfilingBlockInputStream.cpp:48
       0x75b642b    DB::DM::readNextBlock(std::__1::shared_ptr<DB::IBlockInputStream> const&) [tiflash+123429931]
                    dbms/src/Storages/DeltaMerge/DeltaMergeHelpers.h:260
       0x1dbdf85    DB::DM::DMVersionFilterBlockInputStream<1>::initNextBlock() [tiflash+31186821]
                    dbms/src/Storages/DeltaMerge/DMVersionFilterBlockInputStream.h:143
       0x1dbbaa8    DB::DM::DMVersionFilterBlockInputStream<1>::read(DB::PODArray<unsigned char, 4096ul, Allocator<false>, 15ul, 16ul>*&, bool) [tiflash+31177384]
                    dbms/src/Storages/DeltaMerge/DMVersionFilterBlockInputStream.cpp:51
       0x75e4014    DB::DM::DMVersionFilterBlockInputStream<1>::read() [tiflash+123617300]
                    dbms/src/Storages/DeltaMerge/DMVersionFilterBlockInputStream.h:95
       0x758519d    DB::DM::createNewStable(DB::DM::DMContext&, std::__1::shared_ptr<std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> > > const&, std::__1::shared_ptr<DB::IBlockInputStream> const&, unsigned long, DB::DM::WriteBatches&) [tiflash+123228573]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:202
       0x75a3519    DB::DM::Segment::prepareMerge(DB::DM::DMContext&, std::__1::shared_ptr<std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> > > const&, std::__1::vector<std::__1::shared_ptr<DB::DM::Segment>, std::__1::allocator<std::__1::shared_ptr<DB::DM::Segment> > > const&, std::__1::vector<std::__1::shared_ptr<DB::DM::SegmentSnapshot>, std::__1::allocator<std::__1::shared_ptr<DB::DM::SegmentSnapshot> > > const&, DB::DM::WriteBatches&) [tiflash+123352345]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:2085
       0x75545d1    DB::DM::DeltaMergeStore::segmentMerge(DB::DM::DMContext&, std::__1::vector<std::__1::shared_ptr<DB::DM::Segment>, std::__1::allocator<std::__1::shared_ptr<DB::DM::Segment> > > const&, DB::DM::DeltaMergeStore::SegmentMergeReason) [tiflash+123028945]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore_InternalSegment.cpp:308
       0x7547f9c    DB::DM::DeltaMergeStore::onSyncGc(long, DB::DM::GCOptions const&) [tiflash+122978204]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore_InternalBg.cpp:927
       0x8a359c6    DB::GCManager::work() [tiflash+144923078]
                    dbms/src/Storages/GCManager.cpp:108
       0x7ffeebb    void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, DB::BackgroundProcessingPool::BackgroundProcessingPool(int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >)::$_1> >(void*) [tiflash+134213307]
                    /usr/local/bin/../include/c++/v1/thread:291"] [source="bool DB::GCManager::work()"] [thread_id=430]
[2024/01/11 14:51:05.838 +08:00] [ERROR] [SegmentReader.cpp:119] ["ErrMsg: checksum framed file /data2/tidb-data/tiflash-9001/data/t_5578/stable/dmf_37281/%2D1.dat is not seekable: (while reading from DTFile: /data2/tidb-data/tiflash-9001/data/t_5578/stable/dmf_37281) StackTrace 
       0x1ed9311    DB::TiFlashException::TiFlashException(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::TiFlashError const&) [tiflash+32346897]
                    dbms/src/Common/TiFlashException.h:263
       0x1d9525f    DB::FramedChecksumReadBuffer<DB::Digest::XXH3>::doSeek(long, int) [tiflash+31019615]
                    dbms/src/IO/ChecksumBuffer.h:446
       0x1dc67f1    DB::CompressedReadBufferFromFileProvider<false>::seek(unsigned long, unsigned long) [tiflash+31221745]
                    dbms/src/Encryption/CompressedReadBufferFromFileProvider.cpp:125
       0x7685530    std::__1::__function::__func<DB::DM::DMFileReader::readFromDisk(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::mutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, bool)::$_5, std::__1::allocator<DB::DM::DMFileReader::readFromDisk(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::mutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, bool)::$_5>, DB::ReadBuffer* (std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&)>::operator()(std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&) [tiflash+124278064]
                    /usr/local/bin/../include/c++/v1/__functional/function.h:345
       0x2085cea    DB::IDataType::deserializeBinaryBulkWithMultipleStreams(DB::IColumn&, std::__1::function<DB::ReadBuffer* (std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&)> const&, unsigned long, double, bool, std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> >&) const [tiflash+34102506]
                    dbms/src/DataTypes/IDataType.h:162
       0x76826b6    DB::DM::DMFileReader::readColumn(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, unsigned long) [tiflash+124266166]
                    dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp:824
       0x767fb0d    DB::DM::DMFileReader::read() [tiflash+124254989]
                    dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp:703
       0x76736a5    DB::DM::DMFileBlockInputStream::read() [tiflash+124204709]
                    dbms/src/Storages/DeltaMerge/File/DMFileBlockInputStream.h:62
       0x7618ac0    DB::DM::ConcatSkippableBlockInputStream<true>::read() [tiflash+123833024]
                    dbms/src/Storages/DeltaMerge/SkippableBlockInputStream.h:184
       0x75d8ee0    DB::DM::DMRowKeyFilterBlockInputStream<true>::read() [tiflash+123571936]
                    dbms/src/Storages/DeltaMerge/RowKeyFilter.h:219
       0x75b642b    DB::DM::readNextBlock(std::__1::shared_ptr<DB::IBlockInputStream> const&) [tiflash+123429931]
                    dbms/src/Storages/DeltaMerge/DeltaMergeHelpers.h:260
       0x1dba745    DB::DM::DMVersionFilterBlockInputStream<0>::initNextBlock() [tiflash+31172421]
                    dbms/src/Storages/DeltaMerge/DMVersionFilterBlockInputStream.h:143
       0x1db9695    DB::DM::DMVersionFilterBlockInputStream<0>::read(DB::PODArray<unsigned char, 4096ul, Allocator<false>, 15ul, 16ul>*&, bool) [tiflash+31168149]
                    dbms/src/Storages/DeltaMerge/DMVersionFilterBlockInputStream.cpp:51
       0x762bfb4    DB::DM::BitmapFilter::set(std::__1::shared_ptr<DB::IBlockInputStream>&) [tiflash+123912116]
                    dbms/src/Storages/DeltaMerge/BitmapFilter/BitmapFilter.cpp:32
       0x75b15ec    DB::DM::Segment::buildBitmapFilter(DB::DM::DMContext const&, std::__1::shared_ptr<DB::DM::SegmentSnapshot> const&, std::__1::vector<DB::DM::RowKeyRange, std::__1::allocator<DB::DM::RowKeyRange> > const&, std::__1::shared_ptr<DB::DM::RSOperator> const&, unsigned long, unsigned long) [tiflash+123409900]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:2711
       0x7590d23    DB::DM::Segment::getBitmapFilterInputStream(DB::DM::DMContext const&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> > const&, std::__1::shared_ptr<DB::DM::SegmentSnapshot> const&, std::__1::vector<DB::DM::RowKeyRange, std::__1::allocator<DB::DM::RowKeyRange> > const&, std::__1::shared_ptr<DB::DM::PushDownFilter> const&, unsigned long, unsigned long, unsigned long) [tiflash+123276579]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:3147
       0x758ecdd    DB::DM::Segment::getInputStream(DB::DM::ReadMode const&, DB::DM::DMContext const&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> > const&, std::__1::shared_ptr<DB::DM::SegmentSnapshot> const&, std::__1::vector<DB::DM::RowKeyRange, std::__1::allocator<DB::DM::RowKeyRange> > const&, std::__1::shared_ptr<DB::DM::PushDownFilter> const&, unsigned long, unsigned long) [tiflash+123268317]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:791
       0x76089ab    DB::DM::SegmentReadTaskPool::buildInputStream(std::__1::shared_ptr<DB::DM::SegmentReadTask>&) [tiflash+123767211]
                    dbms/src/Storages/DeltaMerge/SegmentReadTaskPool.cpp:177
       0x76fe27b    DB::DM::MergedTask::initOnce() [tiflash+124772987]
                    dbms/src/Storages/DeltaMerge/ReadThread/MergedTask.cpp:54
       0x7704c77    DB::DM::SegmentReader::run() [tiflash+124800119]
                    dbms/src/Storages/DeltaMerge/ReadThread/SegmentReader.cpp:149
       0x77063b2    void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (DB::DM::SegmentReader::*)(), DB::DM::SegmentReader*> >(void*) [tiflash+124806066]
                    /usr/local/bin/../include/c++/v1/thread:291
  0x7fa3d27fafed    <unknown symbol> [libpthread.so.0+36845]
  0x7fa3d265016f    clone [libc.so.6+1036655]"] [thread_id=9]
[2024/01/11 14:57:00.512 +08:00] [ERROR] [Exception.cpp:91] ["Code: 361, e.displayText() = DB::Exception: Seek position is beyond the decompressed block (pos: 65277807, block size: 0): (while reading from DTFile: /data2/tidb-data/tiflash-9001/data/t_5578/stable/dmf_37501), e.what() = DB::Exception, Stack trace:
       0x1ec569e    DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) [tiflash+32265886]
                    dbms/src/Common/Exception.h:46
       0x1dc6941    DB::CompressedReadBufferFromFileProvider<false>::seek(unsigned long, unsigned long) [tiflash+31222081]
                    dbms/src/Encryption/CompressedReadBufferFromFileProvider.cpp:131
       0x7685530    std::__1::__function::__func<DB::DM::DMFileReader::readFromDisk(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::mutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, bool)::$_5, std::__1::allocator<DB::DM::DMFileReader::readFromDisk(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::mutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, bool)::$_5>, DB::ReadBuffer* (std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&)>::operator()(std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&) [tiflash+124278064]
                    /usr/local/bin/../include/c++/v1/__functional/function.h:345
       0x2085cea    DB::IDataType::deserializeBinaryBulkWithMultipleStreams(DB::IColumn&, std::__1::function<DB::ReadBuffer* (std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> > const&)> const&, unsigned long, double, bool, std::__1::vector<DB::IDataType::Substream, std::__1::allocator<DB::IDataType::Substream> >&) const [tiflash+34102506]
                    dbms/src/DataTypes/IDataType.h:162
       0x76826b6    DB::DM::DMFileReader::readColumn(DB::DM::ColumnDefine const&, COWPtr<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, unsigned long) [tiflash+124266166]
                    dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp:824
       0x767fb0d    DB::DM::DMFileReader::read() [tiflash+124254989]
                    dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp:703
       0x76736a5    DB::DM::DMFileBlockInputStream::read() [tiflash+124204709]
                    dbms/src/Storages/DeltaMerge/File/DMFileBlockInputStream.h:62
       0x75d7d9d    DB::DM::ConcatSkippableBlockInputStream<false>::read() [tiflash+123567517]
                    dbms/src/Storages/DeltaMerge/SkippableBlockInputStream.h:184
       0x75de30f    DB::DM::DeltaMergeBlockInputStream<DB::DM::DeltaValueReader, DB::DM::DTCompactedEntries<55ul, 20ul, 3ul>::Iterator, false, false>::fillStableBlockIfNeeded() [tiflash+123593487]
                    dbms/src/Storages/DeltaMerge/DeltaMerge.h:433
       0x75dc7f3    DB::DM::DeltaMergeBlockInputStream<DB::DM::DeltaValueReader, DB::DM::DTCompactedEntries<55ul, 20ul, 3ul>::Iterator, false, false>::read() [tiflash+123586547]
                    dbms/src/Storages/DeltaMerge/DeltaMerge.h:208
       0x75d8ee0    DB::DM::DMRowKeyFilterBlockInputStream<true>::read() [tiflash+123571936]
                    dbms/src/Storages/DeltaMerge/RowKeyFilter.h:219
       0x779e2fc    DB::SquashingBlockInputStream::readImpl() [tiflash+125428476]
                    dbms/src/DataStreams/SquashingBlockInputStream.cpp:38
       0x7774e85    DB::IProfilingBlockInputStream::read(DB::PODArray<unsigned char, 4096ul, Allocator<false>, 15ul, 16ul>*&, bool) [tiflash+125259397]
                    dbms/src/DataStreams/IProfilingBlockInputStream.cpp:82
       0x75aad4d    DB::DM::Segment::ensurePlace(DB::DM::DMContext const&, std::__1::shared_ptr<DB::DM::SegmentSnapshot> const&, std::__1::shared_ptr<DB::DM::DeltaValueReader> const&, std::__1::vector<DB::DM::RowKeyRange, std::__1::allocator<DB::DM::RowKeyRange> > const&, unsigned long) const [tiflash+123383117]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:2518
       0x7592dce    DB::DM::Segment::getReadInfo(DB::DM::DMContext const&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> > const&, std::__1::shared_ptr<DB::DM::SegmentSnapshot> const&, std::__1::vector<DB::DM::RowKeyRange, std::__1::allocator<DB::DM::RowKeyRange> > const&, unsigned long) const [tiflash+123284942]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:2330
       0x7595a9a    DB::DM::Segment::prepareMergeDelta(DB::DM::DMContext&, std::__1::shared_ptr<std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> > > const&, std::__1::shared_ptr<DB::DM::SegmentSnapshot> const&, DB::DM::WriteBatches&) const [tiflash+123296410]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:1161
       0x7556a8c    DB::DM::DeltaMergeStore::segmentMergeDelta(DB::DM::DMContext&, std::__1::shared_ptr<DB::DM::Segment> const&, DB::DM::DeltaMergeStore::MergeDeltaReason, std::__1::shared_ptr<DB::DM::SegmentSnapshot>) [tiflash+123038348]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore_InternalSegment.cpp:470
       0x752100a    DB::DM::DeltaMergeStore::checkSegmentUpdate(std::__1::shared_ptr<DB::DM::DMContext> const&, std::__1::shared_ptr<DB::DM::Segment> const&, DB::DM::DeltaMergeStore::ThreadType, DB::DM::DeltaMergeStore::InputType) [tiflash+122818570]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp:1725
       0x751f9b6    DB::DM::DeltaMergeStore::deleteRange(DB::Context const&, DB::Settings const&, DB::DM::RowKeyRange const&) [tiflash+122812854]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp:724
       0x8b05799    DB::RegionTable::removeRegion(unsigned long, bool, DB::RegionTaskLock const&) [tiflash+145774489]
                    dbms/src/Storages/KVStore/Decode/RegionTable.cpp:233
       0x8a1d52f    DB::KVStore::removeRegion(unsigned long, bool, DB::RegionTable&, DB::KVStoreTaskLock const&, DB::RegionTaskLock const&) [tiflash+144823599]
                    dbms/src/Storages/KVStore/KVStore.cpp:271
       0x8aa52c8    DB::KVStore::handleAdminRaftCmd(raft_cmdpb::AdminRequest&&, raft_cmdpb::AdminResponse&&, unsigned long, unsigned long, unsigned long, DB::TMTContext&) [tiflash+145380040]
                    dbms/src/Storages/KVStore/MultiRaft/RaftCommandsKVS.cpp:355
       0x8a551d8    HandleAdminRaftCmd [tiflash+145052120]
                    dbms/src/Storages/KVStore/FFI/ProxyFFI.cpp:120
  0x7fa3d428b31c    proxy_ffi::engine_store_helper_impls::_$LT$impl$u20$proxy_ffi..interfaces..root..DB..EngineStoreServerHelper$GT$::handle_admin_raft_cmd::h48034f5b662ff094 [libtiflash_proxy.so+26047260]
  0x7fa3d434a5ce    _$LT$engine_store_ffi..observer..TiFlashObserver$LT$T$C$ER$GT$$u20$as$u20$raftstore..coprocessor..AdminObserver$GT$::post_exec_admin::h435f8eeea21e0965 [libtiflash_proxy.so+26830286]
  0x7fa3d53bc071    raftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::apply_raft_cmd::hc241bbf5517beb31 [libtiflash_proxy.so+44073073]
  0x7fa3d53d5acf    raftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::process_raft_cmd::h7a47a865be4fd30e [libtiflash_proxy.so+44178127]
  0x7fa3d53d7ee6    raftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::handle_raft_committed_entries::h63bee235686cb7ba [libtiflash_proxy.so+44187366]
  0x7fa3d53ac51c    raftstore::store::fsm::apply::ApplyFsm$LT$EK$GT$::handle_apply::h3cd768b362e8f910 [libtiflash_proxy.so+44008732]
  0x7fa3d53b0aa2    raftstore::store::fsm::apply::ApplyFsm$LT$EK$GT$::handle_tasks::h07dbf9e3d7892e28 [libtiflash_proxy.so+44026530]
  0x7fa3d4451e1e    _$LT$raftstore..store..fsm..apply..ApplyPoller$LT$EK$GT$$u20$as$u20$batch_system..batch..PollHandler$LT$raftstore..store..fsm..apply..ApplyFsm$LT$EK$GT$$C$raftstore..store..fsm..apply..ControlFsm$GT$$GT$::handle_normal::hb066bfcc6a574726 [libtiflash_proxy.so+27909662]
  0x7fa3d43b3753    batch_system::batch::Poller$LT$N$C$C$C$Handler$GT$::poll::hd4fd3db00dda0f31 [libtiflash_proxy.so+27260755]"] [source="DB::EngineStoreApplyRes DB::HandleAdminRaftCmd(const DB::EngineStoreServerWrap *, DB::BaseBuffView, DB::BaseBuffView, DB::RaftCmdHeader)"] [thread_id=404]

After restart, TiFlash may meet exception like

[2024/01/11 14:57:36.026 +08:00] [ERROR] [Exception.cpp:91] ["Code: 10017, e.displayText() = DB::Exception: try to create external version with invalid state [ver=18629.0] [state={type:VAR_REF, create_ver: 11601.0, is_deleted: true, delete_ver: 11978.0, ori_page_id: 5578.36156, being_ref_count: 1, num_entries: 0}]:  [type=PUT_EXTERNAL] [page_id=5578.36275] [ver=18629.0], e.what() = DB::Exception, Stack trace:
       0x1ec569e    DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) [tiflash+32265886]
                    dbms/src/Common/Exception.h:46
       0x1e4cdaf    DB::PS::V3::VersionedPageEntries<DB::PS::V3::u128::PageDirectoryTrait>::createNewExternal(DB::PS::V3::PageVersion const&, DB::PS::V3::PageEntryV3 const&) [tiflash+31772079]
                    dbms/src/Storages/Page/V3/PageDirectory.cpp:280
       0x1e70d72    DB::PS::V3::PageDirectoryFactory<DB::PS::V3::u128::FactoryTrait>::applyRecord(std::__1::unique_ptr<DB::PS::V3::PageDirectory<DB::PS::V3::u128::PageDirectoryTrait>, std::__1::default_delete<DB::PS::V3::PageDirectory<DB::PS::V3::u128::PageDirectoryTrait> > > const&, DB::PS::V3::PageEntriesEdit<DB::UInt128>::EditRecord const&, bool) [tiflash+31919474]
                    dbms/src/Storages/Page/V3/PageDirectoryFactory.cpp:286
       0x1e6eba4    DB::PS::V3::PageDirectoryFactory<DB::PS::V3::u128::FactoryTrait>::create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::FileProvider>&, std::__1::shared_ptr<DB::PSDiskDelegator>&, DB::PS::V3::WALConfig const&) [tiflash+31910820]
                    dbms/src/Storages/Page/V3/PageDirectoryFactory.cpp:45
       0x865f766    DB::PS::V3::PageStorageImpl::restore() [tiflash+140900198]
                    dbms/src/Storages/Page/V3/PageStorageImpl.cpp:74
       0x761ca46    DB::DM::GlobalStoragePool::restore() [tiflash+123849286]
                    dbms/src/Storages/DeltaMerge/StoragePool.cpp:122
       0x78a2901    DB::Context::initializeGlobalStoragePoolIfNeed(DB::PathPool const&) [tiflash+126494977]
                    dbms/src/Interpreters/Context.cpp:1738
       0x1f7ea45    DB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) [tiflash+33024581]
                    dbms/src/Server/Server.cpp:1305
       0x91269aa    Poco::Util::Application::run() [tiflash+152201642]
                    contrib/poco/Util/src/Application.cpp:335
       0x1f73e1a    DB::Server::run() [tiflash+32980506]
                    dbms/src/Server/Server.cpp:262
       0x9131294    Poco::Util::ServerApplication::run(int, char**) [tiflash+152244884]
                    contrib/poco/Util/src/ServerApplication.cpp:618
       0x1f8ee09    mainEntryClickHouseServer(int, char**) [tiflash+33091081]
                    dbms/src/Server/Server.cpp:1832
       0x1eb1e7c    main [tiflash+32185980]
                    dbms/src/Server/main.cpp:173
  0x7f8bb5088b67    __libc_start_main [libc.so.6+154471]
       0x1b97ee9    <unknown symbol> [tiflash+28933865]"] [source="bool DB::Context::initializeGlobalStoragePoolIfNeed(const DB::PathPool &)"] [thread_id=1]

4. What is your TiFlash version? (Required)

v7.5.0

@JaySon-Huang JaySon-Huang added type/bug The issue is confirmed as a bug. component/storage labels Jan 17, 2024
@JaySon-Huang
Copy link
Contributor Author

Setting tiflash replica to 0 will physically remove the data and IStorage instance in TiFlash after GC safepoint (since v7.2)

if (table_info->replica_info.count == 0)
{
// if set 0, drop table in TiFlash
auto storage = tmt_context.getStorages().get(keyspace_id, table_info->id);
if (unlikely(storage == nullptr))
{
LOG_ERROR(
log,
"Storage instance is not exist in TiFlash, applySetTiFlashReplica is ignored, table_id={}",
table_id);
return;
}
applyDropTable(database_id, table_id);
return;
}

And when re-creating the IStorage instance, its related StoragePool can not restore the correct max_data_id from PageStorage

case PageStorageRunMode::ONLY_V3:
{
max_log_page_id = log_storage_v3->getMaxId();
max_data_page_id = data_storage_v3->getMaxId();
max_meta_page_id = meta_storage_v3->getMaxId();
storage_pool_metrics = CurrentMetrics::Increment{CurrentMetrics::StoragePoolV3Only};
break;
}

template <typename Trait>
UInt64 PageDirectory<Trait>::getMaxIdAfterRestart() const
{
std::shared_lock read_lock(table_rw_mutex);
return max_page_id;
}

@JaySon-Huang
Copy link
Contributor Author

So when adding back TiFlash replica, TiFlash may reuse the data_page_id to create the same DMFile path in mark_cache. However, the mark_cache is invalid and they will lead to queries failure

ti-chi-bot bot pushed a commit that referenced this issue Jan 25, 2024
JaySon-Huang added a commit to ti-chi-bot/tiflash that referenced this issue Jan 25, 2024
ti-chi-bot bot pushed a commit that referenced this issue Jan 26, 2024
@seiya-annie
Copy link

/found customer

@ti-chi-bot ti-chi-bot bot added the report/customer Customers have encountered this bug. label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-7.5 component/storage report/customer Customers have encountered this bug. severity/critical type/bug The issue is confirmed as a bug.
Projects
None yet
2 participants