Fix data race on `ColumnFamilyData::flush_reason` by letting FlushRequest/Job owns flush_reason instead of CFD #11111

hx235 · 2023-01-20T17:28:26Z

Context:
Concurrent flushes on the same CF can set on ColumnFamilyData::flush_reason before each other flush finishes. An symptom is one CF has different flush_reason with others though all of them are in an atomic flush db_stress: db/db_impl/db_impl_compaction_flush.cc:423: rocksdb::Status rocksdb::DBImpl::AtomicFlushMemTablesToOutputFiles(const rocksdb::autovector<rocksdb::DBImpl::BGFlushArg>&, bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::Env::Priority): Assertion cfd->GetFlushReason() == cfds[0]->GetFlushReason() failed.

Summary:
Suggested by @ltamasi, we now refactor and let FlushRequest/Job to own flush_reason as there is no good way to define ColumnFamilyData::flush_reason in face of concurrent flushes on the same CF (which wasn't the case a long time ago when ColumnFamilyData::flush_reason first introduced`)

Tets:

new unit test
make check
aggressive crash test rehearsal

facebook-github-bot · 2023-01-20T19:37:10Z

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-01-20T20:35:10Z

@hx235 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2023-01-21T00:24:46Z

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-01-22T19:18:09Z

@hx235 has updated the pull request. You must reimport the pull request before landing.

ajkr

LGTM, thanks!

ajkr · 2023-01-22T19:18:43Z

HISTORY.md

@@ -16,6 +16,7 @@
 * Fixed a heap use after free in async scan prefetching if dictionary compression is enabled, in which case sync read of the compression dictionary gets mixed with async prefetching
 * Fixed a data race bug of `CompactRange()` under `change_level=true` acts on overlapping range with an ongoing file ingestion for level compaction. This will either result in overlapping file ranges corruption at a certain level caught by `force_consistency_checks=true` or protentially two same keys both with seqno 0 in two different levels (i.e, new data ends up in lower/older level). The latter will be caught by assertion in debug build but go silently and result in read returning wrong result in release build. This fix is general so it also replaced previous fixes to a similar problem for `CompactFiles()` (#4665), general `CompactRange()` and auto compaction (commit 5c64fb6 and 87dfc1d).
 * Fixed a bug in compaction output cutting where small output files were produced due to TTL file cutting states were not being updated (#11075).
+* Fixed a data race on `ColumnFamilyData::flush_reason` caused by concurrent `GetLiveFiles(flush=true)` and other flushes.


This might rebase cleanly into the wrong release notes

ajkr · 2023-01-22T19:24:25Z

db/db_flush_test.cc

+  // Coerce a manual flush happenning in the middle of GetLiveFiles's flush
+  bool get_live_files_paused_at_sync_point = false;
+  ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
+      "DBImpl::AtomicFlushMemTables:AfterScheduleFlush2", [&](void* /* arg */) {


Can you reuse one of the existing two sync points at the same point in the code? IIRC non-callback syncpoints can still be used for callbacks with nullptr argument

Sure - thanks for letting me know. Fixed.

ajkr · 2023-01-22T19:30:56Z

HISTORY.md

@@ -16,6 +16,7 @@
 * Fixed a heap use after free in async scan prefetching if dictionary compression is enabled, in which case sync read of the compression dictionary gets mixed with async prefetching
 * Fixed a data race bug of `CompactRange()` under `change_level=true` acts on overlapping range with an ongoing file ingestion for level compaction. This will either result in overlapping file ranges corruption at a certain level caught by `force_consistency_checks=true` or protentially two same keys both with seqno 0 in two different levels (i.e, new data ends up in lower/older level). The latter will be caught by assertion in debug build but go silently and result in read returning wrong result in release build. This fix is general so it also replaced previous fixes to a similar problem for `CompactFiles()` (#4665), general `CompactRange()` and auto compaction (commit 5c64fb6 and 87dfc1d).
 * Fixed a bug in compaction output cutting where small output files were produced due to TTL file cutting states were not being updated (#11075).
+* Fixed a data race on `ColumnFamilyData::flush_reason` caused by concurrent `GetLiveFiles(flush=true)` and other flushes.


Is this the only case it fixes? flush_reason has been suspicious (in the sense of reporting the wrong reason) many times historically though I don't remember the exact cases

Will change this to "...concurrent flushes" instead cuz this is a generic fix.

facebook-github-bot · 2023-01-23T23:29:36Z

@hx235 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2023-01-23T23:41:21Z

@hx235 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2023-01-23T23:41:34Z

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-01-24T00:42:11Z

@hx235 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2023-01-24T00:42:27Z

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-01-24T17:57:50Z

@hx235 merged this pull request in 86fa259.

…uest/Job owns flush_reason instead of CFD (#11111) Summary: **Context:** Concurrent flushes on the same CF can set on `ColumnFamilyData::flush_reason` before each other flush finishes. An symptom is one CF has different flush_reason with others though all of them are in an atomic flush `db_stress: db/db_impl/db_impl_compaction_flush.cc:423: rocksdb::Status rocksdb::DBImpl::AtomicFlushMemTablesToOutputFiles(const rocksdb::autovector<rocksdb::DBImpl::BGFlushArg>&, bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::Env::Priority): Assertion cfd->GetFlushReason() == cfds[0]->GetFlushReason() failed. ` **Summary:** Suggested by ltamasi, we now refactor and let FlushRequest/Job to own flush_reason as there is no good way to define `ColumnFamilyData::flush_reason` in face of concurrent flushes on the same CF (which wasn't the case a long time ago when `ColumnFamilyData::flush_reason ` first introduced`) **Tets:** - new unit test - make check - aggressive crash test rehearsal Pull Request resolved: #11111 Reviewed By: ajkr Differential Revision: D42644600 Pulled By: hx235 fbshipit-source-id: 8589c8184869d3415e5b780c887f877818a5ebaf

facebook-github-bot added the CLA Signed label Jan 20, 2023

hx235 force-pushed the flush_reason_race branch 2 times, most recently from 5d6066f to 1112bcd Compare January 20, 2023 19:36

hx235 force-pushed the flush_reason_race branch from 1112bcd to 36949d8 Compare January 20, 2023 20:35

ajkr approved these changes Jan 22, 2023

View reviewed changes

hx235 added 3 commits January 23, 2023 13:43

Refactor + repro ut

9352c30

Better ds for FlushRequest

3524252

Better UT + history

9effc7b

hx235 force-pushed the flush_reason_race branch from 5123a65 to 25135ce Compare January 23, 2023 23:29

Address feedback

03b8ced

hx235 force-pushed the flush_reason_race branch from 25135ce to 03b8ced Compare January 23, 2023 23:41

Fix lint

fa2815b

facebook-github-bot closed this in 86fa259 Jan 24, 2023

facebook-github-bot added the Merged label Jan 24, 2023

igorcanadi mentioned this pull request Jan 17, 2024

[SYS-6913] Upgrade RocksDB-Cloud to 8.9.1 rockset/rocksdb-cloud#315

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix data race on `ColumnFamilyData::flush_reason` by letting FlushRequest/Job owns flush_reason instead of CFD #11111

Fix data race on `ColumnFamilyData::flush_reason` by letting FlushRequest/Job owns flush_reason instead of CFD #11111

hx235 commented Jan 20, 2023 •

edited

Loading

facebook-github-bot commented Jan 20, 2023

facebook-github-bot commented Jan 20, 2023

facebook-github-bot commented Jan 21, 2023

facebook-github-bot commented Jan 22, 2023

ajkr left a comment

ajkr Jan 22, 2023

hx235 Jan 23, 2023

ajkr Jan 22, 2023

hx235 Jan 23, 2023

ajkr Jan 22, 2023

hx235 Jan 23, 2023

hx235 Jan 23, 2023

facebook-github-bot commented Jan 23, 2023

facebook-github-bot commented Jan 23, 2023

facebook-github-bot commented Jan 23, 2023

facebook-github-bot commented Jan 24, 2023

facebook-github-bot commented Jan 24, 2023

facebook-github-bot commented Jan 24, 2023

Fix data race on ColumnFamilyData::flush_reason by letting FlushRequest/Job owns flush_reason instead of CFD #11111

Fix data race on ColumnFamilyData::flush_reason by letting FlushRequest/Job owns flush_reason instead of CFD #11111

Conversation

hx235 commented Jan 20, 2023 • edited Loading

facebook-github-bot commented Jan 20, 2023

facebook-github-bot commented Jan 20, 2023

facebook-github-bot commented Jan 21, 2023

facebook-github-bot commented Jan 22, 2023

ajkr left a comment

Choose a reason for hiding this comment

ajkr Jan 22, 2023

Choose a reason for hiding this comment

hx235 Jan 23, 2023

Choose a reason for hiding this comment

ajkr Jan 22, 2023

Choose a reason for hiding this comment

hx235 Jan 23, 2023

Choose a reason for hiding this comment

ajkr Jan 22, 2023

Choose a reason for hiding this comment

hx235 Jan 23, 2023

Choose a reason for hiding this comment

hx235 Jan 23, 2023

Choose a reason for hiding this comment

facebook-github-bot commented Jan 23, 2023

facebook-github-bot commented Jan 23, 2023

facebook-github-bot commented Jan 23, 2023

facebook-github-bot commented Jan 24, 2023

facebook-github-bot commented Jan 24, 2023

facebook-github-bot commented Jan 24, 2023

Fix data race on `ColumnFamilyData::flush_reason` by letting FlushRequest/Job owns flush_reason instead of CFD #11111

Fix data race on `ColumnFamilyData::flush_reason` by letting FlushRequest/Job owns flush_reason instead of CFD #11111

hx235 commented Jan 20, 2023 •

edited

Loading