-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix data race on ColumnFamilyData::flush_reason
by letting FlushRequest/Job owns flush_reason instead of CFD
#11111
Conversation
5d6066f
to
1112bcd
Compare
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1112bcd
to
36949d8
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@hx235 has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
HISTORY.md
Outdated
@@ -16,6 +16,7 @@ | |||
* Fixed a heap use after free in async scan prefetching if dictionary compression is enabled, in which case sync read of the compression dictionary gets mixed with async prefetching | |||
* Fixed a data race bug of `CompactRange()` under `change_level=true` acts on overlapping range with an ongoing file ingestion for level compaction. This will either result in overlapping file ranges corruption at a certain level caught by `force_consistency_checks=true` or protentially two same keys both with seqno 0 in two different levels (i.e, new data ends up in lower/older level). The latter will be caught by assertion in debug build but go silently and result in read returning wrong result in release build. This fix is general so it also replaced previous fixes to a similar problem for `CompactFiles()` (#4665), general `CompactRange()` and auto compaction (commit 5c64fb6 and 87dfc1d). | |||
* Fixed a bug in compaction output cutting where small output files were produced due to TTL file cutting states were not being updated (#11075). | |||
* Fixed a data race on `ColumnFamilyData::flush_reason` caused by concurrent `GetLiveFiles(flush=true)` and other flushes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might rebase cleanly into the wrong release notes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
db/db_flush_test.cc
Outdated
// Coerce a manual flush happenning in the middle of GetLiveFiles's flush | ||
bool get_live_files_paused_at_sync_point = false; | ||
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack( | ||
"DBImpl::AtomicFlushMemTables:AfterScheduleFlush2", [&](void* /* arg */) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reuse one of the existing two sync points at the same point in the code? IIRC non-callback syncpoints can still be used for callbacks with nullptr argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure - thanks for letting me know. Fixed.
HISTORY.md
Outdated
@@ -16,6 +16,7 @@ | |||
* Fixed a heap use after free in async scan prefetching if dictionary compression is enabled, in which case sync read of the compression dictionary gets mixed with async prefetching | |||
* Fixed a data race bug of `CompactRange()` under `change_level=true` acts on overlapping range with an ongoing file ingestion for level compaction. This will either result in overlapping file ranges corruption at a certain level caught by `force_consistency_checks=true` or protentially two same keys both with seqno 0 in two different levels (i.e, new data ends up in lower/older level). The latter will be caught by assertion in debug build but go silently and result in read returning wrong result in release build. This fix is general so it also replaced previous fixes to a similar problem for `CompactFiles()` (#4665), general `CompactRange()` and auto compaction (commit 5c64fb6 and 87dfc1d). | |||
* Fixed a bug in compaction output cutting where small output files were produced due to TTL file cutting states were not being updated (#11075). | |||
* Fixed a data race on `ColumnFamilyData::flush_reason` caused by concurrent `GetLiveFiles(flush=true)` and other flushes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the only case it fixes? flush_reason has been suspicious (in the sense of reporting the wrong reason) many times historically though I don't remember the exact cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change this to "...concurrent flushes" instead cuz this is a generic fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
5123a65
to
25135ce
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
25135ce
to
03b8ced
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…uest/Job owns flush_reason instead of CFD (#11111) Summary: **Context:** Concurrent flushes on the same CF can set on `ColumnFamilyData::flush_reason` before each other flush finishes. An symptom is one CF has different flush_reason with others though all of them are in an atomic flush `db_stress: db/db_impl/db_impl_compaction_flush.cc:423: rocksdb::Status rocksdb::DBImpl::AtomicFlushMemTablesToOutputFiles(const rocksdb::autovector<rocksdb::DBImpl::BGFlushArg>&, bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::Env::Priority): Assertion cfd->GetFlushReason() == cfds[0]->GetFlushReason() failed. ` **Summary:** Suggested by ltamasi, we now refactor and let FlushRequest/Job to own flush_reason as there is no good way to define `ColumnFamilyData::flush_reason` in face of concurrent flushes on the same CF (which wasn't the case a long time ago when `ColumnFamilyData::flush_reason ` first introduced`) **Tets:** - new unit test - make check - aggressive crash test rehearsal Pull Request resolved: #11111 Reviewed By: ajkr Differential Revision: D42644600 Pulled By: hx235 fbshipit-source-id: 8589c8184869d3415e5b780c887f877818a5ebaf
…uest/Job owns flush_reason instead of CFD (#11111) Summary: **Context:** Concurrent flushes on the same CF can set on `ColumnFamilyData::flush_reason` before each other flush finishes. An symptom is one CF has different flush_reason with others though all of them are in an atomic flush `db_stress: db/db_impl/db_impl_compaction_flush.cc:423: rocksdb::Status rocksdb::DBImpl::AtomicFlushMemTablesToOutputFiles(const rocksdb::autovector<rocksdb::DBImpl::BGFlushArg>&, bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::Env::Priority): Assertion cfd->GetFlushReason() == cfds[0]->GetFlushReason() failed. ` **Summary:** Suggested by ltamasi, we now refactor and let FlushRequest/Job to own flush_reason as there is no good way to define `ColumnFamilyData::flush_reason` in face of concurrent flushes on the same CF (which wasn't the case a long time ago when `ColumnFamilyData::flush_reason ` first introduced`) **Tets:** - new unit test - make check - aggressive crash test rehearsal Pull Request resolved: #11111 Reviewed By: ajkr Differential Revision: D42644600 Pulled By: hx235 fbshipit-source-id: 8589c8184869d3415e5b780c887f877818a5ebaf
Context:
Concurrent flushes on the same CF can set on
ColumnFamilyData::flush_reason
before each other flush finishes. An symptom is one CF has different flush_reason with others though all of them are in an atomic flushdb_stress: db/db_impl/db_impl_compaction_flush.cc:423: rocksdb::Status rocksdb::DBImpl::AtomicFlushMemTablesToOutputFiles(const rocksdb::autovector<rocksdb::DBImpl::BGFlushArg>&, bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::Env::Priority): Assertion cfd->GetFlushReason() == cfds[0]->GetFlushReason() failed.
Summary:
Suggested by @ltamasi, we now refactor and let FlushRequest/Job to own flush_reason as there is no good way to define
ColumnFamilyData::flush_reason
in face of concurrent flushes on the same CF (which wasn't the case a long time ago whenColumnFamilyData::flush_reason
first introduced`)Tets: