Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deflake ThreadStatus related unit tests #12858

Closed
wants to merge 3 commits into from

Conversation

cbi42
Copy link
Member

@cbi42 cbi42 commented Jul 12, 2024

Summary: Unit tests DBTest.ThreadStatusFlush and DBTestWithParam.ThreadStatusSingleCompaction have been flaky and fail with error message

[ RUN      ] DBTest.ThreadStatusFlush
op_count: 0, expected_count 1
thread id: 718113, thread status: , cf_name 
thread id: 718114, thread status: , cf_name pikachu
/__w/rocksdb/rocksdb/db/db_test.cc:4817: Failure
Value of: VerifyOperationCount(env_, ThreadStatus::OP_FLUSH, 1)
  Actual: false
Expected: true
[  FAILED  ] DBTest.ThreadStatusFlush (106 ms)


[ RUN      ] DBTestWithParam/DBTestWithParam.ThreadStatusSingleCompaction/0
db/db_test.cc:4673: Failure
Expected equality of these values:
  op_count
    Which is: 0
  expected_count
    Which is: 1
[  FAILED  ] DBTestWithParam/DBTestWithParam.ThreadStatusSingleCompaction/0, where GetParam() = (1, false) 

One cause for this is that before flush/compaction finishes, we will go through ~WritableFileWriter(), either for WAL or SST file, and temporarily set thread_operation to UNKNOWN. This UNKNOWN thread operation seem to be there for some stress test verification. This PR fixes these tests by setting the IOActivity in ~WritableFileWriter() for debug build.

Test plan: monitor future test failure.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

// TODO: fix places where default IOOption() is used, which can cause
// io_activity to not match thread operation.
assert(io_activity == Env::IOActivity::kUnknown ||
options.io_activity == Env::IOActivity::kUnknown ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We kind of rely on NOT having this exception "options.io_activity == Env::IOActivity::kUnknown" to find incorrectly passed in IOOptions.... Is there anyway to fix the UT instead? By the way I like the refactory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could change the sync point used in unit test to make it work. But I think it makes sense to keep the thread_operation as Flush/Compaction instead of setting it to UNKNOWN. I assume that setting the thread_operation to UNKNOWN is only there to pass the checks in stress test. One alternative is to set the IOActivity in ~WritableFileWriter() based on thread_operation, what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think it makes sense to keep the thread_operation as Flush/Compaction instead of setting it to UNKNOWN

Right unfortunately we can't pass parameter into destructor

I assume that setting the thread_operation to UNKNOWN is only there to pass the checks in stress test.

Right

One alternative is to set the IOActivity in ~WritableFileWriter() based on thread_operation, what do you think?

Yeah without refactoring the cleanup out of destructor you may do that under dbg mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to set IOActivity in destructor.

@cbi42 cbi42 force-pushed the deflake-thread-status-test branch from 95af88d to f9c4a34 Compare July 13, 2024 00:09
@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cbi42 cbi42 requested a review from hx235 July 13, 2024 04:48
@facebook-github-bot
Copy link
Contributor

@cbi42 merged this pull request in b800b5e.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants