Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug where OnErrorRecoveryBegin() is not called before auto-recovery #12860

Closed
wants to merge 1 commit into from

Conversation

hx235
Copy link
Contributor

@hx235 hx235 commented Jul 12, 2024

Context/Summary:
*auto_recovery needs to be set true in order for OnErrorRecoveryBegin() to be called before auto-recovery

if (*auto_recovery) {
listener->OnErrorRecoveryBegin(reason, *bg_error, auto_recovery);
}

Currently it's set false for auto-recovery. This PR fixes it.

Test plan:

  • Manual observation that it is called
  • Existing UT

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@@ -289,7 +289,8 @@ class DbStressListener : public EventListener {
}
}

void OnErrorRecoveryCompleted(Status /* old_bg_error */) override {
void OnErrorRecoveryEnd(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To use the later API

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check that recovery is successful?

Copy link
Contributor Author

@hx235 hx235 Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current stuff we do in this crash test listener is intended to be done even when recovery is not successful.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235 hx235 requested review from cbi42 and jowlyzhang July 12, 2024 20:31
@jowlyzhang
Copy link
Contributor

My understanding is this auto recovery applies to when error handler delegates error handling to SstFileManager to handle no space type of error. The event listener notifies the user so they can also manually intervene like configuring more space. While auto resume is something internally handled by RocksDB so the event listener is not currently notifying the user. I don't know if this is intentional or we want to change that. But the name auto_recovery in the event listener came from before ErrorHandler can auto resume, so the naming can be a bit confusing. CC @anand1976

db/error_handler.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@jowlyzhang jowlyzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix!

@@ -466,10 +466,13 @@ void ErrorHandler::SetBGError(const Status& bg_status,
Status bg_err(new_bg_io_err, severity);
CheckAndSetRecoveryAndBGError(bg_err);
recover_context_ = context;
bool auto_recovery = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this taken into account the value of max_bgerror_resume_count since autorecovery may not be enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - fixed.

@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@hx235 merged this pull request in 4ff35af.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants