Skip to content
This repository was archived by the owner on Aug 27, 2025. It is now read-only.

Conversation

yaron-zilliqa
Copy link
Contributor

Description

Backward Compatibility

  • This is not a breaking change
  • This is a breaking change

Review Suggestion

Status

Implementation

  • ready for review

Integration Test (Core Team)

  • local machine test
  • small-scale cloud test

@github-actions github-actions bot changed the title ZIL-5403: check if a LevelDB object is null or not [master] ZIL-5403: check if a LevelDB object is null or not Oct 10, 2023
rrw-zilliqa
rrw-zilliqa previously approved these changes Oct 12, 2023
Copy link
Contributor

@rrw-zilliqa rrw-zilliqa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check comments, but I don't think this makes things worse .. worth testing quite thoroughly, though, because bugs here might be quite bad.

LOG_GENERAL(WARNING, "LevelDB " << dbName << " isn't available");
if (found) {
*found = false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the effect that if the db isn't available, we assume that it's empty. Under what circumstances might it not be available? I'm thinking of the case where the user induces the contract state db not to be available whilst calling a transition that checks a flag - the db is not there, the flag checks out as no, and the user is then allowed to do something they shouldn't ... ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design of this class is pretty bad from the start and indeed like you mentioned, it treats certain errors as if something is empty or not found instead of handling things correctly from the outset.
These changes are backward-compatible with this as I'm not convinced we can fix this completely and the change would be much larger.
Some databases are only created/read in lookup mode, for example. Another possibility is that something wasn't downloaded correctly or got corrupted.
The alternative is to let the process crash completely (which it would have done so far).. and also start downloading persistence upon the next restart, and then possibly crashing again, and so on...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not wrong .. OK. Let's go with this on the basis that if at least 1/3 of the nodes have the right database, you won't be able to commit your txn? @JamesHinshelwood @DrZoltanFazekas - am I off base?

Copy link
Contributor

@JamesHinshelwood JamesHinshelwood Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yaron-zilliqa Sorry, I'm not understanding this properly, but the new code does jump out to me as a bit scary. I agree that its dangerous to assume 'DB doesn't exist' is the same as 'DB exists, but is empty'.

Can you explain what the initial bug you were fixing was? A description of how you believe this changes the logic would also be useful. Sorry to be a pain, but because the methods in the class have been re-ordered as well as edited, it makes the diff quite difficult to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The methods haven't been reordered. The only change in logic here is checking for a nullptr before accessing it.
The previous methods were really a copy of one another with a different construction of leveldb::Slice, so the not simply invoke XXXXXImpl(...) instead of repeating the code.
The problem it solves is that the process won't crash (and then redownload persistence upon restart) if the required LevelDB object was NULL. I came across this scenario this time round when setting ARCHIVAL_LOOKUP_WITH_TX_TRACES to true and calling ots_getTransactionError.

Comment on lines +42 to +45
if (!s.IsNotFound()) {
LOG_GENERAL(WARNING, "LevelDB " << dbName << " status is not OK - "
<< s.ToString());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this makes sense. Would s.IsNotFound() be true if the DB exists, but the key (arg) doesn't exist in it?

Copy link
Contributor Author

@yaron-zilliqa yaron-zilliqa Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would; this is backward compatible with the current implementation. Haven't changed anything there. This piece of code simply replaced log_error which I've removed and placed here.

@rrw-zilliqa rrw-zilliqa dismissed their stale review October 13, 2023 14:13

Discussed with @yaron-zilliqa - best not to merge this now; requesting changes so it doesn't end up in master before v9.3

Copy link
Contributor

@rrw-zilliqa rrw-zilliqa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this as request changes to make sure it doesn't get merged prematurely.

@yaron-zilliqa yaron-zilliqa force-pushed the bugfix/ZIL-5403 branch 5 times, most recently from 384c631 to 61e7dac Compare October 23, 2023 07:54
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

Status: PRs needing review

Development

Successfully merging this pull request may close these issues.

3 participants