-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stale read with loss of un-fsynced writes #14140
Comments
Based on my discussion with @endocrimes this is also reproducible on v3.4 |
I have reproduced loss of un-fsynced writes in my own testing using lazyfs. Just running a single node cluster with lazyfs underneath. Kill + "lazyfs::clear-cache" results in large rollback of etcd state (tens of transactions). @ramses @aphyr How much confidence we have in lazyfs implementation? Have lazyfs been successfully used to find similar issues in other databases? |
@serathius fwiw the recent single-node data loss issues are a result of the same bug - We accept writes before they are actually flushed to the disk |
@endocrimes I don't understand what you mean. Do you attribute this issue to #14370 ? |
Please disregard my above comment, I found issue in my setup. I'm still working on reproducing the issue. |
The data loss issue #14370 was only for one member cluster (and also already fixed in 3.5.5); but this issue was reproduced in multi-members cluster based on the description above. So they are unrelated. I think we can close this ticket,
Also it's hard to debug this "issue" without the transaction details (i.e. #14890 (comment)) and data bundle (i.e. #14890 (comment)). |
What happened?
In this Jepsen test involving the loss of un-fsynced writes (via lazyfs) in a five-node cluster of etcd 3.5.3 nodes, we observed a stale read (a linearizability violation) on a single key despite not using
serializable
reads.Numbering transactions top-to-bottom as T1, T2, T3, and T4, T3 used an etcd transaction guarded by an equality comparison on revision to appended the value
16
to key180
. T4 observed T3's write: it read key 180 and saw it ended in[... 15 16]
. However, T2, which began executing almost five seconds after T4 completed (denoted byrt
edges), read key 180 and saw a previous value, ending in[... 15]
.Again, this relies on lazyfs, which we're still sanding off bugs in. I'm not entirely confident in this finding, but I would like to register it in case y'all can reproduce it using your own tests. :-)
What did you expect to happen?
I expected reads to be linearizable, even with the loss of un-fsynced writes.
How can we reproduce it (as minimally and precisely as possible)?
With https://github.com/jepsen-io/etcd jepsen-io/etcd@181656b, run:
Anything else we need to know?
No response
Etcd version (please run commands below)
Etcd configuration (command line flags or environment variables)
Etcd debug information (please run commands blow, feel free to obfuscate the IP address or FQDN in the output)
Relevant log output
No response
The text was updated successfully, but these errors were encountered: