Skip to content

etcd Go & Java client SDK's retry mechanism may break Serializable #18424

Open
@ahrtr

Description

@ahrtr

Background

Jepsen team raised an issue #14890, and stated that etcd may cause lost update and cyclic information flow. There is a long discussion.

Firstly, there is strong evidence to indicate that it isn't an etcdserver issue, and a key was written twice by the client. Refer to #14890 (comment). So we thought it might be jetcd or Jepsen's issue.

Eventually it turned out to be caused by client's retry mechanism. Refer to

Note Jepsen uses jetcd (java client). But I believe etcd go client sdk also has this issue.

Breaks Serializable

When a database system processes multiple concurrent transactions, it must produces the same effect as some serial execution of those transactions. This is what the Serializable means.

But etcd client sdk's retry (including both go & java) mechanism may break Serializable.

Let's work with an example, assuming there are two concurrent transactions,

  • transaction 1 (txn1): read k1, and write k2: 20
  • transaction 2 (txn2): read k2, and write k1: 10

Based on the definition of Serializable, the final result must be the same as executing the two transaction as some serial execution. There are only two possibilities,

  • execute txn1 first, then txn2
    • txn1 read nothing for k1, and write k2 = 20
    • txn2 read 20 for k2, and write k1 = 10
  • execute txn2 first, then txn1
    • txn2 read nothing for k2, and write k1 = 10
    • txn1 read 10 for k1, and write k2 = 20

But client's retry may lead to a third possibility, see an example workflow below

  • execute txn1 firstly, read nothing for k1, and write k2 = 20.
    • But somehow the client side gets an error response for whatever reason (e.g. temporary network issue);
  • execute txn2: read 20 for k2, and write k1 = 10
  • client retries txn1: read 10 for k1, and write k2 = 20

So finally it leads to cyclic information flow, so it breaks Serializable

  • txn1 reads k1/10, which was written by txn2
  • txn2 reads k2/20, which was writeen by txn1

Break Read Committed

Let's work with an example/workflow,

  • client 1 sends a request write k/v: 277/1;
  • client 2 sends a request write k/v: 277/4
  • client 2 receives a success response; It means 277/4 was successfully persisted;
  • kill the etcdserver & restart etcdserver;
  • etcd client sdk retries write k/v: 277/1; so it's also successfully persisted.
    • But it's a problem if the client 1 doesn't get a success response for whatever reason, e.g timeout.
  • client 3 read k:277, but gets 277/1 instead of 277/4.

Obviously, from client perspective, it should read 277/4 in such case, because it's confirmed committed. So it breaks Read Committed.

  • Note usually breaking Read Committed means client sees uncommitted data or dirty read.

EDIT: even without the client's retry, it's also possible for users to run into this "issue", because it's possible that an user may get a failure response but etcdserver actually has already successfully processed the request. We know it's a little confusing to users, but it isn't an issue from etcd perspective. The Proposal (see below) can mitigate it, but can't completely resolve it.

What did you expect to happen?

etcd should never break Serializable, nor Read committed

How can we reproduce it (as minimally and precisely as possible)?

See workflow mentioned above. We need to create two e2e test cases to reproduce this issue.

We can leverage gofailpoint to reproduce the Serializable issue. When etcdserver receives two transaction requests, it intentionally return a failure response for the first transaction only once; when etcdserver receives the retried failed transaction, it should return success.

We also leverage sleep gofailpoint to interleave the execution of the two transaction.

Proposal

  • We should guarantee that client never retries when the previous operation may be possible already successful.
    • One valid case to retry when the client receives an auth failure
  • We should expose API to let users to enable/disable the retry.

see also #14890 (comment)

Action

  • Create an e2e test case to reproduce the "Serializable" issue.
  • Follow proposal above to resolve the issue

Reference

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions