-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make leases linearizable (excluding TTL) #13915
Comments
I think we may have a short-term plan and a long-term plan. Short-term plan: So in short the short-term plan is merge pull/13690 into main and backport to 3.5. Long-term plan: The long-term plan should only be performed on 3.6 or even 3.7. |
I don't agree that #13690 fixes the issue. It adds a wait time for apply on leader, however this makes the situation described in this issue even worse. It increases the window/time after leader check, meaning that there is a greater chance that leader has changed. Best short term solution would be to stop treating ttl=0 as lease not existing. |
I might be wrong, but
I think the above comment is only partially valid. Firstly, usually waitAppliedIndex returns immediately (which means there is no wait time at all) because normally the leader finishes each applying workflow before it receives the following/next gRPC request, i.e. LeaseRenew. Secondly, even in some extreme scenarios when Etcdserver hasn't finished the previous applying workflow before it receives the following gRPC request, the leader may change during the waitAppliedIndex . But it shouldn't be a problem, because Etcdserver will forward the request to the new leader. See lessor.go#L396 and v3_server.go#L321-L335. Please note that all above discussion are based on the assumption there is no big refactor for now. So it's for the short-term plan. With regard to the long-term plan, I am thinking my previous comment " |
@serathius Are you working on this? If not, I'd like to work on it. I can start to work on it about 1+ week later. We need to get all lease related issue & enhancement sorted out, and provide a clear plan. |
I have a WIP branch for this but didn't get much time in the last couple of weeks (I've been mostly out with a shoulder injury for the last week or so). The hardest bit to figure out conceptually has been figuring out how to handle time and clock drift - specifically assuming we want to switch away from having checkpoint decrement the TTL in Raft and towards having some kind of |
Not sure we want this. Please let's discuss the design and agree on who should implement it to avoid duplicated and wasted work. |
Thanks @serathius and @endocrimes for the feedback. Let's discuss the design & plan firstly, and balance the effort afterwards, I will try to drive this. Feel free to jump in if anyone has any thoughts or ideas. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
As etcd leases depend on real time, they cannot provide same guarantees as rest of the API. In distributed systems we cannot depend on local time of single member, so etcd mitigates this by relying on cluster leader to decide what it lease TTL and whether it expired.
There is however there are two major issues related to current implementation of leases in etcd:
First issue means that methods that read/update lease TTL, are not guaranteed to provide accurate results/succeed. However, fixing them will require larger redesign of leases. We should revisit this in future releases.
Second issue means that leases might be reported as deleted, even though keys using them will still exist and they can be stil renewed. For v3.5 we should look into fixing this issue.
Proposal: Whether lease exist should depend on raft, making it linearizable.
With this change we should be able to address #13675, and improve reliability of lease API.
Changes:
This change brings one issue that we should be discussed. Clients might not be correctly handling case where lease has ttl=0. For example client could be using ttl as a delay to wait for lease to be revoked. We should double check client code to verify if this change will have an impact and consider having some minimal TTL that is returned (for example 1 second).
cc @ptabor @ahrtr @endocrimes @spzala
The text was updated successfully, but these errors were encountered: