Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: add a mechanism for various endpoint selection mode #4030

Merged
merged 2 commits into from
Dec 24, 2015

Conversation

mitake
Copy link
Contributor

@mitake mitake commented Dec 21, 2015

Current etcd client library chooses a default destination node from
every member of a cluster in a random manner. However, requests of
write and read (for consistent results) need to be forwarded to the
leader node as the nature of Raft algorithm. If the chosen node is a
follower, additional network traffic will be caused by the forwarding
from follower to leader.

Mainly for reducing the forward traffic, this commit adds a new
mechanism for various endpoint selection mode to the client library
which can be configured with client.Config.SelectionMode.

Currently, two modes are provided:

  • EndpointSelectionRandom: default, same to existing behavior (pick
    a node in a random manner)
  • EndpointSelectionPrioritizeLeader: prioritize leader, for the above
    purpose

I evaluated the effectiveness of the EndpointSelectionPrioritizeLeader
with 4 t1.micro instances of AWS (3 nodes for etcd cluster and 1 node
for etcd client). Client executes this simple benchmark
(https://github.com/mitake/etcd-things/tree/master/prioritize-leader-bench),
just writes 10000 keys. When SelectionMode == EndpointSelectionRandom
(default), the benchmark needed 1 min and 32.102 sec to finish. When
SelectionMode == EndpointSelectionPrioritizeLeader, the benchmark
needed 1 min 4.760 sec.

This PR also adds a new Leader() call to MembersAPI for the EndpointSelectionPrioritizeLeader mode.

type EndpointSelectionMode int

const (
// EndpointSelectionDefault: pick an endpoint in a random manner
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make comments full sentences as EndpointSelectionDefault is to pick an endpoint in a random manner.

@mitake mitake force-pushed the endpoint-selection branch from 767491e to 6eee6f0 Compare December 22, 2015 02:35
@mitake
Copy link
Contributor Author

mitake commented Dec 22, 2015

@gyuho fixed the style of comments

@gyuho
Copy link
Contributor

gyuho commented Dec 22, 2015

Thanks @mitake . I want to benchmark this function tomorrow, to double check in my side. Thanks for working on this.


const (
// EndpointSelectionDefault is to pick an endpoint in a random manner.
EndpointSelectionDefault EndpointSelectionMode = iota
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EndpointSelectionRandom?

@xiang90
Copy link
Contributor

xiang90 commented Dec 22, 2015

Let's fix these style issues and remove the "unnecessary" feature. Thanks a lot.

@mitake mitake force-pushed the endpoint-selection branch from 6eee6f0 to 53dabda Compare December 22, 2015 06:00
@mitake
Copy link
Contributor Author

mitake commented Dec 22, 2015

@xiang90 Removed EndpointSelectionPreference and fixed style problems pointed, PTAL.

@xiang90
Copy link
Contributor

xiang90 commented Dec 22, 2015

@mitake LGTM. @gyuho wants to double check the result so I would like to defer this to him. Thanks!

@gyuho
Copy link
Contributor

gyuho commented Dec 22, 2015

Yeah, I just want to confirm that I see the same performance gain as you.
We will merge tomorrow. Thanks @mitake.

@xiang90
Copy link
Contributor

xiang90 commented Dec 22, 2015

gofmt seems to be failing.

@mitake mitake force-pushed the endpoint-selection branch from 53dabda to 0d01926 Compare December 22, 2015 06:07
@mitake
Copy link
Contributor Author

mitake commented Dec 22, 2015

@xiang90 oops, sorry for that. Fixed it.

@mitake mitake force-pushed the endpoint-selection branch from 0d01926 to 7616765 Compare December 22, 2015 08:52
@mitake
Copy link
Contributor Author

mitake commented Dec 22, 2015

updated commit message


func (l *membersAPIActionLeader) HTTPRequest(ep url.URL) *http.Request {
u := v2MembersURL(ep)
req, _ := http.NewRequest("GET", u.String()+"/leader", nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the style of the other functions: https://github.com/coreos/etcd/blob/eaaf98348ce433ff0a8ff94552649af215754df3/client/members.go#L216
I also suggest splitting out const leaderSuffix = "leader".

@gyuho
Copy link
Contributor

gyuho commented Dec 22, 2015

@xiang90 @jonboulle I tested this in a simple 3-member cluster.

With 100 keys of random bytes:

  • Old: One loop took 298.645066ms with errCnt 22
  • New: One loop took 270.700415ms with errCnt 0

So current client takes longer per loop (one put request + one get request to the key from the put request). errCnt means how many times out of 100, it tries to talk to follower when the follower's log is not ready to answer the client request.

Looks good. Will wait on @mitake's fix for @jonboulle's feedback.

@mitake mitake force-pushed the endpoint-selection branch from 7616765 to 5d78417 Compare December 23, 2015 13:52
@mitake mitake force-pushed the endpoint-selection branch 2 times, most recently from c716ecb to aa52a42 Compare December 23, 2015 14:30
@gyuho
Copy link
Contributor

gyuho commented Dec 23, 2015

@jonboulle Can you go over @mitake's updates to your feedback?

@@ -35,6 +35,7 @@ var (
ErrTooManyRedirects = errors.New("client: too many redirects")
ErrClusterUnavailable = errors.New("client: etcd cluster is unavailable or misconfigured")
errTooManyRedirectChecks = errors.New("client: too many redirect checks")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, please swap lines 37/38

@jonboulle
Copy link
Contributor

I still feel a bit uneasy about this for long-lived clients. Maybe I'm just being overly paranoid, but it still seems like there are various opportunities for things to get into a weird state where it's not really a PrioritizeLeader mode any more. For example, Do() code will just pin a new arbitrary (available) endpoint if anything goes wrong with a request.

@xiang90
Copy link
Contributor

xiang90 commented Dec 23, 2015

@jonboulle Like I said, I will view this as an acceptable client side optimization when choosing the initial member to talk with without any guarantees. If user does not enable auto-sync, it probably means nothing after any configuration changes.

@xiang90
Copy link
Contributor

xiang90 commented Dec 23, 2015

@jonboulle Another way is that etcd-server always piggy-back the leader addr as a hint. The client can always decide to switch to leader or not. That is what we did for etcd0 and some other systems do for reducing latency.

Current etcd client library chooses a default destination node from
every member of a cluster in a random manner. However, requests of
write and read (for consistent results) need to be forwarded to the
leader node as the nature of Raft algorithm. If the chosen node is a
follower, additional network traffic will be caused by the forwarding
from follower to leader.

Mainly for reducing the forward traffic, this commit adds a new
mechanism for various endpoint selection mode to the client library
which can be configured with client.Config.SelectionMode.

Currently, two modes are provided:
 - EndpointSelectionRandom: default, same to existing behavior (pick
   a node in a random manner)
 - EndpointSelectionPrioritizeLeader: prioritize leader, for the above
   purpose

I evaluated the effectiveness of the EndpointSelectionPrioritizeLeader
with 4 t1.micro instances of AWS (3 nodes for etcd cluster and 1 node
for etcd client). Client executes this simple benchmark
(https://github.com/mitake/etcd-things/tree/master/prioritize-leader-bench),
just writes 10000 keys. When SelectionMode == EndpointSelectionRandom
(default), the benchmark needed 1 min and 32.102 sec to finish. When
SelectionMode == EndpointSelectionPrioritizeLeader, the benchmark
needed 1 min 4.760 sec.
@mitake mitake force-pushed the endpoint-selection branch from aa52a42 to a46ffc6 Compare December 24, 2015 02:03
@mitake
Copy link
Contributor Author

mitake commented Dec 24, 2015

@jonboulle fixed based on your comment, PTAL.

@mitake
Copy link
Contributor Author

mitake commented Dec 24, 2015

@jonboulle I understand your concern about the difference between actual server configuration and view from clients. However, such inconsistency can be caused by other reasons e.g. etcd cluster membership reconfiguration. Developers of long living clients must care about the inconsistency any way. I'd like to add this point to the docs.

@jonboulle
Copy link
Contributor

OK, thanks for your patience.

Developers of long living clients must care about the inconsistency any way. I'd like to add this point to the docs.

This would be great, thanks

@jonboulle
Copy link
Contributor

LGTM

jonboulle added a commit that referenced this pull request Dec 24, 2015
client: add a mechanism for various endpoint selection mode
@jonboulle jonboulle merged commit dac56fa into etcd-io:master Dec 24, 2015
@gyuho
Copy link
Contributor

gyuho commented Dec 24, 2015

Thanks Jon!

Sincerely,
Gyu-Ho Lee
On Dec 24, 2015 03:55, "Jonathan Boulle" notifications@github.com wrote:

Merged #4030 #4030.


Reply to this email directly or view it on GitHub
#4030 (comment).

@mitake mitake deleted the endpoint-selection branch December 24, 2015 15:29
@mitake
Copy link
Contributor Author

mitake commented Dec 24, 2015

@xiang90 @gyuho @jonboulle thanks a lot for detailed review!

mitake added a commit to mitake/etcd that referenced this pull request Jan 19, 2016
This commit adds a document that provides tips of how to use the go
client library. Currently it describes how to use the
client.SelectionMode parameter that is added in
etcd-io#4030.
mitake added a commit to mitake/etcd that referenced this pull request Jan 19, 2016
This commit adds a document that provides tips of how to use the go
client library. Currently it describes how to use the
client.SelectionMode parameter that is added in
etcd-io#4030.
mitake added a commit to mitake/etcd that referenced this pull request Jan 20, 2016
This commit adds a document that provides tips of how to use the go
client library. Currently it describes how to use the
client.SelectionMode parameter that is added in
etcd-io#4030.
mitake added a commit to mitake/etcd that referenced this pull request Jan 21, 2016
This commit adds a document that provides tips of how to use the go
client library. Currently it describes how to use the
client.SelectionMode parameter that is added in
etcd-io#4030.
mitake added a commit to mitake/etcd that referenced this pull request Jan 21, 2016
This commit adds a document that provides tips of how to use the go
client library. Currently it describes how to use the
client.SelectionMode parameter that is added in
etcd-io#4030.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants