Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose lockWaitTimeout to lock options #56

Closed
wants to merge 3 commits into from
Closed

Conversation

shlompy
Copy link

@shlompy shlompy commented May 11, 2017

The default request timeout for a lock kv.get request which is being set is too sensitive for some environments - lockwaittime +1000ms.
This is causing random timeout errors which are emitting 'end' event for the lock.

This change is exposing the lock kv.get kv.get request timeout to the lock constructor options.
If not specified, the default for the request timeout will stay lockwaittime +1000ms.

Also updated README.md to include the 2 lock timeout options.

I hope this can be merged quickly, as currently I'm unable to use the lock as is, and I really wouldn't want to create a new module for my needs.

@shlompy
Copy link
Author

shlompy commented May 13, 2017

@silas I don't want to pressure, but I'm not sure how often this git repo is maintained. I'll need to know if there is an estimation on when this can be merged or I'll need to fork this module.

@silas
Copy link
Owner

silas commented May 13, 2017

I'm not ready to make this change as-is, if you need it right away I'd fork it.

lockWaitTime is the kv.get wait option, which is how long consul will hold the connection open waiting for changes. If consul isn't returning that GET request within 1 second of the wait timeout that indicates that something strange is happening (basically the HTTP request is already open, it just needs to respond back that there were no changes). This could be related to the node IO loop being locked, GC in consul, or something else I don't understand.

I'm open to increasing the http timeout, I just need to think through the ramifications. What are you setting it to in your environment to make it work?

@shlompy
Copy link
Author

shlompy commented May 13, 2017

THanks for replying!
I tend to disagree with your assumption that if after 1 seconds the request does not get back this indicate a problem.

You're setting the http request to lockWaitTime +1 second. AFAIK this setting is to the entire http client request - from the moment a connection to the server has been attempted until a respond has arrived.
It's not a question of what is the correct value for the request timeout. In my case, I was working against a Consul behind AWS ELB, and as it happens, due to bad latency, sometimes there is an overhead and the http request itself is taking more than 1 second until a respond arrives.
I've been able to increase the ammount of connection timeout by reducing it from 1 second to lower values. increasing it to more than 1 seconds reduced the failures. Just to be on the safe side I've increased it to 5 seconds which seemed more than stable enough.

Again, it's not a matter if the right value, and I don't think increasing it to 5 inside the module is right.
You can't assume 1 seconds is enough.. In my case, it was not enought. That's why I suggest to expose it to the options, so every user can tweak it to his needs

@shlompy
Copy link
Author

shlompy commented May 13, 2017

*Also note that because the http timeout value may be too sensitive to some network environments - this is causing false lock 'end' emittions.
If it the module was performing a retry, it would have also be fine, but not emitting 'end' just because one of the get request got timed out.

@silas silas closed this in 770b36d May 13, 2017
@shlompy
Copy link
Author

shlompy commented May 13, 2017

for a second, I thought to say it is a shame you don't understand the problem and closing the PR, but then I saw you've just committed similar changed to master and tagged version 0.29.0. SO THANK YOU for quickly taking care of this!!

This is my first contribution (attempt :)) to someone's else code in github - so just a short question - will this new version will be published to npm repository automatically?

@silas
Copy link
Owner

silas commented May 13, 2017

Thanks for the PR, I decided to use a relative time instead of an absolute time and I'm going to hold off on documenting it for now. I want to rewrite the lock documentation and probably rename it to leaderElection or something to better describe the actual intended usage.

I've published it to npm and thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants