Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster round-robin scheduler loses request #3551

Closed
alexeyten opened this issue Oct 27, 2015 · 8 comments
Closed

Cluster round-robin scheduler loses request #3551

alexeyten opened this issue Oct 27, 2015 · 8 comments
Labels
cluster Issues and PRs related to the cluster subsystem.

Comments

@alexeyten
Copy link
Contributor

Example code is here: https://gist.github.com/alexeyten/45fb4319e4158feb793a

I've tested this code with Node.js 4.2.1

This server just response with OK and restart workers after 200-300 completed requests. With default cluster scheduler (SCHED_RR) server eventually loses some requests and ab stops with timeout error. When I uncomment line 5 and use SCHED_NONE scheduler, then server works fine.

/cc @indutny

@indutny indutny added the cluster Issues and PRs related to the cluster subsystem. label Oct 27, 2015
@indutny
Copy link
Member

indutny commented Oct 27, 2015

cc @bnoordhuis

@evanlucas
Copy link
Contributor

I am able to reproduce this consistently

@indutny
Copy link
Member

indutny commented Oct 30, 2015

@alexeyten I think it should be fixed by da21dba in master, could you please verify?

@bnoordhuis
Copy link
Member

I can still reproduce with master. I'll try to look into it next week.

@bnoordhuis
Copy link
Member

Interestingly enough, today's master works fine. @alexeyten Can you confirm?

@bnoordhuis
Copy link
Member

Okay, I take that back. It still happens, just infrequently.

@bnoordhuis
Copy link
Member

@alexeyten Can you confirm that #3677 fixes the issue for you?

@alexeyten
Copy link
Contributor Author

@bnoordhuis looks so.

At least it handled 1M requests without problems.

bnoordhuis added a commit to bnoordhuis/io.js that referenced this issue Nov 6, 2015
Due to the race window between the master's "disconnect" message and the
worker's "handle received" message, connections sometimes got stuck in
the pending handles queue when calling `worker.disconnect()` in the
master process.

The observable effect from the client's perspective was a TCP or HTTP
connection that simply stalled.  This commit fixes that by closing open
handles in the master when the "disconnect" message is sent.

Fixes: nodejs#3551
PR-URL: nodejs#3677
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
bnoordhuis added a commit that referenced this issue Nov 7, 2015
Due to the race window between the master's "disconnect" message and the
worker's "handle received" message, connections sometimes got stuck in
the pending handles queue when calling `worker.disconnect()` in the
master process.

The observable effect from the client's perspective was a TCP or HTTP
connection that simply stalled.  This commit fixes that by closing open
handles in the master when the "disconnect" message is sent.

Fixes: #3551
PR-URL: #3677
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
bnoordhuis added a commit that referenced this issue Nov 30, 2015
Due to the race window between the master's "disconnect" message and the
worker's "handle received" message, connections sometimes got stuck in
the pending handles queue when calling `worker.disconnect()` in the
master process.

The observable effect from the client's perspective was a TCP or HTTP
connection that simply stalled.  This commit fixes that by closing open
handles in the master when the "disconnect" message is sent.

Fixes: #3551
PR-URL: #3677
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
bnoordhuis added a commit that referenced this issue Dec 4, 2015
Due to the race window between the master's "disconnect" message and the
worker's "handle received" message, connections sometimes got stuck in
the pending handles queue when calling `worker.disconnect()` in the
master process.

The observable effect from the client's perspective was a TCP or HTTP
connection that simply stalled.  This commit fixes that by closing open
handles in the master when the "disconnect" message is sent.

Fixes: #3551
PR-URL: #3677
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
bnoordhuis added a commit that referenced this issue Dec 17, 2015
Due to the race window between the master's "disconnect" message and the
worker's "handle received" message, connections sometimes got stuck in
the pending handles queue when calling `worker.disconnect()` in the
master process.

The observable effect from the client's perspective was a TCP or HTTP
connection that simply stalled.  This commit fixes that by closing open
handles in the master when the "disconnect" message is sent.

Fixes: #3551
PR-URL: #3677
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
bnoordhuis added a commit that referenced this issue Dec 23, 2015
Due to the race window between the master's "disconnect" message and the
worker's "handle received" message, connections sometimes got stuck in
the pending handles queue when calling `worker.disconnect()` in the
master process.

The observable effect from the client's perspective was a TCP or HTTP
connection that simply stalled.  This commit fixes that by closing open
handles in the master when the "disconnect" message is sent.

Fixes: #3551
PR-URL: #3677
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cluster Issues and PRs related to the cluster subsystem.
Projects
None yet
Development

No branches or pull requests

4 participants