Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timer,domain: timers don't maintain order after exception #1271

Closed
bnoordhuis opened this issue Mar 26, 2015 · 19 comments
Closed

timer,domain: timers don't maintain order after exception #1271

bnoordhuis opened this issue Mar 26, 2015 · 19 comments
Assignees
Labels
confirmed-bug Issues with confirmed bugs. domain Issues and PRs related to the domain subsystem. help wanted Issues that need assistance from volunteers or PRs that need help to proceed. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.

Comments

@bnoordhuis
Copy link
Member

var domain = require('domain').create();

domain.run(function() {
  setTimeout(function() { throw Error('FAIL'); }, 1);
  setTimeout(function() { console.log('timeout 1'); }, 1);
  setTimeout(function() { console.log('timeout 2'); }, 2);
});

domain.once('error', function() {});

Expected output:

timeout 1
timeout 2

Actual output:

timeout 2
timeout 1

(EDIT: I forgot to mention that you may need to run the test a few times.)

@Fishrock123 Fishrock123 added the timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. label Mar 26, 2015
@petkaantonov
Copy link
Contributor

FWIW:

io.js makes no guarantees about the exact timing of when the callback will fire, nor of the ordering things will fire in

@Fishrock123
Copy link
Contributor

It still points to larger problems within the timers impl though.

@petkaantonov
Copy link
Contributor

Yes I definitely think that timers should guarantee order to the extent that browsers do, currently they don't do that even in normal code (normal as in not using domains) and code that assumes e.g. setTimeout(..., 5) fires before setTimeout(..., 1000).is broken in node but works fine in browsers.

@silverwind silverwind added the domain Issues and PRs related to the domain subsystem. label Mar 26, 2015
@Fishrock123
Copy link
Contributor

I'm pretty sure this has something to do with the fact that after an error timers.js re-schedules to start processing again on the next tick, but doesn't actually stop processing presently: https://github.com/nodejs/io.js/blob/master/lib/timers.js#L100

This diff fixes the test, but causes other problems, so I'm confused.

diff --git a/lib/timers.js b/lib/timers.js
index 494c599..67b7a7a 100644
--- a/lib/timers.js
+++ b/lib/timers.js
@@ -99,6 +99,7 @@ function listOnTimeout() {
           process.domain = null;
           process.nextTick(listOnTimeoutNT, list);
           process.domain = oldDomain;
+          return;
         }
       }
     }

if you're curious, here's the error log from make test (8 errors): https://gist.github.com/Fishrock123/ad3e50ac7e4e76740c3e

@Fishrock123
Copy link
Contributor

Also, is it possible this only happens when the list gets processed at a time when all 3 should fire in the same tick? (i.e. >2ms)

@Fishrock123 Fishrock123 added the confirmed-bug Issues with confirmed bugs. label Jul 24, 2015
@trevnorris
Copy link
Contributor

Just to add to this confusion, here's a slight change to the test script:

'use strict';
const domain = require('domain').create();
const print = process._rawDebug;

domain.run(function() {
  setTimeout(function() { throw Error('FAIL') }, 1);
  setTimeout(function() { print('timeout 1') }, 1);
  setImmediate(function() { print('immediate 1') });
  setTimeout(function() { print('timeout 2') }, 2);
  setImmediate(function() { print('immediate 2') });
});

domain.once('error', function() {});

I've gotten the output of:

timeout 2  
timeout 1  
immediate 1
immediate 2

and

immediate 1
immediate 2
timeout 1  
timeout 2  

@jasnell
Copy link
Member

jasnell commented Apr 19, 2016

@trevnorris @bnoordhuis ... running this against master currently, given about 100 runs each on @bnoordhuis' original and @trevnorris' revised version and I'm not seeing any variability in the ordering over timer events. Could this one have been fixed?

@Fishrock123
Copy link
Contributor

Strange. I'll investigate soon.

@Fishrock123 Fishrock123 self-assigned this Apr 19, 2016
@jasnell
Copy link
Member

jasnell commented Apr 19, 2016

btw, ran the tests on OSX in case it happens to matter. Have not tried the same tests against v5 or v4 yet.

@Fishrock123
Copy link
Contributor

Fishrock123 commented May 2, 2016

Looks like I can still reproduce, at about the same rate.

Make sure you use a command looper. Here's a modified test to accommodate that.

const domain = require('domain').create();
var first = false;

domain.run(function() {
  setTimeout(function() { throw Error('FAIL'); }, 1);
  setTimeout(function() {
    first = true;
    console.log('timeout 1');
  }, 1);
  setTimeout(function() {
    console.log('timeout 2');
    if (first === false) {
      console.log('timers reordered!')
      process.exit(1);
    }
  }, 2);
});

domain.once('error', function() {});

... shell scripts: (fish-shell & bash)

while ../node/node timers-reorder.js
end
while ../node/node timers-reorder.js; do :; done

@Fishrock123 Fishrock123 removed their assignment May 2, 2016
@imyller
Copy link
Member

imyller commented Sep 16, 2016

Revisited this old issue with current master and v6.6.0:

  • Can still reproduce this on Linux
  • Could not reproduce on Mac

@Fishrock123 Fishrock123 self-assigned this Sep 17, 2016
@Fishrock123
Copy link
Contributor

Just repo'd on OS X 10.10.5. @imyller did you try my test case on repeat?

Maybe we should just call this "one of those odd domain bugs", close the issue, and call it a day.

Or I could spend like a week debugging it but that is probably not a good use of my time.

@imyller
Copy link
Member

imyller commented Sep 17, 2016

did you try my test case on repeat?

@Fishrock123 Yes, I just let it loop with shell while-do on OS X 10.11.6 and couldn't reproduce.

With Linux test environment this is very easy to reproduce with only few repeats.

Your call, but I personally wouldn't close the issue yet because the reproduceable problem still exists and the cause is not yet understood.

@Fishrock123 Fishrock123 added the help wanted Issues that need assistance from volunteers or PRs that need help to proceed. label Sep 17, 2016
@Fishrock123
Copy link
Contributor

I suppose. I suspect unless someone else throws a considerable amount of time at it, this will just go away when domains are likely removed in a couple years.

I'm able to help with the timers side of things if anyone wants to take a look into it.

Maybe @misterdjules would actually be the best to take a look?

@jBarz
Copy link
Contributor

jBarz commented Dec 29, 2016

I took a crack at this and hopefully what I found is useful :-)
The callbacks are stored in an associative array that is indexed by the number of milliseconds.
So list[1] is created for 1ms callbacks, and list[2] is created for 2ms callbacks.
Both these lists are in the queue for the current tick.
When list[1] encounters an unhandled exception, only the 1ms list is postponed to the next tick.
However, list[2] still exists on the current tick and will go through.

The solution seems to be skip all further list[x] to the next tick.
I tested this on linux and all tests pass. Also the test case in the first comment prints the timers in order.
jBarz@e93c751

@gibfahn
Copy link
Member

gibfahn commented Dec 29, 2016

@jBarz that sounds reasonable, could you submit a PR?

@Fishrock123
Copy link
Contributor

Hmmmm that does actually make sense now that I think about it.

@jBarz
Copy link
Contributor

jBarz commented Dec 29, 2016

I added @Fishrock123 's test case from #1271 (comment)
Without the looping though. So it runs only once.

@Fishrock123
Copy link
Contributor

Fixed by 9dac165

Fishrock123 pushed a commit that referenced this issue Feb 15, 2017
PR-URL: #10522
Fixes: #1271
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
italoacasas pushed a commit to italoacasas/node that referenced this issue Feb 16, 2017
PR-URL: nodejs#10522
Fixes: nodejs#1271
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
italoacasas pushed a commit that referenced this issue Feb 22, 2017
PR-URL: #10522
Fixes: #1271
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
jasnell pushed a commit that referenced this issue Mar 7, 2017
PR-URL: #10522
Fixes: #1271
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
MylesBorins pushed a commit that referenced this issue Mar 9, 2017
PR-URL: #10522
Fixes: #1271
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. domain Issues and PRs related to the domain subsystem. help wanted Issues that need assistance from volunteers or PRs that need help to proceed. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.
Projects
None yet
Development

No branches or pull requests

9 participants