Strategy for dealing with Stack Overflows

Stack overflow issues have come up frequently with async.  Most of the time, they come from using synchronous iteration functions with large inputs, e.g. `async.eachSeries(Array(1000000), function () { callback(); }, done)`.  (#66, #75, #110, #117, #133, #296, #413, #510, #573, #588, #590, #604, #622)  Some of those issues aren't caused by large arrays exactly, but most of the time they boil down to using a sync iteration function.

A common way to fix the problem is to use an `async.nextTick()` internally to re-set the call-stack for the next step of iteration.  This masks the problem, and slows performance.  In node and later IE's, you can use `setImmediate` or `process.nexTick` which is very fast, but in other browsers it is implemented as `setTimeout(fn, 0)`, which actually ends up being about `setTimeout(fn, 4)`.  It adds unnecessary delay to your processing, if your iteration functions are actually asynchronous.  This `setImmediate`-type guarding was added to `waterfall` and it appears to cause a pretty big performance hit:   #40, #152, #513, #621  

I feel like we need to come up with a comprehensive and consistent strategy for dealing with these competing issues.
### Option 1: Don't guard against stack overflows, leave it to the user

I propose removing all `async.nextTick`s and `setImmediate`s from async library functions, to remove the performance hit.  This would be a pretty simple change.  We can still offer the `async.nextTick` and `async.setImmediate` functions to help people fix their sync functions, when they do need to use them in a `waterfall` or something similar.

We would need to document what we actually mean by synchronous functions , based on the confusion in  #629.  A `RangeError` is a pretty good indicator of using a sync function.

This would be a breaking change, a 1.0-type feature.  We could also provide a wrapper function that will make sync functions work, to ease migration.  `stackSafe` from  #516  is an interesting idea.
### Option 2: Guard against stack overflows, but use a better `setImmediate`

There is a pretty comprehensive polyfill for setImmediate that handles a variety  of browsers and JS execution environments:  [YuzuJS/setImmediate](https://github.com/YuzuJS/setImmediate).  It uses the native `setImmediate` in node and later IE's, `postMessage` tricks in modern browsers, and `script.onreadystatechange` tricks in older IE's.  It gives sub-millisecond deferrals for 99%+ of JS execution environments.

If we were to use this for the expressed purpose of preventing stack overflows, we should use it everywhere.  There should be tests for every async library function with huge arrays and sync functions.  This would be a bit of work.

It would require a build-step (e.g. `browserify --standalone`) for the standalone distribution of async.  Async would now have an npm dependency,  The other option is to just copy the setImmediate code in, which is compatible with the MIT license I believe.
### Option 3: Have a "dev-mode" that can detect sync iterators

Based on an environment variable or `async._devMode` flag, we could conditionally add some guards to iteration functions to see if they callback on the same event loop tick.  Stack overflows would still happen, but you would get warnings that would make the source of the error easier to find.  This would be tricky to implement, but it is a compromise between Options 1 and 2.  

Implementing it without causing a performance hit will be difficult, and would probably require a build that can strip dead code.  I'm reminded of  how `envify` and Uglify can strip out `if (process.env.NODE_ENV !== "production") { ... }` blocks when browserifying, but I'm not sure if having both `async.js` and `async.dev.js` is desirable.

This is also a breaking, 1.0-type feature.
### Option 4: Do nothing

Just leave things as-is.  Some functions guard against stack overflows, others don't.  We have a mis-match of philosophies.  `waterfall` is slower than is has to be, `eachSeries` is fast, but will `RangeError` with a large array and a sync iteration function.  We will continue to receive bug reports about stack overflows and poor performance.

---

So -- thoughts anyone? (@caolan  @beaugunderson )  I am in favor of Option 1, but I am also biased because I also have experience with diagnosing the errors caused by sync iterators. :smiley: 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Strategy for dealing with Stack Overflows #696

Option 1: Don't guard against stack overflows, leave it to the user

Option 2: Guard against stack overflows, but use a better `setImmediate`

Option 3: Have a "dev-mode" that can detect sync iterators

Option 4: Do nothing

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Strategy for dealing with Stack Overflows #696

Description

Option 1: Don't guard against stack overflows, leave it to the user

Option 2: Guard against stack overflows, but use a better setImmediate

Option 3: Have a "dev-mode" that can detect sync iterators

Option 4: Do nothing

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Option 2: Guard against stack overflows, but use a better `setImmediate`