-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Standardize a way for writable streams to do work after all data has been consumed #7348
Comments
I wanted this as well when we were speccing out streams2, but the idea got shot down for some reason. I can't remember the details. |
I chatted briefly about this with @tjfontaine yesterday afternoon and I think he's open to the idea. |
This is what documentation is for. It seems to be a reasonable expectation that if people are using streams, that they know what the methods are. Further, if a module uses the 'close' event, it seems reasonable to document that in the particular model. I see that's now been done for the module in question. So while I don't find that particular use case compelling, I can see potential use if the WritableStream has an internal buffer. Like with Transform streams a I ran into this case myself recently, but in my case I found my WritableStream was better modeled as a Transform stream, which made the problem go away, as I then had access to the |
If that works then that's a nice solution. In my case it doesn't make any sense to read data out of the stream, so to get it to actually consume (upload) the data I'd have to bind my own "data" listener or pipe it to some unused place. Transform streams are a special case for readable + writable streams, so it seems weird that they have a feature that exists on neither readable nor writable streams. |
Great point. I would expect a Transform stream to be a pure union of Readable and Writable streams. |
To the folks finding this via Google and looking for the best workaround until there's a native _flush implementation, here's my monkey patch. It's a strange approach, but since it doesn't actually patch the stream itself, it's immune to any changes that may be made in reference to this. I'm using this in production to stream to S3, in a framework where 'finish' absolutely must mean that no part of the stream is still active. var EventEmitter = require('events').EventEmitter,
Writable = require('stream').Writable,
util = require('util');
function MyWriteStream() {
Writable.call(this, {highWaterMark: 500});
}
util.inherits(MyWriteStream, Writable);
MyWriteStream.prototype.emit = function(evt) {
if (!Writable.prototype._flush && evt === 'finish') {
this._flush(function(err) {
if (err)
EventEmitter.prototype.emit.call(this, 'error', err);
else
EventEmitter.prototype.emit.call(this, 'finish');
}.bind(this));
}
else {
var args = Array.prototype.slice.call(arguments);
EventEmitter.prototype.emit.apply(this, args);
}
};
MyWriteStream.prototype._flush = function(cb) {
// This now executes identically to Transform._flush.
// 'finish' will be emitted when cb() is called.
};
module.exports = MyWriteStream; |
@TomFrost This looks like a reasonable solution to me, and worthy of publishing on npmjs.org until there's an official solution. As I understand it, you've created a new Writable stream module that will call a custom _flush method after the 'finish' event is emitted. Otherwise, it works like a normal Writable stream. Nicely done. |
@markstos Actually, the key is that it calls _flush before emitting 'finish' ;-). But that's an excellent suggestion -- I'll turn that into something that can be used as a drop-in extendable and put it on NPM. |
Thanks! |
I ran into the exact same need: AWS S3 multipart-upload requires a final HTTP request to commit the upload, after all data has been sent. Earlier suggestions to listen to the
The _flush() method for Writable streams seems to be the best solution: it covers all issues raised here, and doesn't break anything. @TomFrost Thank you for your monkey patch solution! This is perfect in the mean time! Will this be fixed in time for node.js v0.12? |
node v0.12.0 was released, without Writable._flush. Will this be fixed in 0.12.x? or should we wait for v0.14? |
@triccardi-systran resolutions are being discussed in the io.js readable-stream repo if you are interested: |
Closing this in favor of nodejs/readable-stream#112. This should eventually land in either io.js or the converged stream and we'll pick it up from there. |
still working on progress nodejs/node#2314 |
Transform streams can optionally have a
_flush
method that lets the stream do something after all incoming data has been consumed, but before the stream's readable side fires its 'end' event. It would be great if a similar convention existed for writable streams.The particular use case I have in mind here is a writable stream that uploads data to S3 (source) that has to tell S3 that the upload is finished, then wait for a response. In our case, we chose to fire a 'close' event once that work is done. This is annoying for users though, because they have to know that special event exists. If they bind to the standard 'finish' event they may try to use the data produced by the writable stream before it's actually ready. Chaos ensues and tears are shed.
The two solutions I've thought about are (1) allowing writable streams to implement a
_flush
method or (2) indicating when the last data is being written so that the stream can delay calling the last callback until it's finished with it's post-processing work.What do you think?
The text was updated successfully, but these errors were encountered: