-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support reading bytes into buffers allocated by user code on platforms where only async read is available #253
Comments
If the blocking I/O:
Then, we want to allow the user code to allocate a buffer with size determined based on their status. To realize this, yes, we need some API like In what form can we provide So, we need to add some interface to suppress automatic pull when we want to use If the blocking I/O:
Then,
(I assumed that the unblocking event leads to fulfillment of For this case, there's no big difference between |
Sorry for the delay on this. I had a hard time paging it back in. Unfortunately I think it is the first case:
This can be seen by looking at e.g. Win32 ReadFile or Unix fread and pread.
Yes, I agree, this is a problem. The only thing I can think of immediately is to remove sync read() and the whole waiting/readable state transition :(. Assuming that we would carry that over to ReadableStream that would be a lot of churn this late in the game which is frustrating. Maybe we can think of a better solution. E.g. maybe SeekableReadableByteStream only matches ReadableStream in its pipeTo API? :-/ |
Shall we add a new mode to ReadableByteStream, say, manual feed mode, named after manual paper feed of printers. ReadableByteStream would have a method named |
PR #282 is quick prototype of the manual feedable stream. |
Warning: naming bikeshed ahead... feel free to ignore. I think an interface for manual allocation might be useful, but not sure the "feedable" metaphor resonates. I've seen this kind of thing called an "allocation strategy" other places, etc. It seems this would definitely be a power user feature, though. So lets make sure its optional? |
@tyoshino this is sounding promising. I think I need a bit more detail to understand how it's supposed to work though. Especially how it interacts with readInto. I think I like the general approach though, of a more-specialized variant of ReadableByteStream that helps with this case. Would love if you could flesh out the PR with some examples of consumer code, using read/readInto/feedArrayBuffer. As for naming, I don't have any strong preferences. "Strategy" is kind of already taken, but something about allocation might be nice. |
wanderview: I agree that this kind of stuff is called |
Could we use the revealing constructor pattern here again? Something like:
Then we don't need a new stream name. :-) If we add a
If we don't like confusing these attributes with the constructor "source" concept, then it could be set in a different method. Something like:
Would this address the problem? |
I don't think so, unless I'm misunderstanding. The scenario I think we're trying to solve is that the consumer of the stream wants to be able to do "fill up count bytes of ab starting at offset" and then the stream will use this to do an async (i.e., blocking in another thread) operation like I guess |
What would be most helpful for me to understand for all these proposed ideas is, how would a consumer---given a (Feedable?)ReadableByteStream from the UA---use the stream. For example, if we took a drastic approach and overhauled everything so that // Passes ab, 0, CHUNK_SIZE to fread, which uses it
rbs.readInto(ab, 0, CHUNK_SIZE).then(...)
// Uses the default chunk size, currently specified by the stream constructor (although we could allow it to be specified by consumers, hmm).
rbs.read().then(...) What would it look like for the |
Actually, I had not thought about Async read seems ok to me, but I think it opens some questions. What does it do for concurrent read() calls? Do we allow overlap? Ordering guaranteed? |
Well, that was kind of the entire point of ReadableByteStream, was to make it possible to do zero- or one-copy reads into user-allocated ArrayBuffers. For example you could do repeated readInto()s into the same array buffer that was pre-allocated to be the size of the file (determined by stat-ing) or response (determined by Content-Length). Async read was more of a strawman, and I'd rather not dive into it right now because it'd be a lot of spec churn. (I previously explored it here, although at the time I wasn't aware of this particular case.) It's also less efficient for read(2)-style streams like HTTP streams, at least in theory. If we do decide to go that way then ... I'd feel bad about the churn at what seems like a pretty late stage, but it'd be OK I guess. But you guys had some interesting ideas for meshing the existing wait-for-ready-then-sync-read(Into) design with this use case, and I mainly wanted to see what those would look like in practice. |
Sorry. You're right. Sorry for my confusion. I do think To answer your question, I think an allocator function could handle this like so:
I don't really like depending on state between two statements like that, but it seems the |
I think I see. OK, I think what we need to make this more concrete now is a series of use cases that we want to write sample code for. Here is what I have off the top of my head. Please give me more :).
Here is my sample code for setAllocator, let me know if I got this right // Common prelude
const rbs = getReadableByteStreamForFile("/path/to/file.flac");
const ONE_MIB = 1024 * 1024;
const TEN_MIB = 10 * ONE_MIB; // (all)
const ab = new ArrayBuffer(TEN_MIB);
// This begins the fread(ab, 0, ab.byteLength) call
rbs.setAllocator({
allocate() { return ab; }
});
// fulfills when the fread() call finishes
rbs.ready.then(() => {
const readValue = rbs.read();
assert(readValue === ab);
console.log("done!", ab);
// stream is now closed
}); // (chunkwise)
const ab = new ArrayBuffer(ONE_MIB);
// This begins the fread(ab, 0, ab.byteLength) call
rbs.setAllocator({
allocate() { return ab; }
});
// fulfills when the first fread(ab, 0, ONE_MIB) call finishes
rbs.ready.then(() => {
const readChunk1 = rbs.read();
assert(readChunk1 === ab);
console.log("first chunk", ab);
// queue is now empty but we're not done, so stream automatically kicks off
// fread(ab, ONE_MIB, ONE_MIB). This next rbs.ready will fulfill when that finishes
rbs.ready.then(() => {
const readChunk2 = rbs.read();
assert(readChunk2 === readChunk1); // as desired; re-using the same buffer
console.log("second chunk", ab);
// Etc., until chunk 10, at which point the stream is closed.
// Recursion left as an exercise for the reader.
});
}); // (ping-pong)
const abs = [new ArrayBuffer(ONE_MIB), new ArrayBuffer(ONE_MIB)];
const counter = 0;
// This begins the fread(abs[0], 0, abs[0].byteLength) call
rbs.setAllocator({
allocate() {
return abs[counter++ % 2];
}
});
// fulfills when the first fread(abs[0], 0, abs[0].byteLength) call finishes
rbs.ready.then(() => {
const readChunk1 = rbs.read();
assert(readChunk1 === abs[0]);
console.log("first chunk", readChunk1);
// after the rbs.read() call, the stream's queue is now empty but we're not
// done, so stream automatically kicks off fread(abs[1], ONE_MIB, ONE_MIB).
// This next rbs.ready will fulfill when that finishes
Promise.all([
processChunkAsync(readChunk1),
rbs.ready
])
.then(() => {
const readChunk2 = rbs.read();
assert(readChunk2 === abs[1]); // as desired
console.log("second chunk", readChunk2);
// Note that after calling rbs.read() this second time the stream is doing
// fread(abs[0], 2 * ONE_MIB, ONE_MIB), so abs[0] === readChunk1 is starting
// to be reused.
// Anyway, keep going until chunk 10, abstract into recursive function, etc.
});
}); Notably I didn't use your free hook for the (ping-pong) case, so maybe I'm missing something? |
I think you're right that I guess that raises the question if allocate() can return a promise to delay the next fread() until a re-usable block is ready to be processed again. I didn't consider that Its not completely clear to me how this would interact with
|
I think maybe that is the motivation behind @tyoshino's feedArrayBuffer() paradigm. setAllocator() is a bit more automatic since you call it once instead of every time, but the ideas are starting to converge I think :).
Hmm, yeah, readInto() makes less sense here. Maybe the right way to phrase it is, I'm not sure what use case it would address. Perhaps readInto() makes more sense for the read(2)-esque streams instead of the fread-based streams, and wouldn't exist on AllocatorControlledReadableByteStream (or whatever)? |
One advantage of setAllocator over feedArrayBuffer is that once the allocator is set, you can pass off the ReadableByteStream to someone else and they can consume it as if it's a normal readable stream. I think with feedArrayBuffer they need to be aware of the feed/read cycle and can't just do ready/read like normal. |
It would be nice if this didn't require a new type. I think Any readInto call would be limited to the size of the allocator returned buffer. If readInto completely drains the buffer, then the allocate() function is called again. |
I don't have much against new types, as long as they are duck-compatible on a large subset of the API. shrug |
Well yea, but you were talking about removing a method... so a duck with only one foot. :-) |
Yeah. Streams with fread style underlying source would be queue-backed stream (though it's likely to be a single element queue) (see #275 (comment)). So,
I agree that the new user doesn't need to do the feeding, but the new user needs to:
These were not required for normal ReadableStream user. So, I think streams to pass to someone else should have an allocator that doesn't reuse ArrayBuffers. For use cases in which the consumer code wants to adjust the size to pull (pass to
So, I suggest that we encapsulate Hmm, but now I wonder if this is something we should define in the Streams standard or not. Maybe we should just include some advice how to encapsulate blocking I/Os with the ReadableStream interface? |
What is really worrying to me is that we seem to have painted ourselves into a corner, so that now all the ideas that we're considering involve passing off the resulting complexity to authors. Whether it be Stated another way. One of the things the current design does really well, I think, is focus on unifying push and pull sources into a single author-facing API: a "readable stream." I think it should be within our charter to unify read(2)-backed and fread-backed readable byte streams. (And, I also think it's important to make readable byte streams a special case of readable streams---or an extension of them, depending on how you look at it.) At this point I really don't see any truly good solution except moving to async read---everywhere. But I feel really bad about this. One of the very original points of contention between your W3C streams draft and my WHATWG streams draft, inspired by Node streams, was saying that sync read was important. Now, I don't think at the time either of us had a complete understanding of all the issues involved, and how it would impact the byte-stream case due to this fread vs. read(2) dichotomy combined with the desire to specify up front how many bytes you want to read. (Or maybe you did and I just wasn't listening? :-/) We've learned a lot about the problem domain since and in general it's good to revise APIs in light of new information. But it's late! You guys have an implementation in Chrome that you want to ship soon! And making this big of a change this late gives off instability signals that make me quite sad. I guess what I'm asking is---if I can draft up complete semantics for a switch to async read over the next week, how disruptive do you think this would be? To the implementation you guys are hoping to ship soon for Fetch, and to the ecosystem overall? Here's a tentative sketch of what I'm thinking, happy to revise: ReadableStream
ReadableByteStream
Please let me know ASAP if you think I should work on this. If you think it's too late for such a big change and we should explore other options, I can understand. But if you think it's worth doing then I want to get it ready ASAP so as to minimize disruption. |
I'd like to do some survey and create a table of APIs and their characteristics (sync/async, blocking/nonblocking, bring our own buffer/returns API allocated buffer, pull and read are separate/pull and read are combined, ...). Includes some non software API stuff for metaphor.
|
Domenic: Oh, OK. I'll think about revisiting everything. But for now, I'm optimistic about sync read(). Like Current ReadableStream is relieving users from doing this manually by having an automatic token (without size arg) feeder named "strategy". Current ReadableByteStream is relieving users from allocating ArrayBuffers manually by allocating them automatically but with size=underlyingByteSource.readBufferSize. I'd view them as easy to use variants of the base interface, ManualBufferFeed.*. It looks that the pros/cons of the promise-returning version you've sketched in #253 (comment) are:
|
This does make me feel better. If we can make everything fit into one conceptual model, maybe we can even add async read/readInto on top as new methods (named e.g. |
Do we support a platform which has a BYOB async read but doesn't have a sync cancel-read? If we do, we cannot get back the buffer from the platform until the read ends. |
Maybe we say that we can get buffers back not immediately but at some point in the future. |
Hmm, I think so. At least I am not sure how to cancel an ongoing read(2) call. But I do not know my Unix APIs all that well :-/.
Yeah, as @tyoshino says, I think it is OK for the promises to take a while to fulfill with |
Question based on some stuff @yutakahirano I have been discussing. And probably related to the last couple posts, now that I re-read them. Given this scenario: reader.read(view1).then(...);
reader.read(view2).then(...);
reader.releaseLock(); what should happen? We think GIven that
I am leaning toward the latter now that I write them out. But I wanted to record them. |
Based on discussions in #253. The key differences here from the previous async read() commits are: - ReadableStreams no longer have read() methods directly; those only exist on readers. This drastically simplifies the stream/reader interaction, and also allows the possibility of different types of readers which have different reading behavior. - read() promises fulfill with { value, done } instead of using an EOS sentinel value. This avoids a number of problems, and also provides a mechanism by which readable byte streams can smuggle out "unused" buffers given to them (using { value: zeroLengthViewOntoBuffer, done: true }). Another new semantic worth mentioning is that you cannot release a reader if the reader has read()s pending; doing so will throw. This slightly complicates the pipe algorithm in the { preventCancel: true } case. This commit also adds some new infrastructure for _templated tests_, and ports some portion of the existing tests there. This is our solution for #217 and #264. Finally, we re-merge all related code into a single readable-stream.js file, as the setup with the three separate files (readable-stream.js, exclusive-stream-reader.js, and readable-stream-abstract-ops.js) was problematic in causing circular dependencies.
Based on discussions in #253. The key differences here from the previous async read() commits are: - ReadableStreams no longer have read() methods directly; those only exist on readers. This drastically simplifies the stream/reader interaction, and also allows the possibility of different types of readers which have different reading behavior. - read() promises fulfill with { value, done } instead of using an EOS sentinel value. This avoids a number of problems, and also provides a mechanism by which readable byte streams can smuggle out "unused" buffers given to them (using { value: zeroLengthViewOntoBuffer, done: true }). - state property is removed (from readable stream) Another new semantic worth mentioning is that you cannot release a reader if the reader has read()s pending; doing so will throw. This slightly complicates the pipe algorithm in the { preventCancel: true } case. This commit also adds some new infrastructure for _templated tests_, and ports some portion of the existing tests there. This is our solution for #217 and #264. Finally, we re-merge all related code into a single readable-stream.js file, as the setup with the three separate files (readable-stream.js, exclusive-stream-reader.js, and readable-stream-abstract-ops.js) was problematic in causing circular dependencies.
Should we allow a readable stream to be cancelled while locked to a reader? I would lean toward "no," except: consider the case where you have set up a pipe chain (which locks the stream). How do you cancel that pipe chain upon e.g. the user navigating to a different page? (You could abort the ultimate destination stream, but we have the same question there: are you allowed to abort a writable stream that is locked to a probably-will-exist-soon exclusive writer?) |
This replaces the dual ready + read() approach previously, which was derived from the epoll(7) + read(2) paradigm. In #253 we did a deep dive on real-world implementation strategies for byte streams, and discovered that the ready + read() model causes a conflict with the semantics we want for byte streams. Briefly, because some byte streams will demand to know the size of the buffer they must fill before doing any I/O (the fread(3) model), the byte stream method that reads into a given typed array must be asynchronous. If such byte streams are then to conform to the readable stream interface, with a no-argument read() method that is an auto-allocating version of the read-into method, then read() must also be async, across all readable streams. This is a usability upgrade for consumers, in many cases. However, for non-byte streams, it potentially costs more microtasks when multiple chunks of data would be available synchronously. We are hopeful that even if current JS engine microtask queues are not as fast as they could be, they will improve over time until this overhead is negligible (which in theory it should be). Alongside this change, we moved the read() method from the readable stream to the reader (now called "readable stream reader" instead of "exclusive stream reader"). This drastically simplifies the stream/reader interaction, and also allows the possibility of different types of readers which have different reading behavior---again, very helpful for byte streams. The usability impact is also positive, as it guides consumers toward using piping instead of directly reading chunks from the stream. Note that read() promises fulfill with { value, done } instead of using an EOS sentinel value. This avoids a number of problems (see extensive discussion in #253), and also provides a mechanism by which readable byte streams can smuggle out "unused" buffers given to them, by using { value: zeroLengthViewOntoBuffer, done: true }. Finally, the state property is now removed (from readable stream), since there is no longer a "waiting" vs. "readable" distinction. This commit also adds some new infrastructure for _templated tests_, and ports some portion of the existing tests there. This is our solution for #217 and #264. Note that we re-merged all related code into a single readable-stream.js file, as the setup with the three separate files (readable-stream.js, exclusive-stream-reader.js, and readable-stream-abstract-ops.js) was problematic in causing circular dependencies.
This replaces the dual ready + read() approach previously, which was derived from the epoll(7) + read(2) paradigm. In #253 we did a deep dive on real-world implementation strategies for byte streams, and discovered that the ready + read() model causes a conflict with the semantics we want for byte streams. Briefly, because some byte streams will demand to know the size of the buffer they must fill before doing any I/O (the fread(3) model), the byte stream method that reads into a given typed array must be asynchronous. If such byte streams are then to conform to the readable stream interface, with a no-argument read() method that is an auto-allocating version of the read-into method, then read() must also be async, across all readable streams. This is a usability upgrade for consumers, in many cases. However, for non-byte streams, it potentially costs more microtasks when multiple chunks of data would be available synchronously. We are hopeful that even if current JS engine microtask queues are not as fast as they could be, they will improve over time until this overhead is negligible (which in theory it should be). Alongside this change, we moved the read() method from the readable stream to the reader (now called "readable stream reader" instead of "exclusive stream reader"). This drastically simplifies the stream/reader interaction, and also allows the possibility of different types of readers which have different reading behavior---again, very helpful for byte streams. The usability impact is also positive, as it guides consumers toward using piping instead of directly reading chunks from the stream. Note that read() promises fulfill with { value, done } instead of using an EOS sentinel value. This avoids a number of problems (see extensive discussion in #253), and also provides a mechanism by which readable byte streams can smuggle out "unused" buffers given to them, by using { value: zeroLengthViewOntoBuffer, done: true }. Finally, the state property is now removed (from readable stream), since there is no longer a "waiting" vs. "readable" distinction. This commit also adds some new infrastructure for _templated tests_, and ports some portion of the existing tests there. This is our solution for #217 and #264. Note that we re-merged all related code into a single readable-stream.js file, as the setup with the three separate files (readable-stream.js, exclusive-stream-reader.js, and readable-stream-abstract-ops.js) was problematic in causing circular dependencies. - Closes #253! - Closes #266 by simplifying reader-related stuff, removing the problematic `ready` promise, and ensuring that cancel() always returns a new promise instead of reusing [[closedPromise]]. - Closes #291 since the relevant tests have been re-written and the related infrastructure around pull simplified. - Closes #264 by introducing templated tests.
This replaces the dual ready + read() approach previously, which was derived from the epoll(7) + read(2) paradigm. In #253 we did a deep dive on real-world implementation strategies for byte streams, and discovered that the ready + read() model causes a conflict with the semantics we want for byte streams. Briefly, because some byte streams will demand to know the size of the buffer they must fill before doing any I/O (the fread(3) model), the byte stream method that reads into a given typed array must be asynchronous. If such byte streams are then to conform to the readable stream interface, with a no-argument read() method that is an auto-allocating version of the read-into method, then read() must also be async, across all readable streams. This is a usability upgrade for consumers, in many cases. However, for non-byte streams, it potentially costs more microtasks when multiple chunks of data would be available synchronously. We are hopeful that even if current JS engine microtask queues are not as fast as they could be, they will improve over time until this overhead is negligible (which in theory it should be). Alongside this change, we moved the read() method from the readable stream to the reader (now called "readable stream reader" instead of "exclusive stream reader"). This drastically simplifies the stream/reader interaction, and also allows the possibility of different types of readers which have different reading behavior---again, very helpful for byte streams. The usability impact is also positive, as it guides consumers toward using piping instead of directly reading chunks from the stream. Note that read() promises fulfill with { value, done } instead of using an EOS sentinel value. This avoids a number of problems (see extensive discussion in #253), and also provides a mechanism by which readable byte streams can smuggle out "unused" buffers given to them, by using { value: zeroLengthViewOntoBuffer, done: true }. Finally, the state property is now removed (from readable stream), since there is no longer a "waiting" vs. "readable" distinction. This commit also adds some new infrastructure for _templated tests_, and ports some portion of the existing tests there. This is our solution for #217 and #264. Note that we re-merged all related code into a single readable-stream.js file, as the setup with the three separate files (readable-stream.js, exclusive-stream-reader.js, and readable-stream-abstract-ops.js) was problematic in causing circular dependencies. - Closes #253! - Closes #266 by simplifying reader-related stuff, removing the problematic `ready` promise, and ensuring that cancel() always returns a new promise instead of reusing [[closedPromise]]. - Closes #291 since the relevant tests have been re-written and the related infrastructure around pull simplified. - Closes #264 by introducing templated tests.
What's your opinion about making pipeTo() itself cancellable, Domenic? |
Discussed at #297 (comment) |
This replaces the dual ready + read() approach previously, which was derived from the epoll(7) + read(2) paradigm. In #253 we did a deep dive on real-world implementation strategies for byte streams, and discovered that the ready + read() model causes a conflict with the semantics we want for byte streams. Briefly, because some byte streams will demand to know the size of the buffer they must fill before doing any I/O (the fread(3) model), the byte stream method that reads into a given typed array must be asynchronous. If such byte streams are then to conform to the readable stream interface, with a no-argument read() method that is an auto-allocating version of the read-into method, then read() must also be async, across all readable streams. This is a usability upgrade for consumers, in many cases. However, for non-byte streams, it potentially costs more microtasks when multiple chunks of data would be available synchronously. We are hopeful that even if current JS engine microtask queues are not as fast as they could be, they will improve over time until this overhead is negligible (which in theory it should be). Alongside this change, we moved the read() method from the readable stream to the reader (now called "readable stream reader" instead of "exclusive stream reader"). This drastically simplifies the stream/reader interaction, and also allows the possibility of different types of readers which have different reading behavior---again, very helpful for byte streams. The usability impact is also positive, as it guides consumers toward using piping instead of directly reading chunks from the stream. Note that read() promises fulfill with { value, done } instead of using an EOS sentinel value. This avoids a number of problems (see extensive discussion in #253), and also provides a mechanism by which readable byte streams can smuggle out "unused" buffers given to them, by using { value: zeroLengthViewOntoBuffer, done: true }. Finally, the state property is now removed (from readable stream), since there is no longer a "waiting" vs. "readable" distinction. This commit also adds some new infrastructure for _templated tests_, and ports some portion of the existing tests there. This is our solution for #217 and #264. Note that we re-merged all related code into a single readable-stream.js file, as the setup with the three separate files (readable-stream.js, exclusive-stream-reader.js, and readable-stream-abstract-ops.js) was problematic in causing circular dependencies. - Closes #253! - Closes #266 by simplifying reader-related stuff, removing the problematic `ready` promise, and ensuring that cancel() always returns a new promise instead of reusing [[closedPromise]]. - Closes #291 since the relevant tests have been re-written and the related infrastructure around pull simplified. - Closes #264 by introducing templated tests.
This replaces the dual ready + read() approach previously, which was derived from the epoll(7) + read(2) paradigm. In #253 we did a deep dive on real-world implementation strategies for byte streams, and discovered that the ready + read() model causes a conflict with the semantics we want for byte streams. Briefly, because some byte streams will demand to know the size of the buffer they must fill before doing any I/O (the fread(3) model), the byte stream method that reads into a given typed array must be asynchronous. If such byte streams are then to conform to the readable stream interface, with a no-argument read() method that is an auto-allocating version of the read-into method, then read() must also be async, across all readable streams. This is a usability upgrade for consumers, in many cases. However, for non-byte streams, it potentially costs more microtasks when multiple chunks of data would be available synchronously. We are hopeful that even if current JS engine microtask queues are not as fast as they could be, they will improve over time until this overhead is negligible (which in theory it should be). Alongside this change, we moved the read() method from the readable stream to the reader (now called "readable stream reader" instead of "exclusive stream reader"). This drastically simplifies the stream/reader interaction, and also allows the possibility of different types of readers which have different reading behavior---again, very helpful for byte streams. The usability impact is also positive, as it guides consumers toward using piping instead of directly reading chunks from the stream. Note that read() promises fulfill with { value, done } instead of using an EOS sentinel value. This avoids a number of problems (see extensive discussion in #253), and also provides a mechanism by which readable byte streams can smuggle out "unused" buffers given to them, by using { value: zeroLengthViewOntoBuffer, done: true }. Finally, the state property is now removed (from readable stream), since there is no longer a "waiting" vs. "readable" distinction. This commit also adds some new infrastructure for _templated tests_, and ports some portion of the existing tests there. This is our solution for #217 and #264. Note that we re-merged all related code into a single readable-stream.js file, as the setup with the three separate files (readable-stream.js, exclusive-stream-reader.js, and readable-stream-abstract-ops.js) was problematic in causing circular dependencies. - Closes #253! - Closes #266 by simplifying reader-related stuff, removing the problematic `ready` promise, and ensuring that cancel() always returns a new promise instead of reusing [[closedPromise]]. - Closes #264 by introducing templated tests.
Now that this Issue 253 is close when will "Streams API Binary Extension" be folded into the main Streams spec? Resolution of W3C MSE bug 27239 depends on this happening: /paulc |
@paulbrucecotton the meta-bug for ReadableByteStream is #300, in case it helps. @tyoshino and I were just talking the other day about when we could move it into the spec. I want to defer to him for first opinion. |
@domenic @paulbrucecotton I'd like to work on that asap but have been busy for other issues about the main non-binary-specific one. I'll try to start working on it this week and get something out on streams.spec.whatwg.org next week. |
Can you confirm that ReadableByteStream at https://streams.spec.whatwg.org/#rbs is what the MSE spec should reference for W3C MSE bug 27239: /paulc |
@paulbrucecotton sorry for the delay in answering. Yes, indeed that is the corret reference. |
Had a thought-provoking conversation with @piscisaureus today. He works on libuv and was giving me their perspective on how they do I/O and how it interfaces with Node.js's streams.
He pointed out that resources like files and pipes shared with other processes (seekable resources, he called them) don't have the epoll + read interface that ReadableByteStream's current design is based on. In Node.js, they do blocking I/O in a thread pool with pre-allocated buffers.
This works for a
ready
+read()
JS interface, since then the implementation can pre-allocate (say) a 1 MiB buffer, read into it off-thread using blocking I/O, then fulfill theready
promise. When JS callsread()
, it just returns the pre-allocated buffer.It doesn't work as well with ReadableByteStream's
ready
+readInto(ab, offset, sizeDesired)
interface, since we don't know what size buffer to allocate until too late. If we pre-allocate something too small,readInto
will always be returning a number below the desiredsize
, which is a bit user-hostile. If we pre-allocate too large, we are at the least wasting memory, and at the worst getting ourselves into a bunch of trouble as we need to figure out how to merge the "leftovers" with other chunks we read in the future.(I am assuming here it is easy in V8 to map the [offset, offset + max(sizeRead, sizeDesired)] portion of
ab
onto a given C++ backing buffer.)A
readIntoAsync(ab, offset, desiredSize)
model would work a bit better, since then we'd know to pre-allocate a buffer of sizedesiredSize
. But, I was curious if you guys had any other thoughts? Did you think of this issue when designing the API, @tyoshino? I imagine @willchan could be helpful here too.The text was updated successfully, but these errors were encountered: