-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReadableByteStream: should support an internal queue #353
Comments
The boundary between the underlying byte source and the readable byte stream doesn't need to correspond to any specific layer boundary (e.g. kernel-user). It just depends on the philosophy the author of the spec gave to the spec. When we want to drain some data to one world (storage, kernel, process, C++) from another (another storage, userland, another process, JS heap), they're always allowed to do that in various means.
Right. Having some strategy-controlled buffer example in the spec is good. But the spec shouldn't impress the readers that they must implement the buffer in JavaScript. My suggestion is having two separate classes. One is just describing all the publicly visible requirements (including invariants, interaction between methods, properties, etc.), and the other is a strategy-controlled buffer which is convenient for push source implementor. The buffer class is good example for explaining how to interpret backpressure and controlling buffer size, data generation in general. That can be referred to when writing non-JavaScript buffering components. And of course, it's also useful as a built-in JS library. This separation doesn't make much difference in ReadableStream, but I really want this separation for ReadableByteStream. If the source is push source, then it's convenient for the implementor of the source if the stream has a built-in buffer. In that case, the source doesn't understand BYOB style pulling, so we need to take care of the gap by copying the contents to the view provided by the consumer. But this is not expected for a source that understands BYOB. For BYOB read requests, the source just wants to get the view and fulfill the corresponding pending request with the view (possibly detached) filled with generated contents as-is. There shouldn't any automatic copy. Having a queue even for BYOB-capable source may be useful. But the queue system for BYOB-capable source and one for push source would be very different. I try to finish prototyping the ideas explained above. WIP branch is at https://github.com/whatwg/streams/blob/bytestream/index.bs. |
I've made much edit on the last post. Sorry but please take a look at the latest one. |
I don't think this is the way to go. We should not have a class X whose internals are specified, based on interaction with a strategy, and class Y, whose internals are not specified but whose invariants are. Instead we should just have one class, X, which is flexible enough that any Y can be expressed by chosing a specific strategy. And, if you can come up with such a strategy, then of course you're free as an implementer to implement it without using strategies at all, as long as you behave exactly the same. But it's still just one class and one spec. Maybe it would help if we added, somewhere prominent, a note like: Note for implementers: this spec describes stream classes controlled by underlying source or underlying sink JavaScript objects. This allows developers the flexibility to create streams with a wide variety of behaviors, while ensuring that their internal and external invariants are all maintained. Platform-created streams of these classes should behave as if they were also created from some conceivable JavaScript underlying source or sink object, in order to maintain those invariants. However, since such underlying source and sink objects are not externally visible once a stream has been constructed, an implementation is free to use any mechanism it wants to implement the resulting behavior; they don't literally need to create such objects.
I don't think this is necessarily true. For example if you were writing a source that was directly using However, I think it might be OK if we say that this is rare enough that we force authors (and implementers) to handle it on their own. That is, if we assume nobody will ever expose But that means there will still be uneven flow, even for
Having typed all the above, I then went to skim your changes and I see that you actually implemented the queue! And allowing both types of readers to be used, with interop on both sides! Wow!! This is very convenient, if you are OK with it. Now I am a little confused :) I will need to dig in to your branch more later to understand everything. (Now I know what you must feel like reviewing my big patch sets...) |
Yes. I understand it helps. But I'm feeling that the all-in-one class approach would make things complicated.
It's fine to have a queue for convenience. But
I was thinking as you described here. Right. I was trying to build an all-in-one class in that branch. Sorry for confusing. I was trying but seeing difficulties in designing the class without much complexity. Now it's almost done.
|
What is your general feeling? How much more complex is it, and would it be worth it? I am not sure anymore. Maybe the extra layer could be added future-compatibly?
Hmm, why? Maybe I don't fully understand what "partially filled" means? Doing some review of https://streams.spec.whatwg.org/branch-snapshots/bytestream/ (on the assumption we would want to go with this, even if that is not guaranteed pending more discussion above).
Maybe we should not allow ReadableByteStreams that don't support BYOB? It would help uncomplicate things, and would prevent the situation of giving a "false contract" where you support getByobReader() but actually it is just an inefficient shim over the normal reader.
Filed https://bugs.ecmascript.org/show_bug.cgi?id=4369 for you.
Probably TypeError; there isn't a developer-supplied number that is out of range here.
IMO it is OK to throw away the chunk if the stream has errored. It is better than rejecting with a non-Error. So I would just reject with stream@[[storedError]].
This would be a bit clearer if 2.c was nested under 2.b IMO.
I think it needs to do transfers before giving back chunk
Why? Doesn't this lead to situations where you call .read(view) once and then all of a sudden your reader is no longer active? I am probably missing something. |
Please suppose that the BYOB interface of a stream is given a Uint16Array with 1 element (i.e. backed with 2 byte long ArrayBuffer), where the stream is EOF-ed with only 1 byte of data. We planned to return an instance of the same ArrayBufferView variant, so, we want to return a Uint16Array. The stream has only 1 byte to return to the user, but we cannot create a Uint16Array and make it indicate that only 1 byte valid data has been saved into it (1-element Uint16Array is 2 byte long). |
I guess you meant s/ReadableByteStream/building ReadableByteStream with an underlying source that doesn't support BYOB/. Maybe you're right. I'll try it out. |
OK. Will fix. |
Oh, sorry. I didn't come up with what to do with this, and so I left it incomplete (passing error and chunk to CreateIterResultObject) ... and forgot to revisit. OK. Will fix. |
I see, that makes sense. So to be clear:
I cannot really tell whether or not this should be an error (or if the consumer should just deal with the fact they didn't get enough bytes) so am happy to go with your judgement. |
If we could return the number of bytes filled together with the ArrayBufferView, we can avoid erroring it. But, maybe, we should rather ask those who want to fill a Uint16Array from unsigned int16 data sharded into multiple streams to just change the view type to Uint8Array, and change back to Uint16Array once done. |
Right. It's a bug. Will fix.
Yeah, but should we automatically transfer it in library code than asking underlying source implementors to do it by themselves if necessary?
Sorry I cannot get what situation you're concerned with. Both in ReleaseReadableByteStreamReader and RespondToReadableByteStreamByobReaderReadIntoRequest, this step is invoked only when the stream is in "closed" or "errored" state. read(view) and read() issued after after detaching will be taken care of by the reader object. |
Pushed WIP reference implementation. |
I think it has to be part of the implementation, as otherwise it is possible to create observable data races.
Oh sorry, I missed that. All good.
Yay!! |
I remembered why I chose to have sourceSupportsByob. I thought that it's good if we could make it able to wrap a source that doesn't understand If we try to realize this without checking whether the source understands
To prepare for this, the stream needs to store the ArrayBufferViews inside itself and fill them on controller.enqueue() call. I wanted to avoid this queue which is unnecessary for sources that understand If we can require the source to understand |
Ah yes, that is a reasonable thing to do. In balance though, I think it is better to require the source to understand read(view) and use controller.respond(). You can indeed imagine cases where people want ReadableByteStreams that have a "fake BYOB" interface, that necessitates copying. But I think it is OK for that to be hard to implement, instead of easy. That is, the underlying source code would end up pretty complicated and ugly---it would need its own internal queue and it would copy from that queue into the supplied view inside read(view). You can still do it but it's not easy. In other words, I think if someone wants to create a ReadableByteStream with fake BYOB we should make them do that by writing a complicated underlying source, instead of take on that burden ourselves by complicating the ReadableByteStream machinery. |
Wait, upon re-reading I realize my argument may have been overzealous. controller.enqueue() is necessary---at least, that is the premise of this thread. But the intended usage is that you implement read(view) and then also use controller.enqueue() when your kernel buffer is overflowing (or similar). So I meant to argue against sourceSupportsByob and not against controller.enqueue(). (That said, the result of #295 might be that supporting both models is too hard and we have to remove controller.enqueue(). But in this thread we assume controller.enqueue() must stay and are discussing details like sourceSupportsByob.) |
Thanks for the suggestion. But it seems we might be able to support both. Please take a look at the quick progress summary at #295 (comment) Now, we allow an underlying byte source to be implemented with only enqueue() calls. Just respond() must not be called when there's no pending pullInto. |
All the ideas discussed in this issue have been incorporated into the PR #418. |
The current draft does not include such a queue; there is a one-to-one map of calls to
rbsReader.read()
andrbsUnderlyingSource.pull()
. I think this is potentially problematic, for two reasons:rbsUnderlyingSource.pull()
will not be a simple call toread(2)
or similar. Instead, the underlying source author will need to do their own buffering. That seems bad.This does make things more complicated though for the BYOB case, as we have to figure out what happens if the internal queue has things in it and then we switch to a BYOB reader. This was discussed at #177 (comment)
@tyoshino, have your thoughts changed since that discussion? Or is the current draft lacking a queue just as a first-draft thing?
The text was updated successfully, but these errors were encountered: