The default bug view has changed. See this FAQ.
Bug 1128959 (streams)

Implement the WHATWG Streams spec

NEW
Unassigned

Status

()

Core
DOM
2 years ago
2 days ago

People

(Reporter: Michael[tm] Smith, Unassigned)

Tracking

(Depends on: 2 bugs, Blocks: 3 bugs, {dev-doc-needed})

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(5 obsolete attachments)

(Reporter)

Description

2 years ago
https://streams.spec.whatwg.org/
Status: UNCONFIRMED → NEW
Ever confirmed: true

Comment 1

2 years ago
Please also study this while looking at this specification:

  https://github.com/yutakahirano/fetch-with-streams/

Google is pushing this so if we want something fundamentally different now would be the time to say something.
Blocks: 813450
Status: NEW → UNCONFIRMED
Ever confirmed: false

Updated

2 years ago
Status: UNCONFIRMED → NEW
Ever confirmed: true
(Reporter)

Updated

2 years ago
Component: DOM: Core & HTML → DOM
Does this allow receiving a ReadableStream, from for example a filesystem API or a network API, and then transferring that ReadableStream to a Worker thread using postMessage({ readThis: myStream }, [myStream]);

Likewise, can a page instantiate a ReadableStream (through a ReadableStream constructor or through some other means) that can then be transferred to a Worker thread?

This doesn't mean that all ReadableStreams have to be transferrable. For example a stream of plain JS objects can obviously not be transferred. So maybe we need a notion of transferrable and non-transferrable streams.


I also don't understand the purpose of having the "start"/"pull"/"cancel" interface which is used for the ReadableStream constructor. It seems like we're ending up with two writing interfaces, "start"/"pull"/"cancel" and WritableStream. Why do we need two? It seems silly if the only use-case is to enable having a standalone ReadableStream constructor. The pattern that's used for MessageChannel seems much better.


These are not new questions, I've asked these questions before so I assume that there are thought out answers.


I'd also like to understand if the intent of this API is intended to be a binary stream, or a generic object stream? I.e. is it appropriate to use for something like filesystem directory enumeration?

Finally, I'd really like to get some sort of feedback from TC39 on this. Especially if this API is intended to be used for things like directory enumeration since that likely means that we will eventually want ES syntax which uses this (like how "async function" will use Promise). Is TC39 prepared to build that on top of this Stream proposal, or are we going to end up with two incompatible solutions?
I'm sure Domenic would know best, but here's my read on this particular question.

(In reply to Jonas Sicking (:sicking) from comment #2)
> I also don't understand the purpose of having the "start"/"pull"/"cancel"
> interface which is used for the ReadableStream constructor. It seems like
> we're ending up with two writing interfaces, "start"/"pull"/"cancel" and
> WritableStream. Why do we need two? It seems silly if the only use-case is
> to enable having a standalone ReadableStream constructor. The pattern that's
> used for MessageChannel seems much better.

How would you implement a ReadableStream with a pull oriented source using WritableStream?  The "pull"/"cancel" feature seems reasonable and unique for that purpose.

The "start" feature is a bit redundant since the spec could also provide some kind of pipe class that provides both readable and writable.  Then a push source would just call write() on that class.  This is a but clumsy for extension, though, because now you are forcing objects to provide a WritableStream interface just to take data from a push source.  This means you can't easily build a duplex object where writing sends data out and reading pulls data in.

One could also envisage a hybrid source that pushes a certain amount of pre-buffered data at "start", and then waits for "pull" to retrieve the rest.  This would also not be easily achievable using solely WriteableStream.

Overall I like the ReadableStream's constructor "start".
Transferring does not seem addressed in the spec.  I opened an issue to ask the question:

  https://github.com/whatwg/streams/issues/276

Comment 5

2 years ago
> Transferring does not seem addressed in the spec.  I opened an issue to ask the question:

I'll respond to this over there.

> I also don't understand the purpose of having the "start"/"pull"/"cancel" interface which is used for the ReadableStream constructor.

I did some analysis on that a while back (along with some other issues): https://gist.github.com/domenic/197ae6fab8adc6890c7a#part-3-streams-as-two-sided-pipes

Basically, trying to move to this model makes it extremely awkward to author streams, as Ben hinted at. Please take that document with a grain of salt as it is old (so a few signatures are outdated) and cumulative (so section 3, which I linked to, assumes some of the stuff from sections 1 and 2, which also did not make it into the spec). But hopefully it explains why this path did not bear fruit. I can write out full examples of the awkwardness if you'd like, perhaps adapting all of these https://streams.spec.whatwg.org/#creating-examples into the shared-queue style.


> I'd also like to understand if the intent of this API is intended to be a binary stream, or a generic object stream? I.e. is it appropriate to use for something like filesystem directory enumeration?

It is intended to be an I/O stream, but in a broad sense. I/O often starts and ends with bytes, but in between you often have strings, JSON, video frames, protobufs, and other data structures. The interface is focused around mapping efficiently to low-level I/O primitives, but the chunks that pass through streams can be of any type.

I do not think this is the best primitive for filesystem directory enumeration, although it could do in a pinch. Let me explain. In general, given that we have values for sync/singular, iterators for sync/plural, and promises for async/singular, we'd like some shared abstraction for async/plural. Right now async iterator [1] or observable [2] are the leading candidates in TC39. A (readable) stream would be a specific subtype of the more general type, optimized for I/O. This is similar to how an array is a specific subtype of the more general iterable type, optimized for O(1) indexed access and memory locality. The parallels even extend to the development timeline, with array/stream being fleshed out first and then later retrofitted to be a subtype of iterable/async iterable once those were/are developed.

I think filesystem directory enumeration could use a readable stream, but I'm not sure the extra abilities it comes with (e.g. backpressure control, exclusive reader, piping through transform streams and to writable streams) would make very much sense for a directory listing. Maybe they would, especially backpressure---I'd have to learn more about the underlying OS APIs to understand that. But it does feel a bit like killing a fly with a bazooka.

[1]: https://github.com/zenparsing/async-iteration/
[2]: https://github.com/jhusain/asyncgenerator#introducing-observable

> Finally, I'd really like to get some sort of feedback from TC39 on this. Especially if this API is intended to be used for things like directory enumeration since that likely means that we will eventually want ES syntax which uses this (like how "async function" will use Promise). Is TC39 prepared to build that on top of this Stream proposal, or are we going to end up with two incompatible solutions?

This was brought up at TC39 and there was general agreement both on the fact that it should be a subtype of a more abstract solution (as above).
(In reply to Domenic Denicola from comment #5)
> > Transferring does not seem addressed in the spec.  I opened an issue to ask the question:
> 
> I'll respond to this over there.

FWIW, given that this targets IO streams, this seems like an extra important issue to solve since many times it's beneficial to do IO and parsing on background threads.

> > I also don't understand the purpose of having the "start"/"pull"/"cancel" interface which is used for the ReadableStream constructor.
> 
> I did some analysis on that a while back (along with some other issues):
> https://gist.github.com/domenic/197ae6fab8adc6890c7a#part-3-streams-as-two-
> sided-pipes

This argument seems to boil down to "JS doesn't have any syntax for instantiating two objects which has co-dependency". I don't feel competent enough in JS to have an opinion about this. I'd love to hear Dave Herman or Yehuda's opinion.


The way stuff like this is handled in C++ is through interfaces rather than classes. I.e. ReadableStream and WritableStream would be interfaces that doesn't have direct constructors, but instead just acts like contracts. So no one (not even some omnious platform backend) would be able to instantiate ReadableStream/WritableStream, instead you'd instantiate other objects which happen to expose ReadableStream/WritableStream interfaces.

But JS doesn't really have interfaces. The equivalent is duck types. So there's no "thenable" or "iterable" interfaces, instead anything with a .then() function is a thenable, and everything with a @@iterator() function is an iterable.

By that (quite possibly very flawed) logic, ReadableStream and WritableStream could fairly easily be created by having a Pipe constructor like:

class Pipe {
  constructor()

  get readable()
  get writable()
}

where the objects returned by .readable and .writable contain functions which both close over the same closure and thus can share access to underlying buffers. So there would be no ReadableStream class and thus no need for a ReadableStream constructor. So no cyclic dependency between the ReadableStream constructor and WriteableStream constructor.

So this is definitely doable without needing to rely on magic platform objects. The whole setup can be explained and implemented in JS.

But like I said, I defer to people that knows JS better than me (such as yourself) to if this is has other problems than "not impossible" :)

> > I'd also like to understand if the intent of this API is intended to be a binary stream, or a generic object stream? I.e. is it appropriate to use for something like filesystem directory enumeration?
> 
> It is intended to be an I/O stream, but in a broad sense. I/O often starts
> and ends with bytes, but in between you often have strings, JSON, video
> frames, protobufs, and other data structures. The interface is focused
> around mapping efficiently to low-level I/O primitives, but the chunks that
> pass through streams can be of any type.

Ok, that makes sense.
Jonas, how precisely do you think we need the transferability spec'd out before we proceed?  I think we have a rough idea of what to do. [1]  However, it sounds like Domenic would like to wait until its spec'd for promises first unless we really feel this is necessary for streams immediately.

[1]: https://github.com/whatwg/streams/issues/276
Flags: needinfo?(jonas)

Comment 8

2 years ago
To be clear I don't really insist on any particular ordering, I just thought it'd be better to tackle the simple case (promises) first before trying streams. But we didn't get very far with that plan so if there's motivation to "skip ahead" to streams then that seems fine.
I thought that some of the most important use cases we had for Streams was around inter-thread-communication. So it seems important that that can be done without making too many sacrifices.
Flags: needinfo?(jonas)
OS: Mac OS X → All
Hardware: x86 → All
I believe we have consensus on moving forward.  I've sent an intent to implement to dev-platform and blogged a bit about it here:

  https://blog.wanderview.com/blog/2015/06/19/intent-to-implement-streams-in-firefox/
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Keywords: dev-doc-needed
Duplicate of this bug: 891286
Alias: streams
Depends on: 1230148
We will need Promises implemented in JavaScript engine directly -- see bug 911216.
Depends on: 911216
(In reply to Yury Delendik (:yury) from comment #12)
> We will need Promises implemented in JavaScript engine directly -- see bug
> 911216.

I spoke with Boris about this previously and its more of a nice to have.  If js promises are not done we can have an intrinsic that creates one of our DOM promises.
Created attachment 8706963 [details] [diff] [review]
[WIP] ReadableStream self-hosted implementation

WIP for some of ReadableStream, ReadableStreamReader, ReadableStreamController
TODOs use JS Promises instead of DOM/Workers hacks, tee and pipe methods
I have the WIP from comment 14 just about re-based.  Just trying to figure out one last issue with integrating with the SPIDERMONKEY_PROMISE bits.
Created attachment 8749929 [details] [diff] [review]
Rebased wip ReadableStream patch

This rebases the previous patch to currently mozilla-central with --enable-sm-promise switched on.

There is currently a problem with the spidermonkey promises, though.  It seems if self hosted code does this:

  _GetOriginalPromiseConstructor().resolve()

Then we end up hitting this assertion in Debug_CheckSelfHosted():

  https://dxr.mozilla.org/mozilla-central/source/js/src/vm/Interpreter.cpp#186

The Promise code does some complicated things to support chrome wrappers.  It seems like the Promise_static_resolve() or the promiseCapability.resolve() functions are not properly marked as self hosted.

Till, do you have any ideas here?  For now I have commented out the assertions, but I'd like to make the spidermonkey promises usable from self hosted code.
Attachment #8706963 - Attachment is obsolete: true
Flags: needinfo?(till)
Note, if you want to see this yourself you can:

0) fix a couple bit rot issues with SPIDERMONKEY_PROMISES build
1) apply this patch
2) uncomment the assertions in Interpreter.cpp and GlobalObject.h
3) run ./mach web-platform-tests streams/readable-streams/general.https.html
I think Boris solved this for me.  I didn't know that I had to use callFunction() in self hosted code.  Sorry for the noise.
Flags: needinfo?(till)
Duplicate of this bug: 1271950
Created attachment 8751999 [details] [diff] [review]
Rebased wip ReadableStream.patch

Updated patch that uses callFunction().

I still have an issue with the bad-underlying-source tests hitting this assert, though:

https://dxr.mozilla.org/mozilla-central/source/js/src/vm/GlobalObject.h#678
Attachment #8749929 - Attachment is obsolete: true
Created attachment 8752001 [details] [diff] [review]
Rebased wip ReadableStream.patch

A little more cleanup that I missed in the last patch.
Attachment #8751999 - Attachment is obsolete: true
Created attachment 8752223 [details] [diff] [review]
Rebased wip ReadableStream.patch

Fix the last issue I mentioned.  Turns out a "use strict" violation that tries to create a new global variable in self hosting triggers an assertion in setIntrinsic().  That was not intuitive to me.
Attachment #8752001 - Attachment is obsolete: true
Depends on: 1272697
Comment on attachment 8752223 [details] [diff] [review]
Rebased wip ReadableStream.patch

I moved this patch over to bug 1272697.  I'll land this as P1 in that bug with further refinements in additional patches.
Attachment #8752223 - Attachment is obsolete: true
Its been a while since I commented here, but work is progressing.  Bringing the self-hosted js up-to-spec is taking a while because streams were majorly refactored.  At the end this will have support for both byte stream optimizations and tee'ing.

I'm maintaining my patch queue here:

https://github.com/wanderview/gecko-patches/tree/dev-streams

Its not functional yet, but I'm almost done translating the js side of things.  I then need to add the c++ intrinsics necessary and get the tests working again.
Blocks: 1283191
Duplicate of this bug: 1294957
Unfortunately I have to focus on a large service worker effort at the moment.  I need to drop this.  I believe we have someone lined up to take it on, though.
Assignee: bkelly → nobody
Status: ASSIGNED → NEW

Comment 27

6 months ago
Anything moved on?

Updated

5 months ago
Blocks: 802882

Updated

4 months ago
Blocks: 1191981
Depends on: 1333800
Depends on: 1335397

Comment 28

2 days ago
till, is there a bug to watch specifically about writable streams progress? I know you had a prototype written up in an afternoon some time recently.

FYI, the spec is quite stable these days. We have been tweaking edge case behaviors over the last month (e.g. what happens if you abort() then immediately close()) and have arrived at a place we are happy with and hoping to ship in Chrome.
You need to log in before you can comment on or make changes to this bug.