Closed Bug 1302036 Opened 8 years ago Closed 8 years ago

StructuredClone should not require SAB in transfer map

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox51 --- fixed
firefox52 --- fixed

People

(Reporter: lth, Assigned: lth)

References

Details

(Keywords: dev-doc-complete)

Attachments

(5 files)

There was an agreement back in July, chiefly among people who have an opinion on DOM matters (Anne, Domenic Denicola, Dan Ehrenberg), that postMessage should *not* require a SharedArrayBuffer to be present in the transfer list (indeed, we should disallow it from the transfer list), because the object is not really being transfered.  ISTR an analogy was made with blobs.

The decision was codified in the DOM/HTML companion spec for shared memory:
https://tc39.github.io/ecmascript_sharedmem/dom_shmem.html

I will implement this change in two steps.  The first step (this bug) is to allow code not to add the SAB to the transfer list.  Landing this will allow user code to start migrating.  The second step (bug TBD) is to prohibit code from adding the SAB to the transfer list.
Blocks: 1302037
For this patch, if a SAB is present in the transfer map we should probably print a warning in the console.
There's a snag here, from the point of view of Firefox, in that Firefox assumes that objects that aren't transfered can be persisted (they are after all being copied anyhow), but this is not true for SharedArrayBuffer where we'd only be copying the pointer to the RawBuffer.  Not sure yet if this is a semantic thing in the DOM or if it's just an implementation thing in Firefox.
It seems possible that what we want is a principled approach, where the code that requests serialization explicitly requests serialization of SharedArrayBuffer values if such serialization is desirable, something that is only possible if the value is not going to be persisted.  This could amount to one of several things, or a combination:

- a (boolean) flag to the write method indicating whether SharedArrayBuffer is supported, defaulting
  to false
- a requirement to support explicit callbacks to handle SharedArrayBuffer, probably with support
  provided by the structured clone machinery to avoid reinventing the wheel every time this is needed
- a flag on the JSAutoStructuredCloneBuffer, amounting to much the same thing

For parity with what we have now I think we need to support this on straight postMessage() but perhaps also on postMessage() on channels, if that is separate code.
Introduce an API that allows us to specify whether non-persistible non-transferable data should be serialized.  Right now the only example is SharedArrayBuffer.  There's been some speculation on #jslang that eg Promise may be another example, eventually.

There must be several ways to slice this but it seems to me that this is a concern that is orthogonal to the StructuredCloneScope, hence a new argument.  There are quite modest ripples up through the DOM to support this (see subsequent patch).
Attachment #8790714 - Flags: review?(sphink)
Adapt to JS engine API changes to handle cloning of non-transferable non-persistable data properly (currently only SharedArrayBuffer).

As I wrote in the note on the other patch, this seems orthogonal to StructuredCloneScope, so there's a new argument for specifying this.  Most places this argument defaults to "not".  To the best of my knowledge this patch does not change any behavior, since previously all transferable data was non-persistable and SharedArrayBuffer was transferable.

As to whether this should be an argument to Write, as I have it here, or an argument in the construction of the StructuredCloneHolder... I opted for the former since it seemed to place the decision at the right point.
Attachment #8790720 - Flags: review?(amarchesini)
I should mention test cases: I have a bunch of SharedArrayBuffer tests that are not part of our tree (they're in the TC39 repo at the moment).  These pass as well as they did before this change.
Lars, would it be possible to merge this with StructuredCloneScope? If not, I'd want a comment explaining why, given that they're so similar.
Flags: needinfo?(lhansen)
(In reply to Steve Fink [:sfink] [:s:] from comment #9)

> Lars, would it be possible to merge this with StructuredCloneScope? If not,
> I'd want a comment explaining why, given that they're so similar.

I didn't see that as a winner, because I see the concerns as orthogonal, as follows.

The "correct" serialization for SharedArrayBuffer when it is meaningful is as a pointer to the SharedArrayRawBuffer, this is the only thing that's useful to the receiver for postMessage, say.  But this pointer value should never be revealed when persisting the data.  You might argue that when IndexDB or XHR wants to "persist" a SharedArrayBuffer they might want to save a copy of the data, but this is a different kind of operation (and one we did not support before, when SABs had to be transfered).

Mixing all this complexity up with the scope did not seem like it would do any good to anyone, as scope does not have any obvious bearing on what data should be serialized or how it should be serialized.

There's a thread starting at https://github.com/whatwg/html/issues/935#issuecomment-246785244 that discusses the complexities that pertain to this, I think we're glossing over some of that now.

So, if we were to merge this with the scope then at a minimum the number of scope types would have to be expanded to the point where they also encode this policy bit, and at that point the policy bit might as well be separate.
Flags: needinfo?(lhansen)
(In reply to Lars T Hansen [:lth] from comment #10)
> (In reply to Steve Fink [:sfink] [:s:] from comment #9)
> 
> > Lars, would it be possible to merge this with StructuredCloneScope? If not,
> > I'd want a comment explaining why, given that they're so similar.
> 
> I didn't see that as a winner, because I see the concerns as orthogonal, as
> follows.

I somehow missed where you had already discussed that *twice* in the preceding bug comments. Sorry!

> The "correct" serialization for SharedArrayBuffer when it is meaningful is
> as a pointer to the SharedArrayRawBuffer, this is the only thing that's
> useful to the receiver for postMessage, say.  But this pointer value should
> never be revealed when persisting the data.  You might argue that when
> IndexDB or XHR wants to "persist" a SharedArrayBuffer they might want to
> save a copy of the data, but this is a different kind of operation (and one
> we did not support before, when SABs had to be transfered).
> 
> Mixing all this complexity up with the scope did not seem like it would do
> any good to anyone, as scope does not have any obvious bearing on what data
> should be serialized or how it should be serialized.

But this is the place I get lost. The scope *does* change how data gets serialized already. If you Transfer an ArrayBuffer with scope=SameProcessSameThread or scope=SameProcessDifferentThread, you will get a pointer to the data. If instead scope=DifferentProcess, you will get a copy of the data. In both cases, the original ArrayBuffer will be detached. It's true that there is no observable difference in this case, whereas with the SAB there is. But I believe we already 

Btw, I'm all for the general approach of this patch; I'd rather have it be explicit than rely on how you use these APIs (eg by setting callbacks.) But I'd go further, and say unless there's a good reason not to, I'd prefer information like this to be part of the actual structure or data of the structured clone buffer. I've been tempted to have two different classes for persistent vs nonpersistent buffers, since they are entirely different beasts. (Though for me, nonpersistent just meant "having Transferable goop".) Maybe that's akin to the Entangle vs Serialize/Deserialize distinction?

Anyway, I'm still thinking this through. I feel like we have a collection of similar-but-different things: scope of validity, same vs different thread/process, whether persistent across process instances, containing transferables, SAB allowed or not, pointers to an origin address space whose targets are immutable or not, etc.

I also need to look closer at existing usages.
Comment on attachment 8790720 [details] [diff] [review]
bug1302036-dom.patch

Review of attachment 8790720 [details] [diff] [review]:
-----------------------------------------------------------------

If you disagree with my comments, let me know before landing this patch. Thanks!

::: dom/messagechannel/MessagePort.cpp
@@ +443,5 @@
>        ProfileTimelineMessagePortOperationType::SerializeData,
>        MarkerTracingType::START);
>    }
>  
> +  data->Write(aCx, aMessage, transferable, JS::CloneDataClass::Any, aRv);

Persistable

MessagePort must work across processes.

::: dom/xhr/XMLHttpRequestWorker.cpp
@@ +1172,5 @@
>          }
>  
>          if (doClone) {
>            ErrorResult rv;
> +          Write(cx, response, transferable, JS::CloneDataClass::Persistable,

This can be Any. XHRs are in-process.
Attachment #8790720 - Flags: review?(amarchesini) → review+
I looked at all the uses, and I'd like to better understand the discrepancies. With baku's updates, we have:

Any+SameThread: nsGlobalWindow.cpp -> PostMessageEvent
Persistable+DifferentProcess: StructuredCloneData
Persistable+DifferentProcess: MessagePort -> SharedMessagePortMessage
Persistable+SameProcessDifferentThread: ServiceWorkerClient -> ServiceWorkerClientPostMessageRunnable
Any+SameProcessDifferentThread: WorkerPrivate{,Parent} -> MessageEventRunnable
Any+SameProcessDifferentThread: XMLHttpRequestWorker -> EventRunnable

All of these support Cloning and Transferring, except for XHR which does not do transferring.

So DifferentProcess implies Persistable (makes sense)
   SameProcessSameThread implies Any (also makes sense, though why I dunno why nsGlobalWindow is always same thread; I guess it's an internal channel for DOM events) 

The last three, with SameProcessDifferentThread, are the ones without a simple relationship, so they're worth focusing on to try to tease out what is nonorthogonal between Class and Scope.

Why are service workers Persistable while DOM workers are Any? I guess DOM workers are part of a page, but service workers are independent (and shared?)
Flags: needinfo?(amarchesini)
And for lth: the Scope is required when creating a StructuredCloneHolder. You have the Class passed into Write, yet it is fixed statically (it is always passed in as a constant). Do you anticipate a need for it to vary dynamically? If so, is there an underlying reason why it is more likely to vary than Scope is?

There doesn't have to be; perhaps Scope should be dynamic as well. It just seems nice to enforce these statically as long as it is correct to do so. The "type" of cloning we're dealing with must agree between writer and reader, so it's safer to communicate that all the time rather than relying on dynamic checks for rare cases. I'm somewhat tempted to make them template parameters to force the check at compile time, except that I think we still go through an untyped layer right now (moving things around as char* buffers.)
Flags: needinfo?(lhansen)
See Also: → 1232973
I believe the Scope can probably be made a template parameter.  I was uncertain, when I wrote the code, about how StructuredCloneHolders were being used exactly, and I decided to be conservative.  I will investigate this.

There is another dimension here that you allude to ("Why are service workers Persistable while DOM workers are Any?")  This is related to bug 1232973 - shared memory should not be sharable with (or from) service workers and shared workers because it is not deadlock-safe to do so; a tab that shares memory with such a worker, then takes a lock (by setting a value in the shared memory), and then disappears (is pushed into the window history or is closed) can make the shared worker hang, waiting for the lock to be released.

That prohibition needs to be in effect for SameProcessDifferentThread cases, of course.

There are some pending patches on that bug, but I withdrew one after baku disliked the other...  anyway, it seems to me that Persistable / Any is a semantic constraint while SameProcess / DifferentProcess is an implementation constraint and that they are orthogonal, hence two separate variables rather than an attempt to extend SameProcess etc as I did in the other bug.

With that view, DifferentProcess needs to imply Persistable (as it does), but the other two are open.  We could fold the Persistable / Any into the Scope enum, so that we'd have SameProcessDifferentThreadPersistable and SameProcessDifferentThreadAny as the two values for cross-thread SC (SameProcessSameThread would imply persistable?  Not sure.)  Better or worse?  When I attempted that on the other bug baku was not really in favor of it.
Flags: needinfo?(lhansen)
Comment on attachment 8790714 [details] [diff] [review]
bug1302036-engine.patch

Review of attachment 8790714 [details] [diff] [review]:
-----------------------------------------------------------------

I'm still not fond of this patch, but I don't really have any concrete alternative to suggest, and it should be pretty easy to redo this once the next concept arises that needs to be resolved. So I guess I'm ok with this, though I'd like to see the updated patch.

I'm tempted to revive my API rewrite, which allowed registering handlers by tag. Then you wouldn't need this, you could register or unregister the shared array buffer tag. But never mind, I'm not going to mess with that.

::: js/public/StructuredClone.h
@@ +57,5 @@
>  
>      SCTAG_TMO_USER_MIN
>  };
> +
> +enum class CloneDataClass {

Class is pretty vague. Is there something that conveys more semantics? CloneDataFilter or CloneDataPolicy or CloneDataRestrictions or something.

Right now, Persistable/Any seems strangely specific. The comments try to make it about in-process meaning, but that feels very much like the sort of implementation restriction you're trying to get away from (to avoid overlap with Scope). To me, it feels like it has more to do with a policy decision (eg service workers vs dom workers).

Sometimes the Scope forces the policy decision. Or maybe not? I could imagine using shared memory across processes, though then you'd be serializing a shm key or something instead of an address.

Anyway, I would prefer a different name, and for the comment to emphasize policy decisions over implementability (shared address space etc.)

@@ +74,5 @@
> +     * affect the ability to transfer.  Instead, ::Persistable
> +     * restricts the ability to persist non-transferable values, such
> +     * as SharedArrayBuffers.
> +     */
> +    Persistable,

Is "persistable" the right term? NoSharedMemory, DisallowRaces, Unshared, ...?

Using "persist" in the comment is similarly confusing. Presumably you don't mean persisting to disk, since you commonly send these between processes or threads. I guess you mean "persist" as in "serialize" as in "generate a structured clone buffer" -- but that has no implications for whether or not addresses are allowed. Especially given the cases where you *could* meaningfully send across a pointer but choose not to, I'd prefer policy-based terminology instead of pulling in persistence or serializability.

@@ +84,5 @@
> +     * Serialized data may contain values that only have in-process
> +     * meaning, eg, addresses (provided those addresses do not point
> +     * to objects that may be moved by GC or other means).
> +     */
> +    Any

Unrestricted?

@@ +244,5 @@
>  JS_WriteStructuredClone(JSContext* cx, JS::HandleValue v, JSStructuredCloneData* data,
>                          JS::StructuredCloneScope scope,
>                          const JSStructuredCloneCallbacks* optionalCallbacks,
> +                        void* closure, JS::HandleValue transferable,
> +                        JS::CloneDataClass cloneDataClass);

Can this go next to scope? (Sorry for the churn.)

::: js/src/builtin/TestingFunctions.cpp
@@ +3871,3 @@
>  "  Serialize 'data' using JS_WriteStructuredClone. Returns a structured\n"
> +"  clone buffer object.  If non-persistable is true then this will serialize\n"
> +"  even data that only make sense in-process (eg, SharedArrayBuffer)."),

I think this function has the potential to get very messy very quickly. Instead of a boolean, please accept an options object and check it for a property named "non-persistable". Or maybe something more general? Like "allowed" that can currently be "any" or "persistable"?

It's a bit of a nuisance, I know. SetGCCallback has some code to cut & paste.

::: js/src/vm/StructuredClone.cpp
@@ +91,5 @@
>      SCTAG_TYPED_ARRAY_OBJECT,
>      SCTAG_MAP_OBJECT,
>      SCTAG_SET_OBJECT,
>      SCTAG_END_OF_KEYS,
> +    SCTAG_SHARED_TYPED_ARRAY_OBJECT, // Obsolete

Probably ought to name this SCTAG_DO_NOT_USE_3 rather than having historical values here.

@@ +96,3 @@
>      SCTAG_DATA_VIEW_OBJECT,
>      SCTAG_SAVED_FRAME_OBJECT,
> +    SCTAG_SHARED_ARRAY_BUFFER_OBJECT,

I don't know that we don't have serialized clone buffers lying around containing SCTAG_JSPRINCIPALS, so I think this'll have to go after them.
Attachment #8790714 - Flags: review?(sphink) → feedback+
The point of confusion in the current patch is that whether a datum is "Persistable" (not a good word, "Rematerializable" would have been better) is not exclusively an aspect of the datum, but also an aspect of our implementation on a given operation system at a given point in time.  If it is possible to share memory across processes using some OS-level shared memory abstraction then there is no reason not to allow that to happen, if other restrictions on the sharing are obeyed, such as not sharing between a dedicated worker and a ServiceWorker.  But as we don't have that functionality now, it would be a change to allow it in the future; and it may be available on some platforms, but not on others.

The distinction between Persistable/Any ignores that problem; it assumes that the current system where we only share in-process is fixed.  (And generously, it may assume that if not fixed, then the ability to share cross-process will be available on all platforms at the same time.)

So the correct level of the API at the Structured Clone level is indeed whether to allow SharedArrayBuffer or not; then higher level code can make a determination about whether to do that, in those terms.  I don't think that's a big change, just a clarification.
Need this for FF51, really, so need to land+uplift before we branch for FF52.
Priority: -- → P1
I think this addresses all comments and the DOM code adapts easily.

Summary: Instead of taking an enum value, the cloner now takes an object that encapsulates serialization policy for certain data types (currently only SharedArrayBuffer).  It's the responsibility of the caller to select the correct policy for the correct types.  This adds transparency where the prior solution had little.
Attachment #8803862 - Flags: review?(sphink)
Blocks: 1312446
Comment on attachment 8803862 [details] [diff] [review]
bug1302036-engine-v2.patch

Review of attachment 8803862 [details] [diff] [review]:
-----------------------------------------------------------------

Yes, this addresses all of my concerns quite nicely, and provides a reasonable point for future expansion (or more importantly, doesn't suggest that the current usage is the whole and final purpose of the added parameters.) Thank you!
Attachment #8803862 - Flags: review?(sphink) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/87a03a7398fc22a06acd2da94949f27f50839683
Bug 1302036 - Make structured clone accept argument that controls serialization of some data types. r=sfink

https://hg.mozilla.org/integration/mozilla-inbound/rev/a92b09519d319779b27dae6604cf17a6ddb410e1
Bug 1302036 - DOM changes to conform to new JS engine API for cloning. r=baku
https://hg.mozilla.org/mozilla-central/rev/87a03a7398fc
https://hg.mozilla.org/mozilla-central/rev/a92b09519d31
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Flags: needinfo?(amarchesini)
(This is the patch that landed on mozilla-central, backported to aurora.)

Approval Request Comment

Background: we are trying to enable shared memory and atomics in JavaScript in FF51, see bug 1309861 for recent discussion and previous uplift requests.  This patch, and its subsequent companion, implement an API change that it is highly desirable to have when the feature ships.  (Namely, a SharedArrayBuffer is no longer required to be present in the transfer list for postMessage().)

[Feature/regressing bug #]:
Shared memory and atomics for JS

[User impact if declined]:
Stale API will be presented to user when the feature ships

[Describe test coverage new/current, TreeHerder]:
There are tests in the JS test suite, and additional tests currently in the proposal repository with Ecma TC39, which must be run manually at this time.  In addition there's a fair amount of content (internal & with partners) that will test this.

[Risks and why]: 
Low risks, as this touches only the new SharedArrayBuffer type.

[String/UUID change made/needed]:
New error messages in js.msg, if that applies.  Otherwise nothing I know of.
Attachment #8805067 - Flags: approval-mozilla-aurora?
(This is the patch that landed on mozilla-central, backported to aurora.)

Approval Request Comment

See the previous comment, for the companion patch, for justification.
Attachment #8805068 - Flags: approval-mozilla-aurora?
Comment on attachment 8805067 [details] [diff] [review]
bug1302036-aurora-structured-clone-js.patch

Enable shared memory feature in 51 aurora by default. Aurora 51+.
Attachment #8805067 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8805068 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
I've updated https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer

I guess ideally we should also update https://hacks.mozilla.org/2016/05/a-taste-of-javascripts-new-parallel-primitives/ if we want people to stop writing/using:
w.postMessage(sab, [sab])   // Transfer the buffer

Also opened https://github.com/tc39/ecmascript_sharedmem/issues/164 for the tutorial in the spec repo and marked bug 1302037 as dev-doc-needed as I suppose it will change this from a warning to an error.

Review appreciated!
Flags: needinfo?(lhansen)
All good points.

Jason, how do we go about tweaking a hacks post to correct for an API that has changed?
Flags: needinfo?(lhansen) → needinfo?(jweathersby)
(In reply to Florian Scholz [:fscholz] (MDN) from comment #29)
> I've updated
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/
> Global_Objects/SharedArrayBuffer

Looks good.

> Also opened https://github.com/tc39/ecmascript_sharedmem/issues/164 for the
> tutorial in the spec repo and marked bug 1302037 as dev-doc-needed as I
> suppose it will change this from a warning to an error.

Fixed.
Flags: needinfo?(jweathersby)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: