Closed Bug 566951 (mt-wrappers) Opened 10 years ago Closed 8 years ago

Implement multi-threaded object wrappers.

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: gal, Assigned: gal)

References

Details

(Whiteboard: [compartments])

Attachments

(2 files, 13 obsolete files)

No description provided.
Brendan is in the process of eliminating object scopes, which currently contain a title/lock. We want to drop that and instead require explicit sharing of objects. This will be implemented using JS_Becomes, which turns the object and its proto chain into a proxy-based wrapper that synchronizes access while maintaining identity.
Depends on: harmony:proxies
OS: Mac OS X → All
Priority: -- → P2
Hardware: x86 → All
Attached patch patch (obsolete) — Splinter Review
Assignee: general → gal
Attached patch patch (obsolete) — Splinter Review
Attachment #446416 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
Implement object sharing. This patch adds a proxy-based wrapper around objects. Existing objects can be shared using ShareObject(obj), which substitutes a new wrapper around obj for obj. The proto chain is shared as well. The parent chain is not. Access to the object is synchronized. Strings read from the object are made immutable. Objects read from it are shared as we read them. A new shell command (share) is available for testing. This patch essentially re-purposes threaded shell builds. The VM is still thread safe, but title is gone from scope and objects are no longer thread safe by default. Only after we apply share(obj) is that object safe to be touched from multiple threads. Unshared object access is as fast as in a single-threaded VM.
Attachment #446422 - Attachment is obsolete: true
Depends on: 567055
Attached patch patch (obsolete) — Splinter Review
Attachment #446446 - Attachment is obsolete: true
(In reply to comment #4)
> Existing objects can be shared using ShareObject(obj), which substitutes a new
> wrapper around obj for obj. The proto chain is shared as well. The parent chain
> is not.

Why not? I.e. what happens when someone shares a closure across threads?
Attached patch patch (obsolete) — Splinter Review
Report error in single threaded builds when attempting to share an object.
Attachment #446468 - Attachment is obsolete: true
(In reply to comment #6)
> (In reply to comment #4)
> > Existing objects can be shared using ShareObject(obj), which substitutes a new
> > wrapper around obj for obj. The proto chain is shared as well. The parent chain
> > is not.
> 
> Why not? I.e. what happens when someone shares a closure across threads?

Also, the patch does not deal with __proto__ manipulations for shared objects. We should either disallow that for shared objects or make sure that all objects are shared on the new proto chain.
Attached patch rip out some dead code (obsolete) — Splinter Review
Attachment #446470 - Attachment is obsolete: true
Attached patch a bit more cleanup (obsolete) — Splinter Review
Attachment #446476 - Attachment is obsolete: true
I am going to leave LOCK_OBJ/UNLOCK_OBJ in place for now so I don't conflict the hell out of brendan's patch stack.
The current object locking code makes sure that lock is not held accross 
a getter call etc. To be a valuable replacement the new proxy-based approach must make sure that this still holds. Otherwise it would be very difficult to avoid deadlocks.

For reasons like that I am still not convinced that share(obj) is the right approach. A more useful alternative IMO would be to make object shared at birth plus an option to create a special proxy that would allow to access from another thread any object. When such proxy will be used to access the object from non-home thread when the home thread is running, it will queue its operation on the home thread and trigger an operation callback there so the home thread can quickly suspend and carry the necessary operation.
(In reply to comment #6)
> (In reply to comment #4)
> > Existing objects can be shared using ShareObject(obj), which substitutes a new
> > wrapper around obj for obj. The proto chain is shared as well. The parent chain
> > is not.
> 
> Why not? I.e. what happens when someone shares a closure across threads?

Error -- an object can refuse to become another object, for Proxy.fix. An object can refuse to be frozen/sealed/preventExtensions'ed. See bug 558866 comment 4.

/be
(In reply to comment #13)
> Error -- an object can refuse to become another object, for Proxy.fix.

What about ordinary top-level functions that access some globals? Would they also be banned from sharing?

And what about non-function objects and the parent lookup in the security manager?
(In reply to bug 566818 comment #26)
> JS_FixProxy is not enough for JS_THREADSAFE elimination.

AFAICS without the parent chain sharing the current patch would make impossible to share scripted functions. That means that many valid current JS_THREADSAFE uses would be impossible to port using the share(obj). On the other hand sharing the parent chain implies shared globals and that is not supported.

So IMO instead of providing share(obj) it would be better to  direct script writers towards message passing interfaces or provide other alternatives like creating proxy for existing objects without changing the objects themself or provide an option to create thread-safe domain where everything is thread safe and slow.

Any of the above options does not need JS_Becomes and leaves JS_FixProxy the single user of JSObject::swap.
Message passing != shared object concurrency, and especially non-concurrency (hand-off, as in nsIThreads and runnables). You can rewrite almost any concurrent program using a different concurrency model, but at high cost.

Directing people to message passing is a good idea but it won't cause non-trivial rewrites to happen on any schedule. Breaking compatibility could force rewrites or simply drive addon developers to other browsers, whatever the concurrency model offered. Multiple factors come into play once the rewrite tax takes effect.

The plan for functions is a bit more subtle: if the function is a freshly created function expression and it has no free variables then it can be handed off cross-thread.

More work needed here, but in general we do not want wrapped functions or globals -- no non-natives on the scope chain (this may run into trouble with DOM event handlers, but currently DOM nodes are native -- they have JSScope maps and they emulate native JSObjectOps). 

/be
discussion with jorendorff:
- the stuff can deadlock like hell (totally, didn't even look at the looking stuff yet)
- have to catch exceptions (JSAutoExceptionSharer looks at cx->exception and shares it)
- when asked to wrap a global or function, throw an exception
Possible fix for the deadlocks: whenever you pass through a MT-proxy, unlock whichever object (if any) the calling thread currently has locked, before locking the wrappee. This can also be done with an RAII helper.
(In reply to comment #18)
> Possible fix for the deadlocks: whenever you pass through a MT-proxy, unlock
> whichever object (if any) the calling thread currently has locked, before
> locking the wrappee. This can also be done with an RAII helper.

That does not work in general since if thread 1 unlocks an object A to lock B then there is no guarantee that A is is a thread-safe state meaning that another thread 2 can mutate that in unexpected ways.
(In reply to comment #16)
> The plan for functions is a bit more subtle: if the function is a freshly
> created function expression and it has no free variables then it can be handed
> off cross-thread.

With the old model MT embeddings can share functions and objects via a common prototype for thread-private global objects. Inability to share functions would immediately kill that option. So what would be the message for those embeddings?
(In reply to comment #19)
> (In reply to comment #18)
> > Possible fix for the deadlocks: whenever you pass through a MT-proxy, unlock
> > whichever object (if any) the calling thread currently has locked, before
> > locking the wrappee. This can also be done with an RAII helper.
> 
> That does not work in general since if thread 1 unlocks an object A to lock B
> then there is no guarantee that A is is a thread-safe state meaning that
> another thread 2 can mutate that in unexpected ways.

Let's separate MT-proxy for A from underlying ST-only object A -- call the former MTProxy(A). The title lock is associated with MTProxy(A). The only way threads can mutated A is via MTProxy(A). Once A becomes MTProxy(A), the relocated A is not accessible to anything but MTProxy(A).

So how could A be in a thread-unsafe state at any "pass-through" point (handler trap call)?

(In reply to comment #20)
> (In reply to comment #16)
> > The plan for functions is a bit more subtle: if the function is a freshly
> > created function expression and it has no free variables then it can be handed
> > off cross-thread.
> 
> With the old model MT embeddings can share functions and objects via a common
> prototype for thread-private global objects. Inability to share functions would
> immediately kill that option. So what would be the message for those
> embeddings?

There's more going on: JM if not TM want mutable thread-local scripts, which imply thread-local function objects. But compiler caching wants immutable and unspecialized shared scripts (shared among threads). So we're going to have both ST-only functions and scripts, and probably (no one has signed up for this) immutable shared functions/scripts.

See bug 567198 at least.

Frozen objects can be shared without MTProxy creation -- gal and I discussed this and he will make a fast path that does not have the frozen object "become" its MTProxy, instead it will be returned by ShareObject. It may be that such embeddings will need to freeze shared functions.

The hard case for functions in the Firefox embedding is exemplified by

http://hg.mozilla.org/mozilla-central/annotate/825acf49308c/toolkit/components/passwordmgr/src/nsLoginManagerPrompter.js#l149

The nsIThread gets a runnable whose run function uses upvars hashKey and self from LoginManagerPromptFactory.prototype._doAsyncPrompt. These are set before the closure is formed and so can be flattened into a flat (or partially flat) closure for run. A flat closure is immutable and can be shared cross-thread.

But run also uses Components from its global scope. My thought was to make JS XPCOM components have a thread-safe global object, details to be worked out.

/be
(In reply to comment #21)
> Let's separate MT-proxy for A from underlying ST-only object A -- call the
> former MTProxy(A). The title lock is associated with MTProxy(A). The only way
> threads can mutated A is via MTProxy(A). Once A becomes MTProxy(A), the
> relocated A is not accessible to anything but MTProxy(A).
> 
> So how could A be in a thread-unsafe state at any "pass-through" point (handler
> trap call)?

Suppose A has a scripted getter. If that getter would access another shared object MTProxy(B), that would unlock MTProxy(A) per comment 18. But what then ensures that at the moment of the getter call the object A was in a consistent way and nothing were stored in the local variables on the native that cannot be deleted/altered from another thread?
(In reply to comment #21)
> There's more going on: JM if not TM want mutable thread-local scripts, which
> imply thread-local function objects. But compiler caching wants immutable and
> unspecialized shared scripts (shared among threads). So we're going to have
> both ST-only functions and scripts, and probably (no one has signed up for
> this) immutable shared functions/scripts.

That is why I think it would be very valuable to have a proxy that can be applied to any object and that would allow to access thread-safely the object from non-home threads without affecting the object itself. This way an access to ST functions and scripts can be provided.
(In reply to comment #22)
> (In reply to comment #21)
> > Let's separate MT-proxy for A from underlying ST-only object A -- call the
> > former MTProxy(A). The title lock is associated with MTProxy(A). The only way
> > threads can mutated A is via MTProxy(A). Once A becomes MTProxy(A), the
> > relocated A is not accessible to anything but MTProxy(A).
> > 
> > So how could A be in a thread-unsafe state at any "pass-through" point (handler
> > trap call)?
> 
> Suppose A has a scripted getter. If that getter would access another shared
> object MTProxy(B), that would unlock MTProxy(A) per comment 18. But what then
> ensures that at the moment of the getter call the object A was in a consistent
> way and nothing were stored in the local variables on the native that cannot be
> deleted/altered from another thread?

The same situation arises with getters today. We unlock before calling the getter, relock after. These unlock and relock operations are done via JS_LOCK_OBJ and JS_UNLOCK_OBJ or their _SCOPE underpinnings.

These macros do not imply any barrier, however. So another thread can race with the getter and see stale cache data for the object or its scope -- but not for obj->scope()->title.ownercx because that is updated only under protection of the rt->gcLock mutex, which has a barrier (flushes write buffers). So another thread would see the wrong ownercx, or null ownercx (meaning thin CAS or fat locking is required), and contend for obj.

But with MTProxy the same solution should be in effect. We have the title in the MTProxy. It unlocks a held MTProxy on entry to the trap, then locks the current proxy's title; and unlocks/relocks after. Execution is still within a request. The MTProxy's title usage either goes through a CAS or a full fat lock barrier, or relies on ownercx being the current cx while it is in its request.

Jorendorff's proposal simply moves the unlock/relock out of js_Native[GS]et and into the proxy handler trap dispatch code.

If you meant some other kind of inconsistency than what weak store order allows I don't know what it is. The memory consistency model should be the same here.

/be
(In reply to comment #23)
> (In reply to comment #21)
> > There's more going on: JM if not TM want mutable thread-local scripts, which
> > imply thread-local function objects. But compiler caching wants immutable and
> > unspecialized shared scripts (shared among threads). So we're going to have
> > both ST-only functions and scripts, and probably (no one has signed up for
> > this) immutable shared functions/scripts.
> 
> That is why I think it would be very valuable to have a proxy that can be
> applied to any object and that would allow to access thread-safely the object
> from non-home threads without affecting the object itself.

This requires message passing, to serialize all execution through the home thread's event loop. Which in turn can result in calls to other objects which would need to be serialized. You can't do this transparently unless you nest event loop but this violates run-to-completion.

We may indeed need MTProxies on the scope chain, at least for the JS XPCOM component global object. We may have non-natives on the scope chain behind the cacheable objects (Block, Call, etc.) -- for DOM event handlers if we use Proxy for the DOM. This all may be necessary in spite of "no non-natives on the scope chain" being the new rule we are trying to put in place.

But whatever we do, I do not see how message-passing is compatible with any of the code, Firefox toolkit/browser/add-on, which is currently assumed to be in Firefox 4 and still working.

/be
Attached patch patch (obsolete) — Splinter Review
This patch wraps exceptions and avoids deadlocks by preempting held locks as we recurse into nested membrane crossings. At most one lock is held at all times (!) by a thread (cx->thread->lockedTitle). If we want to take another lock, we unlock it first and restore it after the operation. As long the membrane doesn't leak this should be sound.
Attachment #446479 - Attachment is obsolete: true
> (In reply to comment #24)
> Jorendorff's proposal simply moves the unlock/relock out of js_Native[GS]et and
> into the proxy handler trap dispatch code.

Right, the rule about single lock is effectively observed currently in the code and that proposal just moves the points of the lock release. The only problem is that various callbacks like error reporting etc. will be called with the lock held, but that can be dealt with checks for cx->thread->lockedTitle from the new gal's patch and releasing it at the point of the callback call.

Also until something is done with the functions and their parents the whole point is not relevant as there would be no scripted code running inside the lock. 

(In reply to comment #25)
> This requires message passing, to serialize all execution through the home
> thread's event loop. Which in turn can result in calls to other objects which
> would need to be serialized. You can't do this transparently unless you nest
> event loop but this violates run-to-completion.

The event loop is needed only outside the request. When home thread is running JS request and the proxy is accessed from another thread the proxy triggers operation callback so the home thread can quickly suspend and carry out the necessary operation. Outside the request some interation with the event loop is necessary indeed unless one consider doing something like temporary reassigning JSThread associated with the object with the caller native thread and running the operation on the caller thread. 

> 
> But whatever we do, I do not see how message-passing is compatible with any of
> the code, Firefox toolkit/browser/add-on, which is currently assumed to be in
> Firefox 4 and still working.

This is very true.
Attached patch patch (obsolete) — Splinter Review
Clone and then seal scope objects on the parent chain when an object is shared.
Attachment #446601 - Attachment is obsolete: true
(In reply to comment #17)
> - when asked to wrap a global or function, throw an exception

That wrap reject should also include at least Arguments objects and E4X. In both cases quite a few places would be very surprised to find a proxy class where they expect JS_ArgumentClass or XML.

And it seems that a lot of other classes cannot be made shared. For example, consider the date_setTime method. That method calls ValueToNumber before calling the implementation. If ValueToNumber would turn the object into the proxy, then the latter call to SetUTCTime would reinterpret the Proxy as the Date.
Attached patch patch (obsolete) — Splinter Review
When sharing an object, keep track of all the scopes we snapshot and re-use them instead of re-snapshotting all the time. The top level clone operation is the snapshot point. We still error on trying to share functions. I will work on that next.
Attachment #446636 - Attachment is obsolete: true
I currently don't share objects that I snapshot into the scope object clones. That's wrong. I think we have to share those, too. That's going to get pretty nasty of course. It means that if you share anything, you really share your global object and everything in it and anything that touches. But I guess that's the price of MT objects.
Attached patch patch (obsolete) — Splinter Review
Ok, I think this is promising. I am now cloning the scope chain into a fresh snapshot, but I no longer share values eagerly. Instead, a class getter on ClonedScopeClass/ClonedGlobalClass will share the value right before its read. This will make a bunch of eager copies of the global object, but it will no longer share all the properties (especially function properties) on the global object. That only happens when those are actually touched from a different thread. This might be a quite feasible compromise.
Attachment #446646 - Attachment is obsolete: true
(In reply to comment #32)
> Created an attachment (id=446662) [details]
> patch
> 
> Ok, I think this is promising. I am now cloning the scope chain into a fresh
> snapshot, but I no longer share values eagerly. Instead, a class getter on
> ClonedScopeClass/ClonedGlobalClass will share the value right before its read.
> This will make a bunch of eager copies of the global object, but it will no
> longer share all the properties (especially function properties) on the global
> object. That only happens when those are actually touched from a different
> thread. This might be a quite feasible compromise.

This is precisely what I proposed here:

http://groups.google.com/group/mozilla.dev.tech.js-engine/msg/af16c4039f402ceb

Sorry if it wasn't clear. What you describe in comment 31 is the "Ice-9 disaster" that I said we should not do. (Google it!)

So, all's well that ends well, knock on wood.

/be
(In reply to comment #32)
> Created an attachment (id=446662) [details]
> patch
> 
> Ok, I think this is promising. I am now cloning the scope chain into a fresh
> snapshot, but I no longer share values eagerly.

It would be nice if the patch would not share read-only objects. This way one can continue to use a common read-only prototype for thread-local globals. To communicate between threads the prototype would use few special-purpose native or mutable thread-safe objects. IIRC a seup like this is what MikeM uses.

But that would requre a function like makeThreadSafe(scopeObject, array_of_mutable_objects). The function would make any object reachable from scopeObject readonly unless the object is in array_of_mutable_objects. Those would be turned into mutable thread-safe objects.

What is not clear is what to do when those mutable objects would be used to store single-threaded objects. Again, ST objects could be either made read-only or shared depending on the policy.
Attached patch patch (obsolete) — Splinter Review
The scope chain is now properly captured and shared when a property is accessed.

var x = ({});

share(x);

with (parent(x)) {
    print("Hello");
}

The above code fails with a sharing exception because print can't be shared.
Attachment #446662 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
Sharing functions via the scope chain works now. The function gets cloned and sealed as we read it off the scope chain. The with example above no longer throws.
Attachment #446818 - Attachment is obsolete: true
Attached patch patchSplinter Review
This patch avoids the leak described above by eagerly cloning and sealing all functions. This means that if any object is shared, we walk its parent chain to the global object and clone and seal the global object, which causes all its function properties to be cloned and sealed. Object properties of the global object are not cloned. Instead, they will be shared when accessed.
Attachment #446832 - Attachment is obsolete: true
Alias: mt-wrappers
Blocks: 584198
Blocks: compartments
This looks good, but I wonder if it can go in before my patch for bug 558451? I would take the rebase hit. What needs to be done besides rebasing the patch to get it to reviewable state?

/be
I will rebase and try to land.
rebasing this is a huge pita, but getting close
Attachment #464188 - Attachment is patch: true
Attachment #464188 - Attachment mime type: application/octet-stream → text/plain
removing the harmony:proxies dependency to make bug tracking easier.
No longer depends on: harmony:proxies
Whiteboard: [compartments]
Depends on: 558861
Blocks: 606029
No longer blocks: 606029
Let's discuss the requirements here.

To summarize what's already been said:

Object and string allocation used to be thread-safe. Sharing objects across threads used to be thread-safe (with some known hazards). Compartments break these longstanding JS_THREADSAFE API guarantees.

For binary extensions and embeddings, the upgrade path is to use compartments properly. So those use cases don't concern us in this bug. That leaves JS code that uses threads, which I think means nsIThread.dispatch. (Embeddings that do something *like* nsIThread.dispatch should be able to reuse our solution. But we're not making a hard requirement of it.)

Gecko only uses nsIThread.dispatch to fire events to the current thread. That will continue to work. So what is left to worry about? Only using nsIThread.dispatch to send functions from one thread to another.

There will be cases we can support well and cases we won't be able to support at all (e.g. obviously turning a global object into a proxy is right out). There will probably also be a lot of murky middle ground where what we can accomplish will be subtly different from the old behavior--and might or might not be close enough. I think the key here is to delineate exactly what kind of functions we want to support sending cross-thread, and make nsIThread.dispatch(any_other_function) throw.

We have to draw the boundary in such a way that MT proxies end up being usable, robust, and (crucially) low-maintenance in the long term. We don't need another E4X.
Is this going anywhere close to 2.0 ? In calendar land we have a few cases where it makes sense to dispatch a function call to a different thread, which creates and initializes an xpcom object and then hands this object back over to the main thread. If I understand things correctly, mt wrappers will help us here.
Yes, this should get fixed for Mozilla 2. We're strictly less crashy if we do this and of course test it, than if we make a wish and leave JS thread manager usage as is. We're strictly more compatible than if we disable JS thread manager usage (as we've done for now in b7).

Jason, see comment 37. The last patch did clone the global object, with function objects cloned/frozen, and other object-valued global properties lazily shared (MT-wrapped). We rely on the request model still to serialize such incremental and on-demand wrapping.

/be
Requesting blocking based on last comment and bug 608142 comment 63.
blocking2.0: --- → ?
Andreas, do you think this should block?
No. We have a patch in review that allows to send MT-safe wrapped natives to chrome workers, and also make MT-safe wrapped natives on a worker thread. Sounds like that will do.
blocking2.0: ? → ---
(In reply to comment #48)
> No. We have a patch in review that allows to send MT-safe wrapped natives to
> chrome workers, and also make MT-safe wrapped natives on a worker thread.
> Sounds like that will do.

That sounds like it could help, but "that will do" is pure untested wishful thinking. We need the add-on developers here to try using workers and the new XPCOM natives they can call before we make a decision about this bug.

/be
Is there any reason to move forward with this bug?
The world didn't end, so I don't think so, at least not the particular approach in this bug.  Other, less scary/general forms of cross-thread/process proxying are being experimented with, but that is for a different bug.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.