Last Comment Bug 608142 - Disallow sending JS objects to a different thread
: Disallow sending JS objects to a different thread
Status: RESOLVED FIXED
[compartments], fixed-in-tracemonkey
: crash, dev-doc-complete, regression, relnote, reproducible
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: x86 All
: -- critical (vote)
: ---
Assigned To: Andreas Gal :gal
:
Mentors:
: 604756 (view as bug list)
Depends on: 649151 617251
Blocks: ipc 604756
  Show dependency treegraph
 
Reported: 2010-10-28 16:20 PDT by Andreas Gal :gal
Modified: 2011-04-11 18:37 PDT (History)
45 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
beta7+


Attachments
patch (6.78 KB, patch)
2010-10-28 16:22 PDT, Andreas Gal :gal
no flags Details | Diff | Splinter Review
patch (2.50 KB, patch)
2010-10-28 16:23 PDT, Andreas Gal :gal
no flags Details | Diff | Splinter Review
patch (4.05 KB, patch)
2010-10-28 17:24 PDT, Andreas Gal :gal
no flags Details | Diff | Splinter Review
patch (3.98 KB, patch)
2010-10-28 17:40 PDT, Andreas Gal :gal
no flags Details | Diff | Splinter Review
patch (7.80 KB, patch)
2010-10-29 14:26 PDT, Andreas Gal :gal
jst: review+
Details | Diff | Splinter Review

Description Andreas Gal :gal 2010-10-28 16:20:54 PDT
Needed for b7 to make sure extension authors can clearly see that the dispatch API isn't intended for cross thread use.
Comment 1 Andreas Gal :gal 2010-10-28 16:22:00 PDT
Created attachment 486775 [details] [diff] [review]
patch
Comment 2 Andreas Gal :gal 2010-10-28 16:23:27 PDT
Created attachment 486777 [details] [diff] [review]
patch
Comment 3 Andreas Gal :gal 2010-10-28 17:24:38 PDT
Created attachment 486803 [details] [diff] [review]
patch
Comment 4 Andreas Gal :gal 2010-10-28 17:40:44 PDT
Created attachment 486807 [details] [diff] [review]
patch
Comment 5 Andreas Gal :gal 2010-10-28 17:41:22 PDT
In some cases thread was coming back as null, which caused the hang. This patch fixes that.
Comment 6 Boris Zbarsky [:bz] (TPAC) 2010-10-28 20:48:33 PDT
I assume web workers are the replacement here?
Comment 7 Boris Zbarsky [:bz] (TPAC) 2010-10-28 20:48:43 PDT
Er, chrome workers.
Comment 8 Andreas Gal :gal 2010-10-28 20:56:18 PDT
Yeah. We will beef up the backstage pass with a couple additional helpers to make that easier. Getting crypto back onto a separate thread will be the first major customer.
Comment 9 Andreas Gal :gal 2010-10-28 21:01:57 PDT
This temporarily broke sync, but with the backed out background crypto sync is working again with this patch applied. so I think this one is good to go into m-c.
Comment 10 Brendan Eich [:brendan] 2010-10-28 23:06:34 PDT
(In reply to comment #8)
> Yeah. We will beef up the backstage pass with a couple additional helpers to
> make that easier. Getting crypto back onto a separate thread will be the first
> major customer.

ChromeWorkers are getting ad-hoc feature additions for sync, but that is just the start of a long list. We haven't even begun to scope out what other APIs might be needed in a ChromeWorker to replace ThreadManager usage (which might be safe in spite of the risks) in existing add-ons, apps, etc.

Selling something we haven't tested or even created as a replacement is fraud, although MT wrappers are kind of phony too. But they use frozen snapshots for closures, so they're not as broken as our platform has been for years. I'm still pretty sure we are better off with MT wrappers for now. We're not going to extend ChromeWorkers much more if we want to ship Firefox 4 soon.

/be
Comment 11 Brendan Eich [:brendan] 2010-10-28 23:10:08 PDT
(In reply to comment #5)
> In some cases thread was coming back as null, which caused the hang. This patch
> fixes that.

Why was thread coming back null?

/be
Comment 12 Andreas Gal :gal 2010-10-29 11:22:24 PDT
*** Bug 604756 has been marked as a duplicate of this bug. ***
Comment 13 Andreas Gal :gal 2010-10-29 11:23:32 PDT
Review ping. I really want to land this to judge the fallout (if there is any).

#11: shutdown procedure. The thread object was de-allocated but other threads were still alive.
Comment 14 Andreas Gal :gal 2010-10-29 11:59:14 PDT
bsmedberg is worried about perf impact, i.e. for networking code that sends a lot of C++ stuff across threads. I will look into doing this in xpconnect, i.e. refusing to unpack a wrappedJS on the wrong thread.
Comment 15 Andreas Gal :gal 2010-10-29 14:26:56 PDT
Created attachment 487029 [details] [diff] [review]
patch
Comment 16 Andreas Gal :gal 2010-10-29 15:08:39 PDT
http://hg.mozilla.org/tracemonkey/rev/71728ff7002c
Comment 17 Brendan Eich [:brendan] 2010-10-29 16:05:34 PDT
All that refcounting for mThread -- cant you use a thread-id?

/be
Comment 18 Andreas Gal :gal 2010-10-29 16:08:17 PDT
I am not very familiar with that code. peterv suggested the COMPtr. I will take anything cheaper that works. Though, this is a pretty slow path (CallMethod).
Comment 19 Brendan Eich [:brendan] 2010-10-29 17:23:47 PDT
(In reply to comment #18)
> I am not very familiar with that code. peterv suggested the COMPtr. I will take
> anything cheaper that works. Though, this is a pretty slow path (CallMethod).

PRThread *PR_GetCurrentThread() is refcount free at home and from your parents' house :-).

/be
Comment 20 Jeff Walden [:Waldo] (remove +bmo to email) 2010-10-29 23:02:27 PDT
I backed out this change from TM, because it turned mochitest-plain-2 orange.
Comment 21 Jeff Walden [:Waldo] (remove +bmo to email) 2010-10-29 23:04:01 PDT
(Andreas, as an aside, would you mind using a commit message format for bug numbers that tbpl will linkify?  Bug foo somewhere in the message cuts it, dunno what else offhand.)
Comment 22 Andreas Gal :gal 2010-10-29 23:25:38 PDT
Thanks Waldo. Sorry for not catching this.
Comment 23 Andreas Gal :gal 2010-10-30 00:26:12 PDT
I don't think my patch is responsible for any of the orange. It looks all random, or stuff that came in later. In other words I think after the backout the tree will still be super orange. I ran a bunch of that orange locally with my patch applied but nothing that came later and it all works. I might be wrong of course, so I will wait for the tinderboxes to cycle and we take it from there.
Comment 24 Josh Matthews [:jdm] 2010-10-30 01:18:59 PDT
Nope, your patch consistently made test_xhr time out on debug machines, which is not a known intermittent orange.
Comment 25 Andreas Gal :gal 2010-10-30 05:34:36 PDT
I tried that test locally, it passes. Also, it doesn't test anything that I change or reasonably affected. I will try-server, but right now I have no indication that #24 is correct.
Comment 26 Andreas Gal :gal 2010-10-30 06:29:58 PDT
I have verified with gdb and with a printf that during that test the extra paths I added are not taken.
Comment 27 Andreas Gal :gal 2010-10-30 07:55:02 PDT
Running that test individually passes. Working all of mochitest hits my error path. The DOM worker XHR code seems to be sending a WrappedJS to the worker that was created on the main thread.

#0  nsXPCWrappedJS::CallMethod (this=0x7fffd3f5fb80, methodIndex=3, info=0x7fffe603df18, params=0x7fffce9fd4d0)
    at ../../../../../js/src/xpconnect/src/xpcwrappedjs.cpp:575
#1  0x00007ffff6442faf in PrepareAndDispatch (self=0x7fffd0474a60, methodIndex=3, args=0x7fffce9fd660, 
    gpregs=0x7fffce9fd5e0, fpregs=0x7fffce9fd610)
    at ../../../../../../../xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_linux.cpp:153
#2  0x00007ffff6443043 in SharedStub ()
    at ../../../../../../../xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_linux.cpp:159
#3  0x00007ffff56e9e52 in nsDOMWorkerMessageHandler::DispatchEvent (this=0x7fffcffe20f0, aEvent=0x7fffcffe3420, 
    _retval=0x0) at ../../../../dom/src/threads/nsDOMWorkerMessageHandler.cpp:329
#4  0x00007ffff56fadb5 in nsDOMWorkerXHREventTarget::DispatchEvent (this=0x7fffcffe20f0, evt=0x7fffcffe3420, 
    _retval=0x0) at ../../../../dom/src/threads/nsDOMWorkerXHR.h:73
#5  0x00007ffff56fe3ec in nsDOMWorkerXHRProxy::HandleWorkerEvent (this=0x7fffd000f200, aEvent=0x7fffcffe3420, 
    aUploadEvent=0) at ../../../../dom/src/threads/nsDOMWorkerXHRProxy.cpp:614
#6  0x00007ffff56e503a in nsDOMWorkerXHREvent::Run (this=0x7fffcffe3420)
    at ../../../../dom/src/threads/nsDOMWorkerEvents.cpp:539
#7  0x00007ffff57042cf in nsDOMWorkerXHRLastProgressOrLoadEvent::Run (this=0x7fffd24353e0)
    at ../../../../dom/src/threads/nsDOMWorkerXHRProxy.cpp:192
#8  0x00007ffff56d2573 in nsDOMWorkerRunnable::RunQueue (this=0x7fffd0140080, aCx=0x7fffd84e8800, 
    aCloseRunnableSet=0x7fffce9fdadc) at ../../../../dom/src/threads/nsDOMThreadService.cpp:493
#9  0x00007ffff56d1fa2 in nsDOMWorkerRunnable::Run (this=0x7fffd0140080)
    at ../../../../dom/src/threads/nsDOMThreadService.cpp:407
#10 0x00007ffff6427830 in nsThreadPool::Run (this=0x7fffd3e35120) at ../../../xpcom/threads/nsThreadPool.cpp:221
#11 0x00007ffff6423939 in nsThread::ProcessNextEvent (this=0x7fffd1fbbaf0, mayWait=1, result=0x7fffce9fdcbc)
    at ../../../xpcom/threads/nsThread.cpp:609
#12 0x00007ffff63a802c in NS_ProcessNextEvent_P (thread=0x7fffd1fbbaf0, mayWait=1) at nsThreadUtils.cpp:250
#13 0x00007ffff6422946 in nsThread::ThreadFunc (arg=0x7fffd1fbbaf0) at ../../../xpcom/threads/nsThread.cpp:277
#14 0x00007ffff3b0d503 in _pt_root (arg=0x7fffd1f88cf0) at ../../../../../nsprpub/pr/src/pthreads/ptthread.c:187
#15 0x00007ffff7bc69ca in start_thread (arg=<value optimized out>) at pthread_create.c:300
#16 0x00007ffff029b70d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#17 0x0000000000000000 in ?? ()
Comment 28 Andreas Gal :gal 2010-10-30 08:35:23 PDT
Actually what happens here is that a worker attaches to thread A and creates the WrappedJS and then the worker migrates to thread B and consumes the event there, which is a thread transition, and that breaks. I will switch to a main thread/not main thread flag instead of the thread comparison. That will do for now.
Comment 29 Andreas Gal :gal 2010-10-30 08:44:02 PDT
I backed out the back-out and landed a follow-up fix.

http://hg.mozilla.org/tracemonkey/rev/9c04e4174f4f
Comment 30 Brendan Eich [:brendan] 2010-10-30 09:48:05 PDT
(In reply to comment #28)
> Actually what happens here is that a worker attaches to thread A and creates
> the WrappedJS and then the worker migrates to thread B and consumes the event
> there, which is a thread transition, and that breaks. I will switch to a main
> thread/not main thread flag instead of the thread comparison. That will do for
> now.

It will do if we keep JS_THREADSAFE, or if no non-main thread A <-> B sharing happens, or both.

Based on experience, where add-ons (some with binary components) occasionally use main thread objects from a non-main thread, this is helpful. But to avoid crashes from add-ons that share across multiple threads, we should consider a general way to be safe.

MT wrappers is one, it needs work but it is symmetric.

Something like your patch here, with more hacks to migrate or "hand off" objects so they don't cause the error condition to bite, might work too. Could be whack-a-mole if there are lots of hand-off points and they are hard to see.

/be
Comment 32 Benjamin Smedberg [:bsmedberg] 2010-11-16 06:38:00 PST
Sheppy, can you paste doc links? I want to include the link from the error message I'm adding in bug 610381.
Comment 33 Eric Shepherd [:sheppy] 2010-11-16 06:54:48 PST
All we have are notes in the Thread Manager docs saying you can't do this anymore. If there's more needed, I'm not sure what it would be. Do you want me to put together a little section on this? Anything more that needs saying beyond "If you need to pass JS objects between threads, use a worker instead"?

https://developer.mozilla.org/en/The_Thread_Manager
Comment 34 Benjamin Smedberg [:bsmedberg] 2010-11-17 09:16:54 PST
I really think we need to have some code examples how one might transition from using the event manager to using a ChromeWorker.
Comment 35 Andreas Gal :gal 2010-11-17 09:19:49 PST
Yeah. I am waiting for sync to be ported, and then I would like to blog about that change with the concrete code examples.
Comment 36 Erik Vold 2010-12-06 20:57:05 PST
(In reply to comment #13)
> Review ping. I really want to land this to judge the fallout (if there is any).

bug 617251
Comment 37 Adrià Mercader 2010-12-07 01:36:27 PST
(In reply to comment #34)
> I really think we need to have some code examples how one might transition from
> using the event manager to using a ChromeWorker.

Any idea on when there will be some example / guidance / alternative to the Thread Manager? My extension (https://addons.mozilla.org/en-US/firefox/addon/91406/) is pretty much useless in Firefox 4 and I'm confused on how to update it. Will it be possible to pass Javascript objects or functions to a ChromeWorker? Or only strings like suggested in bug 617251?

Sorry if this isn't the right place to ask, but the release date of Firefox 4 is approaching and the developers are left a little on their own...
Comment 38 Erik Vold 2010-12-07 19:38:54 PST
Why was this patch implemented? I don't see a reason stated here except "to make sure extension authors can clearly see that the dispatch API isn't intended for cross thread use" which seems pretty vague to me, and since the example on https://developer.mozilla.org/en/The_Thread_Manager no longer works, it doesn't seem accurate to me either..
Comment 39 Erik Vold 2010-12-08 21:38:40 PST
Very simple test case: https://github.com/erikvold/thread-test-case

Since that is not working I'm very confused how threads are supposed to be used, can someone please respond?

Also suggesting ChromeWorker be used instead here https://developer.mozilla.org/en/The_Thread_Manager doesn't make much sense to me, since ChromeWorkers don't have access to Components and are basically the same as a regular Worker, and if a developer wanted a Worker then I doubt they'd end up at the Thread Manager documentation. Can someone please shine some light on this decision as well?
Comment 40 Erik Vold 2010-12-08 21:48:41 PST
(In reply to comment #10)
> ChromeWorkers are getting ad-hoc feature additions for sync, but that is just
> the start of a long list. We haven't even begun to scope out what other APIs
> might be needed in a ChromeWorker to replace ThreadManager usage (which might
> be safe in spite of the risks) in existing add-ons, apps, etc.
> 
> Selling something we haven't tested or even created as a replacement is fraud,

My feelings exactly.

> We're not going to extend ChromeWorkers much more if we want to ship Firefox 4 soon.

Which is why I'm so confused why ChromeWorkers are being suggested instead of threads on https://developer.mozilla.org/en/The_Thread_Manager atm..
Comment 41 Nils Maier [:nmaier] 2010-12-08 22:05:23 PST
(In reply to comment #39)
> Very simple test case: https://github.com/erikvold/thread-test-case
> 
> Since that is not working I'm very confused how threads are supposed to be
> used, can someone please respond?

You simply aren't supposed to use background threads to run js runnables any more. That feature is effectively gone.
If ChromeWorkers and/or Jetpack Processes are powerful enough for your task (no need for anything XPCOM related), then good. If not, then run your code on the main thread again: No threads for you. ;)
(Disclaimer: Not my decision, and I still oppose that decision)

> Also suggesting ChromeWorker be used instead here
> https://developer.mozilla.org/en/The_Thread_Manager doesn't make much sense to
> me, since ChromeWorkers don't have access to Components and are basically the
> same as a regular Worker, and if a developer wanted a Worker then I doubt
> they'd end up at the Thread Manager documentation. Can someone please shine
> some light on this decision as well?

The decision was made because the new JS engine introduced some significant changes especially when it comes to strings. Distinct javascript strings may now internally share memory which will cause crashes and/or data havoc if two threads manipulated strings in a way that that the shared memory would manipulated by different threads in parallel or manipulated by one thread and accessed by another thread in parallel.
In order to avoid any crashes and data havoc, and because extension authors and amo-editors are obviously considered incapable of dealing with this new situation themselves, the js team seems to have reached the decision to simply strip away the possibility to use real threads. Instead of making cross-thread data manipulation thread safe again. And/or instead of documenting the changes and proposing work-arounds (like using nsISupportsString and similar to avoid js string havoc).
Comment 42 Boris Zbarsky [:bz] (TPAC) 2010-12-08 22:17:22 PST
The last part (last two paragraphs) of comment 41 are almost entirely wrong.  I'm happy to send correct information via e-mail to anyone who requests; it really doesn't belong in this bug.
Comment 43 Karsten Düsterloh 2010-12-08 23:38:11 PST
(In reply to comment #42)
> The last part (last two paragraphs) of comment 41 are almost entirely wrong. 
> I'm happy to send correct information via e-mail to anyone who requests;

Well, it'd be much better if that information was somewhere in the public (Wiki or something), so you can point people there...
Comment 44 Nate Weiner 2010-12-09 15:19:15 PST
While it does seem like this change is something that needs to happen, I think slipping it into one of the final betas isn't the way to do it.  Even more so without providing an alternative right out of the gate.

My add-on relies on threading with access to components to do a number of long running tasks.  ChromeWorkers do not appear to be a suitable alternative because of the lack of component access.

If a major change is going to be required on the side of the developer, we need more time, warning, and information about what direction to take.  If I have to do a full rewrite, it'd be better to be able to pick the right route to use (binary, c-types, or something else).

If at all possible, I'd vote for reverting this change and saving it for the next major FF release or at least until a suitable alternative exists.
Comment 45 Shawn Wilsher :sdwilsh 2010-12-09 15:47:32 PST
(In reply to comment #44)
> My add-on relies on threading with access to components to do a number of long
> running tasks.  ChromeWorkers do not appear to be a suitable alternative
> because of the lack of component access.
I'm curious about what components your add-on accessed.  The vast majority of components in Gecko are not threadsafe (but they don't throw when accessed off of the GUI thread).
Comment 46 Nate Weiner 2010-12-09 16:14:08 PST
(In reply to comment #45)
> I'm curious about what components your add-on accessed.  The vast majority of
> components in Gecko are not threadsafe (but they don't throw when accessed off
> of the GUI thread).

Thanks Shawn,

Digging through this a bit further there seems to be two major components I access off thread:

"@mozilla.org/storage/service;1"
I pass a database to a thread to do a long running set of selects that builds the users dataset when starting up.  I think I can replace this with a mix of aSyncExecute's and Web Workers to process the results into the final set.

Components.interfaces.nsIFile
The add-on saves a number of files for offline use.  When I delete several files from this directory, I prefer to do it off thread because it can take a second or two.  Is there an alternate way to asynchronously delete files?
Comment 47 Boris Zbarsky [:bz] (TPAC) 2010-12-09 17:23:03 PST
> Components.interfaces.nsIFile

These aren't threadsafe.  Using them from off the main thread is a good way to get corrupt files and crashes.

> Is there an alternate way to asynchronously delete files?

No, sadly (if you want to delete them all at once).  You could delete one file at a time, though, right, with trips to the event loop in between?
Comment 48 Shawn Wilsher :sdwilsh 2010-12-09 17:40:23 PST
(In reply to comment #46)
> "@mozilla.org/storage/service;1"
> I pass a database to a thread to do a long running set of selects that builds
> the users dataset when starting up.  I think I can replace this with a mix of
> aSyncExecute's and Web Workers to process the results into the final set.
Yeah, this is exactly what mozIStorageConnection::executeAsync and mozIStorageBaseStatement:executeAsync were designed for.
Comment 49 Andreas Gal :gal 2010-12-09 17:59:40 PST
Nate, Shawn, can you guys help me describe the extension, the original code, and the async alternative? I am trying to collect a couple examples for a blog post.
Comment 50 Shawn Wilsher :sdwilsh 2010-12-09 18:01:58 PST
(In reply to comment #49)
> Nate, Shawn, can you guys help me describe the extension, the original code,
> and the async alternative? I am trying to collect a couple examples for a blog
> post.
I can certainly help with the mozStorage bits once I see what used to be happening.
Comment 51 Nate Weiner 2010-12-09 18:19:38 PST
(In reply to comment #49)
> Nate, Shawn, can you guys help me describe the extension, the original code,
> and the async alternative? I am trying to collect a couple examples for a blog
> post.

Not a problem and thanks for doing that.  I'm sure there are others who'd benefit from that post.

Here is a generalized version of the file/thread actions.  This function was used when a user 'cleared their cache'.  Basically it deleted a folder and then recreated an empty one.  If this folder had a large cache would lock the browser for an inappropriate amount of time if performed on the main thread.  I also have a function that uses moveTo (instead of delete) if a user wants to relocate their cache.

// set up the action for the thread
var threadHandler = 
{   
	run : function()
	{
		this.folder.remove(true);
	},
        
	QueryInterface: function(iid)
	{
		if (iid.equals(Components.interfaces.nsIRunnable) ||
		iid.equals(Components.interfaces.nsISupports))
                    return this;
	        throw Components.results.NS_ERROR_NO_INTERFACE;
	}
};

// get the folder to delete
var folder = Components.classes["@mozilla.org/file/local;1"].createInstance(Components.interfaces.nsILocalFile);
folder.initWithPath('/location/of/folder/');

// create and run the thread that deletes the folder
var thread = Components.classes["@mozilla.org/thread-manager;1"].getService().newThread(0);
thread.folder = folder;
thread.dispatch(threadHandler, thread.DISPATCH_NORMAL);
Comment 52 Erik Vold 2010-12-09 20:48:21 PST
(In reply to comment #47)
> > Components.interfaces.nsIFile
> 
> These aren't threadsafe.  Using them from off the main thread is a good way to
> get corrupt files and crashes.

Isn't it perfectly fine to use if you create the nsIFile instance in the bg thread and only touch the file from the bg thread?
Comment 53 Boris Zbarsky [:bz] (TPAC) 2010-12-09 20:55:44 PST
Generally speaking, no, from a read through the file code.  Some operations are maybe safe, but not everything on nsIFile.
Comment 54 timeless 2010-12-10 00:52:11 PST
so, instead of actually passing an nsIFile across the thread boundary, you should pass its .spec or the other serialized version and then create a file on the other thread and initialize it from the string serialization .spec/whatever. at least, historically that was the correct way to avoid asserts about threadsafety.

that doesn't speak to bz's reference about there being bits of the file reading class which is unsafe, i'm not sure what he means - i don't recall many statics floating around....
Comment 55 Nate Weiner 2010-12-10 11:03:17 PST
Follow up to using executeAsync as an alternative:

ExecuteAsync appears to run on the main thread.

For example:

statement.executeAsync({
  handleResult: function(aResultSet) {
    for (let row = aResultSet.getNextRow();
         row;
         row = aResultSet.getNextRow()) {

          let value = row.getResultByName("column_name");
    }
  },
  handleError: function(aError) {},
  handleCompletion: function(aReason) {}
});

If the result set in handleResult has a lot of rows, Firefox hangs enumerating through it.

I also tried:

handleResult: function(aResultSet) {
    let p = 0;
    for(let i=0; i<1000000000; i++)
    {
         // waste time
        p = i * 20.4;                   
     }
}

This also locked up Firefox until completed.

Is handleResult suppose to be occurring off thread?  If not, how can you handle a large sql result without blocking?
Comment 56 Boris Zbarsky [:bz] (TPAC) 2010-12-10 12:00:07 PST
> ExecuteAsync appears to run on the main thread.

ExecuteAsync does the database/disk access on a background thread, then calls your callback on the main thread.

> Is handleResult suppose to be occurring off thread? 

No.

> If not, how can you handle a large sql result without blocking?

You break up your handleResult across trips back to the event loop?  Either using explicit setTimeout calls or using a generator plus timeouts to drive it.  There are lots of well-understood techniques for doing this in JS...
Comment 57 Philipp Kewisch [:Fallen] 2010-12-10 12:47:06 PST
(In reply to comment #45)
> I'm curious about what components your add-on accessed.  The vast majority of
> components in Gecko are not threadsafe (but they don't throw when accessed off
> of the GUI thread).

Lightning uses libical, or rather the xpcom libical wrapper we created. Its advisable to do this on a thread, since the processing takes a moment and with tons of events it really slows down the UI if the processing is done on the main thread. We've been there before and I don't really want to have to go back there again.

Ideally, we could also move creating the calIEvent xpcom object into the thread, but this currently fails. I don't really need to access these objects from multiple threads, for me it would be sufficient if I could hand the objects back to the main thread as soon as I am done processing them.
Comment 58 Shawn Wilsher :sdwilsh 2010-12-10 13:57:31 PST
(In reply to comment #56)
> You break up your handleResult across trips back to the event loop?  Either
> using explicit setTimeout calls or using a generator plus timeouts to drive it.
Note that you will get multiple handleResult calls.  Storage will dispatch an event to the main thread ever 75ms or 15 rows (whichever comes first).
Comment 59 Nate Weiner 2010-12-10 15:12:28 PST
Here is a first stab at making an asynchronous method for reading a large result set from mozStorage:

https://gist.github.com/736966

It seems to be working pretty well so far, Shawn and Boris, since it was built on your suggestions, let me know if you find any pitfalls in it.
Comment 60 Shawn Wilsher :sdwilsh 2010-12-10 15:32:19 PST
(In reply to comment #59)
> It seems to be working pretty well so far, Shawn and Boris, since it was built
> on your suggestions, let me know if you find any pitfalls in it.
mozStorage already breaks it into chunks for you.  I think it'd probably be better to increase the interval by 100-200ms each time handleResult is called if you are getting too many callbacks.

That, or just push each row onto an array, and every Xms process N rows (or until empty).
Comment 61 Ben Turner (not reading bugmail, use the needinfo flag!) 2010-12-10 15:50:29 PST
I just filed bug 608142 to get ChromeWorkers access to (a very small number of) XPCOM objects.
Comment 62 Ben Turner (not reading bugmail, use the needinfo flag!) 2010-12-10 15:53:15 PST
(In reply to comment #61)
> I just filed bug 608142 to get ChromeWorkers access to (a very small number of)
> XPCOM objects.

Doh. Bug 618484.
Comment 63 Brendan Eich [:brendan] 2010-12-11 17:10:34 PST
https://wiki.mozilla.org/JavaScript/HandlingThreads is a slightly out of date, way-too-generous in assessing alternatives that I think are not realistic at all, wiki page on what to do.

I'm in favor of keeping JS runnables mostly working if the closure passed cross thread does not use main-thread-only objects (already an error due to bug 612745) or rely on mutating variables in its environment. I'm going to work on this over in bug 566951. I will need help from add-on authors who implement nsIRunnable in JS and do not abuse main-thread-only objects and mutable upvars.

Giving ChromeWorkers some XPCOM access is ok but it's a bit late to be not only implementing this but selling it as the migration path. This should have happened a release ago so we could actually test migration, iterate on design and code bug fixes, and be sure add-on developers could, would, and ultimately did migrate.

Since we didn't do anything like that, I think bug 566951 needs fixing now. We'll be able to deprecate JS nsIRunnable later, without a crisis that's already upon us due to poor planning, can-kicking, or just too much work to do and too few to do it.

/be
Comment 64 Brendan Eich [:brendan] 2010-12-11 17:13:32 PST
> I'm in favor of keeping JS runnables mostly working if the closure passed cross
> thread does not use main-thread-only objects

... off the main thread, of course.

There are lots of ways to go wrong with the old JS runnable / thread manager API but also ways to go right. Bath water, baby -- at least until we get a better bath tub.

/be
Comment 65 Boris Zbarsky [:bz] (TPAC) 2010-12-11 17:22:18 PST
> if the closure passed cross thread does not use main-thread-only objects

I should note that every single example I've looked at so far in extensions does.... just not reliably.  Not only that, but the patch for bug 612745 doesn't help much: an exception is thrown, but the corruption has already occurred.

> I will need help from add-on authors

It hasn't happened in the past; what makes you think it'll start happening now?
Comment 66 Boris Zbarsky [:bz] (TPAC) 2010-12-11 17:34:58 PST
I apparently underquoted in comment 65, since at least one person misunderstood me.  Addon authors have been helping all along here.  I just don't think we're giving them the tools to help in the particular way Brendan asks for: by not abusing main-thread-only objects and mutable upvars.  We have no documentation on what objects are main-thread-only.  We have no good way for addon authors to easily tell when they're entraining or accessing such objects.  We have no good documentation for addon authors about what the heck an upvar _is_, much less what constitutes a mutable upvar.

So unless we actually do something to address the above, I don't think addon authors will be able to help in the way Brendan hopes, much as they'd like to.
Comment 67 Brendan Eich [:brendan] 2010-12-11 21:25:33 PST
(In reply to comment #65)
> > if the closure passed cross thread does not use main-thread-only objects
> 
> I should note that every single example I've looked at so far in extensions
> does.... just not reliably. 

We need to be precise now: do you mean the closure uses main-thread-only objects off the main thread, or on it? I've seen JS runnables sent to the main thread precisely to interact with main-thread-only objects, when I last skimmed AMO sources.

> Not only that, but the patch for bug 612745
> doesn't help much: an exception is thrown, but the corruption has already
> occurred.

If that's true, please file a bug. The intent was to throw before any corruption could occur.

> It hasn't happened in the past; what makes you think it'll start happening now?

We are (trying to -- please file that bug if real) make it an error to use a main-thread-only object off the main thread. This should put clear diagnostics in the error console (see followup bug 618126). I'm considering making assignments to upvars in frozen closure environment clones always throw (as if in strict-mode code), so that should produce more smoking guns.

We can't do much better without trying, provided we have clear diagnostics of what to avoid and still pass a closure cross-thread. We've certainly done a lot worse by not trying and just kicking the can. ChromeWorkers are not compatible enough, however superior and future-proof they may be, as I've said from day one in this debate.

So again: speak now or I'm going to hack a patch to bug 566951.

/be
Comment 68 Boris Zbarsky [:bz] (TPAC) 2010-12-12 07:53:48 PST
> do you mean the closure uses main-thread-only objects off the main thread, or
> on it?

The former.  And it often does that without even realizing it, I bet (e.g does something like access Components in the closure, when the runnable was posted from a Window, not a component).

> The intent was to throw before any corruption could occur.

Let's move this discussion to bug 612745.
Comment 69 Boris Zbarsky [:bz] (TPAC) 2010-12-12 08:01:25 PST
> We are (trying to -- please file that bug if real) make it an error to use a
> main-thread-only object off the main thread.

Er, forgot this part.  I think we've prevented creation of a new wrapper for such objects.   I don't see how we've generally prevented their use.  And now we're relying on extension authors to test all codepaths to make sure they don't hit this new error and then change their code to avoid it.  Some certainly will.  I'm worried that "some" won't be nearly the same thing as "all"....

> I'm considering making assignments to upvars in frozen closure environment
> clones always throw 

That would help a lot, yes.

I agree ChromeWorkers are not all that compatible.  I'm not sure runnables will be either, once we actually disallow the unsafe bits...  I guess we'll see.  Far be it from me stopping someone from doing work.  ;)
Comment 70 Shawn Wilsher :sdwilsh 2011-04-11 17:58:58 PDT
This did break user defined SQL functions written in JavaScript from running asynchronously, which means add-on authors have to do disk I/O on the GUI thread (bug 649151).

Note You need to log in before you can comment on or make changes to this bug.