Last Comment Bug 650411 - assert that JSRuntime is only used in a single-threaded manner
: assert that JSRuntime is only used in a single-threaded manner
Status: RESOLVED FIXED
: dev-doc-needed
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal with 4 votes (vote)
: mozilla10
Assigned To: Luke Wagner [:luke]
:
Mentors:
: 547712 (view as bug list)
Depends on: new-web-workers 663479 667535 668915 673473 674571 674597 675068 676376
Blocks: 720018 cpg 675078 678984 680562 688069 720013 720045
  Show dependency treegraph
 
Reported: 2011-04-15 15:21 PDT by Luke Wagner [:luke]
Modified: 2012-01-20 16:36 PST (History)
20 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
assert single-threaded runtime (23.67 KB, patch)
2011-07-13 12:07 PDT, Luke Wagner [:luke]
no flags Details | Diff | Review
add asserts to JS and XPConnect (54.95 KB, patch)
2011-07-28 16:05 PDT, Luke Wagner [:luke]
jorendorff: review+
dmandelin: superreview+
Details | Diff | Review

Description Luke Wagner [:luke] 2011-04-15 15:21:43 PDT
Bug 649537 should make it *possible* to have JSRuntime be single-threaded from the Mozilla perspective.  However, as bug 649537 comment 18 states, the goal for that bug is limited to workers and DOM perf, not to actually make JSRuntime require single-threaded use.  That discussion is for this bug.

Benefits include:
 - much simpler mental model for engine, especially important for newcomers/interns
 - we can hoist a bunch of stuff out of compartments and into the JSRuntime (which is a precondition for bug 650353) and generally use the JSRuntime without taking locks (e.g., atomization)

Note that the second item could also be solved by adding yet another new concept (proposed as "zones" in bug 621205).

Note that, using multiple JSRuntimes would *not* require multiple processes/IPC -- just that sharing between JSRuntimes be handled by the embedding (e.g., as is being done in bug 649537 for workers) via native classes or proxies.
Comment 1 Luke Wagner [:luke] 2011-07-13 12:07:49 PDT
Created attachment 545718 [details] [diff] [review]
assert single-threaded runtime

This bug is getting close.  The attach patch gives JSRuntime an "ownerThread" and asserts this is the current thread on every CHECK_REQUEST.  With the patches in the dependencies applied (and with one little 'quell' hack to silence one case that hasn't been fixed yet)), the patch passes try.

I'll make a newsgroup post announcing the intention to make JSRuntime officially single-threaded.
Comment 2 Mike Moening 2011-07-13 13:01:51 PDT
Talk about throwing the baby out with the bath water...
So what happens to those of us that heavy use threads, context pools etc...
Those of use that can't wait for real compartment level GC.

How are we supposed to have 500 runtimes going at the same time and not run out of memory?  Did the JSRuntime suddenly get lightweight?  Tell me how to accomplish this in the "new world".

Multi-threading support is one of the most useful features of the spidermonkey.
It's really not that hard to learn.  A simple MT-Hello World apps is all a guy needs. Maybe even better documentation!

Why hoist stuff out of compartments?  I thought we were going to have 1 compartment per thread?  Isn't that lock free?  What am I missing.

Maybe I'm making too much of this... I just seems like a bad path to take for such small gains.
Comment 3 Andreas Gal :gal 2011-07-13 13:38:42 PDT
My understanding is that JSRuntime will become very lightweight, yes.
Comment 4 Mike Moening 2011-07-13 13:41:47 PDT
How lightweight?  Who owns the memory pools?  Aren't those big & fat? Will there be a memory manager hoisted out of the runtime now?
Comment 5 Andreas Gal :gal 2011-07-13 13:46:32 PDT
Compartments are already the memory manager, so thats hoisted out. My biggest worry is fragmentation (compartment pools are 2MB each). We might have to go to smaller pools for that. Lets start with some numbers. Measure often, cut once.
Comment 6 Mike Moening 2011-07-13 13:51:15 PDT
Can you explain how one might setup spidermonkey if you needed to run between 200-500 threads?
Assumptions are that scripts are compiled, saved and run many times on many different contexts.
How many runtimes, compartments, contexts etc?
How would this work?

Thanks
Comment 7 Andreas Gal :gal 2011-07-13 13:56:56 PDT
Contexts are becoming near-0-cost or are going away.

Each concurrent thread needs its own compartment (only one thread is allowed to touch a compartment at a time). The compartment cost is mostly the arena pool (2MB increments).

You also need one runtime per thread, which should be near-0-cost as well, or at least not more expensive than the current JSThread, which will be folded into JSRuntime.

So your main worry would be fragmentation of the compartment pool (500*2MB=1GB). If that is indeed a problem, we can probably fix that through smaller pool sizes (we can move the compartment pointer into the arena and essentially eliminate fragmentation, for example).
Comment 8 Mike Moening 2011-07-13 14:03:44 PDT
Who owns the compiled scripts? I don't want to recompile each script for each compartment.  Lots of wasted overhead there.
1GB probably would kill the process due to out of memory conditions.
Comment 9 Luke Wagner [:luke] 2011-07-13 14:10:05 PDT
(In reply to comment #8)
> Who owns the compiled scripts? I don't want to recompile each script for
> each compartment.

Too late, that is already happening without this bug.

Basically, if you are using a SM release from after FF4, then you are either (1) already correctly using SM in a way that will translate naturally to single-threaded-runtime or (2) incorrectly using SM.

Yes, the docs are not up-to-date and need to be fixed.

Also, if you want 200-500 threads, unless you have 200-500 cores, it seems there should be multiplexing of threads+runtimes via some form of thread pool.
Comment 10 Andreas Gal :gal 2011-07-13 14:10:41 PDT
Compiled scripts are owned by the compartment. That would be a problem for you for sure. Do you have 500 threads operating concurrently? Its ok for different threads to touch a compartment, as long its sequential.
Comment 11 Mike Moening 2011-07-13 14:18:57 PDT
To comment #9:
Yes we have a thread pool. Any thread can use any context or compiled script.
As for "multiplexing" its called preemptive multitasking. The OS handles that very nicely for me.

To comment #10:
Would it be possible to update the compartment to allow hoisting out a "compiled script cache/hash"?  Since I believe scripts are stateless.
Comment 12 Andreas Gal :gal 2011-07-13 14:24:00 PDT
If you have 500 concurrent threads active, you are already blowing globs of memory on stacks.
Comment 13 Luke Wagner [:luke] 2011-07-13 15:09:12 PDT
(In reply to comment #11)
> Since I believe scripts are stateless.

Scripts are stateful (jits mutate them).
Comment 14 Mike Moening 2011-07-14 08:22:15 PDT
Luke,

Can you explain what you mean in detail about the jits mutating the scripts?
Thanks
Comment 15 Luke Wagner [:luke] 2011-07-14 10:20:39 PDT
(In reply to comment #14)
At runtime, the mjit reads and writes the JSScript fields: jitArityCheckNormal, jitArityCheckCtor, jitNormal, and jitCtor.
Comment 16 Luke Wagner [:luke] 2011-07-14 10:21:12 PDT
(In reply to comment #15)
Also I believe type inference has its own mutable set of attached data as well.
Comment 17 Mike Moening 2011-07-14 10:24:57 PDT
Ok so script is entrained with its compartment. Can I clone a compiled script from one compartment to the other to save on the compile hit?
Comment 18 Luke Wagner [:luke] 2011-07-14 10:38:40 PDT
Yes.  If any value (script or other) is moved between compartments in the same runtime, you need to be calling JS_WrapValue anyway.  This function takes care of cloning script objects.
Comment 19 Mike Moening 2011-07-14 10:42:15 PDT
If you make the runtime single threaded how will this work?  I'm want to cloning the script so that each new thread doesn't have to go through the work to re-compiling scripts. With this bug each runtime will require a different thread.
Comment 20 Luke Wagner [:luke] 2011-07-14 10:52:34 PDT
(In reply to comment #19)
Under the hood, JS_WrapValue implements the script cloning by using XDR to efficiently serialize then deserialize the script.  XDR is part of the public API (jsxdrapi.h), so you could use that.  Unfortunately, despite being quite old, noone seems to have added MDC documentation for XDR, so you'd have to read some SM code to see how it works (js_CloneScript, in particular).

However, if you your thread pool reuses runtimes (that is, the pool caches (thread,runtime) pairs), then a single runtime should be reused by many requests and script compilation time should be amortized.  Thus, as Andreas said above, I would just do the simple thing until profiling data shows that it matters.  If it reaches that point, XDR will work without too much fuss.
Comment 21 Mike Moening 2011-07-14 12:07:07 PDT
Luke, 
Do you think XDR serialization is more efficient the compilation?
Comment 22 Luke Wagner [:luke] 2011-07-14 12:14:52 PDT
(In reply to comment #21)
> Do you think XDR serialization is more efficient the compilation?

Yes, that's its purpose.  It is used in the Firefox "fast load cache" so that we don't recompile all browser/addon JS on startup.
Comment 23 Luke Wagner [:luke] 2011-07-28 16:05:14 PDT
Created attachment 549259 [details] [diff] [review]
add asserts to JS and XPConnect

It looks like this is getting close to landing.  The official plan is just to land this patch (which only asserts single-threaded-ness (albeit in release builds)) without any of the patches that actually take advantage of single-threaded-ness.  That way, if anything explodes and we can't fix it, we can trivially turn off the release-build asserts and everything works again.

Once the patch is on beta/release and nothing has exploded, the time of great reaping can begin.  I already have one patch to completely remove JSThread and JSThreadData.  There is much more to rip out after that.  Thus, this bug is simply for landing and tracking the assert patch.

The patch is actually pretty small and mostly deals with shell tests and js workers, which jorendorff wrote.
Comment 24 Jason Orendorff [:jorendorff] 2011-07-29 10:29:49 PDT
Comment on attachment 549259 [details] [diff] [review]
add asserts to JS and XPConnect

r=me with a few comments and questions.

In jscntxt.h:
>+    /* See comment for JS_AbortIfWrongThread in jsapi.h. */
>+#ifdef JS_THREADSAFE
>+    void clearOwnerThread();
>+    void setOwnerThread();
>+    bool onOwnerThread() const;
>+  private:
>+    void                *ownerThread_;
>+  public:
>+#else
>+    bool onOwnerThread() const { return true; }
>+#endif
>+
>     /* Context create/destroy callback. */
>     JSContextCallback   cxCallback;

When there are cpp "branches" in a classdef, I think maybe it's easier to
read, and more obviously correct, if public: or private: appears at the
beginning of each "basic block", particularly after the join point:

    #ifdef JS_THREADSAFE
      public:
        void clearOwnerThread();
        void setOwnerThread();
        bool onOwnerThread() const;
      private:
        void *ownerThread_;
    #else
      public:
        bool onOwnerThread() const { return true; }
    #endif

      public:
        /* Context create/destroy callback. */
        JSContextCallback   cxCallback;

Just a thought; feel free to ignore.

In shell/js.cpp:
>-typedef struct ScatterThreadData ScatterThreadData;

LOL

In shell/jsworkers.cpp:
>+struct AutoSetRuntimeThread
>+{
>+    JSRuntime *runtime;
>+
>+    AutoSetRuntimeThread(JSRuntime *runtime) : runtime(runtime) {
>+        JS_SetRuntimeThread(runtime);
>+    }
>+
>+    ~AutoSetRuntimeThread() {
>+        JS_ClearRuntimeThread(runtime);
>+    }
>+};

Maybe this building block belongs in jsapi.h.

>+struct AutoAssociateWithThread : AutoSetRuntimeThread,
>+                                 JSAutoRequest

The name here isn't clear. For the two places it's used, it seems as good
to declare two local variables.

In js/src/tests/js1_8_5/extensions/worker-fib.js:
>-    w.postMessage("10\t" + workerDir); // 0 1 1 2 3 5 8 13 21 34 55
>+    w.postMessage("8\t" + workerDir); // 0 1 1 2 3 5 8 13 21

Good change, but can you elaborate on what happened here? Are runtimes
not yet cheap enough to do this?

In js/src/tests/tests.py:
>-        resource.setrlimit(resource.RLIMIT_AS, (1*GB, 1*GB))
>+        resource.setrlimit(resource.RLIMIT_AS, (2*GB, 2*GB))

Same here.

In xpconnect/loader/mozJSSubScriptLoader.cpp, in mozJSSubScriptLoader::LoadSubScript:
>     JSContext *cx;
>     rv = cc->GetJSContext (&cx);
>     if (NS_FAILED(rv)) return NS_ERROR_FAILURE;
> 
>+    JS_AbortIfWrongThread(JS_GetRuntime(cx));

Why add this here rather than in the GetJSContext method?
Comment 25 Luke Wagner [:luke] 2011-07-29 11:42:29 PDT
Good points, all applied.

As for the worker-fib.js and rlimit changes: I was getting OOMs from mmap() for worker-fib.  The core dumps showed only about ~800MB mapped which is about (1+1+2+3+5+8+13+21+34+55) * 6MB (2MB for GC, 4MB for eagerly-committed VM stack).  (Note: browser workers max out at 20 threads and bug 656941 should lazily commit stacks for Linux/Mac, so realistic max usage should be 20 * (2+.1) MB.)
Comment 26 Luke Wagner [:luke] 2011-08-11 15:28:46 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/53dd7a930bae

Again: this is only a easily-backout-able release-mode assert; now we wait and see.
Comment 27 Luke Wagner [:luke] 2011-08-12 00:22:48 PDT
The backout did not fix the regression; assuming I'm reading the graphs right (boy howdy graphs-new.mozilla.org is nicer than than the old one!) I'll reland tomorrow.
Comment 28 Luke Wagner [:luke] 2011-08-12 00:23:16 PDT
Ignore comment 27; wrong bug.
Comment 29 Luke Wagner [:luke] 2011-08-12 00:50:30 PDT
*** Bug 547712 has been marked as a duplicate of this bug. ***
Comment 30 Matt Brubeck (:mbrubeck) 2011-08-12 07:58:19 PDT
https://hg.mozilla.org/mozilla-central/rev/53dd7a930bae
Comment 31 Luke Wagner [:luke] 2011-08-17 09:11:59 PDT
Backed out:
http://hg.mozilla.org/integration/mozilla-inbound/rev/5f0596a0b81e

Looks like a few more patches to security/manager/ssl need to land.  See bug 674571 comment 11 for details.
Comment 32 Jeff Walden [:Waldo] (remove +bmo to email) 2011-08-17 09:53:27 PDT
Based on timing I assume this made the aurora branch.  Probably back out there too?
Comment 33 Luke Wagner [:luke] 2011-08-17 10:52:42 PDT
Already requested approval in bug 678440.
Comment 35 Ed Morley [:emorley] 2011-11-07 17:25:11 PST
https://hg.mozilla.org/mozilla-central/rev/31d13586087f

dev-doc-needed?
Comment 36 Luke Wagner [:luke] 2011-11-07 19:08:02 PST
Yes, thanks.
Comment 37 Ben Turner (not reading bugmail, use the needinfo flag!) 2011-11-22 22:17:40 PST
(In reply to Luke Wagner [:luke] from comment #25)
> (Note: browser workers max out at 20 threads

Well... 20 threads *per origin*.

Note You need to log in before you can comment on or make changes to this bug.