Closed Bug 897655 Opened 6 years ago Closed 6 years ago

Use off thread JS parsing when loading scripts from XUL documents


(Core :: XUL, defect)

Not set





(Reporter: bhackett, Assigned: bhackett)




(8 files, 1 obsolete file)

Attached patch patch (868ce514bba7) (obsolete) — Splinter Review
With bug 875125 JS can now be parsed off the main thread, and since XUL documents already read in scripts asynchronously it would be good if they parsed the scripts asynchronously once reading is done.  This both reduces main thread time on cold startups and helps with potentially removing the need for XDR'ed bytecode (instead always reading in scripts directly, as source is more compact than XDR).

A cold startup on my Mac spends about 200ms parsing/emitting scripts.  About 1/3 of this parsing is triggered by scripts loaded from XUL documents.  The attached patch changes these parses to happen off thread.  Unfortunately, the 70ms of parsing that used to be on the main thread now takes about 310ms off thread.  220ms of the overhead, almost everything, is due to contention on the lock protecting the atoms table, and almost all this contention is because on Macs the browser triggers two parses for several rather large scripts (including browser.js) at almost the same time.

There are several options for dealing with this contention:

- Ignore it.  Most of the contention is happening off the main thread and the delay in getting scripts parsed may not affect how long the browser takes to start.  Also, the problem with parsing the same scripts multiple times doesn't affect non-Mac platforms (see bug 392650).

- Only allow one helper thread at a time (plus the main thread) to parse scripts.

- Modify the JS parser and VM so that locking is not required when using the atoms table.  This would require using a fake thread local atoms table when parsing off thread and fixing up atom pointers and state when merging back onto the main thread.

I need to do more measurements to see which of these approaches is best, but posting the complete patch as is for now.
The second option above seems reasonable.  When only one helper thread, and possibly the main thread, are allowed to parse there is much less contention for the exclusive access lock and the total amount of time blocked goes from 220ms to 17ms.
Attachment #780571 - Attachment is obsolete: true
Attached patch boring partsSplinter Review
Splitting the above patch up for review.  This has some boring changes allowing flattening and concatenation on ExclusiveContext threads (which own their zone, so this is fine) and a couple bugfixes.
Assignee: nobody → bhackett1024
Attachment #780668 - Flags: review?(wmccloskey)
Some more boring stuff to add new rooting APIs and relax JS_AddRoot* roots so that they can hold null pointers.  This means that while the main thread still needs to add/remove roots, they can be changed by helper threads (so long as those threads are paused during GC).
Attachment #780670 - Flags: review?(wmccloskey)
Remove some vestigial parser code that is unnecessary after CPG.
Attachment #780673 - Flags: review?(luke)
Attached patch XUL changesSplinter Review
Changes to XUL interfaces so that XUL documents can parse their scripts off thread after the source has fully loaded.
Attachment #780676 - Flags: review?(bzbarsky)
Attached patch JS changesSplinter Review
Changes to off thread parsing to support the new API for triggering off thread parses and merging the finished scripts into the target compartment/zone after parsing finishes.
Attachment #780678 - Flags: review?(wmccloskey)
Comment on attachment 780673 [details] [diff] [review]
remove vestigial code

my favorite kind of review
Attachment #780673 - Flags: review?(luke) → review+
I'm sorry for the lag here.  I should get to this tomorrow....
I was looking at the other JS patches and thinking that it would be useful to JS_ASSERT(parseFinishedList.empty()) when we can to ensure that we aren't leaking ParseTasks.  A first place to assert would be ~JSRuntime (since otherwise we'll leak them).  Other than that, the JS engine doesn't have a good pinch point, but Gecko should know when there are no outstanding scripts compiling and be able to call some JS_AssertNoPendingScriptCompilations.  (I see nsIOffThreadScriptReceiver isn't implemented yet, so this is a future request.)
(In reply to Luke Wagner [:luke] from comment #9)
> I was looking at the other JS patches and thinking that it would be useful
> to JS_ASSERT(parseFinishedList.empty()) when we can to ensure that we aren't
> leaking ParseTasks.  A first place to assert would be ~JSRuntime (since
> otherwise we'll leak them).  Other than that, the JS engine doesn't have a
> good pinch point, but Gecko should know when there are no outstanding
> scripts compiling and be able to call some
> JS_AssertNoPendingScriptCompilations.  (I see nsIOffThreadScriptReceiver
> isn't implemented yet, so this is a future request.)

Actually, XULDocument is an nsIOffThreadScriptReceiver.  I'm not sure where a good place would be to assert this in gecko since the script loading is already done asynchronously so we'd basically end up with some counter somewhere that needs to be updated in a threadsafe manner and would be updated at similar points to where parseFinishedList changes, which doesn't seem to add much.
(In reply to Brian Hackett (:bhackett) from comment #10)
> Actually, XULDocument is an nsIOffThreadScriptReceiver.

Oops, I missed that.

I agree just adding some on-the-side counter won't add much.  bz: by any chance do you know of any existing conditions in Gecko where we know there are no outstanding documents being loaded?
Comment on attachment 780676 [details] [diff] [review]
XUL changes

>+++ b/content/xul/content/src/nsXULElement.cpp
>+                              nsIOffThreadScriptReceiver *aOffThreadReceiver /* = NULL */)

nullptr, please.

>+++ b/content/xul/content/src/nsXULElement.h
>+                     nsIOffThreadScriptReceiver *aOffThreadReceiver = NULL);

And here.

>+++ b/content/xul/document/src/XULDocument.cpp
>+                // We will be notified via OnOffThreadCompileComplete when the
>+                // compile finishes. Keep the contents of the compiled script
>+                // alive until the compilation finishes.
>+                mOffThreadCompileString = stringStr;

This isn't actually safe: depending on exactly how stringStr was filled with data this assignment may copy.

It's probably better to assert that mOffThreadCompileString is empty, then assign it before the Compile() call, and pass its .get() and .Length().  Then if the Compile call returns an error, Truncate() it.

>+XULDocument::OnScriptCompileComplete(JSScript *aScript, nsresult aStatus)

The first arg should probably be a JS::Handle<JSScript*>.  Or is it a non-handle on purpose?  If so, it's worth documenting that and why that is...

>+    mOffThreadCompileString = nullptr;


Or if you want the "null string" behavior then SetIsVoid(), I guess, but just making it empty should be fine.

>+    // Clear mCurrentScriptProto now, but save it first for use below in
>+    // the compile/execute code

There is no compile code below, just execute.

You should probably add nsIOffThreadScriptReceiver to the QueryInterface implementation for XULDocument...  See more on this below.

>+++ b/dom/base/nsIScriptContext.h

Please give the new interface an IID, and maybe put it and its IID after nsIScriptContext?  You can forward-declare it for use in CompileScript.

>+                                 nsIOffThreadScriptReceiver* aOffThreadReceiver = NULL,


>+++ b/dom/base/nsJSEnvironment.cpp
>+  nsRefPtr<nsIOffThreadScriptReceiver> mReceiver;

nsCOMPtr, please.

>+  JS::FinishOffThreadScript(nsJSRuntime::sRuntime, mScript);
>+  return mReceiver->OnScriptCompileComplete(mScript, mScript ? NS_OK : NS_ERROR_FAILURE);

The first call effectively unroots mScript, according to the comments where mScript is defined.  What guarantees that it won't die during the middle of the second call?  I guess the callee is expected to immediately root it?  If so, that's worth documenting on the nsIOffThreadScriptReceiver interface.

>+                           nsIOffThreadScriptReceiver* aOffThreadReceiver /* = NULL */,


>+    aOffThreadReceiver->AddRef();


>+++ b/dom/base/nsJSEnvironment.h
>+                                 nsIOffThreadScriptReceiver* aOffThreadReceiver = NULL,


r=me with the above nits fixed.
Attachment #780676 - Flags: review?(bzbarsky) → review+
Comment on attachment 780668 [details] [diff] [review]
boring parts

Review of attachment 780668 [details] [diff] [review]:

::: js/src/jsatom.cpp
@@ -212,5 @@
>      /* We treat static strings as interned because they're never collected. */
>      if (StaticStrings::isStatic(atom))
>          return true;
> -    AtomSet::Ptr p = cx->runtime()->atoms.lookup(atom);

Oops. Can we make this private or something so that you have to access it through the ExclusiveContext getter?

::: js/src/vm/String.cpp
@@ +400,5 @@
>  }
>  template <AllowGC allowGC>
>  JSString *
> +js::ConcatStrings(ExclusiveContext *cx,

Shu is already making this a ThreadSafeContext. I think the right thing to do is to change ScopedThreadSafeStringInspector::ensureChars so that it calls ensureLinear even for ExclusiveContexts. Right now it will copy the chars out, which is probably less efficient.
Attachment #780668 - Flags: review?(wmccloskey) → review+
Depends on: 898886
Comment on attachment 780678 [details] [diff] [review]
JS changes

Review of attachment 780678 [details] [diff] [review]:

This looks nice. Sorry the review took so long.

::: js/src/jsapi.h
@@ +4011,5 @@
> +                 const jschar *chars, size_t length,
> +                 OffThreadCompileCallback callback, void *callbackData);
> +
> +extern JS_PUBLIC_API(void)
> +FinishOffThreadScript(JSRuntime *rt, JSScript *script);

Some comments about how to use these API functions would be nice, especially for FinishOffThreadScript. Please especially say which thread it all happens on.

::: js/src/jsgc.cpp
@@ +4764,5 @@
>      return compartment.forget();
>  }
>  void
> +gc::MergeCompartments(JSCompartment *source, JSCompartment *target)

I think we should set source->global_ to NULL.
I'd also like to assert that source->crossCompartmentWrappers is empty.
The same for source->regExps.inUse_.
Also, I think we should clear out all the tables that get swept in JSCompartment::sweep. That function will still be called by the GC for the worker's compartment.

@@ +4783,5 @@
> +        JS_ASSERT(base->compartment() == source);
> +        base->compartment_ = target;
> +    }
> +
> +    // Fixup zone pointers in source's zone to refer to target's zone.

It might be nice to assert that we're not messing up any other compartments. Maybe something like this:

  for (CompartmentsInZoneIter c(source->zone()); !c.done();
    JS_ASSERT(c.get() == source);

::: js/src/jsobj.cpp
@@ +5145,5 @@
>      return true;
>  }
>  JSObject *
> +js::GetClassPrototypePure(GlobalObject *global, JSProtoKey protoKey)

Can you rewrite the beginning of js_GetClassPrototype to use this code?

::: js/src/jsworkers.cpp
@@ +184,5 @@
>      if (options.principals)
>          JS_HoldPrincipals(options.principals);
>      if (options.originPrincipals)
>          JS_HoldPrincipals(options.originPrincipals);
> +    if (!JS_AddNamedObjectRootRT(rt, &this->scopeChain, "ParseTask::scopeChain"))

Rather than adding new API functions, I think we should use calls to js:: functions.

Alternatively, given the problems with CompileOptions in the heap, it might be easier to put code in MarkRuntime that iterates over all the ParseTasks and marks their state. That way we could mark the data in CompileOptions as well.

@@ +186,5 @@
>      if (options.originPrincipals)
>          JS_HoldPrincipals(options.originPrincipals);
> +    if (!JS_AddNamedObjectRootRT(rt, &this->scopeChain, "ParseTask::scopeChain"))
> +        MOZ_CRASH();
> +    if (!JS_AddNamedScriptRootRT(rt, &script, "ParseTask::script"))

If I'm not mistaken, we never actually need to root |script|. It will be allocated in the worker zone, and once it's transferred over, it should be rooted on the stack. I think it might be better not to root it so that we avoid the complexity of possibly-null roots, as well as any fears this might inspire about pointers from the runtime to worker thread objects. However, I'll leave it up to you what to do.

@@ +226,5 @@
>          return false;
>      // For now, type inference is always disabled in exclusive zones.
>      // This restriction would be fairly easy to lift.
> +    JS_ASSERT(!cx->typeInferenceEnabled());

I guess this is because, when we merge the thread zone back, we would need TI data if the main thread zone has TI enabled? That seems like a pretty important restriction on this interface. Can you note it in the comment?

@@ +466,5 @@
> +{
> +    JS_ASSERT(isLocked());
> +    if (parseWorklist.empty())
> +        return false;
> +    for (size_t i = 0; i < numThreads; i++) {

Please make some comment here about not allowing simultaneous parses to prevent contention on the atoms table.

@@ +484,5 @@
> +        AutoLockWorkerThreadState lock(rt);
> +        for (size_t i = 0; i < parseFinishedList.length(); i++) {
> +            if (parseFinishedList[i]->script == script) {
> +                parseTask = parseFinishedList[i];
> +                if (i + 1 == parseFinishedList.length())

It might be easier to do this as:
  parseFinishedList[i] = parseFinishedList.back();

@@ +511,5 @@
> +
> +    // Point the prototypes of any objects in the script's compartment to refer
> +    // to the corresponding prototype in the new compartment. This will briefly
> +    // create cross compartment pointers, which will be fixed by the
> +    // MergeCompartments call below.

As far as I can tell, the parent chain isn't getting fixed up. I'm pretty sure that a CPG invariant is that every object's parent chain end at the compartment's global. We assert this in JSObject::global.

::: js/src/jsworkers.h
@@ +329,1 @@
>      JSScript *script;

Can you put comments on these fields (scopeChain and script) about when they're used and whether they point into the worker zone.

::: js/src/vm/Shape.h
@@ +230,5 @@
>  class UnownedBaseShape;
>  struct StackBaseShape;
> +namespace gc {
> +    void MergeCompartments(JSCompartment *source, JSCompartment *target);

Usually we wouldn't indent this.
Attachment #780678 - Flags: review?(wmccloskey) → review+
Comment on attachment 780670 [details] [diff] [review]
boring rooting changes

Review of attachment 780670 [details] [diff] [review]:

::: js/src/jsapi.cpp
@@ +2255,5 @@
>      return AddObjectRoot(cx, rp, name);
>  }
> +JS_AddNamedObjectRootRT(JSRuntime *rt, JSObject **rp, const char *name)

As I said, I don't think we should add these to the API if we don't need to.
Attachment #780670 - Flags: review?(wmccloskey) → review+
Fix a bug where the 'load' event could fire on a XUL document after the script has been read in but before it has been parsed or executed.
Attachment #785771 - Flags: review?(bzbarsky)
Comment on attachment 785771 [details] [diff] [review]
block load events during parsing

Attachment #785771 - Flags: review?(bzbarsky) → review+
Blocks: 902506
Off thread parsing makes a lot of OS X tests more or less permaorange.  This isn't a bug introduced by off thread parsing --- bug 896054 covers the identical intermittent orange here and bug 887868 is another patch unrelated to graphics code which made these tests permaorange.  There doesn't seem to be an obvious fix so rather than just not land this patch I tracked down the off thread compilation which was triggering the orange and think it should just be blacklisted.
Attachment #789120 - Flags: review?(wmccloskey)
If we're convinced it's a graphics bug, can we instead disable the test with a bug to reenable?  Putting blacklisting code in the JS engine seems unnecessarily gnarly and if this is an easily tickle-able bugs, it'll just cause troubles for someone else later.
Given that this is a non-fatal assertion, I think we can just annotate these assertions as "expected". You can do that in the reftests manifest like so:
and the mochitest as follows:

Also, setting needinfo for Matt in case there's an easy fix.
Flags: needinfo?(matt.woodrow)
I don't know if it's an easy fix, I haven't been able to reproduce this before.

Can it be reproduced locally with these patches?

I'd be ok with disabling the test now, and filing a new bug (and assigning it to me).
Flags: needinfo?(matt.woodrow)
We can't disable "the test" or mark the assertions as expected, because the assertion is triggered irregularly during the test session, and so causes failures in many different mochitests sporadically.

I can reproduce the assertion firing locally on my 10.7 machine during a normal browser session, but Steven wasn't able to on his 10.8 machine.
Comment on attachment 789120 [details] [diff] [review]
blacklist for os x orange

Review of attachment 789120 [details] [diff] [review]:

OK, I guess if there are a lot of tests that fail.

Brian, can you give Matt instructions on how to reproduce this? It would be nice to get this fixed the right way.
Attachment #789120 - Flags: review?(wmccloskey) → review+
Matt, if you comment out the #ifdef XP_MACOSX stuff in js/src/jsapi.cpp, the assertion failures should reproduce on try.  I couldn't get them locally on my 10.8 machine.
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
Blocks: 906060
Could you explain ?
Flags: needinfo?(bhackett1024)
(In reply to :Ms2ger from comment #26)
> Could you explain
> ?

Umm, incompetence?  I disabled those tests while I was in the process of narrowing down the issue in comment 18 and forgot to undisable them.  The following push restores the tests, and thankfully doesn't restore the permaorange.
Flags: needinfo?(bhackett1024)
Depends on: 906331
Blocks: 906371
Depends on: 906372
Depends on: 907777
No longer depends on: 906591
Blocks: 913519
No longer blocks: 913519
Depends on: 913519
Depends on: 931747
Depends on: 994335
Depends on: 1000598
Depends on: 1009624
You need to log in before you can comment on or make changes to this bug.