Make JS-implemented WebIDL faster

RESOLVED WONTFIX

Status

()

Core
DOM
P3
normal
RESOLVED WONTFIX
4 years ago
26 days ago

People

(Reporter: bz, Unassigned)

Tracking

(Depends on: 2 bugs)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 1 obsolete attachment)

Created attachment 8432627 [details] [diff] [review]
Testing patch

I was asked to measure the performance of JS-implemented WebIDL.  With the attached
Created attachment 8432636 [details]
Testcase

Argh.  

With the attached patch and testcase, I see numbers like so:

Pre void(void): 9
Pre void(arg): 12
Pre string(void): 41
Contact void(void): 529
Contact void(arg): 719
Contact string(void): 602

This is clearly suboptimal.

I did some profiling, and the extra cost breaks down approximately as follows:

36% CallSetup constructor, breaking down as:
  12% the AutoEntryScript ctor doing a cx push
   5% nsScriptSecurityManager::ScriptAllowed
   5% GetNativeForGlobal
   3% WindowGlobalOrNull
   2% NS_IsMainThread
   2% xpc::WindowOrNull
   2% addrefing principals
   2% JSAutoCompartment ctor
   2% Getting the subject principal
20% The actual call into chrome JS via JS::Call, with this breakdown:
   6% self time in js::Invoke
   3% SPS entry marker ctors/dtors
   2% js::jit::CanEnterBaselineMethod
   3% in jit::EnterBaselineMethod but outside EnterBaseline
   2% JitActivation ctors/dtors
   3% running the actual JS
19% GetCallableProperty.  This is mostly under JS_GetProperty, which is spending half its
    time atomizing and the other half doing the actual baseops::GetProperty, complete with
    nativeLookup, etc.
17% CallSetup destructor, breaking down as:
    7% ~AutoCxPusher
    3% ~AutoEntryScript (releasing principals, etc).  
    2% JSAutoCompartment ctor (why???)
    2% JS_SaveFrameChain
    1% JSAutoCompartment dtor
    1% JS_RestoreFrameChain
Attachment #8432627 - Attachment description: test.patch → Testing patch
Bobby, how close are we to being able to get rid of the cx push/pop here?  Can we just elide the ScriptAllowed check in the aIsJSImplentedWebIDL case?  Can we make AutoEntryScript hold a weak ref to the principal?  Can we just skip the "WindowGlobalOrNull" bit for JS-implemented WebIDL, since afaict that will always report null and just fall back to GetNativeForGlobal immediately?
Depends on: 1019091
Flags: needinfo?(bobbyholley)
> So 20% is cx pushing alone?

Yes.

Kyle, we have some sort of per-thread place to hang jsids off that we could use in GetCallableProperty, right?
Flags: needinfo?(khuey)
Jason, is there anything we could do to make the JS::Call bit faster?
Flags: needinfo?(jorendorff)
Yes, we have a generated PerThreadAtomCache struct we can stick things in.  http://hg.mozilla.org/mozilla-central/annotate/6a984e21c2ca/dom/bindings/Codegen.py#l13200
Flags: needinfo?(khuey)
(In reply to Boris Zbarsky [:bz] from comment #3)
> Bobby, how close are we to being able to get rid of the cx push/pop here?

We're getting there, but probably months. First, we need to finish converting all of the existing cx pushes into AutoJSAPI (or derived). Then we need to switch all consumers that inspect the JSContext stack to inspecting the Script Setting Stack. Then we can stop pushing.

> Can we just elide the ScriptAllowed check in the aIsJSImplentedWebIDL case?

That sounds fine to me.

> Can we make AutoEntryScript hold a weak ref to the principal?

I _think_ so. We just have to be sure that they won't go away in stack scope.

> Can we just
> skip the "WindowGlobalOrNull" bit for JS-implemented WebIDL, since afaict
> that will always report null and just fall back to GetNativeForGlobal
> immediately?

Yes.
Flags: needinfo?(bobbyholley)
> I _think_ so. We just have to be sure that they won't go away in stack scope.

It's the principal we got from nsContentUtils::SubjectPrincipal(), so I'd think it can't die, right?
Flags: needinfo?(bobbyholley)
(In reply to Boris Zbarsky [:bz] from comment #8)
> > I _think_ so. We just have to be sure that they won't go away in stack scope.
> 
> It's the principal we got from nsContentUtils::SubjectPrincipal(), so I'd
> think it can't die, right?

In theory if we entered compartment A with an nsNullPrincipal, invoked nsContentUtils::SubjectPrincipal(), then immediately entered compartment B, and then later GCed, we could collect B and release the nsNullPrincipal. This seems unlikely, but also kind of dicey.
Flags: needinfo?(bobbyholley)
Streamlining js::Invoke looks hard. It's very cluttered in there--that is, there's no one thing taking much of the time. No low-hanging fruit. You could avoid, say, memcpying the arguments. You can eliminate a branch here and there. Maybe SPSProfiler::push could be streamlined. I doubt it would amount to much though.

(In reply to Boris Zbarsky [:bz] from comment #1)
> 19% GetCallableProperty.  This is mostly under JS_GetProperty, which is
> spending half its
>     time atomizing and the other half doing the actual baseops::GetProperty,
> complete with
>     nativeLookup, etc.

Hmm. The atom could be cached.

> 17% CallSetup destructor, breaking down as:
>     7% ~AutoCxPusher
>     3% ~AutoEntryScript (releasing principals, etc).  
>     2% JSAutoCompartment ctor (why???)
>     2% JS_SaveFrameChain
>     1% JSAutoCompartment dtor
>     1% JS_RestoreFrameChain

Seems like a bug in ~CallSetup(). We must be running some of that code about dealing with exceptions, even though nothing is throwing in this case.
> Hmm. The atom could be cached.

Yes, that's bug 1019160.

> We must be running some of that code about dealing with exceptions

Indeed, covered by bug 1019091.
The next obvious step here is to create a C++ inline cache class to eliminate the whole GetCallableProperty lookup, which is total nonsense. It could even use a custom (perhaps friend-api) JS::Call signature that lets you skip the memcpy and some of the weirder checks. 

I wish we could work on making fewer calls from C++ to JS rather than optimizing them...
Flags: needinfo?(jorendorff)
Fwiw, on trunk now the numbers look like this:

Contact void(void): 393
Contact void(arg): 565
Contact string(void): 451

So we've cut out 20-25% of the time...

But yes, the GetCallableProperty thing kinda sucks.  It's the rare case on the web, but the common case for JS-impl webidl methods....
Blocks: 1103446

Updated

3 years ago
No longer blocks: 1103446
The patches in bug 1275999 make the testcase here about 10% faster.  The patches in bug 1276276 will shave off another 5%...
Attachment #8432627 - Attachment is obsolete: true

Updated

26 days ago
Priority: -- → P3
It sounded like the consensus was to deprecate JS implemented WebIDL, so I think we're certainly not going to work on making it faster.
Status: NEW → RESOLVED
Last Resolved: 26 days ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.