Open Bug 1148678 Opened 9 years ago Updated 2 years ago

Script loaded with mozIJSSubScriptLoader.loadSubScript() much slower than module loaded with Components.utils.import()

Categories

(Core :: JavaScript Engine: JIT, defect, P5)

x86_64
Windows 7
defect

Tracking

()

People

(Reporter: Eduard.Braun2, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

3.65 KB, application/x-zip-compressed
Details
Attached file Example add-on
I wanted to know if there is any performance difference when using
  a) a script via mozIJSSubScriptLoader.loadSubScript()
  b) a module via Components.utils.import()
in a bootstrapped add-on.

Turns out there is!

See the attached example extension which tests both methods side-by-side:
- Some identical benchmarking code is executed at least 5 times slower
  in the script compared to the module version
- First run of both test methods is considerably faster than subsequent runs.
- First run of module version is (in nightly) so fast that I assume
  a potential error in measuring the times.

Let's see if the experts can figure out what's going on (or if there is simply a problem with my code).
Obvious difference is that the module is executed in its own sandbox while the script is executed in the scope of bootstrap.js.
> I wanted to know if there is any performance difference

In general, there is.  In particular, running the script via loadSubScript against some random object instead of the global means running against a weird scope chain, and the JS engine doesn't perform various optimizations in those cases.  Most importantly, accessing global variables in loadSubScript against a random object will be much slower than accessing them in a JS module.

In this case the only global variable being accessed is Math.  If I change the script to grab Math once up front into a var local to the function and then use that var throughout, things become equally fast in the subsript and the JSM.

Similarly, if I loadSubScript into the global instead of some random object, the performance of the subscript becomes the same as that of the JSM.

> - First run of module version is (in nightly) so fast that I assume
>  a potential error in measuring the times.

Well, there's an obvious bug in the test: the inner loop uses "i" for the loop condition and increment, not "j", across the board.  So for each of the three subtests only 1e6 iterations are performed of the inner loop body.

But it gets worse: in the module case we know we're working with the Math object (because we know we found Math on the global, etc), and we know how Math functions behave, and the inner loop bodies have no side-effects apart from assigning to z, but the value of 'z' never gets used.  So all that code is dead and gets eliminated by the JIT.  The result is that the testcase is measuring how long it takes to count from 1 to 1e6 three times, checking whether you've reached 1e6 after each increment.  Over here, I get 4ms for that, which seems like a totally reasonable, amount of time to increment a counter 3e6 times and branch 3e6 times on a modern CPU (about 4 CPU cycles per increment+branch in my case).

If I ignore the loop variable thing for now and just make 'z' not dead by changing the timeEnd call like so:

  console.timeEnd('test2', z);

then I get times more like 40ms than 4ms.  But even this is not really all that reliable, since the assignments to 'z' dominate each other, so a compiler should really be able to prove that they're all dead code except the very last assignment...  IonMonkey doesn't do that yet, as far as I know, but it's being worked on.
Anyway, that's the general overview.

The specific thing that ends up slow here is that we have a name lookup instead of a gname lookup, so we end up with a NameIC instead of whatever fast path gname ends up taking.

But even worse than that, the NameIC never actually generates an IC.  What happens is that we enter NameIC::update with the With for the random object as the scopeChain.  We call js::LookupName which walks up the scope chain and finds it on the global.  It sets both obj and holder to said global.  OK so far.

Then we call IsCacheableNameReadSlot which is all happy with our obj and holder until we get to IsCacheableScopeChain.  js::IsCacheableNonGlobalScope returns false for a With object, so we return false up the callstack.  

One problem, of course, is that "with" will in fact walk up the proto chain of its target object.  So just modifying IsCacheableNonGlobalScope to allow With objects if their target is nice enough and modifying GenerateScopeChainGuards to guard on the shape of the target instead of the shape of the With is not good enough: we would actually need to ensure the entire proto chain of the target is nice enough and guard on the shapes of everything in that proto chain.  Or something.

It's not clear to me how worthwhile this is.
Status: UNCONFIRMED → NEW
Component: JavaScript Engine → JavaScript Engine: JIT
Ever confirmed: true
> In general, there is.  In particular, running the script via loadSubScript
> against some random object instead of the global means running against a
> weird scope chain, and the JS engine doesn't perform various optimizations
> in those cases.  Most importantly, accessing global variables in
> loadSubScript against a random object will be much slower than accessing
> them in a JS module.
I see! That's something I didn't even think about. Is this somewhere documented? I think it should at least be mentioned on MDN (particularly in [1]) to prevent such pitfalls...

> Well, there's an obvious bug in the test: the inner loop uses "i" for the
> loop condition and increment, not "j", across the board.
Ah, stupid error. :-(


Regarding comment #2 - you lost me there... But thanks for the very insightful explanation otherwise. It's pretty clear now how the code ended up so slow and how one could fix it.
Also, while it seems both approaches (subscript/module) are equally fast in the end (or are there other things to consider?) using modules is basically less prone to lesser obvious scoping problems like in this case.


[1] https://developer.mozilla.org/docs/Mozilla/Tech/XPCOM/Reference/Interface/mozIJSSubScriptLoader
Comment #2 was more for the people watching the JIT component so they don't have to redo investigation I've already done.

I'm not aware of any documentation for this so far.  I agree that it's worth documenting on MDN, and if you don't get to it I'll try to at some point...
Blocks: sm-js-perf
Priority: -- → P5
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: