TI: big SunSpider regression with type inference enabled

RESOLVED FIXED

Status

()

RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: dmandelin, Unassigned)

Tracking

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments)

(Reporter)

Description

7 years ago
SunSpider 0.9.1 in-browser with type inference enabled, Windows 7:

  Jun 6 m-c nightly:   200.7
  Jun 7 TI  nightly:   388.6
So TI is ok in the shell here but not in browser?
Just tried SS on my mac, TI seems to have thrown off our don't-GC-during-SS heuristics and now we always get a 109ms GC during bitwise-and.  Discounting that, TI enabled is about 5ms slower than the Firefox 5 beta (haven't compared to nightly).  I imagine the problem is worse on windows, will look closer tomorrow.
Some Win7 numbers I get, nightly vs. inference enabled.

Nightly

Total:                 199.4ms +/- 1.4%
    morph:               5.3ms +/- 6.5%
    bitwise-and:         1.3ms +/- 26.6%
    format-tofte:       14.0ms +/- 0.0%
    partial-sums:        8.5ms +/- 4.4%
    base64:              3.1ms +/- 7.3%
    tagcloud:           14.9ms +/- 1.5%

Inference Enabled

Total:                  349.6ms +/- 0.6%
    morph:               21.3ms +/- 2.3%
    bitwise-and:        125.7ms +/- 0.7%
    format-tofte:        17.1ms +/- 2.4%
    partial-sums:        13.3ms +/- 2.6%
    base64:               8.2ms +/- 3.7%
    tagcloud:            17.3ms +/- 2.0%

The main issue is the GC during bitwise-and.  The other tests shown here are those where numbers are significantly different from the Mac AWFY numbers.  Fixing the issues with these tests should bring us to about the same place as stock TM.
(Reporter)

Comment 4

7 years ago
I currently see this on TM.
Created attachment 552551 [details]
GCTimer output

There is a very long (333ms) GC in there that is caused by a JS_API call. It happens exactly 4 sec after the previous GC so I suspect the 4sec trigger in the CC.
The long GC spends 250ms in the END callback. That might be something to look at as well.
The GCs (marking) take longer and longer. Something may be leaking as well.
Mmmm, sorry for reflexively blaming the GC (remembered the problems we had here before with GC timing in SS).  The problem here is specific to TI; the issue is that with TI enabled we never compile uncached eval scripts, which have a lifetime managed by API clients and can be destroyed at times other than on GC.  This plays havoc with the assumptions inference makes that scripts are always safe to access from type constraints, so uncached eval scripts are not analyzed.  Top level <script> tags in the browser are uncached eval scripts, unlike the global scripts in the shell; bitops-bitwise-and is basically just a long-running loop in global code, and never leaves the interpreter.

Bug 674251 should remove this restriction; all scripts have lifetimes managed by the GC, and it will be straightforward to analyze and compile uncached eval scripts.
Depends on: 674251

Updated

7 years ago
Depends on: 678830
Created attachment 554110 [details] [diff] [review]
patch

Unblock on bug 674251.  This has EvaluateUCScriptForPrincipalsCommon and EvaluateInScope give script objects to the results and let the GC destroy them instead of doing so immediately after execution.  Allows us to analyze/compile such scirpts and removes a fair bit of analysis cruft in place to account for these scripts.

This also fixes a perf bug in GNAMEINC ops where we emitted a SETPROP in the decomposed version instead of SETGNAME.  This was compiled as a normal SETPROP IC (which is safe to do here), but in the browser (not shell) this PIC ends up getting disabled --- Window objects have a class setter hook XPC_WN_Helper_SetProperty (???) which causes the ICs to bail out.

This will (I think) still cause bad browser perf for SETNAME ops which go to global objects.

Confirmed this fixes the bitwise-and regression.

http://hg.mozilla.org/projects/jaegermonkey/rev/7dae91c263cf
Attachment #554110 - Flags: review?(dvander)
Attachment #554110 - Flags: review?(dvander) → review+
On latest Windows nightlies for SS I get 230ms for m-c and 223ms for JM.  This is similar to AWFY performance.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED

Comment 10

7 years ago
Similar? Can we get concrete numbers here?
The latest awfy-regress page (OS X, shell, x86) is 304ms m-c, 300ms JM, so Windows browser performance is showing a slightly larger edge for JM.

Updated

7 years ago
Depends on: 719189
You need to log in before you can comment on or make changes to this bug.