HUGE performance regression on FF9+ (sunspider benchmark)

RESOLVED FIXED in Firefox 12

Status

()

Core
JavaScript Engine
--
major
RESOLVED FIXED
6 years ago
5 years ago

People

(Reporter: Miguel Angel, Assigned: dvander)

Tracking

9 Branch
mozilla12
x86
Linux
Points:
---

Firefox Tracking Flags

(firefox10-, firefox11- affected, firefox12+ fixed)

Details

(Whiteboard: [qa+])

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

6 years ago
User Agent: Mozilla/5.0 (X11; U; Linux i686; es-ES; rv:1.9.2.24) Gecko/20111101 Firefox/3.6.24
Build ID: 2011110100

Steps to reproduce:

Test javascript performance with Sunspider 0.9.1 benchmark, using official i686 linux builds.


Actual results:

Performance dropped 5-fold from Firefox 8.x to Firefox 9+
Currently FF9 performance is much worse than FF3.6 (2614.2ms in FF3.6)

FF9+ (beta channel) sunspider run:
============================================
RESULTS (means and 95% confidence intervals)
--------------------------------------------
Total:                 4263.0ms +/- 6.0%
--------------------------------------------

  3d:                   658.4ms +/- 10.4%
    cube:               204.7ms +/- 6.4%
    morph:              221.2ms +/- 6.9%
    raytrace:           232.5ms +/- 21.1%

  access:               931.1ms +/- 5.5%
    binary-trees:        94.4ms +/- 22.6%
    fannkuch:           452.1ms +/- 6.5%
    nbody:              241.2ms +/- 3.3%
    nsieve:             143.4ms +/- 2.8%

  bitops:               816.9ms +/- 7.1%
    3bit-bits-in-byte:  110.6ms +/- 7.7%
    bits-in-byte:       164.6ms +/- 24.1%
    bitwise-and:        333.1ms +/- 2.3%
    nsieve-bits:        208.6ms +/- 17.9%

  controlflow:          110.0ms +/- 19.6%
    recursive:          110.0ms +/- 19.6%

  crypto:               357.6ms +/- 20.0%
    aes:                167.0ms +/- 19.5%
    md5:                100.0ms +/- 29.7%
    sha1:                90.6ms +/- 10.4%

  date:                 224.6ms +/- 2.8%
    format-tofte:       132.2ms +/- 1.2%
    format-xparb:        92.4ms +/- 6.8%

  math:                 548.4ms +/- 5.1%
    cordic:             220.6ms +/- 1.1%
    partial-sums:       192.1ms +/- 9.6%
    spectral-norm:      135.7ms +/- 6.8%

  regexp:                26.8ms +/- 2.1%
    dna:                 26.8ms +/- 2.1%

  string:               589.2ms +/- 20.6%
    base64:             102.6ms +/- 12.1%
    fasta:              182.3ms +/- 22.2%
    tagcloud:           121.6ms +/- 26.5%
    unpack-code:         77.4ms +/- 22.6%
    validate-input:     105.3ms +/- 30.8%

FF8 sunspider run:
============================================
RESULTS (means and 95% confidence intervals)
--------------------------------------------
Total:                  856.5ms +/- 5.8%
--------------------------------------------

  3d:                   184.7ms +/- 4.2%
    cube:                56.0ms +/- 2.1%
    morph:               20.3ms +/- 2.9%
    raytrace:           108.4ms +/- 7.1%

  access:               130.1ms +/- 7.3%
    binary-trees:        58.6ms +/- 12.3%
    fannkuch:            27.7ms +/- 5.8%
    nbody:               18.2ms +/- 23.9%
    nsieve:              25.6ms +/- 10.5%

  bitops:                35.2ms +/- 25.1%
    3bit-bits-in-byte:    1.4ms +/- 26.4%
    bits-in-byte:        19.0ms +/- 43.0%
    bitwise-and:          3.0ms +/- 0.0%
    nsieve-bits:         11.8ms +/- 33.3%

  controlflow:           62.6ms +/- 11.2%
    recursive:           62.6ms +/- 11.2%

  crypto:                52.6ms +/- 7.8%
    aes:                 28.1ms +/- 12.8%
    md5:                 17.3ms +/- 8.5%
    sha1:                 7.2ms +/- 7.8%

  date:                 105.6ms +/- 15.5%
    format-tofte:        64.7ms +/- 2.4%
    format-xparb:        40.9ms +/- 40.0%

  math:                  60.5ms +/- 13.2%
    cordic:              28.7ms +/- 16.1%
    partial-sums:        18.7ms +/- 12.7%
    spectral-norm:       13.1ms +/- 8.7%

  regexp:                32.9ms +/- 13.2%
    dna:                 32.9ms +/- 13.2%

  string:               192.3ms +/- 6.4%
    base64:               8.4ms +/- 17.6%
    fasta:               44.4ms +/- 12.9%
    tagcloud:            59.2ms +/- 1.7%
    unpack-code:         58.8ms +/- 7.9%
    validate-input:      21.5ms +/- 29.8%


Expected results:

Performance should have increased or stayed the same.
(Reporter)

Updated

6 years ago
Severity: normal → major
(Reporter)

Comment 1

6 years ago
FF 11.0a1 nightly is no better:

RESULTS (means and 95% confidence intervals)
--------------------------------------------
Total:                  5134.6ms +/- 11.2%
If you have extensions like Firebug installed, can you disable or uninstall them and see if that helps?

If that does not help, please make sure javascript.options.methodjit.content is enabled in about:config, and let us know whether you can reproduce with a clean profile (http://support.mozilla.com/en-US/kb/Managing-profiles).
(Reporter)

Comment 3

6 years ago
I did all the tests with a new profile created just for that.
No extensions at all (except Feedback 1.1.2)
I confirm javascript.options.methodjit.content is enabled (by default)

Comment 4

6 years ago
Miguel, would you be willing to use http://harthur.github.com/mozregression/ to figure out when the problem appears for you?  This is the first report of this that we have, so presumably something specific about your exact configuration is relevant...
(Reporter)

Comment 5

6 years ago
Sure!

Last good nightly: 2011-08-29
First bad nightly: 2011-08-30

Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2011-08-29&enddate=2011-08-30


Can I do anything else to help?

Comment 6

6 years ago
Hmm.  That looks like the TI landing.  That's ... really odd.

Are these 32-bit or 64-bit Linux builds?
(Reporter)

Comment 7

6 years ago
Good old 32 bit.
Duplicate of this bug: 713063

Updated

6 years ago
Status: UNCONFIRMED → NEW
Ever confirmed: true

Comment 9

6 years ago
Miguel, could you attach your about:support page here?

Comment 10

6 years ago
Miguel, what's your exact CPU hardware?
(Reporter)

Comment 11

6 years ago
My CPU is an AMD AthlonXP (cpu family 6, model 10)

about:support

  Application Basics

        Name
        Firefox

        Version
        9.0

        User Agent
        Mozilla/5.0 (X11; Linux i686; rv:9.0) Gecko/20100101 Firefox/9.0

        Profile Directory

          Open Containing Folder

        Enabled Plugins

          about:plugins

        Build Configuration

          about:buildconfig

        Crash Reports

          about:crashes

        Memory Use

          about:memory

  Extensions

        Name

        Version

        Enabled

        ID

        Feedback
        1.1.2
        true
        testpilot@labs.mozilla.com

        openSUSE Firefox Extensions
        1.0.1
        false
        susefox@opensuse.org

  Modified Preferences

      Name

      Value

        browser.places.smartBookmarksVersion
        2

        browser.startup.homepage_override.buildID
        20111212185108

        browser.startup.homepage_override.mstone
        rv:9.0

        extensions.lastAppVersion
        9.0

        network.cookie.prefsMigrated
        true

        places.history.expiration.transient_current_max_pages
        39363

        places.history.expiration.transient_optimal_database_size
        62980096

        privacy.sanitize.migrateFx3Prefs
        true

  Graphics

        Adapter Description
        GLXtest process failed (exited with status 1): GLX version older than the required 1.3

        WebGL Renderer
        Blocked for your graphics card because of unresolved driver issues.

        GPU Accelerated Windows
        0/1. Blocked for your graphics driver version. Try updating your graphics driver to version <Anything with EXT_texture_from_pixmap support> or newer.
Miguel, could you try a few things for me?

1. Could you try SunSpider in version 8 with javascript.options.methodjit.content=false? This is to test the possibility that in 9, things are running in the interpreter only.

2. Could you try the V8 benchmarks in version 9? I'd like to know if it's specific to SunSpider or if it affects everything.
(Reporter)

Comment 13

6 years ago
1. SunSpider in version 8 with javascript.options.methodjit.content=false: 
Just as fast as javascript.options.methodjit.content=true

RESULTS (means and 95% confidence intervals)
--------------------------------------------
Total:                  899.2ms +/- 8.8%
--------------------------------------------

2. V8 benchmarks (version 6):
I'd say it affects everything. Score is about 6 times lower.

FF8:
Score: 889
Richards: 2962
DeltaBlue: 1256
Crypto: 2972
RayTrace: 291
EarleyBoyer: 380
RegExp: 625
Splay: 575

FF9:
Score: 150
Richards: 69.1
DeltaBlue: 80.9
Crypto: 99.4
RayTrace: 146
EarleyBoyer: 260
RegExp: 214
Splay: 387

Comment 14

6 years ago
> This is to test the possibility that in 9, things are running in the interpreter only.

Given comment 11, that's exactly what's happening.  "AMD AthlonXP (cpu family 6, model 10)" would probably be a CPU without SSE2 support according to <http://en.wikipedia.org/wiki/SSE2#Notable_IA-32_CPUs_not_supporting_SSE2>.  Miguel, could you confirm by looking at your /proc/cpuinfo ?

TraceMonkey worked on such CPUs, with runtime SSE2 detection.   JaegerMonkey does not: it just disables itself if SSE2 is not available.  So Miguel is getting pure-interp performance in 9.  That matches comment 13: disabling JM in 8 should be a performance hit on Sunspider if it were actually working.
(Reporter)

Comment 15

6 years ago
That is correct, there is no SSE2 in this cpu:
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow up

Comment 16

6 years ago
Yeah, then the regression is expected: you no longer have a working JIT.  :(
(Reporter)

Comment 17

6 years ago
I really hope it doesn't mean this is a wontfix.
There are a lot of 32 bit AMD cpus out there, and I don't think it is very wise to drop JIT altogether just because they lack a few altivec instructions wich provide a marginal gain at best.

Comment 18

6 years ago
Setting "javascript.options.typeinference" to false seems to enable the JIT on my Athlon XP. I assume this is using TraceMonkey.

Correct me if I'm wrong, but I don't see any obvious x87 instructions in the Nitro assembler source code, so it looks like adding support for non-SSE2 chips would be a big job.
(Assignee)

Comment 19

6 years ago
The problem is that the x87 fpu is annoying to work with. It's a completely different instruction set. The trace JIT (which is removed in Fx11+) dealt with x87 by assuming it only had one register, so it generated bad code but at least it generated something. SSE2 is something normal and easy to use.

One option for JaegerMonkey would be to disable floating-point optimizations entirely if SSE2 isn't present. You'd still generate JIT code but floating point math would call out to C++, making it much slower than on a slightly newer CPU that had modern extensions. bug 696291 does this (it also disables type inference if there's no SSE2).

Comment 20

6 years ago
> One option for JaegerMonkey would be to disable floating-point optimizations
> entirely if SSE2 isn't present.

Well I would vote for that if it's easy enough to do, it's bound to be better than nothing.
Depends on: 696291
(In reply to David Anderson [:dvander] from comment #19)
> One option for JaegerMonkey would be to disable floating-point optimizations
> entirely if SSE2 isn't present. You'd still generate JIT code but floating
> point math would call out to C++, making it much slower than on a slightly
> newer CPU that had modern extensions. bug 696291 does this (it also disables
> type inference if there's no SSE2).

Is that patch ready to land?
(Assignee)

Comment 22

6 years ago
(In reply to David Mandelin from comment #21)
> Is that patch ready to land?

No, it doesn't seem to apply at all. I'll rebase.
(Assignee)

Comment 23

6 years ago
Created attachment 587160 [details] [diff] [review]
disable sse2 optimizations if not available

pushed to try
Assignee: general → dvander
Status: NEW → ASSIGNED
(Assignee)

Comment 24

6 years ago
Could anyone with this problem try a build from here:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/danderson@mozilla.com-41c5e3bac27d

(avoid the -debug ones)

Comment 25

6 years ago
Huge performance regression fixed for me.

Sunspider :
 Before (ff 8) : 1080 ms
 Today (ff 9) : 7800 ms
 After (ff 12a1 (2012-01-09)) : 1370 ms

Good Job B)
Thanks
(Reporter)

Comment 26

6 years ago
FF 12.0a1

RESULTS (means and 95% confidence intervals)
--------------------------------------------
Total:                 1097.9ms +/- 1.6%

That's much better, 30% slower than FF8 but 2.2 times faster than FF3.6
May I suggest that this patch go in FF10? Since FF10 will be a LTS release, it wouldn't be wise to leave a good chunk of users with a low performance LTS.
(Assignee)

Comment 27

6 years ago
Thanks for testing! I'm confident this patch basically works but need to see why it came up orange on the tryserver.
Nominating for tracking-firefox10 even though it's very late, because it would be bad not to have a JIT in ESR (for some CPUs).
tracking-firefox10: --- → ?
If this proves to be an issue for enterprise, we can consider uplifting after the 10.0 release. Let's wait for their feedback before tracking though.
tracking-firefox10: ? → -
(Assignee)

Comment 30

6 years ago
Created attachment 588224 [details] [diff] [review]
v2

fixes orange
Attachment #587160 - Attachment is obsolete: true
Attachment #588224 - Flags: review?(bhackett1024)
Comment on attachment 588224 [details] [diff] [review]
v2

Review of attachment 588224 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/methodjit/FastArithmetic.cpp
@@ +244,5 @@
>      bool canDoIntMath = op != JSOP_DIV && type != JSVAL_TYPE_DOUBLE &&
>                          !(rhs->isType(JSVAL_TYPE_DOUBLE) || lhs->isType(JSVAL_TYPE_DOUBLE));
>  
> +    if (!canDoIntMath || (frame.haveSameBacking(lhs, rhs) && !masm.supportsFloatingPoint()))
> +        return jsop_binary_slow(op, stub, type, lhs, rhs);

This test looks wrong, won't we always make a stub call instead of going through jsop_binary_double, even if the CPU has SSE2?

@@ +1627,2 @@
>  
> +        if (!lhs->isTypeKnown() || !rhs->isTypeKnown()) {

Can the code below this test just be unconditional?  This opaque test is confusing.

@@ +1718,3 @@
>  
> +        /* Link all incoming slow paths to here. */
> +        if (!lhs->isTypeKnown() || !rhs->isTypeKnown()) {

Ditto.
Attachment #588224 - Flags: review?(bhackett1024) → review+
(Assignee)

Comment 32

6 years ago
Okay thanks, I've fixed those things and sent to try.
status-firefox11: --- → affected
status-firefox12: --- → affected
tracking-firefox11: --- → +
tracking-firefox12: --- → +
(Assignee)

Comment 33

6 years ago
https://bugzilla.mozilla.org/show_bug.cgi?id=712261
(Assignee)

Updated

6 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
(Assignee)

Comment 34

6 years ago
I botched comment #33 - this landed a week ago and appears to have stuck. It should appear in Firefox 12.

Comment 35

6 years ago
Is this something that is worth backporting to 11?
status-firefox12: affected → fixed
Target Milestone: --- → mozilla12
(Assignee)

Comment 36

6 years ago
Comment on attachment 588224 [details] [diff] [review]
v2

[Approval Request Comment]
Regression caused by (bug #): bug 698201
User impact if declined: older CPUs will have extremely slow JS (no JIT)
Testing completed (on m-c, etc.): yes
Risk to taking this patch (and alternatives if risky):
String changes made by this patch:
Attachment #588224 - Flags: approval-mozilla-beta?
(In reply to David Anderson [:dvander] from comment #36)
> Risk to taking this patch (and alternatives if risky):

Can you address the risk to uplifting to beta in our second to last beta?
(Assignee)

Comment 38

6 years ago
The risk is it could introduce a JIT bug, either a regression, or could expose an existing bug that users with older CPUs wouldn't have otherwise seen.
Comment on attachment 588224 [details] [diff] [review]
v2

[Triage Comment]
Given the risk evaluation and the fact that this is not a regression from FF10, let's let this bake more and then release with FF12. Thanks David.
Attachment #588224 - Flags: approval-mozilla-beta? → approval-mozilla-beta-

Updated

6 years ago
tracking-firefox11: + → -
Whiteboard: [qa+]
You need to log in before you can comment on or make changes to this bug.