Kraken: Big performance regression in gaussian-blur




9 years ago
4 years ago


(Reporter: spammaaja, Assigned: billm)



Windows 7

Firefox Tracking Flags

(blocking2.0 betaN+)


(Whiteboard: softblocker, fixed-in-tracemonkey, )


(1 attachment)



9 years ago
User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b7) Gecko/20100101 Firefox/4.0b7
Build Identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b8pre) Gecko/20101208 Firefox/4.0b8pre

Beta 7		1038.7ms +/- 0.7%{%22v%22:%20%22kraken-1.0%22,%20%22ai-astar%22:[2451,2653,2621,2466,2642,2663,2629,2531,2668,2640],%22audio-beat-detection%22:[1089,1060,1050,1039,1050,1048,1049,1046,1043,1048],%22audio-dft%22:[1046,1047,1034,1041,1057,1035,1036,1046,1034,1076],%22audio-fft%22:[920,845,863,858,907,838,861,902,872,826],%22audio-oscillator%22:[1132,1120,1116,1124,1115,1116,1114,1129,1118,1123],%22imaging-gaussian-blur%22:[1029,1031,1041,1049,1055,1052,1036,1039,1026,1029],%22imaging-darkroom%22:[445,443,442,446,454,443,442,438,445,444],%22imaging-desaturate%22:[912,904,911,892,909,918,916,891,897,900],%22json-parse-financial%22:[229,228,229,238,231,230,231,233,233,225],%22json-stringify-tinderbox%22:[117,117,109,111,115,108,109,109,110,110],%22stanford-crypto-aes%22:[328,366,364,316,320,365,363,318,322,373],%22stanford-crypto-ccm%22:[242,241,244,254,237,243,243,251,246,236],%22stanford-crypto-pbkdf2%22:[604,601,601,602,615,603,591,601,694,604],%22stanford-crypto-sha256-iterative%22:[220,239,229,224,226,238,228,234,239,232]}

nightly		9059.5ms +/- 0.3%{%22v%22:%20%22kraken-1.0%22,%20%22ai-astar%22:[2697,2488,2627,2623,2624,2500,2628,2644,2487,2689],%22audio-beat-detection%22:[1123,1126,1127,1160,1136,1164,1131,1168,1129,1174],%22audio-dft%22:[1041,1040,1035,1046,1045,1037,1038,1044,1043,1047],%22audio-fft%22:[941,886,897,881,885,883,878,892,893,881],%22audio-oscillator%22:[1072,1072,1086,1066,1077,1072,1076,1061,1070,1070],%22imaging-gaussian-blur%22:[9070,9013,9048,9048,9018,9044,9058,9142,9106,9048],%22imaging-darkroom%22:[442,441,438,437,443,437,435,442,440,445],%22imaging-desaturate%22:[893,889,894,897,896,895,890,895,898,898],%22json-parse-financial%22:[229,228,231,232,235,227,228,232,232,233],%22json-stringify-tinderbox%22:[115,115,116,115,117,115,123,115,114,117],%22stanford-crypto-aes%22:[309,333,310,314,319,307,310,320,307,311],%22stanford-crypto-ccm%22:[239,233,231,236,237,234,234,240,238,236],%22stanford-crypto-pbkdf2%22:[685,696,698,687,678,706,692,682,679,697],%22stanford-crypto-sha256-iterative%22:[261,257,255,263,261,260,256,261,258,259]}

I'm trying to find a regression range, but it's gonna take some time.

Reproducible: Always


9 years ago
blocking2.0: --- → ?
Version: unspecified → Trunk


9 years ago
Hmm.  I just tried this (on Mac).  I get 549.0ms in b7 and 530.9ms in the Dec 7 m-c nightl.  527ms in the Dec 8 m-c nightly.
My results, also the gaussian blur test was throwing 'slow script warnings' 

A script on this page may be busy, or it may have stopped responding. You can stop the script now, or you can continue to see if the script will complete.

Each pass of the test for gaussian blur thre the warning, but the driver.html:xxx was a different number:  i.e. 96, or 97 in a couple of passes. 

Running Win7 x64 here latest nightly, today's.  
AMD Phenom II Quad, HD3200 on-board video chip 256meg
I'd still love a regression range.  I can't reproduce in either a 32-bit or 64-bit build on Mac...

Comment 5

9 years ago
Regression range:

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b8pre) Gecko/20101117 Firefox/4.0b8pre
Result: 1062.6ms +/- 1.9%

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b8pre) Gecko/20101118 Firefox/4.0b8pre
Result: 9306.6ms +/- 0.2%


Comment 6

9 years ago
Try using the TM builds to further narrow it down.
I just tested K:\Moz2\20101117133109 6d7b682a3cc4 which is before the TM merge in that range, and I was getting slow-script warnings.. Still looking.
Perhaps I'm cchasing ghosts - but I'm back to build:
20101028104052 cebb9d72f310

And still getting slow-script warnings - if I turn off in about:config pref:

The warnings stop - can someone tell me when that pref first landed before I spin any more wheels ?
20101026093127 3e08f8844f87 OK  - no pref
20101027142247 16c54dba9f6b bad - Has pref

For me between those two builds is when the pref:

appeared but I don't have time to narrow it further now, but in my mind the introduction of 'jitprofiling' is or may be cause, as turning it off in any build after that stops the slow-script warning and poor performance.

Comment 11

9 years ago
Regression range in tracemonkey tree (WinXP x86).

OK:20101022055746 95ba3d6a1ecf
Bad:20101023060640 ee1746020149

Comment 12

9 years ago
Are all of you running Windows 32-bit builds?


9 years ago
blocking2.0: ? → betaN+


9 years ago
Assignee: general → wmccloskey
I am running the win32 builds.
blocking2.0: betaN+ → ?

Comment 14

9 years ago
(In reply to comment #13)
> I am running the win32 builds.

Did you reset the block flag accidentally?


9 years ago
blocking2.0: ? → betaN+
I'm pretty sure this is what is happening: Everything happens as it's supposed to and we decide to trace all the right loops. Then for some reason we flush the JIT cache, which releases all traces and all profile data. Then we profile again, and this time we see a different iterations of the loop, and we decide not to trace. Specifically, we hit the continue below:
  for (...) {
    if (unlikely condition) continue;
    // lots of code that we want to trace
The profiler just sees a short loop with almost no code, so it chooses not to trace.

So the reason it's win32-only is that the timing of ResetJIT is platform-specific. I'll work on making the profiler less fragile here so that it decides to trace in both cases. Maybe we should try to profile again if we hit a continue statement...
This is relevant:

In particular, note the type change -- we start with an array-of-ints, but that changes into an array-of-doubles.  However, I would have thought that throwing the code away and re-recording would have helped, because the 2nd time we'd generate code for the array-of-doubles.
Whiteboard: softblocker
I landed a patch in bug 606890 that I'm hoping fixes this problem. However, I'm no longer able to reproduce the original problem on recent builds.

I'd appreciate it if someone who has recently reproduced the bug could check the fix. You'll need to use tomorrow's tracemonkey nightly. To see if the bug is fixed, run the Kraken benchmark with javascript.options.jitprofiling.content set to both true and false (set this in about:config). Hopefully, the results for Gaussian blur should be similar in the two cases.
Posted patch fixSplinter Review
The problem being discussed in bug 606890 comment 27 actually seems to be this bug as well. I tracked it down a bit better, and here's what seems to happen.

ResetJIT is getting called during the Gaussian blur benchmark. This happens both from the Kraken harness and from the browser (although not all the time). Most profile information gets reset at this point. However, there's a little bit of information in the methodjit as well. Namely, whether the loop is blacklisted and how long we should wait before calling into the tracer. The consequence is that we profile an outer loop before profiling the inner loop in some cases (if, for example, the inner loop was previously blacklisted). This messes up some of the tracer's heuristics.

This patch iterates through all the profiles when ResetJIT is called, and it resets the trace ICs associated with them. It seems to fix the problem that I'm seeing on AWFY now.
Attachment #502987 - Flags: review?(dmandelin)
Attachment #502987 - Flags: review?(dmandelin) → review+
(In reply to comment #19)
> ResetJIT is getting called during the Gaussian blur benchmark. This happens
> both from the Kraken harness and from the browser (although not all the time).

Why is it non-deterministic?
(In reply to comment #20)
> Why is it non-deterministic?

The JIT has space for four globalObj/globalObjShape pairs; when there are more, it resets. So when you reset depends on how many things were in this cache when the benchmark started.

Although now that the trace monitor is in the compartment, this should be a lot more deterministic (which may be why I couldn't get it to happen recently).

This is just my guess. There's a lot of complicated stuff going on here, and I don't pretend to understand it all.
These tracer-related data structures that fill up and then reset do introduce ND in the browser, and they also can cause performance sawtoothing as traces after a reset need to be re-recorded. Seems like LRU-SP or similar is always better. Not sure what to do about this, flying it here for quick reactions.

This fixed the problem with Gaussian blur on AWFY, so I'm going to mark the bug fixed-in-tracemonkey. I think that the original, in-browser version of the bug should also be gone. If anyone still sees it on a build that includes the patch above, please re-open.
Whiteboard: softblocker → softblocker, fixed-in-tracemonkey
Serendipitous that the ai-astar change helped diagnose this one! :)

Comment 26

9 years ago
JIT profiling on
Total: 		13590.1ms +/- 11.6%
gaussian-blur: 	3845.5ms +/- 41.0%

JIT profiling off
Total: 		11230.1ms +/- 0.5%
gaussian-blur:  1098.1ms +/- 1.1%

It's still slower with JIT profiling.

Comment 27

9 years ago
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b9pre) Gecko/20110112 Firefox/4.0b9pre (TM branch).
> Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b9pre) Gecko/20110112
> Firefox/4.0b9pre (TM branch).

If that's your build, it probably doesn't contain the patch for this bug.  The patch was checked in to tracemonkey on the 12th at 10am pacific; the nightly builds are usually created earlier than that.  Update to today's nightly?

Comment 29

9 years ago
It's up to date according to the about window.

Comment 30

9 years ago
This is from today's *Mozilla-Central* build:

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b10pre) Gecko/20110113 Firefox/4.0b10pre

Total: 11252.0ms +/- 0.9%
gaussian-blur: 1064.3ms +/- 1.9%

So, the TM build is slower.
JK, what's the build id of your TM build?

Comment 32

9 years ago
No, that's your build date.  The build id is the thing at the end of the URI in about:buildconfig.

But note that even your builddate is 6 hours before the patch was checked in.  If it claims up to date, then maybe today's nightly builds failed or something.  But your build certainly does NOT have this fix.
(In reply to comment #32)
> 20110112034642

Unfortunately, this doesn't include the patch. Could you download a new TM nightly? This one has the fix:

I'm not sure why mozilla-central was faster.

Comment 35

9 years ago

Total: 11358.9ms +/- 3.9%
gaussian-blur: 1065.8ms +/- 1.3%
Great! Thanks very much for your time and patience, JK.
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.