Closed Bug 1110539 Opened 10 years ago Closed 9 years ago

Significant pause times introduced in FF since 4 December

Categories

(Core :: JavaScript Engine: JIT, defect)

x86_64
macOS
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox37 + fixed

People

(Reporter: lth, Unassigned)

Details

(Keywords: regression)

Attachments

(1 file)

One shared-memory benchmark I have - iterated animated mandelbrot - ran smoothly at 44FPS on my MBP built from m-i on 4 December.

When built from sources from today, and also from sources from earlier this week, the animation has very visible intermittent pauses and gets about 39FPS.

There's as yet no evidence at all that the slowdown is caused by GC, but I'm going to tag it as such for the time being.  I'll be investigating.
If you set javascript.options.mem.log to true in about:config and then open the browser console, you can see if the pauses match up with when we're doing GC or CC.
Though, if the GCs are on workers you won't see that in the log.
(No output in the log.  There should be very little GC, but one never knows.)

Bisecting, I find that the problem is intriguingly caused by this change set:

changeset:   218782:becc88436330
user:        Dan Gohman <sunfish@mozilla.com>
date:        Mon Dec 08 18:20:30 2014 -0800
summary:     Bug 1065339 - IonMonkey: x86 VEX encoding support for several operators r=jandem

I haven't tried to figure out why yet.  The machine is an early-2014 MacBook Pro with 4 cores with hyperthreading, and the benchmark uses 8 threads in parallel.  Obvious possibilities are:

- the patch introduced some sort of weirdness around the atomics
- the patch started using CPU features that are more scarce than what we were using before,
  causing the jerkiness

(I'll post STR etc later on.)

ni? from :sunfish in case he has any good ideas.
Flags: needinfo?(sunfish)
Hardware: x86 → x86_64
Definitely VEX related in some way, since changing HaveAVX to return false on m-i tip restores the higher frame rate and smooth behavior.
Component: JavaScript: GC → JavaScript Engine: JIT
[Tracking Requested - why for this release]: Performance regression.
Keywords: regression
STR (on my system).

1) Clone https://github.com/lars-t-hansen/sab-demo.git

2) Apply the attached patch to mozilla-inbound, rebuild Firefox in release mode

3) Open sab-demo/mandelbrot-iterated/mandelbrot.html, and observe the display.  With VEX enabled, I get
   occasional pauses that are very noticeable; with VEX disabled, they disappear.  The number of threads
   used for the computation can be varied using eg .../mandelbrot.html?cores=4 (defaults to 8).

My MacBook Pro identifies itself as "Retina, 15-inch, Late 2013" / "MacBookPro11,3", 2.6GHz Core i7, 256KB L2 cache per core, 6MB L3 cache.
This applies with minimal fuzz to m-i 62a9e591e57d (11 December).
(The fix for bug 1110570 does not fix the problem, not that I expected it to.)
Have there been any slowdowns other than on SAB test cases? Do you see the pauses if you set cores=1?

If it's related to running on multiple cores, could the slowdowns be related to context switching? However, we aren't using any 256-bit instructions in the JIT currently, so in theory the OS shouldn't have to save and restore the full ymm state.

I'm not able to reproduce the slowdown on my machine. Do you have a performance tool which can observe the OTHER_ASSISTS.AVX_TO_SSE hardware counter?
Flags: needinfo?(sunfish)
I have not tried cores=1 yet, among other things.  I did try cores=4 and the pauses were significantly worse, if anything.  But I need to collect the data properly.

As for the rest ... I'm traveling most of the day today, I'll report back next week.
AVX support on SpiderMonkey trunk is now considerably more complete, with changes in the way AVX support is detected and in how it's used. Can you test whether the pauses are still present?
The pauses appear to have gone away, and the frame rate is up to what it used to be.  Thanks!
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
I should note (in reference to Hannes's comment https://bugzilla.mozilla.org/show_bug.cgi?id=1118235#c16) that I had upgraded to Mac OS X 10.10 last time I tested this, but initial slowdown was reported with 10.9.  I did not think of that as a variable but I guess it might be.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: