Closed Bug 920508 Opened 11 years ago Closed 6 years ago

Peacekeeper grid benchmark much slower than in Chrome

Categories

(Core :: Layout, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1493420

People

(Reporter: jandem, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(3 files)

Attached file Testcase
I'm attaching a standalone version of http://www.peacekeeper.therichins.net/test2.html

I had to rewrite the test a bit to not require jQuery but that shouldn't influence the results.

Chrome 31 / Safari:  6 ms per frame
Opera 12.16:        15 ms per frame
Nightly:            30 ms per frame

Webkit is a lot faster but Presto is/was also 2x faster than us.

The test creates a grid of divs, then modifies the background color of each of them:

    this.pixels[y][x].style.backgroundColor = this.hex(r, g, 0);

When I change this line to:

    this.pixels[y][x].foo = this.hex(r, g, 0);

Every frame takes ~2 ms, same as with Chrome.

According to Instruments, most of the time is in Layout/Graphics.
FWIW, Peacekeeper uses the same grid code for multiple tests. The only difference is the grid size. I hope that if we can make this fast, the tests with smaller grids will also improve:

http://www.peacekeeper.therichins.net/test0.html
http://www.peacekeeper.therichins.net/test1.html
Just to make sure, is the peacekeeper harness measuring the time being shown here?  Because afaict the time being shown is just the time to twiddle the styles, not the time it takes to then process the restyles or paint, right?
Flags: needinfo?(jdemooij)
Depends on: 903372
Assuming that the "time per frame" time is what's being benchmarked here...

What I see is:

* 20% or so in jitcode and various JS engine bits _not_ counting proxies.
*  4% under HTMLElementBinding::get_style (called off the fast ion path, yay).
* 11% getting from js::Proxy::set to CSS2PropertiesBinding::set_backgroundColor.
      If we could fix TI to deal and generate a direct call to the latter, that
      would sure be nice.  See bug 822442, basically, though maybe efaust has
      other bugs on this too.
* 10% parsing the new property value.
* 10% calling AttributeWillChange; a good chunk of this (6.5%) is
      HasAttributeDependentStyle.
* 4% getting the inline style rule to modify.
* 3% getting the base URI... we need to kill off xml:base.  :(
* 23% the SetAttrAndNotify call.  Some of this is the SetAndTakeAttr (largely
      the free() call on the old style rule; about 3-4% here!) and about 13% is
      AttributeChanged.  This time the HasAttributeDependentStyle is only
      about 5%; there is also 2% in GetStyleDisplay(), 2% adding things to the
      restyle tracker, and a bunch of self time in
      RestyleManager::AttributeChanged.

We should really be able to do better for the AttributeWillChange/AttributeChanged bits for style attributes in particular...  :(
Depends on: 822442
(In reply to Boris Zbarsky [:bz] from comment #2)
> Just to make sure, is the peacekeeper harness measuring the time being shown
> here?  Because afaict the time being shown is just the time to twiddle the
> styles, not the time it takes to then process the restyles or paint, right?

The harness does the following for every frame, iterations == 15000 for this test:
---
var testTime = benchmark.elapsedTime();

if (testTime > benchmark.test.iterations) {
    benchmark.test.result = benchmark.renderedFrames / (testTime / 1000);
    // submit
    return;
}

benchmark.renderedFrames++;
benchmark.test.run(testTime);
// setTimeout, 5 ms

---

So it's (number of frames) / (number of seconds)... Sorry, I should have added that same calculation to the standalone test...
Flags: needinfo?(jdemooij)
Ah, so it's measuring wall-clock time, not style-twiddling time, I see.
This is showing, for me, about 130ms/frame times for us and about 80ms/frame times for Chrome.

Note that all the stuff I profiled before is about 12% of our time now...
OK, profiling the new thing we have:

* 40% restyle processing.  Some of this (~6%?) is TreeMatchContext stuff, 10% is resolving
  the new style context, 10% is CalcStyleDifference to determine what changed.  5% gcing the
  ruletree.  We should really try do do better with inline style changes in particular; was
  sure we had bugs on this...
* 5% display list construction
* 17% building layers
* 17% painting background colors.

and most of the rest is the stuff from comment 3 but scaled down by a factor of 7.
(In reply to Boris Zbarsky [:bz] from comment #7)
> * 40% restyle processing.  Some of this (~6%?) is TreeMatchContext stuff,
> 10% is resolving
>   the new style context, 10% is CalcStyleDifference to determine what
> changed.  5% gcing the
>   ruletree.  We should really try do do better with inline style changes in
> particular; was
>   sure we had bugs on this...

David, do you know if we have bugs on this maybe? This is one of the Peacekeeper tests where we are much slower than Chrome and I'm trying to set dependencies for all these bugs so that we know what has to happen :)
Flags: needinfo?(dbaron)
I don't know; I'm not sure what bugs Boris was talking about.

Are we rerunning selector matching as a result of style attribute changes?  I thought we weren't anymore, but maybe not.  I'm having trouble telling if that's the 6% plus 10% that bz is talking about, though I suppose I could re-profile.
Flags: needinfo?(dbaron)
(In reply to David Baron [:dbaron] (needinfo? me) (UTC-8) from comment #9)
> Are we rerunning selector matching as a result of style attribute changes? 
> I thought we weren't anymore, but maybe not.  I'm having trouble telling if
> that's the 6% plus 10% that bz is talking about, though I suppose I could
> re-profile.

Er, we still do that, and I should have known that.

Anyway, filed bug 977991.
(Still, that covers at most 16% if I'm understanding bz's data correctly, and it wouldn't reduce it to 0, either, though maybe close enough.)
I was mostly talking about not rerunning selector matching when changing inline style, yes.
On 64bit linux we're doing really badly with attachment 809984 [details]
Nightly 110ms, Chromium 33ms.
65% of refresh driver tick is flushing styles.
Cam: can you have a quick look at Olli's test case and see if our current restyle perf bugs already cover it? Thanks!
Flags: needinfo?(cam)
To my knowledge, bug 977991 (already landed) was the only one that would help with this test.

On my machine I'm getting:

18% RefreshDriverTimer::Tick -> nsRefreshDriver::Tick -> ... -> DoProcessRestyles
12% nsRefreshDriverTimer::FinishedWaitingForTransaction -> nsRefreshDriver::Tick -> ... -> DoProcessRestyles

and of the call stack (is there a way I can merge these two in Cleopatra?):

* 8.4% is under RestyleSelf, of which about half is in CalcStyleDifference and its rule tree walking and style computation, and most of the rest under ResolveStyleWithReplacement
* 2.8% GCing the rule tree and destroying style contexts
* ~2.5% doing TreeMatchContext stuff now -- which actually isn't necessary since we're not going to do any selector matching as part of the restyle
Flags: needinfo?(cam)
(In reply to Cameron McCormack (:heycam) from comment #15)
> and of the call stack (is there a way I can merge these two in Cleopatra?):

of the first call stack
Depends on: 1109939
on a fresh nightly:

profile: http://bit.ly/2CMSUdr
profile with sequential styling: http://bit.ly/2CMSYtH

almost all the time is spent in styling
Any chance of getting a profile with sequential styling? I suspect this is bug 1493420 basically.
(In reply to Emilio Cobos Álvarez (:emilio) from comment #18)
> Any chance of getting a profile with sequential styling? I suspect this is
> bug 1493420 basically.

written in comment 17 
profile with sequential styling: http://bit.ly/2CMSYtH
Flags: needinfo?(emilio)
(In reply to Mayank Bansal from comment #19)
> (In reply to Emilio Cobos Álvarez (:emilio) from comment #18)
> > Any chance of getting a profile with sequential styling? I suspect this is
> > bug 1493420 basically.
> 
> written in comment 17 
> profile with sequential styling: http://bit.ly/2CMSYtH

Bleh, I'm blind.

Yeah, 74% of the time in ensure_child, that's bug 1493420.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(emilio)
Resolution: --- → DUPLICATE

We look roughly as fast or faster than chromium on the latest nightly, can you confirm it matches what you see too?

I see other stuff we could optimize as well...

Flags: needinfo?(jdemooij)

If you still see measurable differences, then please reopen and ni? me and I can take a look at those other optimizations.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #22)

If you still see measurable differences, then please reopen and ni? me and I can take a look at those other optimizations.

What I'm seeing here on OS X:

  • Latest Nightly: 8-10 ms
  • Chrome Canary: 5-7 ms
  • Safari 12.1: 2-3 ms

We've come a long way and I don't know if this is worth spending more time on. However the Safari numbers are curious...

Flags: needinfo?(jdemooij) → needinfo?(emilio)

Oh my bad. I was using the original test case. On https://bug920508.bmoattachments.org/attachment.cgi?id=809984 we indeed match Chrome now if I also enable WebRender, about 22-24 ms.

Flags: needinfo?(emilio)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: