Closed Bug 713532 Opened 13 years ago Closed 1 year ago

http://hakim.se/experiments/css/domtree/ is very slow in Firefox but fast in Chrome

Categories

(Core :: Graphics, defect)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1493359
mozilla12

People

(Reporter: pcwalton, Unassigned)

References

()

Details

Attachments

(3 files, 1 obsolete file)

http://hakim.se/experiments/css/domtree/ is very smooth in Chrome but is 1-2 fps in Firefox. I see a lot of calls to widget and Cairo in the profile, so it looks like we aren't layerizing enough.
Crappy on Windows w/ D2D as well.
OS: Mac OS X → All
Hardware: x86 → All
Chrome renders it incorrectly. Notice that the radio buttons have a white square background in Chrome, which shouldn't be there. In Firefox the radio buttons are correctly transparent outside the button. That may have something to do with it...
Did a profile on Windows.

4% reresolving style.
7% doing reflows. That would be improved by bug 524925.
At least 83% painting:
  5% building display lists
  9% optimizing display lists
  12% building layer tree
    mostly in nsIntRegion::Or called by ProcessDisplayItems, that needs more investigation
  48% in EndTransaction
    46% in ThebesLayerD3D10::DrawRegion
      22% in nsNativeThemeWin::DrawWidgetBackground, which does a whole bunch of temporary surface stuff, especially filling them, plus actual theme drawing. Surprisingly less than 1% in actual alpha recovery computation.
      11% flushing while destroying gfxContexts. Probably because we have a lot of layers and they're small so flushing for each layer really hurts
  At least 9% in the Nvidia driver doing who knows what

Maybe some of the layer invalidation is unnecessary and could be avoided, but a lot of the widgets in the tree are animated so must be rerendered.

OMTC with animation would really help here since the animation would be smooth, even if the widget contents updated at a lower rate.
Bas, I wonder if the D2D flush we do in ThebesLayerD3D10::DrawRegion destroying the gfxContext(s) could be deferred somehow? Seems like we don't actually need the flush to complete until we actually reach RenderLayer?
Many of the layers use component alpha, which is dumb since we're transforming those layers in ways that will mess up the assumptions behind component alpha. Basically we should not use component alpha when the effective transform is not just a translation. That would make the content look better and speed things up.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #4)
> Bas, I wonder if the D2D flush we do in ThebesLayerD3D10::DrawRegion
> destroying the gfxContext(s) could be deferred somehow?

Bas explained that this wouldn't help. There's no way to avoid the D2D flush work being performed on the thread where we call it.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #2)
> Chrome renders it incorrectly.

That was on Windows BTW. Mac might well be different since Quartz can draw to transparent surfaces sanely.

Some of the performance issues here are likely to be platform-specific.
Attachment #584916 - Flags: review?(tnikkel) → review+
Attached patch part 1 v2Splinter Review
I decided to redo this patch as an explicit control for subpixel AA. It seems clearer. Also, only disable subpixel AA for descendants of transformed *retained* layers. Temporary nonretained layers don't need to disable subpixel AA since we won't use transformed intermediate buffers for them in general.
Attachment #584916 - Attachment is obsolete: true
Attachment #584921 - Flags: review?(tnikkel)
Attachment #584921 - Flags: review?(tnikkel) → review+
This fixes the layer construction hotspot that I mentioned in comment #3.
Attachment #584947 - Flags: review?(tnikkel)
I noticed that we're frequently repainting layers containing only form elements such as checkboxes and textfields that aren't animated so shouldn't need to be repainted. I'll look into that next.
That's happening because when the transform for the checkbox etc changes, its overflow area relative to its parent block changes, which changes the line overflow area, which nsBlockFrame::ReflowLine uses as a cue to invalidate the frame subtree for all the frames on the line...

Bug 524925 would avoid that. We need to focus on getting that finished and landed, then revisit this to see how we're doing. Patches 1 and 3 at least are still good though.
Depends on: 524925
Comment on attachment 584947 [details] [diff] [review]
Part 3: use SimplifyOutward to avoid regions getting too complex

>     /**
>      * The region containing the bounds of all display items in the layer,
>      * regardless of visbility.
>      * Same coordinate system as mVisibleRegion.
>+     * This is a conservative approximation: it contains the true region.
>      */
>     nsIntRegion  mDrawRegion;

This comment is on mDrawRegion not mDrawAboveRegion.

So this patch could have negative consequences in ContainerState::FindOpaqueBackgroundColorFor where if we fluff out the region we may fail to find an opaque background color and in ContainerState::FindThebesLayerFor where we may create more thebes layers due to a fluffed out region. Do we think this is a reasonable trade off to take?
(In reply to Timothy Nikkel (:tn) from comment #14)
> Comment on attachment 584947 [details] [diff] [review]
> Part 3: use SimplifyOutward to avoid regions getting too complex
> 
> >     /**
> >      * The region containing the bounds of all display items in the layer,
> >      * regardless of visbility.
> >      * Same coordinate system as mVisibleRegion.
> >+     * This is a conservative approximation: it contains the true region.
> >      */
> >     nsIntRegion  mDrawRegion;
> 
> This comment is on mDrawRegion not mDrawAboveRegion.

I'll move it.

> So this patch could have negative consequences in
> ContainerState::FindOpaqueBackgroundColorFor where if we fluff out the
> region we may fail to find an opaque background color and in
> ContainerState::FindThebesLayerFor where we may create more thebes layers
> due to a fluffed out region. Do we think this is a reasonable trade off to
> take?

We have to take it, since otherwise we have O(N^2) behavior in ProcessDisplayItems.
Attachment #584947 - Flags: review?(tnikkel) → review+
Landed the two reviewed patches:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2ddeb63ac2d9
https://hg.mozilla.org/integration/mozilla-inbound/rev/27cc07ec8b88
Marked an UNEXPECTED-PASS test as passing:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7c9a9e02b08c

This bug is not fixed yet, please don't close after merging inbound.
Whiteboard: [don't close after merging inbound]
This is a lot faster now that bug 524925 has landed. It still slows down a lot when you move the mouse in the window.

Off-main-thread CSS animation is probably the next big thing needed to make this smoother.
Depends on: 706179
Depends on: 724197
Firefox uses a full core but it's smooth and pretty.
Chrome uses almost 2 cores, and the radio button has a white box background... very ugly.
I'm on win7 HWA ON.
Assignee: roc → nobody
Status: ASSIGNED → NEW
Severity: normal → S3

Bug 1493359 has more relevant dependents, so I'm closing this one.

Status: NEW → RESOLVED
Closed: 1 year ago
Duplicate of bug: 1493359
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: