Closed Bug 584494 Opened 14 years ago Closed 14 years ago

Regression: Transform rotation testcase performs significantly worse

Categories

(Core :: Graphics, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla2.0b7
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: bugzilla, Assigned: roc)

References

()

Details

(Keywords: perf, regression, testcase)

Attachments

(7 files, 4 obsolete files)

5.30 KB, text/html
Details
1.01 KB, patch
roc
: review+
Details | Diff | Splinter Review
16.01 KB, patch
dbaron
: review+
Details | Diff | Splinter Review
1.52 KB, patch
tnikkel
: review+
Details | Diff | Splinter Review
3.18 KB, patch
cjones
: review+
Details | Diff | Splinter Review
19.23 KB, patch
roc
: review-
Details | Diff | Splinter Review
18.87 KB, patch
roc
: review+
Details | Diff | Splinter Review
Build ID: Mozilla/5.0 (Windows; Windows NT 6.1; rv:2.0b3pre) Gecko/20100804 Minefield/4.0b3pre

Direct2D enabled.

Steps to reproduce
1. Go to http://www.tapper-ware.net/stable/web.dom.stresstest.transform/
2. Notice that spinning Firefox logo is much slower than latest beta

It performed well in Firefox 4 beta 2, it's now significantly slower
It seems to work pretty well for me on the nightly? How slow is 'significantly slower'? And do you know when this regressed?
It is much slower for me. Beta 2 was very fast and now its quite slow.

Regarding the regression date, where can I find older Firefox builds after beta 2?
Not sure that I looked in the right place, but I tested different builds from here:
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/

And the results:
2010-08-01-04-mozilla-central	fast
2010-08-02-04-mozilla-central	slow
Keywords: regression
That's the right place.  What are the build ids of those two builds (from about:buildconfig)?
Pending that, a likely regression range: http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2010-08-01+01&enddate=2010-08-02+08

There's a number of painting-related changes in there.
blocking2.0: --- → ?
Keywords: perf
So it looks like atleast for D2D rendering, doing transformed layers the 'new way' has actually slowed down rendering. Can we see if there's anything that we can do? For what it's worth this will probably go better once we make D2D run through D3D9 layers.
(In reply to comment #4)
> That's the right place.  What are the build ids of those two builds (from
> about:buildconfig)?

Fast:
Sun Aug 01 02:32:15 2010 -0700 (at Sun Aug 01 02:32:15 2010 -0700)

Slow:
Mon Aug 02 01:28:10 2010 -0700 (at Mon Aug 02 01:28:10 2010 -0700)

Let me know please if it is possible to narrow this done somehow
José, I meant to type "about:buildconfig" into the url bar in each build and copy the "Built from" url and paste it in this bug.
I think the immediate thing to do is to back off a bit by making transform layers inactive unless there is an active child layer. That means we'll go back to the old rendering behaviour in most cases, but transformed video etc will perform much better than it used to, especially with acceleration.
blocking2.0: ? → betaN+
Assignee: nobody → roc
Custom version of the stress test with a fps counter using mozPaintCount.

With this I get (debug build of tip trunk):

Current code: 27 fps
Old code (always Thebes items): 46fps
Forcing inactive layers always: 46fps

Based on this an easy solution would be to revert to the old behaviour in the presence of a basic layer manager where there's no benefit to having this as a separate layer.

I'll attach a patch for this, but I'm happy to use your suggestion roc if you think that's preferable.
Attached patch Proposed fix (obsolete) — Splinter Review
Attachment #463421 - Flags: review?(roc)
I think making the layer inactive when there's no active child is a better approach. It's more consistent across layer managers, it will work with 3D transforms, and it avoids the code duplication you have here.
Will tabcandy content have active layers? Accelerating the effects used by tabcandy probably should be a consideration for this.
blocking2.0: betaN+ → ?
Tabcandy uses canvases for the thumbnails, and those are always active at the moment, so yes.

Although I *think* Tabcandy mostly does its scaling by just setting the CSS size of the canvas elements, rather than using CSS tranforms. (Canvas scaling uses layer transforms.)
Was the removal of the betaN+ flag intentional in comment 15?
blocking2.0: ? → betaN+
(In reply to comment #16)
> Tabcandy uses canvases for the thumbnails, and those are always active at the
> moment, so yes.
> 
> Although I *think* Tabcandy mostly does its scaling by just setting the CSS
> size of the canvas elements, rather than using CSS tranforms. (Canvas scaling
> uses layer transforms.)

All correct. We do our scaling by resizing the canvas elements. However, we do use -moz-transform:rotate on those canvases when they go into a stack (http://img.skitch.com/20100806-tysmkn5xr4ffncdw4mnjx5c28j.png).
Attachment #463421 - Attachment is obsolete: true
Attachment #463990 - Flags: review?(roc)
Attachment #463421 - Flags: review?(roc)
Attachment #463990 - Flags: review?(roc) → review+
I looked into this stress test and figured out why we're repainting a lot.

Currently whenever the transform changes, we have to reflow the transformed frame to get its overflow area updated and so that ancestors can show or hide scrollbars if necessary.

On every transform, the overflow area for the transformed frame does change (because it's rotating), and whenever the overflow area for a transformed frame changes, nsIFrame::FinishAndStoreOverflow invalidates the frame's entire overflow area. See
http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsFrame.cpp#5919

I think I can fix that. However, we hit another invalidation:
http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsAbsoluteContainingBlock.cpp#519
Again the change in overflow areas triggers this.

This code was introduced in bug 389224. I understand what that fix was trying to do, but I need to think about whether there's a better way to do it.
This patch fixes a couple of things.

It backs out most of the patch for bug 389224. Responsibility for invalidating the new overflow rect when 'clip' has changed falls to FinishAndStoreOverflow, which uses a frame state bit to track when a frame had 'clip' but no longer does. (This bit --- and associated code --- will go away when we move to display-list-based invalidation in the future.)

It also changes the treatment of the case where a frame's overflow area changes and the frame has a transform. In that case we do not need to invalidate the ThebesLayer contents of the frame.

I added a reftest for bug 389224 and verified that if I disable the clip state code in this patch, the test fails.

With this patch, running a local copy of that "stress test", the only ThebesLayer invalidations performed are the ones updating the FPS counter. The testcase is running significantly faster.
Attachment #465204 - Flags: review?(dbaron)
Attached patch Avoid invalidating, v2 (obsolete) — Splinter Review
Fixes a bug ... instead of calling Invalidate with INVALIDATE_NO_THEBES_LAYERS, call InvalidateLayer so that if there happens to not be a retained layer, we'll call Invalidate instead.
Attachment #465204 - Attachment is obsolete: true
Attachment #465484 - Flags: review?(dbaron)
Attachment #465204 - Flags: review?(dbaron)
Comment on attachment 465484 [details] [diff] [review]
Avoid invalidating, v2

>+      // changed (in particular, if it grows), we have to repaint the

Maybe worth noting explicitly that growing can happen due to removal of 'clip'?

In fact, the comment doesn't make it clear why checking for why checking 
|hasAbsPosClip| is needed.  It seems like didHaveAbsPosClip should be
sufficient.
... except the comment is wrong.  'clip' changes trigger only
eChangeHint_ReflowFrame.  So I think you do need both checks, and should
fix the comment.  (That said, changing 'clip' doesn't really require
reflow either...)

r=dbaron with that, I guess, though at some point I need to understand the layer stuff better...
Attachment #465484 - Flags: review?(dbaron) → review+
(In reply to comment #23)
> Maybe worth noting explicitly that growing can happen due to removal of 'clip'?

Sure

> In fact, the comment doesn't make it clear why checking for why checking 
> |hasAbsPosClip| is needed.  It seems like didHaveAbsPosClip should be
> sufficient.
> ... except the comment is wrong.  'clip' changes trigger only
> eChangeHint_ReflowFrame.  So I think you do need both checks, and should
> fix the comment.  (That said, changing 'clip' doesn't really require
> reflow either...)

It requires reflow to update the overflow rect. But you're right about the rest. I'll fix the comment.
Whiteboard: [needs landing]
In fact, we need to have 'clip' changes trigger eChangeHint_RepaintFrame. We need to ensure that the *old* overflow area is repainted for the case where we're shrinking the clip rect. So we need to make that comment true by adding eChangeHint_RepaintFrame for clip changes.
Changes since the last patch:
-- added another test, for clip shrinking
-- made 'clip' changes nsChangeHint_RepaintFrame as well as reflow
-- made nsBlockFrame::InvalidateInternal not 'clip' the damage rect if the invalidation is targeted at this frame. Otherwise the invalidation done by ApplyRenderingChangeToTree gets clipped by the new 'clip', which is wrong.
Attachment #466218 - Flags: review?(dbaron)
Whiteboard: [needs landing] → [needs review]
Comment on attachment 466218 [details] [diff] [review]
avoid invalidating, v3

Given the nsChangeHint_RepaintFrame, do you still need all the invalidation for clip in FinishAndStoreOverflow?  Now I'm wondering if you only need the |hasAbsPosClip| check (which would mean you wouldn't need the state bit).  (I'm trying to remember why I didn't come to that conclusion yesterday, though.)

If you want to avoid reindenting much of nsBlockFrame::InvalidateInternal, you could inline the GetStyleDisplay() call inside the parameter list of GetAbsPosClipRect and combine the two ifs with &&.

r=dbaron
Attachment #466218 - Flags: review?(dbaron) → review+
(In reply to comment #27)
> Given the nsChangeHint_RepaintFrame, do you still need all the invalidation for
> clip in FinishAndStoreOverflow?  Now I'm wondering if you only need the
> |hasAbsPosClip| check (which would mean you wouldn't need the state bit).  (I'm
> trying to remember why I didn't come to that conclusion yesterday, though.)

In fact, we only need the didHaveAbsPosClip check, so we still need the state bit: the code in FinishAndStoreOverflow needs to run when the overflow area grows, i.e. when the clip is removed.

I'll remove the check of hasAbsPosClip, although it doesn't really matter performance-wise.
Whiteboard: [needs review] → [needs landing]
Will this make beta 5?
Is this still gonna make beta 5?
Hrm, avoid invalidatoin v3 landed, but in my local tests doesn't seem to make this much faster? Should it?
Where did it land? AFAIK it hasn't landed on mozilla-central.
(In reply to comment #33)
> Where did it land? AFAIK it hasn't landed on mozilla-central.

this? http://hg.mozilla.org/mozilla-central/rev/f6916eb6b678
Ah, right! Thanks!

Bas, are you still seeing us invalidate ThebesLayers in that "stress" test? We'd invalidate for the FPS counter but if you disable that, you shouldn't see ThebesLayer repainting.
Whiteboard: [needs landing]
I can confirm that the attached URL (without counter) did not become faster.
Jose, are you testing with D3D on or off?
On my machine, on trunk with D3D off I get about 24fps, with D3D on I get 30fps. With beta2 I get 65fps. Hmm...
Attached patch Avoid clipping when possible (obsolete) — Splinter Review
The ClipToRegion we do to clip to the visible region when we blit out a ThebesLayer is not necessary if the visible region is the whole layer buffer (which it is in this testcase). This clip can be expensive because it might not be axis-aligned rectangles if the layer is transformed. Indeed, it gets the D2D/BasicLayers fps up to 38. (Doesn't affect D3D of course.)
Attachment #465484 - Attachment is obsolete: true
Attachment #472550 - Flags: review?(jones.chris.g)
I noticed we were creating empty nsDisplayTransform items when a transformed frame had nothing to paint inside. Don't do that.
Attachment #472551 - Flags: review?(tnikkel)
Comment on attachment 472551 [details] [diff] [review]
Avoid creating transform/opacity wrapper display items for empty lists

Why not do the same for SVG effects just above?
Attachment #472551 - Flags: review?(tnikkel) → review+
(In reply to comment #37)
> Jose, are you testing with D3D on or off?

I am testing with it on. Latest trunk build Windows 7 64-bit. I get around 14 fps, used to get around 40 if I remember it correctly.
(In reply to comment #41)
> Why not do the same for SVG effects just above?

Because filters can paint stuff even if there's no contents.
I like this version better. We can draw the whole buffer if it's all valid, even if it's not all visible. We should probably also clip if it's cheap to do so, since that could improve performance.

I've also updated the documentation for Layer::SetVisibleRegion to reflect reality. Need to check what D3D is doing here...
Attachment #472550 - Attachment is obsolete: true
Attachment #472580 - Flags: review?(jones.chris.g)
Attachment #472550 - Flags: review?(jones.chris.g)
Attachment #472580 - Flags: review?(jones.chris.g) → review+
Keywords: checkin-needed
Whiteboard: [needs landing]
I wanted to land this, but I was kind of lost on which patches should be applied and in which order...
Attachments 472551 and 472580 need to land.
http://hg.mozilla.org/mozilla-central/rev/5522a4f458cc
http://hg.mozilla.org/mozilla-central/rev/f0367ec52766

You should probably double-check that I landed the right things, though.
Status: NEW → RESOLVED
Closed: 14 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [needs landing]
Target Milestone: --- → mozilla2.0b7
Build ID: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b7pre) Gecko/20100918 Firefox/4.0b7pre

This is still not as fast as it used to be. I get around 10 fps when I used to get around 40 fps before this regressed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Retested with the 2010-08-01-04-mozilla-central build and got 50 fps, 500% times faster than latest nightly build (09-19-2010).
This is true and known. We also have a fairly good understanding of the reasons for this. At the moment I think it's 'too bad', and going forward there's probably some steps that may be taken to improve this.

We should also realize that this particularly testcase is an extremely rare situation in reality though and it would be silly to optimize for it.

Never-the-less it's curious that it only does 10-15 fps and it requires some profiling. In theory all it should be doing is compositing a bunch of layers together. No invalidation need ever occur in theory.
(In reply to comment #51)
> This is true and known. We also have a fairly good understanding of the reasons
> for this.

We do? What are those reasons?
If we have a single child in a container, and that child has no clip rect applied to it, then we can forward our transform to that child and avoid using an intermediate surface.

Opacity can always be forwarded in the single child case. This patch also adjusts the clipping code to the new behavior (and in passing by fixes a bug in current clipping code).

For me this patch raises FPS to 60 with D2D and 40 without.
Attachment #477167 - Flags: review?(roc)
Reftests came back green for my patch.
+  if (!firstChild || (!firstChild->GetNextSibling() &&
+      !firstChild->GetClipRect())) {

Fix indent/wrapping so that !firstChild->GetNextSibling() && !firstChild->GetClipRect() is on the same line

Wouldn't it be simpler to have ContainerLayerD3D9::RenderLayer always include the child layer's mTransform in the aTransform it passes to each child? Ditto for opacity? That would remove code from each child layer type.
Keywords: testcase
Enabling layers.use-d3d10 gives me 50 fps on the testcase. It is around 10 if set to false.
(In reply to comment #55)
> +  if (!firstChild || (!firstChild->GetNextSibling() &&
> +      !firstChild->GetClipRect())) {
> 
> Fix indent/wrapping so that !firstChild->GetNextSibling() &&
> !firstChild->GetClipRect() is on the same line

Sure, as Mattwoodrow correctly concluded btw, the next clause if (aMatrix.IsIdentity() && !firstChild && !firstChild->GetNextSibling()) { needs a little correction too, I'll upload a patch with that fixed as well.

> 
> Wouldn't it be simpler to have ContainerLayerD3D9::RenderLayer always include
> the child layer's mTransform in the aTransform it passes to each child? Ditto
> for opacity? That would remove code from each child layer type.

We could certainly do that, I didn't do it since it felt the functions wouldn't be as nicely isolated. But if you prefer that I'm open to doing that, D3D doesn't wiw.
Matt's patch to give Images their own layer in bug 586683 might help a bit here. Instead of creating one temporary ThebesLayer surface per image and drawing all those surfaces, each with a different transform, we'll have a single underlying image surface and many ImageLayers all drawing from the *same* surface with a different transform.
Depends on: 586683
Comment on attachment 477167 [details] [diff] [review]
Avoid creating intermediate surface

Need a new patch, per comment #57
Attachment #477167 - Flags: review?(roc) → review-
bug 586683 doesn't have any noticeable effect on performance. It does lower memory usage by around 80mb though.

bug 599507 takes fps on mac (with accelerated layers) from 15 -> 55 for me.
Updated to be correct.
Attachment #484916 - Flags: review?(roc)
Attachment #484916 - Flags: review?(roc) → review+
http://hg.mozilla.org/mozilla-central/rev/73f8c0079bc1
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
Depends on: 614651
Depends on: 618556
Depends on: 691225
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: