Closed
Bug 584494
Opened 14 years ago
Closed 14 years ago
Regression: Transform rotation testcase performs significantly worse
Categories
(Core :: Graphics, defect)
Tracking
()
VERIFIED
FIXED
mozilla2.0b7
Tracking | Status | |
---|---|---|
blocking2.0 | --- | betaN+ |
People
(Reporter: bugzilla, Assigned: roc)
References
()
Details
(Keywords: perf, regression, testcase)
Attachments
(7 files, 4 obsolete files)
5.30 KB,
text/html
|
Details | |
1.01 KB,
patch
|
roc
:
review+
|
Details | Diff | Splinter Review |
16.01 KB,
patch
|
dbaron
:
review+
|
Details | Diff | Splinter Review |
1.52 KB,
patch
|
tnikkel
:
review+
|
Details | Diff | Splinter Review |
3.18 KB,
patch
|
cjones
:
review+
|
Details | Diff | Splinter Review |
19.23 KB,
patch
|
roc
:
review-
|
Details | Diff | Splinter Review |
18.87 KB,
patch
|
roc
:
review+
|
Details | Diff | Splinter Review |
Build ID: Mozilla/5.0 (Windows; Windows NT 6.1; rv:2.0b3pre) Gecko/20100804 Minefield/4.0b3pre
Direct2D enabled.
Steps to reproduce
1. Go to http://www.tapper-ware.net/stable/web.dom.stresstest.transform/
2. Notice that spinning Firefox logo is much slower than latest beta
It performed well in Firefox 4 beta 2, it's now significantly slower
Reporter | ||
Updated•14 years ago
|
Comment 1•14 years ago
|
||
It seems to work pretty well for me on the nightly? How slow is 'significantly slower'? And do you know when this regressed?
Reporter | ||
Comment 2•14 years ago
|
||
It is much slower for me. Beta 2 was very fast and now its quite slow.
Regarding the regression date, where can I find older Firefox builds after beta 2?
Reporter | ||
Comment 3•14 years ago
|
||
Not sure that I looked in the right place, but I tested different builds from here:
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/
And the results:
2010-08-01-04-mozilla-central fast
2010-08-02-04-mozilla-central slow
Reporter | ||
Updated•14 years ago
|
Keywords: regression
Comment 4•14 years ago
|
||
That's the right place. What are the build ids of those two builds (from about:buildconfig)?
Comment 5•14 years ago
|
||
Pending that, a likely regression range: http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2010-08-01+01&enddate=2010-08-02+08
There's a number of painting-related changes in there.
blocking2.0: --- → ?
Keywords: perf
Comment 6•14 years ago
|
||
So it looks like atleast for D2D rendering, doing transformed layers the 'new way' has actually slowed down rendering. Can we see if there's anything that we can do? For what it's worth this will probably go better once we make D2D run through D3D9 layers.
Reporter | ||
Comment 7•14 years ago
|
||
(In reply to comment #4)
> That's the right place. What are the build ids of those two builds (from
> about:buildconfig)?
Fast:
Sun Aug 01 02:32:15 2010 -0700 (at Sun Aug 01 02:32:15 2010 -0700)
Slow:
Mon Aug 02 01:28:10 2010 -0700 (at Mon Aug 02 01:28:10 2010 -0700)
Let me know please if it is possible to narrow this done somehow
Comment 8•14 years ago
|
||
José, I meant to type "about:buildconfig" into the url bar in each build and copy the "Built from" url and paste it in this bug.
Reporter | ||
Comment 9•14 years ago
|
||
Sorry, here we go
Fast:
http://hg.mozilla.org/mozilla-central/rev/070d9d46d88b
Slow:
http://hg.mozilla.org/mozilla-central/rev/a4d86c3a3494
Comment 10•14 years ago
|
||
Exact regression range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=070d9d46d88b&tochange=a4d86c3a3494
Assignee | ||
Comment 11•14 years ago
|
||
I think the immediate thing to do is to back off a bit by making transform layers inactive unless there is an active child layer. That means we'll go back to the old rendering behaviour in most cases, but transformed video etc will perform much better than it used to, especially with acceleration.
Updated•14 years ago
|
blocking2.0: ? → betaN+
Updated•14 years ago
|
Assignee: nobody → roc
Comment 12•14 years ago
|
||
Custom version of the stress test with a fps counter using mozPaintCount.
With this I get (debug build of tip trunk):
Current code: 27 fps
Old code (always Thebes items): 46fps
Forcing inactive layers always: 46fps
Based on this an easy solution would be to revert to the old behaviour in the presence of a basic layer manager where there's no benefit to having this as a separate layer.
I'll attach a patch for this, but I'm happy to use your suggestion roc if you think that's preferable.
Comment 13•14 years ago
|
||
Attachment #463421 -
Flags: review?(roc)
Assignee | ||
Comment 14•14 years ago
|
||
I think making the layer inactive when there's no active child is a better approach. It's more consistent across layer managers, it will work with 3D transforms, and it avoids the code duplication you have here.
Comment 15•14 years ago
|
||
Will tabcandy content have active layers? Accelerating the effects used by tabcandy probably should be a consideration for this.
blocking2.0: betaN+ → ?
Assignee | ||
Comment 16•14 years ago
|
||
Tabcandy uses canvases for the thumbnails, and those are always active at the moment, so yes.
Although I *think* Tabcandy mostly does its scaling by just setting the CSS size of the canvas elements, rather than using CSS tranforms. (Canvas scaling uses layer transforms.)
Reporter | ||
Comment 17•14 years ago
|
||
Was the removal of the betaN+ flag intentional in comment 15?
Updated•14 years ago
|
blocking2.0: ? → betaN+
Comment 18•14 years ago
|
||
(In reply to comment #16)
> Tabcandy uses canvases for the thumbnails, and those are always active at the
> moment, so yes.
>
> Although I *think* Tabcandy mostly does its scaling by just setting the CSS
> size of the canvas elements, rather than using CSS tranforms. (Canvas scaling
> uses layer transforms.)
All correct. We do our scaling by resizing the canvas elements. However, we do use -moz-transform:rotate on those canvases when they go into a stack (http://img.skitch.com/20100806-tysmkn5xr4ffncdw4mnjx5c28j.png).
Comment 19•14 years ago
|
||
Attachment #463421 -
Attachment is obsolete: true
Attachment #463990 -
Flags: review?(roc)
Attachment #463421 -
Flags: review?(roc)
Assignee | ||
Updated•14 years ago
|
Attachment #463990 -
Flags: review?(roc) → review+
Assignee | ||
Comment 20•14 years ago
|
||
I looked into this stress test and figured out why we're repainting a lot.
Currently whenever the transform changes, we have to reflow the transformed frame to get its overflow area updated and so that ancestors can show or hide scrollbars if necessary.
On every transform, the overflow area for the transformed frame does change (because it's rotating), and whenever the overflow area for a transformed frame changes, nsIFrame::FinishAndStoreOverflow invalidates the frame's entire overflow area. See
http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsFrame.cpp#5919
I think I can fix that. However, we hit another invalidation:
http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsAbsoluteContainingBlock.cpp#519
Again the change in overflow areas triggers this.
This code was introduced in bug 389224. I understand what that fix was trying to do, but I need to think about whether there's a better way to do it.
Assignee | ||
Comment 21•14 years ago
|
||
This patch fixes a couple of things.
It backs out most of the patch for bug 389224. Responsibility for invalidating the new overflow rect when 'clip' has changed falls to FinishAndStoreOverflow, which uses a frame state bit to track when a frame had 'clip' but no longer does. (This bit --- and associated code --- will go away when we move to display-list-based invalidation in the future.)
It also changes the treatment of the case where a frame's overflow area changes and the frame has a transform. In that case we do not need to invalidate the ThebesLayer contents of the frame.
I added a reftest for bug 389224 and verified that if I disable the clip state code in this patch, the test fails.
With this patch, running a local copy of that "stress test", the only ThebesLayer invalidations performed are the ones updating the FPS counter. The testcase is running significantly faster.
Attachment #465204 -
Flags: review?(dbaron)
Assignee | ||
Comment 22•14 years ago
|
||
Fixes a bug ... instead of calling Invalidate with INVALIDATE_NO_THEBES_LAYERS, call InvalidateLayer so that if there happens to not be a retained layer, we'll call Invalidate instead.
Attachment #465204 -
Attachment is obsolete: true
Attachment #465484 -
Flags: review?(dbaron)
Attachment #465204 -
Flags: review?(dbaron)
Comment on attachment 465484 [details] [diff] [review]
Avoid invalidating, v2
>+ // changed (in particular, if it grows), we have to repaint the
Maybe worth noting explicitly that growing can happen due to removal of 'clip'?
In fact, the comment doesn't make it clear why checking for why checking
|hasAbsPosClip| is needed. It seems like didHaveAbsPosClip should be
sufficient.
... except the comment is wrong. 'clip' changes trigger only
eChangeHint_ReflowFrame. So I think you do need both checks, and should
fix the comment. (That said, changing 'clip' doesn't really require
reflow either...)
r=dbaron with that, I guess, though at some point I need to understand the layer stuff better...
Attachment #465484 -
Flags: review?(dbaron) → review+
Assignee | ||
Comment 24•14 years ago
|
||
(In reply to comment #23)
> Maybe worth noting explicitly that growing can happen due to removal of 'clip'?
Sure
> In fact, the comment doesn't make it clear why checking for why checking
> |hasAbsPosClip| is needed. It seems like didHaveAbsPosClip should be
> sufficient.
> ... except the comment is wrong. 'clip' changes trigger only
> eChangeHint_ReflowFrame. So I think you do need both checks, and should
> fix the comment. (That said, changing 'clip' doesn't really require
> reflow either...)
It requires reflow to update the overflow rect. But you're right about the rest. I'll fix the comment.
Whiteboard: [needs landing]
Assignee | ||
Comment 25•14 years ago
|
||
In fact, we need to have 'clip' changes trigger eChangeHint_RepaintFrame. We need to ensure that the *old* overflow area is repainted for the case where we're shrinking the clip rect. So we need to make that comment true by adding eChangeHint_RepaintFrame for clip changes.
Assignee | ||
Comment 26•14 years ago
|
||
Changes since the last patch:
-- added another test, for clip shrinking
-- made 'clip' changes nsChangeHint_RepaintFrame as well as reflow
-- made nsBlockFrame::InvalidateInternal not 'clip' the damage rect if the invalidation is targeted at this frame. Otherwise the invalidation done by ApplyRenderingChangeToTree gets clipped by the new 'clip', which is wrong.
Attachment #466218 -
Flags: review?(dbaron)
Assignee | ||
Updated•14 years ago
|
Whiteboard: [needs landing] → [needs review]
Comment on attachment 466218 [details] [diff] [review]
avoid invalidating, v3
Given the nsChangeHint_RepaintFrame, do you still need all the invalidation for clip in FinishAndStoreOverflow? Now I'm wondering if you only need the |hasAbsPosClip| check (which would mean you wouldn't need the state bit). (I'm trying to remember why I didn't come to that conclusion yesterday, though.)
If you want to avoid reindenting much of nsBlockFrame::InvalidateInternal, you could inline the GetStyleDisplay() call inside the parameter list of GetAbsPosClipRect and combine the two ifs with &&.
r=dbaron
Attachment #466218 -
Flags: review?(dbaron) → review+
Assignee | ||
Comment 28•14 years ago
|
||
(In reply to comment #27)
> Given the nsChangeHint_RepaintFrame, do you still need all the invalidation for
> clip in FinishAndStoreOverflow? Now I'm wondering if you only need the
> |hasAbsPosClip| check (which would mean you wouldn't need the state bit). (I'm
> trying to remember why I didn't come to that conclusion yesterday, though.)
In fact, we only need the didHaveAbsPosClip check, so we still need the state bit: the code in FinishAndStoreOverflow needs to run when the overflow area grows, i.e. when the clip is removed.
I'll remove the check of hasAbsPosClip, although it doesn't really matter performance-wise.
Assignee | ||
Updated•14 years ago
|
Whiteboard: [needs review] → [needs landing]
Reporter | ||
Comment 29•14 years ago
|
||
Will this make beta 5?
Assignee | ||
Comment 30•14 years ago
|
||
Yes.
Comment 31•14 years ago
|
||
Is this still gonna make beta 5?
Comment 32•14 years ago
|
||
Hrm, avoid invalidatoin v3 landed, but in my local tests doesn't seem to make this much faster? Should it?
Assignee | ||
Comment 33•14 years ago
|
||
Where did it land? AFAIK it hasn't landed on mozilla-central.
Comment 34•14 years ago
|
||
(In reply to comment #33)
> Where did it land? AFAIK it hasn't landed on mozilla-central.
this? http://hg.mozilla.org/mozilla-central/rev/f6916eb6b678
Assignee | ||
Comment 35•14 years ago
|
||
Ah, right! Thanks!
Bas, are you still seeing us invalidate ThebesLayers in that "stress" test? We'd invalidate for the FPS counter but if you disable that, you shouldn't see ThebesLayer repainting.
Assignee | ||
Updated•14 years ago
|
Whiteboard: [needs landing]
Reporter | ||
Comment 36•14 years ago
|
||
I can confirm that the attached URL (without counter) did not become faster.
Assignee | ||
Comment 37•14 years ago
|
||
Jose, are you testing with D3D on or off?
Assignee | ||
Comment 38•14 years ago
|
||
On my machine, on trunk with D3D off I get about 24fps, with D3D on I get 30fps. With beta2 I get 65fps. Hmm...
Assignee | ||
Comment 39•14 years ago
|
||
The ClipToRegion we do to clip to the visible region when we blit out a ThebesLayer is not necessary if the visible region is the whole layer buffer (which it is in this testcase). This clip can be expensive because it might not be axis-aligned rectangles if the layer is transformed. Indeed, it gets the D2D/BasicLayers fps up to 38. (Doesn't affect D3D of course.)
Attachment #465484 -
Attachment is obsolete: true
Attachment #472550 -
Flags: review?(jones.chris.g)
Assignee | ||
Comment 40•14 years ago
|
||
I noticed we were creating empty nsDisplayTransform items when a transformed frame had nothing to paint inside. Don't do that.
Attachment #472551 -
Flags: review?(tnikkel)
Comment 41•14 years ago
|
||
Comment on attachment 472551 [details] [diff] [review]
Avoid creating transform/opacity wrapper display items for empty lists
Why not do the same for SVG effects just above?
Attachment #472551 -
Flags: review?(tnikkel) → review+
Reporter | ||
Comment 42•14 years ago
|
||
(In reply to comment #37)
> Jose, are you testing with D3D on or off?
I am testing with it on. Latest trunk build Windows 7 64-bit. I get around 14 fps, used to get around 40 if I remember it correctly.
Assignee | ||
Comment 43•14 years ago
|
||
(In reply to comment #41)
> Why not do the same for SVG effects just above?
Because filters can paint stuff even if there's no contents.
Assignee | ||
Comment 44•14 years ago
|
||
I like this version better. We can draw the whole buffer if it's all valid, even if it's not all visible. We should probably also clip if it's cheap to do so, since that could improve performance.
I've also updated the documentation for Layer::SetVisibleRegion to reflect reality. Need to check what D3D is doing here...
Attachment #472550 -
Attachment is obsolete: true
Attachment #472580 -
Flags: review?(jones.chris.g)
Attachment #472550 -
Flags: review?(jones.chris.g)
Updated•14 years ago
|
Attachment #472580 -
Flags: review?(jones.chris.g) → review+
Assignee | ||
Updated•14 years ago
|
Keywords: checkin-needed
Whiteboard: [needs landing]
Comment 45•14 years ago
|
||
I wanted to land this, but I was kind of lost on which patches should be applied and in which order...
Assignee | ||
Comment 46•14 years ago
|
||
Attachments 472551 and 472580 need to land.
http://hg.mozilla.org/mozilla-central/rev/5522a4f458cc
http://hg.mozilla.org/mozilla-central/rev/f0367ec52766
You should probably double-check that I landed the right things, though.
Status: NEW → RESOLVED
Closed: 14 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [needs landing]
Target Milestone: --- → mozilla2.0b7
Assignee | ||
Comment 48•14 years ago
|
||
Looks good, thanks!
Reporter | ||
Comment 49•14 years ago
|
||
Build ID: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b7pre) Gecko/20100918 Firefox/4.0b7pre
This is still not as fast as it used to be. I get around 10 fps when I used to get around 40 fps before this regressed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 50•14 years ago
|
||
Retested with the 2010-08-01-04-mozilla-central build and got 50 fps, 500% times faster than latest nightly build (09-19-2010).
Comment 51•14 years ago
|
||
This is true and known. We also have a fairly good understanding of the reasons for this. At the moment I think it's 'too bad', and going forward there's probably some steps that may be taken to improve this.
We should also realize that this particularly testcase is an extremely rare situation in reality though and it would be silly to optimize for it.
Never-the-less it's curious that it only does 10-15 fps and it requires some profiling. In theory all it should be doing is compositing a bunch of layers together. No invalidation need ever occur in theory.
Assignee | ||
Comment 52•14 years ago
|
||
(In reply to comment #51)
> This is true and known. We also have a fairly good understanding of the reasons
> for this.
We do? What are those reasons?
Comment 53•14 years ago
|
||
If we have a single child in a container, and that child has no clip rect applied to it, then we can forward our transform to that child and avoid using an intermediate surface.
Opacity can always be forwarded in the single child case. This patch also adjusts the clipping code to the new behavior (and in passing by fixes a bug in current clipping code).
For me this patch raises FPS to 60 with D2D and 40 without.
Attachment #477167 -
Flags: review?(roc)
Comment 54•14 years ago
|
||
Reftests came back green for my patch.
Assignee | ||
Comment 55•14 years ago
|
||
+ if (!firstChild || (!firstChild->GetNextSibling() &&
+ !firstChild->GetClipRect())) {
Fix indent/wrapping so that !firstChild->GetNextSibling() && !firstChild->GetClipRect() is on the same line
Wouldn't it be simpler to have ContainerLayerD3D9::RenderLayer always include the child layer's mTransform in the aTransform it passes to each child? Ditto for opacity? That would remove code from each child layer type.
Reporter | ||
Comment 56•14 years ago
|
||
Enabling layers.use-d3d10 gives me 50 fps on the testcase. It is around 10 if set to false.
Comment 57•14 years ago
|
||
(In reply to comment #55)
> + if (!firstChild || (!firstChild->GetNextSibling() &&
> + !firstChild->GetClipRect())) {
>
> Fix indent/wrapping so that !firstChild->GetNextSibling() &&
> !firstChild->GetClipRect() is on the same line
Sure, as Mattwoodrow correctly concluded btw, the next clause if (aMatrix.IsIdentity() && !firstChild && !firstChild->GetNextSibling()) { needs a little correction too, I'll upload a patch with that fixed as well.
>
> Wouldn't it be simpler to have ContainerLayerD3D9::RenderLayer always include
> the child layer's mTransform in the aTransform it passes to each child? Ditto
> for opacity? That would remove code from each child layer type.
We could certainly do that, I didn't do it since it felt the functions wouldn't be as nicely isolated. But if you prefer that I'm open to doing that, D3D doesn't wiw.
Assignee | ||
Comment 58•14 years ago
|
||
Matt's patch to give Images their own layer in bug 586683 might help a bit here. Instead of creating one temporary ThebesLayer surface per image and drawing all those surfaces, each with a different transform, we'll have a single underlying image surface and many ImageLayers all drawing from the *same* surface with a different transform.
Depends on: 586683
Assignee | ||
Comment 59•14 years ago
|
||
Comment on attachment 477167 [details] [diff] [review]
Avoid creating intermediate surface
Need a new patch, per comment #57
Attachment #477167 -
Flags: review?(roc) → review-
Comment 60•14 years ago
|
||
bug 586683 doesn't have any noticeable effect on performance. It does lower memory usage by around 80mb though.
bug 599507 takes fps on mac (with accelerated layers) from 15 -> 55 for me.
Assignee | ||
Updated•14 years ago
|
Attachment #484916 -
Flags: review?(roc) → review+
Comment 62•14 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•14 years ago
|
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•