Closed Bug 350148 Opened 18 years ago Closed 18 years ago

scrolling regression in gtk1/gtk2 (worse in cairo) not noticed with windows

Categories

(Core :: Layout, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: timeless, Assigned: roc)

References

(Depends on 1 open bug, )

Details

(Keywords: perf, regression)

Attachments

(8 files, 1 obsolete file)

215.28 KB, application/zip
Details
298.43 KB, application/x-bzip
Details
268.02 KB, application/x-bzip
Details
237.39 KB, application/x-bzip
Details
2.96 KB, patch
dbaron
: review+
dbaron
: superreview+
Details | Diff | Splinter Review
548 bytes, text/html
Details
8.16 KB, patch
dbaron
: review+
dbaron
: superreview+
Details | Diff | Splinter Review
8.03 KB, patch
dbaron
: review+
dbaron
: superreview+
Details | Diff | Splinter Review
amorphous steps to reproduce:
load something like gmail and use your mouse wheel to scroll up/down a lot.
pause, waiting for gecko to catch up, this takes a while.

steps to reproduce:
1. load jar:http://webwizardry.net/~timeless/317375-testcase.html.zip!/317375-testcase.html
(there's a .bz2 and a plain, if you'd rather a 1.5 or 35mb file instead of a 2.5mb file)
2. enter one of the following into the urlbar and press enter:
javascript:void (function(){for (var x=0; x<10000; x++)
window.scrollTo(0,x*10)})()
javascript:void (function(){for (var x=0; x<10000; x++)
window.scrollTo(0,document.height/2+(1-2*(x%2))*x*1000)})()

expected results:
if you use windows or a gtk1/gtk2 build from before bug 317375, gecko paints responsively/quickly, doesn't lag and actually looks nice (especially w/ the second js url where you get a heisenberg cat thumb running away from itself).

actual results:
on some systems the system's cpu is pegged and no painting happens until the end which takes longer to reach.

note: the regression window was chased using mozilla.org nightlies, and then the likely candidate (plus all its bustage fixes) was built and that showed the performance difference that was noticed between last year and now. general cairo performance is much worse than noncairo at the moment, so we ignored cairo.

windows doesn't seem to have any problems with this, not sure why.
Flags: blocking1.9?
Are we repainting the whole window after each scroll movement, post 317375, while we weren't before?
that seems to be a reasonable conclusion. note that i'm not in a position to answer this question and bz isn't cc'd, so i'm not sure to whom your question was addressed.
It's easy to see, just turn on paint flashing in a debug GTK2 build; set nglayout.debug.paint_flashing and then press capslock
fullafterjprof.html   -  many scrolls on  www.gmail.com after bug 317375
fullbeforejprof.html  -  many scrolls on  www.gmail.com before bug 317375
shortafterjprof.html  -  short scroll on  www.gmail.com after bug 317375
shortbeforejprof.html -  short scroll on  www.gmail.com before bug 317375
On the testcase I profiled, we only repaint the newly-exposed strip.  We also blit the widget and all (see the control flow in the profile).
We're spending a lot of time in ComputeRepaintRegionForCopy, to see what can and can't be scrolled via bitblt. This now traverses the frame tree where it used to just traverse the view tree.
The testcase contains big tables. For any table intersecting the displayed area, we traverse all the rows to while constructing the display list. We could avoid that using a cursor optimization like the block's line cursor, and that's probably a good thing to do. I don't know if it will be enough to resolve this bug.
Status: NEW → ASSIGNED
Ah, indeed.  The view tree would be smaller by far here.  And the large table thing is basically bug 241796, I think.  I suspect that using a cursor would help some.  Bailing out of the row traversal once we're "far enough down" if we can (e.g. if none of the rows overflow upwards, since unlike row ends row starts are nicely increasing -- cells don't span up) would help the rest of the way.
Depends on: 241796
(In reply to comment #12)
> Bailing out of the row traversal once we're "far enough down" if we
> can (e.g. if none of the rows overflow upwards, since unlike row ends row
> starts are nicely increasing -- cells don't span up) would help the rest of the
> way.

That'll be included in the cursor optimization one way or another.
(In reply to comment #11)
> The testcase contains big tables. For any table intersecting the displayed

Exactly testcase it is Gmail.com mail inbox, without chats, with 100 emails on page, Full screen.
If everything is actually visible, table row cursor optimizations won't help :-(
But then I guess you wouldn't be scrolling...
I've implemented the table row cursor but it doesn't seem to help at all :-(
(In reply to comment #7)
> Created an attachment (id=235669) [edit]
> Updated after Jprof more hit count,  JP_RTC_HZ=8192

This is a "big scroll", right? so each scroll operation repaints the entire window because the visible areas before and after scrolling don't overlap?
If that's the case, then I'm confused, because ComputeRepaintRegionForCopy should not be showing up in the profile as anything significant, because it should be really fast when the incoming rectangle, the intersection of the areas visible before and after, is empty...
the original problem involves using a mouse wheel or similar device, i.e. you're never scrolling a full page, you always have some slice from the previous view onscreen. some of my js testcases didn't honor that faithfully.

note that 100 email "rows" (6 lines per message because of wrapping) will never fit on a 420px high screen if you expect to read it (10pt font) and have a couple of rows for a header (say 250px) and a footer (250px).

My guess is that there are about 36 screens (this is based on a shipping product that isn't gecko) so something like 15120px tall.

i'm not sure what normal jitter is, but it's probably 10-60px. The most common user input does not involve using scrollbars, so you'd *never* get full painting, at best you'd painting 95% fresh and having to deal w/ 5% that moved from bottom to top or top to bottom.
Okay, so only the "small scroll" test is really relevant. And it's where we do trigger the extra analysis pass.

I may be able to somehow combine the analysis pass with the construction of the display list for actual painting, into a single display list construction. It would add complexity which is why I didn't do it initially.
OK, so bonsai breaks up its output into tables with 20-70 rows in them.  So the testcase in the URL bar doesn't benefit that much from the table cursor (well, sometimes we only have to walk 10 rows instead of 50... but sometimes not, since a lot of rows can fit on screen at once, and given that bug 240275 is fixed the simple act of walking rows is not too bad, since we do nothing for them).
bz: well, gmail only has 100 rows which isn't really that much more than 50, and there's only one such table. it's just very hard to use gmail as a testcase, and it seemed like bonsai emulated gmail tolerably well.
Did bonsai regress too, in your testing?
I have tried to investigate what function is slowly....

Way:
nsViewManager::CanScrollWithBitBlt->
nsLayoutUtils::ComputeRepaintRegionForCopy->
AddItemsToRegion->
clip.IntersectRect(aClipRect, NS_STATIC_CAST(nsDisplayClip*, item)->GetClipRect());

All this parts and calls can be commented, and I can't see any visible issues after disabling functionality on this way.

also I have tried  to insert after
http://lxr.mozilla.org/mozilla/source/layout/base/nsLayoutUtils.cpp#751:
....
clip.height = 0;
....

It scrolling also works very fast without visible issues...

May be ROc can explain what problem here?
ALso i have check height values before and after Intersect function:

nsRect::IntersectRectTest: height = 0, temp = 13515, y = 0
nsRect::IntersectRectTest: after height = 13515
So if you just comment out the call from ComputeRepaintRegionForCopy to AddItemsToRegion, the problem goes away? That is very interesting.
Actually I don't believe that because AddItemsToRegion has only 5 hits in bug317375_short_rtc/shortafterjprof.html
(In reply to comment #28)
> Actually I don't believe that because AddItemsToRegion has only 5 hits in
> bug317375_short_rtc/shortafterjprof.html
> 

Problem not in hits on this functions, ... I think problem in wrong calculated height.... and paint actions that try to paint this calculations....

see
........................
also I have tried  to insert after
http://lxr.mozilla.org/mozilla/source/layout/base/nsLayoutUtils.cpp#751:
....
clip.height = 0;
....
It scrolling also works very fast without visible issues...
.........................

When height = 0 all works fast, when it = 13515 then slow...

Okay. Now that I've finally got around to checking, it turns out that on trunk we're repainting the whole window when I scroll gmail. I think that's the problem, and I should be able to fix it.

That bonsai testcase was a total red herring, guys.
This fixes the problem at hand. It's very simple, we were concluding that Gmail's all-enclosing IFRAME had a border which needed to be painted, the border rect intersects the area being scrolled, but isn't being scrolled itself, so ComputeRepaintRegionForCopy says "need to repaint everything".

In fact there is no border on that IFRAME. The border-style is set to something other than 'none', but the width is zero. The solution is trivial, just check the border width instead of the style. Other code ensures that when the style is none, the computed border width is zero.
Attachment #238361 - Flags: superreview?(dbaron)
Attachment #238361 - Flags: review?(dbaron)
Would:

  return GetStyleBorder()->GetBorder() != nsMargin(0, 0, 0, 0);

be better?  If not really, should probably use NS_FOR_CSS_SIDES here instead of manual for loop.
(In reply to comment #31)
> Created an attachment (id=238361) [edit]
> fix border visibility check

I have tested this patch, it works some faster, but not so fast as with clip.height = 0; or with commented lines ...

May be something else tries to redraw there?.


(In reply to comment #32)
> Would:
> 
>   return GetStyleBorder()->GetBorder() != nsMargin(0, 0, 0, 0);
> 
> be better?

Yes.
(In reply to comment #33)
> (In reply to comment #31)
> > Created an attachment (id=238361) [edit]
> > fix border visibility check
> 
> I have tested this patch, it works some faster, but not so fast as with
> clip.height = 0; or with commented lines ...
> 
> May be something else tries to redraw there?.

Enable paint flashing and see if there's any painting difference. I'm not sure why there would be.

I do have some ideas for to make things faster still, I'll get to that tomorrow.
Comment on attachment 238361 [details] [diff] [review]
fix border visibility check

It'd probably be good to use NS_FOR_CSS_SIDES for the for-loop.

r+sr=dbaron
Attachment #238361 - Flags: superreview?(dbaron)
Attachment #238361 - Flags: superreview+
Attachment #238361 - Flags: review?(dbaron)
Attachment #238361 - Flags: review+
I'll just go with bz's "return GetStyleBorder()->GetBorder() != nsMargin(0, 0, 0, 0);"
If gmail did use a border on its IFRAME, the previous patch would of course be defeated. This testcase tests such a situation. Adding an outline would also cause the same sort of problem.
This patch fixes that last testcase. We observe that when the visible region's bounds are wholly contained inside the inner edge of a border or outline, the border or outline is not in fact visible. Then when we do our scroll analysis, such a border or outline will not be considered when we look for non-moving but visible content that could interfere with the bitblt.
Attachment #238457 - Flags: superreview?(dbaron)
Attachment #238457 - Flags: review?(dbaron)
We should take both patches.
Here's another easy optimization that will help a lot on all kinds of pages. When we build the display list for scroll analysis, we descend into the moving frame and its children only to look for content whose rendering depends on position and therefore can't be safely bitblitted. Currently this is only background-attachment:fixed content. We can keep a flag in the prescontext to record whether we've ever rendered any such content in this prescontext; if we haven't, there is no need to descend into the moving frame when building a display list for scroll analysis. This will hugely speed up ComputeRepaintRegionForCopy in cases when background-attachment:fixed is not used and we don't have to descend to find the placeholders for visible positioned elements ... typically most of the content visible in the area to be scrolled is in fact the scrolled element and its contents, so skipping traversal of that is a big win.

It slightly refactors code in BuildDisplayListForChild to share the test of NS_FRAME_FORCE_DISPLAY_LIST_DESCEND_INTO, and in the process fixes what I think is a possible bug where we might fail to render positioned elements whose placeholders are in something visibility:collapsed.
Attachment #238475 - Flags: superreview?(dbaron)
Attachment #238475 - Flags: review?(dbaron)
Altogether I hope these patches will put this bug to rest.

But I still don't understand why hacking AddItemsToRegion is making a difference after the first patch, and that worries me. In my tests, after the first patch, AddItemsToRegion isn't adding anything to the region, and it's not taking much time itself, so changing it in any way shouldn't be helping.
(In reply to comment #42)
> Altogether I hope these patches will put this bug to rest.
> 
> But I still don't understand why hacking AddItemsToRegion is making a
> difference after the first patch, and that worries me. In my tests, after the
> first patch, AddItemsToRegion isn't adding anything to the region, and it's not
> taking much time itself, so changing it in any way shouldn't be helping.
> 

I have tested all pathes together ... works very good, I think we can close this bug after applying all pathes...

PS: May be  first patch can be implemented as
nsFrame.h:
...........
inline PRBool HasBorder... {
 return GetStyleBorder()->GetBorder() != nsMargin(0, 0, 0, 0);
}
........
?


For the HasBorder thing, btw, do we want to have a static nsMargin that we compare to so that we don't have to keep creating nsMargins?
I hope it all just gets inlined away to a few compares with zero.
sorry. bonsai was slow and used to be faster, no one likes using google mail for testcases becuase it's too dynamic, so i tried something that seemed similar. i knew there was a risk it was iframe related :(.
The last patch will help the bonsai testcase as well, I hope.
Comment on attachment 238457 [details] [diff] [review]
fix borders and outlines interfering with scrolled children

This seems reasonable, although two ideas that you can take or leave:

* it might be nicer for these methods to call the base-class method, i.e., begin with:

if (!nsDisplayItem::OptimizeVisibility(...))
  return PR_FALSE;

and then do the one bit that they add before returning true (especially if you need to add to the base class method in the future).  That said, what you did is *probably* faster...


* I think you might be bailing too easily on the handling of border/outline-radius and especially outline-offset.  Inflating by the outline-offset should be trivial; deflating by the largest or larger-applicable border/outline-radius shouldn't be too hard either.
Attachment #238457 - Flags: superreview?(dbaron)
Attachment #238457 - Flags: superreview+
Attachment #238457 - Flags: review?(dbaron)
Attachment #238457 - Flags: review+
Comment on attachment 238475 [details] [diff] [review]
prevent descending into the scrolled frame during BuildDisplayList for analysis, in most cases

You need to initialize mRenderedPositionVaryingContent to PR_FALSE in nsPresContext's constructor.

Why is NS_STYLE_VISIBILITY_COLLAPSE treated specially?  (I know it's just code you're moving.)  It normally (i.e., for most frame types) means the same thing as hidden.  My initial intuition is that that test should be removed entirely.

Where's the test that this code is used for building the display list when doing scroll analysis?  Or is it used more generally?
(In reply to comment #49)
> (From update of attachment 238475 [details] [diff] [review] [edit])
> You need to initialize mRenderedPositionVaryingContent to PR_FALSE in
> nsPresContext's constructor.

It's NS_DECL_AND_IMPL_ZEROING_OPERATOR_NEW so I don't think I do.

> Why is NS_STYLE_VISIBILITY_COLLAPSE treated specially?  (I know it's just code
> you're moving.)  It normally (i.e., for most frame types) means the same thing
> as hidden.  My initial intuition is that that test should be removed entirely.

Will do.

> Where's the test that this code is used for building the display list when
> doing scroll analysis?  Or is it used more generally?

When we're not doing scroll analysis, "if (aBuilder->GetRootMovingFrame() == aChild" will fail because GetRootMovingFrame() returns nsnull.
I'll explain the latter with a comment there and in nsDisplayList.h for GetRootMovingFrame.
Would this optimization prevent using BitBlt if the moving frame has a solid color background, since you never actually look at its background?  (I'm assuming the moving frame would be the scrolled frame inside a scrollframe.  Is that correct?)
Except I suppose with the current CSS spec, the moving frame would never have a background at all; it would be the scrollframe that has the background.  I suppose I should look at how the scroll analysis works, since it doesn't currently make sense to me.
Comment on attachment 238475 [details] [diff] [review]
prevent descending into the scrolled frame during BuildDisplayList for analysis, in most cases

OK, this makes a little more sense after reading the comments in ComputeRepaintRegionForCopy.

r+sr=dbaron with the visibility:collapse check removed (are there any other similar checks elsewhere?)
Attachment #238475 - Flags: superreview?(dbaron)
Attachment #238475 - Flags: superreview+
Attachment #238475 - Flags: review?(dbaron)
Attachment #238475 - Flags: review+
(In reply to comment #53)
> Except I suppose with the current CSS spec, the moving frame would never have
> a background at all; it would be the scrollframe that has the background.

Right.

You're right, this could be a problem if we used ComputeRepaintRegionForCopy more generally (like if we used it to accelerate positioning changes). But at least it would be a problem of performance, not correctness. I could fix it by changing
+    if (aBuilder->GetRootMovingFrame() == aChild 
to
+    if (aBuilder->GetRootMovingFrame() == this 

Let me know if that's OK.

(In reply to comment #54)
> (From update of attachment 238475 [details] [diff] [review] [edit])
> OK, this makes a little more sense after reading the comments in
> ComputeRepaintRegionForCopy.
> 
> r+sr=dbaron with the visibility:collapse check removed (are there any other
> similar checks elsewhere?)

Yes, here:
http://lxr.mozilla.org/mozilla/source/layout/generic/nsFrame.cpp#1152

I'll remove that too.
(In reply to comment #48)
> (From update of attachment 238457 [details] [diff] [review] [edit])
> This seems reasonable, although two ideas that you can take or leave:
> 
> * it might be nicer for these methods to call the base-class method, i.e.,
> begin with:
> 
> if (!nsDisplayItem::OptimizeVisibility(...))
>   return PR_FALSE;
> 
> and then do the one bit that they add before returning true (especially if you
> need to add to the base class method in the future).  That said, what you did
> is *probably* faster...

I'll do this.

> * I think you might be bailing too easily on the handling of
> border/outline-radius and especially outline-offset.  Inflating by the
> outline-offset should be trivial; deflating by the largest or larger-applicable
> border/outline-radius shouldn't be too hard either.

I don't want to add additional code to handle these very rare cases, so I'll skip this.
> Let me know if that's OK.

I took the liberty of doing that.

All patches checked in.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Flags: blocking1.9? → in-testsuite?
Depends on: 451526
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: