Optimize scrolling some more

RESOLVED FIXED in mozilla1.9.2b1

Status

()

defect
P2
normal
RESOLVED FIXED
10 years ago
10 years ago

People

(Reporter: roc, Assigned: roc)

Tracking

({verified1.9.2})

Trunk
mozilla1.9.2b1
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9.2 +

Firefox Tracking Flags

(status1.9.2 beta1-fixed)

Details

Attachments

(9 attachments, 4 obsolete attachments)

789 bytes, text/html
Details
63.21 KB, patch
Details | Diff | Splinter Review
6.19 KB, patch
dbaron
: review+
Details | Diff | Splinter Review
60.04 KB, patch
Details | Diff | Splinter Review
49.24 KB, text/html
Details
61.83 KB, patch
Details | Diff | Splinter Review
747 bytes, text/html
Details
63.51 KB, patch
dbaron
: review+
Details | Diff | Splinter Review
4.31 KB, patch
Details | Diff | Splinter Review
Currently when we scroll an element we divide its area into two regions: a region that can must be repainted, and the rest. Then we pick the largest rectangle in the rectangle list defining "the rest", and blit that, then repaint the part of that rectangle that was scrolled in from outside, and also repaint everything outside that rectangle in the scrolled area.

We can improve this in two ways:
-- Support blitting of a complex region, not just a rectangle, so platforms that support it don't need to restrict blitting to the largest rectangle of the moving region
-- Identify parts of the scrolled area that don't need to be repainted or blitted because they're opaque content that's not moving, and don't repaint them

So I propose extending nsLayoutUtils::ComputeRepaintRegionForCopy to return two regions: aRepaintRegion, the region that must be repainted, and aBlitRegion, the region that should be copied. Then we extend nsIWidget::Scroll with a region parameter defining the region to blit. Then we hook these up in nsScrollPortView.

The challenging part is to make the resulting code understandable. Testability would also be good, but that's also difficult.
I think Fennec needs this.
Flags: blocking1.9.2?
Posted patch WIP (obsolete) — Splinter Review
This seems to generate correct rectangles in at least some cases, but gdk_window_move_region doesn't seem to be working for me. It seems to blit the bounding-box of the region.
Posted file testcase
This testcase doesn't work. Scrolling to the top and then pressing the "Down" button shows in the debugger that we've generated the right parameters to nsWindow::Scroll, but at least on GTK2, we have the gdk_window_move_region problem.
Posted patch WIP v2 (obsolete) — Splinter Review
Fixes some pretty bad bugs in the cross-platform nsScrollPortView code and some issues in the Mac code. This seems to work well on Mac.
Attachment #393676 - Attachment is obsolete: true
Posted patch WIP v3Splinter Review
Karl determined that there is a GDK bug here which prevents gdk_window_move_region from working correctly with complex regions.

This version of the patch works around it by doing one gdk_window_move_region call per rectangle. It seems to work pretty well.

I just need to build and test the Windows code now.
Attachment #393712 - Attachment is obsolete: true
For the record, the bug in GDK (version 2.16.5) was that _gdk_x11_window_move_region uses GDK_GC_XGC when it should have been using GDK_GC_GET_XGC.  Client-side windows has landed on GDK 2.17.x, resulting in much of this being completely reorganized.  It looks like the move_region bug is fixed in git.

These GDK changes may give us other problems as it looks like moves may not happen until gdk_window_process_updates() is called (and some other situations), but it doesn't seem right to optimize for a development version of GDK yet as I don't know the timeline there and things may change further.
This gets in the way of the work I want to do in this bug. In particular, content positioned over scrolling content appears to be *under* the nsDisplaySummary for the moving content's outline list, which means common cases where we want to use optimizations don't work. We need to descend into the moving frame subtree to see if there are really any outlines etc.
Attachment #393936 - Flags: review?(dbaron)
Traversing the visible frames every time we scroll the window is a bit annoying, but I don't think it will be a performance issue; building display lists is fast and hardly ever shows up on profiles.
-- Add aSaveVisibleRegionAboveMovingContent feature to SetMovingFrame
-- Factor out code to form SubtractFromVisibleRegion
-- Rename aAreaRect to aUpdateRect in AccumulateItemInRegion for clarity
-- Rename aRect to aUpdateRect in AddItemsToRegion for clarity
-- In AddItemsToRegion, compute a region to blit as well as a region to repaint. This lets us leave a region that should be neither blitted nor repainted. The region to blit is everything moving that's visible under non-moving content (visibleRegionOfMovingContent) minus the area that's going to be repainted. We also restrict the area to repaint to be inside visibleRegionOfMovingContent.
-- Change nsIWidget::Scroll to take a list of rectangles to copy. These rectangles are guaranteed to be safe to copy in order.
-- Mac, Windows and GTK implementations of the new API. They all just copy rectangles one by one.
-- Change nsTreeBodyFrame for the new Scroll API.
-- Change nsViewManager::CanScrollWithBitBlt to GetRegionsForBlit, since we're always going to try to scroll with blitting.
-- nsScrollPortView::Scroll is basically reimplemented. We always take the "try to blit" path when we have a widget (that's not transparent). The only tricky bit here is building the list of device pixel rectangles. First we have to shrink the appunits blit region into the region of device pixels it contains, in ConvertToInnerPixelRegion. Then we sort the rectangles into a safe order for copying, in SortBlitRectsForCopy. The code for the latter is nice and small, but the proof that this approach is correct is surprisingly subtle (surprising to me, anyway) (especially for diagonal scrolling). See http://weblogs.mozillazine.org/roc/archives/2009/08/homework_answer.html
Attachment #393941 - Flags: review?(dbaron)
Whiteboard: [needs review]
I'd like to be able to test this automatically but that will require new infrastructure, and I'm open to suggestions for the best way to do that.
Posted patch Part 2 v2 (obsolete) — Splinter Review
Forgot to qref...
Attachment #393941 - Attachment is obsolete: true
Attachment #393942 - Flags: review?(dbaron)
Attachment #393941 - Flags: review?(dbaron)
Would be interesting to be able to test this w/ something along the lines of Taras' Tpan (I forget what he calls the extension), combined w/ reference images.  Tpan already does perf testing.
Part 2 has a bug ... when we descend into an nsDisplayClip we clip the visible region that gets passed down. So if the first moving content is in an nsDisplayClip we return a bogus visible region through mSaveVisibleRegionAboveMovingContent. I'm not yet sure what the best thing is to do here...
Posted patch Part 2 v3Splinter Review
This fixes the problem by going all-out and accumulating the actual region of visible moving content into mSaveVisibleRegionAboveMovingContent. This enables additional optimizations as well, but there may be some cost to display list construction on complex pages. I'm looking into it...
Attachment #393942 - Attachment is obsolete: true
Attachment #393942 - Flags: review?(dbaron)
Posted file testcase
Worst-ish-case testcase to stress display list analysis.

Profiling my debug build with Shark, we spend 30% of the time painting and 60-ish% of the time in nsLayoutUtils::ComputeRepaintRegionForCopy (only about 1% of the time actually in nsChildView::Scroll).

13.2% of the time is in BuildDisplayListForStackingContext (i.e., actually collecting the display list). 44.2% of the time is in OptimizeVisibility. 40.2% in CheckVisibleRegionOfMovingContent. 39.7% in that Or operation that combines the current visible region into the accumulated region of visible moving content. I think I can improve that a fair bit...
One thing that'd be neat to do is to reuse the display list constructed for scrolling for painting as well, but probably not in this patch...
Posted patch Part 2 v4Splinter Review
This improves things a lot on the worst-case testcase. We compute an approximation to the region of visible moving content, by accumulating a bounding rect of moving content in nsDisplayList::OptimizeVisibility, and only intersect with the visible region at the end.

There's significantly more that could be done to speed up the display list analysis, including reducing 2 display list constructions to 1 as I mentioned above, but I think this is probably good enough. This reduces OptimizeVisibility during ComputeRepaintRegionForCopy to 10% of the profile.
Attachment #393996 - Flags: review?(dbaron)
On my Mac debug build, with a 540x1033 content area, we get about 45 frames per second in the testcase in comment #15.
(In reply to comment #10)
> I'd like to be able to test this automatically but that will require new
> infrastructure, and I'm open to suggestions for the best way to do that.

I've been watching this bug and asking around for ideas on how best to test this.  I notice your test case is all about performance, and this is a bug on "optimizing", so that's obviously important.  

I liked getting something like Tara's Tpan test into the main talos as Crowder suggested in comment 12.  Ideally, we'd have a Tpan and a set of "hard to scroll/render" pages to run in that framework.

The other half of this is of course testing for correctness that the rendering is performed properly when we scroll: without flicker, superimposed stationary elements remain stationary etc.  I'm not sure how to test that either.  I've been trying to think if there are any sort of lightweight hooks or events that we could put into the code to make determining the "correctness" of a scroll operation.  Since several scroll tests have histories of being [orange] tests it'd be awesome to have a better way to test this section too.  Any thoughts on this type of approach?

The best way I can think to do correctness is to do some carefully hand crafted reftests, but for mochitests, I'm really drawing a blank.
Comment on attachment 393936 [details] [diff] [review]
Part 1: remove nsDisplaySummary optimization

r=dbaron
Attachment #393936 - Flags: review?(dbaron) → review+
I filed bug 510110 on extending MozAfterPaint to make scrolling more testable.

I'm leaving town soon for a week's holiday and I don't have time to land this today. Anyone else is welcome to land it.
Depends on: 510110
Keywords: checkin-needed
Whiteboard: [needs review] → [needs landing]
I haven't even reviewed the main patch yet.  That r=dbaron was for part 1.
Hmm. Oh well, we can still land part 1 :-)
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P2
There's a problem with the approach in part 2, demonstrated in this testcase. The problem is that when we scan the display list to find visible moving content that needs to be blitted, we don't catch stuff that has been scrolled out of view. In this testcase, we end up deciding not to blit anything at all.

I'm going to have to find a more conservative approach.
Posted patch Part 2 v5Splinter Review
This version fixes that problem by making nsDisplayList responsible for computing the visible region of moving content both in the after position and in the before position.

To make this work we have to suppress the elision of nsDisplayClip items around empty lists, because those nsDisplayClip items might be hiding moving content that's been scrolled out of view, and we need to add possible visible regions of moving content based on them.
Comment on attachment 394258 [details] [diff] [review]
Part 2 v5

>+   * and then repaint aRepaintRegion; then the area aUpdateRect will be

s/repaint/repaints/
s/;/,/

>    * d) except that if the same display list element is visible in b) and c)
>    * for a frame that did not move and paints a uniform color within its
>    * bounds, then the intersection of its old and new bounds can be excluded
>    * when it is processed by b) and c).

Shouldn't this be removed, now that it's covered by the definition of
the "visible moving area"?

>+   * We may return a larger region for aRepaintRegion and/or aBlitRegion
>+   * if computing the above regions precisely is too expensive.

Will it still be true that they never intersect?  i.e., you're really 
computing a larger area for the "visible moving area" or for aRepaintRegion, 
and then aBlitRegion is still their exact difference?
(In reply to comment #26)
> (From update of attachment 394258 [details] [diff] [review])
> >    * d) except that if the same display list element is visible in b) and c)
> >    * for a frame that did not move and paints a uniform color within its
> >    * bounds, then the intersection of its old and new bounds can be excluded
> >    * when it is processed by b) and c).
> 
> Shouldn't this be removed, now that it's covered by the definition of
> the "visible moving area"?

Yes, you're right.

> >+   * We may return a larger region for aRepaintRegion and/or aBlitRegion
> >+   * if computing the above regions precisely is too expensive.
> 
> Will it still be true that they never intersect?  i.e., you're really 
> computing a larger area for the "visible moving area" or for aRepaintRegion, 
> and then aBlitRegion is still their exact difference?

The latter statement is true, and thus they will never intersect.

I ran this latest patch through tryserver and tests passed (except for I think a random failure on Windows).
Comment on attachment 394258 [details] [diff] [review]
Part 2 v5

I wonder if ConvertToInnerPixelRegion is too pessimistic, given that 
we're snapping most edges anyway.  But I suppose we can worry about that
later.

ConvertToInnerPixelRegion's comments could also be clearer that 
aRepaintRegion is in-out (although it only gets added to), and that the
return value is the device-pixel region resulting from snapping 
aBlitRegion in to the nearest pixels.



It seems like it's probably worth optimizing SortBlitRectsForCopy for 
the common cases of horizontal and vertical scrolling, but that could be 
a followup.  In particular, the most common case is probably vertical 
scrolling, and the sort call doesn't do anything at all useful for that
case.

>       // We're going to bit-blit.  Let the viewmanager know so it can
>       // adjust dirty regions appropriately.
>       mViewManager->WillBitBlit(this, aTwipsDelta);

It looks like the current code for this always does union, so things 
should still be safe (if, say, the dirty regions are inside something 
opaque that's not scrolling).

r=dbaron

I'll try to address the easier comments here and then land it; I think 
the rest can be addressed later.
Attachment #394258 - Flags: review+
The common case is also that there is only one rectangle, and due to the way we simplify the blit region while building it in the display list code, the blit region can never be extremely complex, so I don't think it's worth optimizing SortBlitRectsForCopy for a large number of rectangles.
Here's an interdiff for the review comments that I'm addressing on landing.

I was thinking I'd land this separately, but I don't think it's worth it, and I'll just roll it in to the main patch.
http://hg.mozilla.org/mozilla-central/rev/29df79466189
http://hg.mozilla.org/mozilla-central/rev/e6034ded61fd
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [needs landing] → [needs 1.9.2 landing]
Target Milestone: --- → mozilla1.9.3
Scrolling improvement in DHTML Linux at least (assuming 509889 wasn't the cause).

e0cc032cb852:
NOISE: _x_x_mozilla_page_load,741.4705882352941,NaN,NaN
NOISE: _x_x_mozilla_page_load_details,avgmedian|741.4705882352941|average|740.91|minimum|NaN|maximum|NaN|stddev|NaN
NOISE: |i|pagename|median|mean|min|max|runs|
NOISE: |0;colorfade.html;1296;1295;1290;1307;1295;1297;1298;1290;1307
NOISE: |1;diagball.html;1551;1551.5;1544;1561;1544;1556;1546;1561;1560
NOISE: |2;fadespacing.html;2086.5;2087.75;2077;2109;2109;2077;2077;2096;2101
NOISE: |3;imageslide.html;372;372.25;367;385;367;378;385;373;371
NOISE: |4;layers1.html;287.5;287.25;284;316;316;286;290;289;284
NOISE: |5;layers2.html;10;10.25;10;11;11;10;10;11;10
NOISE: |6;layers4.html;8.5;8.5;8;9;8;9;9;8;9
NOISE: |7;layers5.html;250.5;249.75;245;254;254;252;245;253;249
NOISE: |8;layers6.html;22;22;22;23;22;22;22;23;22
NOISE: |9;meter.html;1074;1075;1071;1082;1081;1075;1082;1073;1071
NOISE: |10;movingtext.html;730.5;729.75;719;739;739;739;719;733;728
NOISE: |11;mozilla.html;2611.5;2614.5;2606;2630;2612;2611;2630;2629;2606
NOISE: |12;replaceimages.html;367.5;361.75;343;372;369;372;369;366;343
NOISE: |13;scrolling.html;731;730.75;725;750;727;735;725;750;736
NOISE: |14;slidein.html;2398;2394.5;2377;2417;2403;2393;2405;2377;2417
NOISE: |15;slidingballs.html;289.5;291.25;288;299;299;289;290;298;288
NOISE: |16;textslide.html;665.5;663.75;655;671;655;665;669;671;666
NOISE: |17;zoom.html;465;464.5;461;477;464;477;467;461;466

e6034ded61fd:
NOISE: |i|pagename|median|mean|min|max|runs|
NOISE: |0;colorfade.html;1296.5;1296;1293;1300;1297;1293;1300;1298;1296
NOISE: |1;diagball.html;1553.5;1552.75;1548;1562;1562;1548;1554;1556;1553
NOISE: |2;fadespacing.html;2084.5;2082.25;2060;2107;2107;2060;2100;2080;2089
NOISE: |3;imageslide.html;375;375;370;384;370;380;375;375;384
NOISE: |4;layers1.html;300;299.75;291;312;296;304;312;308;291
NOISE: |5;layers2.html;10;10;10;10;10;10;10;10;10
NOISE: |6;layers4.html;8;8;8;9;8;8;9;8;8
NOISE: |7;layers5.html;234.5;232.75;226;239;233;236;239;226;236
NOISE: |8;layers6.html;22;21.75;21;22;22;22;22;21;22
NOISE: |9;meter.html;1073;1074;1067;1086;1073;1067;1083;1073;1086
NOISE: |10;movingtext.html;742;741;737;748;741;743;737;748;743
NOISE: |11;mozilla.html;2609.5;2609.25;2600;2619;2609;2619;2610;2600;2618
NOISE: |12;replaceimages.html;360;362.25;358;373;360;360;371;358;373
NOISE: |13;scrolling.html;639;638.75;631;649;637;649;641;631;646
NOISE: |14;slidein.html;2390;2390.5;2380;2405;2402;2387;2405;2380;2393
NOISE: |15;slidingballs.html;297.5;292.75;272;316;272;304;303;316;292
NOISE: |16;textslide.html;658;657.5;655;663;663;659;657;655;659
NOISE: |17;zoom.html;469.5;469.75;465;479;465;475;470;469;479
Blocks: 505665
Keywords: checkin-needed
Depends on: 510656
Depends on: 510856
That's weird because scrolling.html doesn't exercise our scrolling code.
verified with Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.2a2pre) Gecko/20090911 Namoroka/3.6a2pre
Keywords: verified1.9.2
You need to log in before you can comment on or make changes to this bug.