Closed Bug 483417 Opened 12 years ago Closed 12 years ago

optimize CanvasBrowser region handling

Categories

(Firefox for Android Graveyard :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: vlad, Assigned: vlad)

Details

Attachments

(2 files, 4 obsolete files)

No description provided.
Attached patch optimize canvasbrowser (obsolete) — Splinter Review
Er, hit enter too soon.

CanvasBrowser is inefficient in how it handles update regions and drawWindow calls.  On MozAfterPaint, for each rectangle, _redrawRect was being called which would immediately call flushRect because we weren't loading.  In many DHTML cases, we'll get MozAfterPaint with a bunch of rects in it because many elements are moving, as in bubblemark.

This patch changes things around so that _redrawRect becomes _redrawRects, and it takes an array of rects instead of a single rect.  Then those rects are added to the region, and drawWindow can get a better set of update rects from the region.

Then the code that does the actual drawWindow call can examine the list of rects and decide whether it's more efficient to just do one draw call.  This patch does a simple pixels-based test; if the number of pixels to be drawn by drawing each update rect is 90% or more than the number of pixels in the total update bounds, then we just draw the update bounds again.  This percentage could be tweaked, but should help us avoid per-drawWindow overhead when we have multiple update regions while still handling the case of disjoint update rects in the region well.

Canvas will still invalidate on each drawRect (the entire canvas, though better with bug 483412).  A better/future optimization would be to have a way to pause invalidations until all the drawWindows are finished, and then invalidate the final region at the end.  This avoids some weirdness in gtk's update handling, where it will often process the first invalidate quickly, whereas subsequent ones will get batched.

Another additional change here is switching from setInterval to setTimeout; setInterval has some rather crappy code that tries to "catch up" if it falls behind in the number of events that should be fired, so if we ever have a delay longer than the setInterval time, it could cause multiple invocations of the function to be triggered in quick succession.  The change here just uses setTimeout, and continues setting setTimeout again while the page is loading or until the timeout is canceled.
Attachment #367400 - Flags: review?
Attachment #367400 - Flags: review? → review?(gavin.sharp)
tracking-fennec: --- → ?
Comment on attachment 367400 [details] [diff] [review]
optimize canvasbrowser

Bah, stupid bug in here, will update in a min.
Attachment #367400 - Attachment is obsolete: true
Attachment #367400 - Flags: review?(gavin.sharp)
Attached patch fixed (obsolete) — Splinter Review
Fixed patch.
Attachment #367411 - Flags: review?(gavin.sharp)
(In reply to comment #1)
> Created an attachment (id=367400) [details]
> optimize canvasbrowser
> 
> Er, hit enter too soon.
> 
> CanvasBrowser is inefficient in how it handles update regions and drawWindow
> calls.  On MozAfterPaint, for each rectangle, _redrawRect was being called
> which would immediately call flushRect because we weren't loading.  In many
> DHTML cases, we'll get MozAfterPaint with a bunch of rects in it because many
> elements are moving, as in bubblemark.

Yes
> 
> This patch changes things around so that _redrawRect becomes _redrawRects, and
> it takes an array of rects instead of a single rect.  Then those rects are
> added to the region, and drawWindow can get a better set of update rects from
> the region.

Yes. and I have a tiny patch that does that.

> 
> Then the code that does the actual drawWindow call can examine the list of
> rects and decide whether it's more efficient to just do one draw call.  This
> patch does a simple pixels-based test; if the number of pixels to be drawn by
> drawing each update rect is 90% or more than the number of pixels in the total
> update bounds, then we just draw the update bounds again.  This percentage
> could be tweaked, but should help us avoid per-drawWindow overhead when we have
> multiple update regions while still handling the case of disjoint update rects
> in the region well.

That's something I've been pondering too, but haven't had much luck getting mozafterpaints involving multiple rects(bubblemark does it ~ 1/10 times here). 

> 
> Canvas will still invalidate on each drawRect (the entire canvas, though better
> with bug 483412).  A better/future optimization would be to have a way to pause
> invalidations until all the drawWindows are finished, and then invalidate the
> final region at the end.  This avoids some weirdness in gtk's update handling,
> where it will often process the first invalidate quickly, whereas subsequent
> ones will get batched.

Now that would be an awesome opt. May I suggest hooking into save/restore as a starting point? All canvas code I've seen batches draws in that fashion, so it might be a reasonable thing to do with a pref(esp for mobile).

> 
> Another additional change here is switching from setInterval to setTimeout;
> setInterval has some rather crappy code that tries to "catch up" if it falls
> behind in the number of events that should be fired, so if we ever have a delay
> longer than the setInterval time, it could cause multiple invocations of the
> function to be triggered in quick succession.  The change here just uses
> setTimeout, and continues setting setTimeout again while the page is loading or
> until the timeout is canceled.

Cool, didn't know about that.

So to summarize:
1. This patch does mozafterpaint batching(can be done already with _isPanning), but forces the common case of a single mozafter paint to go through an array. 
2. Adding the rect-unioning is an interesting , do you have perf numbers on that? A potential problem is that region likes to return a list of rectnagles where 4 or so rectangles have height/width of only a few pixels.

3. setTimeout change sounds like a good idea
4. I've done the paint batching opt and it did reduce stuttering in bubblemark(ie when the multirect mozafterpaint happens), but I also got a much better net(around 10%) win out of removing wsRect/etc allocation out of the drawing path and getting rid of redundant xpconnect calls. This patch complicates the path from event to final draw a fair bit, I disapprove.
Attachment #367411 - Flags: review-
(In reply to comment #4)
> Now that would be an awesome opt. May I suggest hooking into save/restore as a
> starting point? All canvas code I've seen batches draws in that fashion, so it
> might be a reasonable thing to do with a pref(esp for mobile).

Can't; there's no requirement that you call restore() at the end of your drawing, since it's just there for state and not for batching.  I would probably just add startUpdateBatch() and endUpdateBatch() calls, though.

> 4. I've done the paint batching opt and it did reduce stuttering in
> bubblemark(ie when the multirect mozafterpaint happens), but I also got a much
> better net(around 10%) win out of removing wsRect/etc allocation out of the
> drawing path and getting rid of redundant xpconnect calls. This patch
> complicates the path from event to final draw a fair bit, I disapprove.

Hrm.  I think this actually simplifies the drawing path a good amount :)  It just changes where loops go, instead of looping over rectangles multiple times; and it just moves addToRegion into redrawRects because it was only called from that one spot.

I get a pretty significant speedup with this, much bigger than 10% than without it, but it should be possible to remove wsRect etc. allocations if they're causing that big of a slowdown (especially if they're taking up 10% of painting time!)  It's worth seeing if plain arrays are faster, with some helper functions to work on them maybe, or we should just yell at the JS guys to fix it.
Here are some numbers..

With the code as written/patched here (bubblemark, 1:1 zoom):

+ mozAfterPaint: prep 18ms
+ redrawRects prep 3ms ( 9 real rects)
+ flushRegion prep: 5ms
+ flushRegion drawWindows: 18ms
+ flushRegion total: 26ms

the mozAfterPaint "prep" work is the time to build the rects array, drawWindows is the time it takes to do all the drawWindows.  In this case (and in most cases where there were a lot of update rects), the 9 rects were coalesced into 1.

Without the rect-coalescing, that is doing N separate drawWindow calls:

+ mozAfterPaint: prep 12ms
+ redrawRects prep 3ms ( 7 real rects)
+ flushRegion prep: 3ms
+ flushRegion drawWindows: 36ms
+ flushRegion total: 40ms

+ mozAfterPaint: prep 17ms
+ redrawRects prep 3ms ( 10 real rects)
+ flushRegion prep: 5ms
+ flushRegion drawWindows: 56ms
+ flushRegion total: 63ms

So the coalescing is clearly a good perf boost.  But the mozAfterPaint prep time is still way too expensive, so let's look at that... it does a loop where it grabs a rect from the region, creates a wsRect using it and some math, and pushes it to an array.  There's a lot of xpconnect going on here.

My first thought was that getting the rect was expensive:

+ mozAfterPaint: iterating clientRects: 8ms
+ mozAfterPaint: prep 16ms
+ redrawRects prep 3ms ( 9 real rects)

So half the time is spent just calling item().  But there's still some other time happening here, calling wsRect with the math.  So here's doing the math but not creating a wsRect:

+ mozAfterPaint: iterating clientRects+math: 14ms
+ mozAfterPaint: prep 16ms
+ redrawRects prep 3ms ( 8 real rects)

Ok, so the majority of the time is actually calling item() and doing the math.  This isn't surprising -- the math is (e.left + cwin.scrollX, e.top + cwin.scrollY, e.width, e.height).  All of those are xpconnect getters.  scrollX and scrollY are constants though, so no need to grab them in the loop.. timing just how long it takes to get them:

+ mozAfterPaint: iterating clientRects+math(sxsy): 10ms
+ mozAfterPaint: prep 11ms
+ redrawRects prep 3ms ( 7 real rects)

So just those are pretty expensive (that's calling item() and doing k = cwin.scrollX + cwin.scrollY).  A simple opt is pulling out the scrollX/scrollY out of the loop and just using a local variable for them.  This gives us:

+ mozAfterPaint: prep(c) 12ms
+ redrawRects prep 3ms ( 10 real rects)

So down to 12ms for 10 rects vs 17ms earlier just by pulling those two variables out.  Also, the time it takes to create 100 wsRects and push them to an array is around 3-4ms (wsRect(0,0,0,0)).  The right place to look for additional speedups here is to get rid of the xpconnect overhead in dealing with regions, either by rewriting to relevant bits of code in JS or experimenting with quickstubs.

I'll post an updated patch that pulls out the constant scrollX/scrollY values out of the loop as well.
Attached patch updated (obsolete) — Splinter Review
Updated with puling the constant getters out of the loop in mozAfterPaint.
Attachment #367411 - Attachment is obsolete: true
Attachment #367489 - Flags: review?(gavin.sharp)
Attachment #367411 - Flags: review?(gavin.sharp)
Comment on attachment 367489 [details] [diff] [review]
updated

>     for (let i = 0; i < aEvent.clientRects.length; i++) {
>       let e = aEvent.clientRects.item(i);
.clientRects is also an xpconnect call.

I'll look at it more monday
Right, all that stuff is heavy xpconnect.  Pulling out aEvent.clientRects into a variable would probably also yield a small win, but it sucks that we need to deal with xpconnect as much as we do here at all.  Hopefully quickstubs can win here.
quickstubs are really fast indeed. Can you hold off on pushing this until I inspect it some more?
I like this approach, but if an r+ comes soon, should we hold off pushing this until beta 1 is released or branched?
Comment on attachment 367489 [details] [diff] [review]
updated

>+    if (drawls.length > 1 &&
>+        pixelsInRegion > (updateBounds.width * updateBounds.height * 0.90))
>+    {
>+      drawls = [updateBounds];
>+    }

This does work well for painting. Except that moving subtractRect out of the draw loop means we call subtractRect too much now.

>+      if (this._pageLoading)  {
>+        // if this rect would push us beyond our known size to the bottom or right,
>+        // maybe redo zoom-to-page
>+        // XXX why bottom > 0 and not bottom > _visibleBounds.bottom?

bottom > 0 means that the rect is on the page, instead of some magical thing above the page.I was getting a lot of bottom < 0 mozafterpaints while writing this.
>+    function resizeAndPaint(self) {
>+      if (self._maybeZoomToPage) {
>+        self.zoomToPage();
>+        self._maybeZoomToPage = false;
>+      }
>+      // draw visible area..freeze during pans
>+      if (!self._isPanning)
>+        self.flushRegion(true);
>+
>+      if (self._pageLoading) {
>+        // kick ourselves off 1s later while we're still loading
>+        self._drawTimeout = setTimeout(resizeAndPaint, 1000, self);

This really should be 2s. resizeAndPaint can trigger additional reflows by checking for page sizes during zooming, some of which take >1second. I don't see any benefit of drawing faster than that. I do realize that now it's (draw time + 1s), but i think (draw time + 2s) is fine too.

>+      } else {
>+        self._drawTimeout = 0;
>+      }
>+    }
>+
>+    let flushNow = !this._pageLoading;
>+
>+    // page is loading, and we don't have an existing draw here.  Note that
>+    // setTimeout is used (and reset in resizeAndPaint)
>+    if (this._pageLoading && !this._drawInterval) {
>+      //always flush the first draw
>+      flushNow = true;
>+      this._maybeZoomToPage = true;
>+      this._drawTimeout = setTimeout(resizeAndPaint, 2000, this);
>+    }

This causes a motherload of runaway settimeouts during slow pageloads. s/drawInterval/drawTimeout/
Attachment #367489 - Flags: review-
Attached patch updated (obsolete) — Splinter Review
Updated --

- fixed drawInterval thing
- moved subtractRect back into loop, after the optimization
- hoisted aEvent.clientRects into variable outside of loop in mozAfterPaint
- fixed 1000ms -> 2000ms
- did maxBottom/maxRight thing for deciding whether to zoom a newly-loading page
Attachment #367489 - Attachment is obsolete: true
Attachment #367616 - Flags: superreview?(gavin.sharp)
Attachment #367616 - Flags: review?
Attachment #367489 - Flags: review?(gavin.sharp)
Attachment #367616 - Flags: review? → review?(tglek)
Comment on attachment 367616 [details] [diff] [review]
updated

>diff --git a/chrome/content/CanvasBrowser.js b/chrome/content/CanvasBrowser.js
>--- a/chrome/content/CanvasBrowser.js
>+++ b/chrome/content/CanvasBrowser.js
>@@ -60,14 +60,22 @@ CanvasBrowser.prototype = {
>   // used to force paints to not be delayed during panning, otherwise things
>   // some things seem to get drawn at the wrong offset, not sure why
>   _isPanning: false,
>-  
>+
>+  // the max right/bottom coords we saw from paint events
>+  // while we were loading a page.  If we see something that's bigger than
>+  // either, we'll trigger a page zoom.
>+  _maxRight: 0,
>+  _maxBottom: 0,
>+
>+  _drawTimeout: 0,
^ No real need for newlines here

>+    let rects = [];
>+    for (let i = 0; i < clientRects.length; i++) {

Should loop backwards here to avoid extra length calls.
Attachment #367616 - Flags: review?(tglek) → review+
Attached patch more updatesSplinter Review
Ok, some more updates.

- loops backwards starting from length-1
- "while I was here" moved a few things into the constructor to avoid possibly hard-to-track-down brokenness if we ever have more than one CanvasBrowser
- added some more comments and newlines :)
Attachment #367616 - Attachment is obsolete: true
Attachment #367616 - Flags: superreview?(gavin.sharp)
I took out floor/ceil where they didn't do anything(ie _redrawRects rounds too) and changed post-blit region subtraction code to floor
Attachment #367635 - Flags: review?(vladimir)
Attachment #367635 - Flags: review?(vladimir) → review+
Comment on attachment 367635 [details] [diff] [review]
short-circuit drawing for visible stuff

Looks fine, though I don't think this is quite correct:

>-    rgn.subtractRect(dx, dy, cWidth, cHeight);
>+    // ceil because dx/dy are not always integers
>+    rgn.subtractRect(Math.ceil(dx), Math.ceil(dy), cWidth, cHeight);

It's no worse than the current code is though (although floor() would be more accurate to what it's doing now), but it really wants a roundOut() type of operation on the entire rectangle I think).  I'd just put a comment to that effect in here.  As is it could result in 1px-wide seams here and there when things are being repainted.
Checked in my patch -- taras, is your followon ready to go?  You mentioned something about changing the ceil/etc. pieces on irc yesterday.
pushed http://hg.mozilla.org/mobile-browser/rev/dc47de886765
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
tracking-fennec: ? → ---
You need to log in before you can comment on or make changes to this bug.