Last Comment Bug 701623 - Using dirty rects for repainting causes corruption
: Using dirty rects for repainting causes corruption
Status: RESOLVED FIXED
:
Product: Firefox for Android
Classification: Client Software
Component: General (show other bugs)
: unspecified
: ARM Android
: P1 normal (vote)
: ---
Assigned To: Patrick Walton (:pcwalton)
:
Mentors:
Depends on:
Blocks: native_droid_panning
  Show dependency treegraph
 
Reported: 2011-11-10 21:57 PST by Patrick Walton (:pcwalton)
Modified: 2012-01-09 14:57 PST (History)
7 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed
11+


Attachments
Proposed patch. (46.08 KB, patch)
2011-11-15 22:54 PST, Patrick Walton (:pcwalton)
chrislord.net: review-
Details | Diff | Splinter Review

Description Patrick Walton (:pcwalton) 2011-11-10 21:57:50 PST
See cnn.com; the header gets duplicated.
Comment 1 Patrick Walton (:pcwalton) 2011-11-14 10:29:02 PST
I've been hearing lots of reports of this anecdotally. I think we should consider making this higher priority.
Comment 2 Patrick Walton (:pcwalton) 2011-11-15 22:54:21 PST
Created attachment 574806 [details] [diff] [review]
Proposed patch.

Here's a patch for this. It combines several things, that all contribute to fix this bug in most cases:

* Paints all dirty rects.
* Moves pan logic into the widget code and bundles it with the paint events, so that we don't have races.

It also refactors some of the GL tiles code, for good measure.

There are still redraw issues when moving from page to page, but this is much better in my experience. I don't see the header duplicated on cnn.com, for example.
Comment 3 Chris Lord [:cwiiis] 2011-11-16 05:01:05 PST
Comment on attachment 574806 [details] [diff] [review]
Proposed patch.

Review of attachment 574806 [details] [diff] [review]:
-----------------------------------------------------------------

I like the idea of locking any viewport metrics to the draw call, but there are too many unanswered questions with this patch to r+ it. I'd like an explanation of the deck size change (and assuming it isn't a mistake, this should be commented in the xul file too), and it'd be good to get roc's feedback on the possible ramifications of that Translate call.

I also wish you'd based this on top of the current displayport patch, as that patch contains a lot of refactoring that makes the code more readable (in my opinion at least), and rebasing that on top of this is a much larger job than the other way round. Is there a reason you haven't done this?

::: embedding/android/GeckoAppShell.java
@@ +478,5 @@
>              }
>          });
>  
> +        GeckoEvent event = new GeckoEvent(GeckoEvent.SIZE_CHANGED, WINDOW_WIDTH, WINDOW_HEIGHT,
> +                                          WINDOW_WIDTH, WINDOW_HEIGHT);

The last pair of numbers should be the screen size, not the window size. It's ok for the screen size to be smaller than the window size.

::: embedding/android/gfx/TileLayer.java
@@ +174,5 @@
> +
> +        ByteBuffer buffer = mImage.lockBuffer();
> +        try {
> +            /*
> +             * Upload the changed rect. We have to widen to the full width of the texture

The gfx team make a copy rather than always forcing to the full width when they do the same thing in the EGL layers back-end. This doesn't hold up this particular patch, as nothing has changed here, but have you discussed this with them? I expect we may want to do the same (we may also want to take advantage of the Tegra extension that lets us set the stride too so we can avoid this altogether).

::: mobile/chrome/content/browser.xul
@@ +12,5 @@
>    <script type="application/javascript" src="chrome://browser/content/exceptions.js"/>
>    <script type="application/javascript" src="chrome://browser/content/sanitize.js"/>
>  
>    <stack flex="1">
> +    <deck id="browsers" top="0" bottom="47952" left="0" right="8976"/>

Where do these numbers come from and why can this no longer be flex? Looks like we'd break on very long/wide pages...?

::: widget/src/android/nsWindow.cpp
@@ +819,5 @@
>          ALOG("### Failed to create a valid surface");
>          return PR_FALSE;
>      }
>  
> +    nsIntRect boundsRect(0, 0, TILE_WIDTH, TILE_HEIGHT);

I think we should get these values from the code rather than duplicate them here. We're probably going to want different tile-sizes depending on screen-size/device capabilities (certainly, a 1280x800 tablet wants a different size to a 800x480 phone). Again, this is something that was there before though, so doesn't hold up review of this particular patch.

@@ +1078,5 @@
>                      AutoLayerManagerSetup
>                        setupLayerManager(this, ctx, BasicLayerManager::BUFFER_NONE);
> +
> +                    gfxPoint translation(-gAndroidViewportOffset.x, -gAndroidViewportOffset.y);
> +                    ctx->Translate(translation);

Is it ok to do this? The display-list/layers builder has no knowledge of this transform, I wonder if this may cause problems? (roc would be the best person to ask on this)
Comment 4 Chris Lord [:cwiiis] 2011-11-16 05:10:41 PST
Actually, on further thought, the translate ought to be ok. I'm actually not sure why you've set the window width/height so large?

Surely it just needs to be (tileWidth + (tileWidth - viewportWidth)) x (tileHeight + (tileHeight - viewportHeight)) (i.e. the tile size plus the largest possible viewport offset)?
Comment 5 Chris Lord [:cwiiis] 2011-11-16 07:23:49 PST
And on even closer inspection, I didn't realise this patch removed setting of the scroll coordinates - which explains the need for a massive window, why it removes race conditions, etc.

I don't think this is going to work unless we have a way to separate viewport size from element size.

Because the page's width depends on the viewport width, we can't really size the page at the document width - so we can't really disable horizontal scrolling. Any scroll offset will break this patch.

The part of this patch that sets width/height attributes on the browser doesn't work - a child element of a deck will always assume the size of the deck, regardless of its width/height.

Really, I think we want the displayport-style method; We have a render area (the displayport), we have an element which is a sub-rectangle of this area (the viewport), and we scroll the document to expose the document area (and optionally move the viewport to wherever we believe is most optimal).

I think we'd be better off finding a way to send the metrics for these transforms (as is done in JS in the displayport patch) synchronously with the endDrawing call.

Doing the viewport offset as is done in this patch is probably the way to go, we also, I suppose, need to find a reliable way of getting the browser frame so that we can send the scroll coordinates as well.
Comment 6 Patrick Walton (:pcwalton) 2011-11-16 09:18:56 PST
(In reply to Chris Lord [:cwiiis] from comment #5)
> And on even closer inspection, I didn't realise this patch removed setting
> of the scroll coordinates - which explains the need for a massive window,
> why it removes race conditions, etc.
> 
> I don't think this is going to work unless we have a way to separate
> viewport size from element size.
> 
> Because the page's width depends on the viewport width, we can't really size
> the page at the document width - so we can't really disable horizontal
> scrolling. Any scroll offset will break this patch.
> 
> The part of this patch that sets width/height attributes on the browser
> doesn't work - a child element of a deck will always assume the size of the
> deck, regardless of its width/height.
> 
> Really, I think we want the displayport-style method; We have a render area
> (the displayport), we have an element which is a sub-rectangle of this area
> (the viewport), and we scroll the document to expose the document area (and
> optionally move the viewport to wherever we believe is most optimal).
> 
> I think we'd be better off finding a way to send the metrics for these
> transforms (as is done in JS in the displayport patch) synchronously with
> the endDrawing call.
> 
> Doing the viewport offset as is done in this patch is probably the way to
> go, we also, I suppose, need to find a reliable way of getting the browser
> frame so that we can send the scroll coordinates as well.

I guess we have these options:

1. Continue down the road of trying to squash duplicate frames and keep all the synchronization in the Java side (by dropping frames, etc).
2. This patch's approach: move all the panning code into widget.
3. Remove the paint code in widget. Have painting strictly controlled by browser.js, which adds the tile offset and zoom factor to each paint event before sending it off to Java.

(1) is a real nightmare, as we've discovered. It's also wasteful for performance.

(2) seems to work (and better than (1); this patch really makes a big difference), but it does have the downsides you mention (breaking scrolling, massive window leading to ugly XUL), and zoom is going to be a bit annoying since we have to poke the PresShell, which isn't something that's straightforward to do from widget.

(3) is attractive for a lot of reasons -- it removes races while keeping the CSS transform approach. Is this what you were trying to do with MozAfterPaint? How did it go?
Comment 7 Chris Lord [:cwiiis] 2011-11-16 11:14:59 PST
(3) is what I tried, but it ends up that MozAfterPaint isn't tied directly to window draws, so what would often happen is;

MozAfterPaint <viewport metrics 1>
MozAfterPaint <viewport metrics 2>
endDrawing <viewport metrics 1>
endDrawing <viewport metrics 2>

or even, two MozAfterPaints and one endDrawing, the other way round, and various other combinations - basically, it doesn't work :(


I think a fourth option,

- Do the viewport offset how you've done it in this patch - but never change it. Always assume that the viewport offset places the viewport in the centre of the display port. If we do it how you've done here, changing it means we have to force Gecko to redraw everything, so it would outweigh the benefit of having a larger buffered area at the edges of the page.

- Size the window so that it's (tileWidth + (tileWidth - viewportWidth)) x (tileHeight + (tileHeight - viewportHeight)), as mentioned in the above comment. This should mitigate any problems with setting the viewport transform via a translate that the display list builder is unaware of.

- Before calling endDrawing, in the native code, read the scroll position of the browser frame and send the coordinates back with the draw call. We should be able to do this - we may want to mark the browser frame somehow in JS so it's easy to find in the native code (I don't know how to do this, but in the worst case, we could add something to nsIDOMWindowUtils)

- Continue to send scroll requests/viewport resizes to browser.js, and continue sending page-size back from browser.js. Page size needn't be perfectly in sync, so there's no need to complicate things further. We could add this in the native code if it's easy, however.

Any thoughts on that?
Comment 8 Robert O'Callahan (:roc) (email my personal email if necessary) 2011-11-16 12:52:15 PST
(In reply to Chris Lord [:cwiiis] from comment #7)
> (3) is what I tried, but it ends up that MozAfterPaint isn't tied directly
> to window draws, so what would often happen is;
> 
> MozAfterPaint <viewport metrics 1>
> MozAfterPaint <viewport metrics 2>
> endDrawing <viewport metrics 1>
> endDrawing <viewport metrics 2>
> 
> or even, two MozAfterPaints and one endDrawing, the other way round, and
> various other combinations - basically, it doesn't work :(

This doesn't seem right. Are you sending widget-level paint events other than outside these endDrawing calls?

nsViewManager::Refresh has an nsAutoScriptBlocker. nsPresShell::Paint via nsPresContext::NotifyDidPaintForSubtree uses AddScriptRunner to dispatch the MozAfterPaint event(s). Those runnables should run when nsViewManager::Refresh releases its scriptblocker, i.e., during the dispatch of the paint event from the widget.

> Any thoughts on that?

Sounds reasonable.
Comment 9 Patrick Walton (:pcwalton) 2011-11-29 17:59:59 PST
This should now be fixed with the patch in bug 703821.
Comment 10 Steffen Wilberg 2011-12-06 02:33:38 PST
Comment on attachment 574806 [details] [diff] [review]
Proposed patch.

Obsoleting patch and removing feedback request.

(In reply to Patrick Walton (:pcwalton) from comment #9)
> This should now be fixed with the patch in bug 703821.

Note You need to log in before you can comment on or make changes to this bug.