Aborting painting outside the viewport

RESOLVED FIXED in Firefox 18

Status

()

RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: BenWa, Assigned: cwiiis)

Tracking

(Blocks: 1 bug)

unspecified
Firefox 18
x86
Mac OS X
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 4 obsolete attachments)

(Reporter)

Description

6 years ago
Currently when we start a draw continue until the very end. As we can see in eideticker frame synced profiles under GalaxyNexus Task.js on some runs we spend 600ms drawing a paint that is completely outside the viewport.

With progressive tile drawing we can abort a paint for very minimal cost. That is everything we have painted is already sent to the compositor. We can check after each tile if we should abort the paint (or rather have the Java frontend tell us) after each completed tile.

The overhead is the work needed to prepare a paint (viewport update, build a display list) so we need to keep that in mind when updating our strategy that aborting a paint isn't free.
(Assignee)

Comment 1

6 years ago
Have been discussing with Benoit, will be taking a look at this.
Assignee: nobody → chrislord.net
Status: NEW → ASSIGNED
(Reporter)

Comment 2

6 years ago
Also we should also abort if the resolution has changed.
(Assignee)

Comment 3

6 years ago
Created attachment 667407 [details] [diff] [review]
Abort progressive updates under some circumstances

The abort circumstances are documented in the code. This adds a function to BasicLayerManager to ask whether to abort draws and some Android-specific code behind an #ifdef that uses JNI to call into Java and ask if we should abort.

This helps us recover more quickly in pathologically bad cases like taskjs.org (but doesn't make taskjs.org look good, mind).

I played around with the aborting circumstances and found the second method implemented here to be the most effective without causing bad behaviour.

This also fixes a bug I introduced to BasicTiledThebesLayer that marked regions as valid before they were (which would short-circuit the 'new-content-before-stale-content' code). Sorry about that :)
Attachment #667407 - Flags: review?(bgirard)
(Assignee)

Comment 4

6 years ago
Comment on attachment 667407 [details] [diff] [review]
Abort progressive updates under some circumstances

Asking for additional review from kats for the Java/JNI bits.
Attachment #667407 - Flags: review?(bugmail.mozilla)
(Assignee)

Comment 5

6 years ago
Comment on attachment 667407 [details] [diff] [review]
Abort progressive updates under some circumstances

Review of attachment 667407 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/basic/BasicTiledThebesLayer.cpp
@@ +338,5 @@
>      mTiledBuffer.SetResolution(resolution);
>      mValidRegion = mVisibleRegion;
>    }
>  
> +  mTiledBuffer.PaintThebes(this, mValidRegion, regionToPaint, aCallback, aCallbackData);

This is the bug-fix for the bad valid region marking. Previously, I was immediately setting the buffer's valid region to the visible region (this is fine in the non-progressive case), but it's mValidRegion that contains the progressively updated valid region.
(Assignee)

Comment 6

6 years ago
hmm, a note - this patch seems to have some issues when zooming in far (almost certainly due to rounding) - I'll figure these out and post an update, but I'm  still interested in the review. I'll likely work around these rounding issues in the Java code.
Status: ASSIGNED → NEW
(Assignee)

Comment 7

6 years ago
* shakes fist at bugzilla
Status: NEW → ASSIGNED
(Assignee)

Comment 8

6 years ago
Created attachment 667440 [details] [diff] [review]
Abort progressive updates under some circumstances

Work around the rounding issue when zoomed. Not that pleased about the work-around, but I don't think there's any reason to put in anything more complicated.

I don't think there's ever an occasion where we're more than a pixel out and as the comment explains, we align to tile boundaries anyway - it's better this code doesn't trigger than if it does.
Attachment #667407 - Attachment is obsolete: true
Attachment #667407 - Flags: review?(bugmail.mozilla)
Attachment #667407 - Flags: review?(bgirard)
Attachment #667440 - Flags: review?(blassey.bugs)
Attachment #667440 - Flags: review?(bgirard)
(Assignee)

Comment 9

6 years ago
Created attachment 667525 [details] [diff] [review]
Abort progressive updates under some circumstances

Rebased, no major changes.
Attachment #667440 - Attachment is obsolete: true
Attachment #667440 - Flags: review?(blassey.bugs)
Attachment #667440 - Flags: review?(bgirard)
Attachment #667525 - Flags: review?(blassey.bugs)
Attachment #667525 - Flags: review?(bgirard)
(Reporter)

Comment 10

6 years ago
Comment on attachment 667525 [details] [diff] [review]
Abort progressive updates under some circumstances

Review of attachment 667525 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/gfx/DisplayPortMetrics.java
@@ +19,5 @@
>   */
>  public final class DisplayPortMetrics {
> +    // We need to flatten the RectF structure as Java doesn't have the concept
> +    // of const classes
> +    public final float left;

I don't get the point of storing l,t,r,b if we keep a rect. It's not like Java is reading these directly?

@@ +56,5 @@
> +        sb.append("{ \"left\": ").append(left)
> +          .append(", \"top\": ").append(top)
> +          .append(", \"right\": ").append(right)
> +          .append(", \"bottom\": ").append(bottom)
> +          .append(", \"resolution\": ").append(resolution)

Arggg, last time I looked JSON operation where causing huge performance hits :(. Looks like we're not using to toJSON for every draw so not a huge deal.

::: mobile/android/base/gfx/GeckoLayerClient.java
@@ +360,5 @@
> +    public boolean shouldAbortProgressiveUpdate(boolean aHasPendingNewThebesContent,
> +                                                float x, float y, float width, float height, float resolution) {
> +        // Abort drawing if;
> +        // The resolution has changed.
> +        if (!FloatUtils.fuzzyEquals(resolution, mDisplayPort.resolution)) {

I don't see any locks for reading mDisplayPort. This is being called from the Gecko thread but being modified from the Java UI (or is it the compositor).

@@ +372,5 @@
> +        //     outside of the display-port.
> +        // XXX All sorts of rounding happens inside Gecko that becomes hard to
> +        //     account exactly for. Given we align the display-port to tile
> +        //     boundaries, just check if the bounds are within a pixel by
> +        //     shrinking the viewport rect by a pixel on all sides.

Shouldn't we round OUT the displayport? Aren't we better to not abort rather then abort a paint that we shouldn't and risk a loop?

@@ +373,5 @@
> +        // XXX All sorts of rounding happens inside Gecko that becomes hard to
> +        //     account exactly for. Given we align the display-port to tile
> +        //     boundaries, just check if the bounds are within a pixel by
> +        //     shrinking the viewport rect by a pixel on all sides.
> +        // Grab a local copy of the viewport metrics to avoid races.

We should add that if we're not sure about the dimension we should prefer to not abort to prevent an abort loop.

@@ +387,5 @@
> +        }
> +
> +        // There's no new content and we've sent a more recent display-port.
> +        // XXX Same as above with respect to rounding.
> +        if (!aHasPendingNewThebesContent &&

Wont this cause us to always abort painting if the display port is moving slightly but we have only stale content? I'm not convinced new vs. stale is a good metric. Maybe you can convince me :). I think the one above is what we want.

What about aborting if the resolution changes? Once I fix tile store we will start doing progressive paint for them once again.
(Reporter)

Comment 11

6 years ago
Comment on attachment 667525 [details] [diff] [review]
Abort progressive updates under some circumstances

Review of attachment 667525 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/gfx/GeckoLayerClient.java
@@ +387,5 @@
> +        }
> +
> +        // There's no new content and we've sent a more recent display-port.
> +        // XXX Same as above with respect to rounding.
> +        if (!aHasPendingNewThebesContent &&

Nevermind about the resolution change. I see it above and it works well!
Comment on attachment 667525 [details] [diff] [review]
Abort progressive updates under some circumstances

Review of attachment 667525 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/gfx/DisplayPortMetrics.java
@@ +18,5 @@
>   * subsection of that with compositor scaling.
>   */
>  public final class DisplayPortMetrics {
> +    // We need to flatten the RectF structure as Java doesn't have the concept
> +    // of const classes

why do we need a const class? I'm not seeing much point of changing this class at all.

@@ +56,5 @@
> +        sb.append("{ \"left\": ").append(left)
> +          .append(", \"top\": ").append(top)
> +          .append(", \"right\": ").append(right)
> +          .append(", \"bottom\": ").append(bottom)
> +          .append(", \"resolution\": ").append(resolution)

Is the only caller for this https://mxr.mozilla.org/mozilla-central/source/mobile/android/base/gfx/GeckoLayerClient.java#381? If so, I feel like we can drop it. Let's get a follow up bug filed.
Attachment #667525 - Flags: review?(bgirard) → review+
Comment on attachment 667525 [details] [diff] [review]
Abort progressive updates under some circumstances

Review of attachment 667525 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/gfx/DisplayPortMetrics.java
@@ +18,5 @@
>   * subsection of that with compositor scaling.
>   */
>  public final class DisplayPortMetrics {
> +    // We need to flatten the RectF structure as Java doesn't have the concept
> +    // of const classes

why do we need a const class? I'm not seeing much point of changing this class at all.

@@ +56,5 @@
> +        sb.append("{ \"left\": ").append(left)
> +          .append(", \"top\": ").append(top)
> +          .append(", \"right\": ").append(right)
> +          .append(", \"bottom\": ").append(bottom)
> +          .append(", \"resolution\": ").append(resolution)

Is the only caller for this https://mxr.mozilla.org/mozilla-central/source/mobile/android/base/gfx/GeckoLayerClient.java#381? If so, I feel like we can drop it. Let's get a follow up bug filed.
Attachment #667525 - Flags: review?(blassey.bugs) → review?(bgirard)
(Assignee)

Comment 14

6 years ago
(In reply to Benoit Girard (:BenWa) from comment #10)
> Comment on attachment 667525 [details] [diff] [review]
> Abort progressive updates under some circumstances
> 
> Review of attachment 667525 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mobile/android/base/gfx/DisplayPortMetrics.java
> @@ +19,5 @@
> >   */
> >  public final class DisplayPortMetrics {
> > +    // We need to flatten the RectF structure as Java doesn't have the concept
> > +    // of const classes
> > +    public final float left;
> 
> I don't get the point of storing l,t,r,b if we keep a rect. It's not like
> Java is reading these directly?

I need to read these values, but making the Rect class public makes it modifiable, which breaks the idea of this class being immutable. I could've added more comparison methods, but that would probably involve creating and destroying objects in a critical section of code - I assumed we want this to be fast.

> @@ +56,5 @@
> > +        sb.append("{ \"left\": ").append(left)
> > +          .append(", \"top\": ").append(top)
> > +          .append(", \"right\": ").append(right)
> > +          .append(", \"bottom\": ").append(bottom)
> > +          .append(", \"resolution\": ").append(resolution)
> 
> Arggg, last time I looked JSON operation where causing huge performance hits
> :(. Looks like we're not using to toJSON for every draw so not a huge deal.

Not really sure how this gets used anymore... Like blassey says, we can file a follow-up bug.

> ::: mobile/android/base/gfx/GeckoLayerClient.java
> @@ +360,5 @@
> > +    public boolean shouldAbortProgressiveUpdate(boolean aHasPendingNewThebesContent,
> > +                                                float x, float y, float width, float height, float resolution) {
> > +        // Abort drawing if;
> > +        // The resolution has changed.
> > +        if (!FloatUtils.fuzzyEquals(resolution, mDisplayPort.resolution)) {
> 
> I don't see any locks for reading mDisplayPort. This is being called from
> the Gecko thread but being modified from the Java UI (or is it the
> compositor).

Whoops, I should've taken a local reference for this - nice catch. There is the chance that we end up reading an old version, which is ok - this just means we've sent a newer display-port, and so aborting or not aborting doesn't have much ill consequence.

> @@ +372,5 @@
> > +        //     outside of the display-port.
> > +        // XXX All sorts of rounding happens inside Gecko that becomes hard to
> > +        //     account exactly for. Given we align the display-port to tile
> > +        //     boundaries, just check if the bounds are within a pixel by
> > +        //     shrinking the viewport rect by a pixel on all sides.
> 
> Shouldn't we round OUT the displayport? Aren't we better to not abort rather
> then abort a paint that we shouldn't and risk a loop?

We're shrinking the local viewport rect, which would be the same as rounding out the received display-port.

> @@ +373,5 @@
> > +        // XXX All sorts of rounding happens inside Gecko that becomes hard to
> > +        //     account exactly for. Given we align the display-port to tile
> > +        //     boundaries, just check if the bounds are within a pixel by
> > +        //     shrinking the viewport rect by a pixel on all sides.
> > +        // Grab a local copy of the viewport metrics to avoid races.
> 
> We should add that if we're not sure about the dimension we should prefer to
> not abort to prevent an abort loop.

Yes - I think I'll restructure this method so that it's less likely we'll accidentally abort when we shouldn't. We should never abort when we're not sure we've sent a new display-port, for example - that should be the first test.

> @@ +387,5 @@
> > +        }
> > +
> > +        // There's no new content and we've sent a more recent display-port.
> > +        // XXX Same as above with respect to rounding.
> > +        if (!aHasPendingNewThebesContent &&
> 
> Wont this cause us to always abort painting if the display port is moving
> slightly but we have only stale content? I'm not convinced new vs. stale is
> a good metric. Maybe you can convince me :). I think the one above is what
> we want.

This case is so rarely hit that I really don't mind just removing it. The one above it is the important one. If the display-port is moving, we will have new content, that's a guarantee - what this clause means is if we have a large animated gif, for example, then we start scrolling, it'll abort drawing that animated gif sooner so that it can start drawing the new content from the moved display-port.

I'll document this better, but I also don't mind removing it if you think this isn't a common enough case to bother catering for.
(Assignee)

Comment 15

6 years ago
Created attachment 667966 [details] [diff] [review]
Abort progressive updates under some circumstances

Fix not using local copies of variables that are changed in other threads, make the changes I discussed in the previous comment and better document shouldAbortProgressiveUpdate in GeckoLayerClient.java. Also removed the debug spew that relied on possibly-slow toString methods.
Attachment #667525 - Attachment is obsolete: true
Attachment #667525 - Flags: review?(bgirard)
Attachment #667966 - Flags: review?(bgirard)
(Assignee)

Comment 16

6 years ago
Comment on attachment 667966 [details] [diff] [review]
Abort progressive updates under some circumstances

Carrying r=blassey
Attachment #667966 - Flags: review+
(Reporter)

Comment 17

6 years ago
Comment on attachment 667966 [details] [diff] [review]
Abort progressive updates under some circumstances

Review of attachment 667966 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/gfx/GeckoLayerClient.java
@@ +391,5 @@
> +            return false;
> +        }
> +
> +        // Abort updates when the display-port no longer contains the viewport.
> +        // XXX This makes the assumption that we never let our viewport fall

r+ with this comment clarified.
Attachment #667966 - Flags: review?(bgirard) → review+
(Assignee)

Comment 18

6 years ago
Created attachment 668077 [details] [diff] [review]
Abort progressive updates under some circumstances

Final version with comments addressed and incorrect use of nsRect replaced by gfx::Rect (whoops.)
Attachment #667966 - Attachment is obsolete: true
Attachment #668077 - Flags: review+
Attachment #668077 - Flags: checkin?(bgirard)
(Reporter)

Comment 19

6 years ago
Comment on attachment 668077 [details] [diff] [review]
Abort progressive updates under some circumstances

https://hg.mozilla.org/integration/mozilla-inbound/rev/ef54c5cced7f
Attachment #668077 - Flags: checkin?(bgirard) → checkin+
(Reporter)

Updated

6 years ago
Target Milestone: --- → Firefox 18
(Assignee)

Comment 20

6 years ago
Ugh, this causes intermittent crashes with progressive updates enabled. Leave open, will submit and land a follow-up fix. This is non-critical, as progressive updates are disabled by default.
Whiteboard: [leave-open]
Target Milestone: Firefox 18 → ---
(Assignee)

Comment 22

6 years ago
Created attachment 668402 [details] [diff] [review]
Fix crash and actually abort

Whoops - so I was calling CallObjectMethod instead of CallBooleanMethod, so this was never actually aborting. Now that the code is actually working, taskjs.org scrolling is very different and it highlights the need for more intelligent picking of which tiles to update.

Not entirely sure if this fixes the crash (waiting on try), but given the logs, I'd be surprised if it didn't. Submitting for early review, won't push until try results come back green.
Attachment #668402 - Flags: review?(blassey.bugs)
Attachment #668402 - Flags: review?(blassey.bugs) → review+
(Assignee)

Comment 23

6 years ago
Pushed to inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/a31b3a3b7954
Whiteboard: [leave-open]
https://hg.mozilla.org/mozilla-central/rev/a31b3a3b7954
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 18
(Reporter)

Comment 25

6 years ago
I added an abort marker so we can see very easily when we abort in eideticker:
http://people.mozilla.com/~bgirard/cleopatra/?zippedProfile=profiles/sps-profile-1351419924.11.zip&videoCapture=videos/video-1351420170.83.webm

It looks like we abort a bit early. On task.js 'retargeting' a paint takes about 15ms-40ms and we abort while we still have invalid content in the display port that we can finish painting. I think we should continue to paint if part of the screen is still not updated.
(Reporter)

Updated

6 years ago
Blocks: 808704
You need to log in before you can comment on or make changes to this bug.