Closed Bug 1208829 Opened 9 years ago Closed 8 years ago

[APZ] Fixed background image disappears after scrolling to the bottom of the page

Categories

(Core :: Layout, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
firefox44 --- affected

People

(Reporter: mstange, Assigned: botond)

References

Details

Attachments

(9 files)

418 bytes, text/html
Details
40 bytes, text/x-review-board-request
mstange
: review+
Details
40 bytes, text/x-review-board-request
mstange
: review+
Details
40 bytes, text/x-review-board-request
kats
: review+
Details
40 bytes, text/x-review-board-request
mstange
: review+
Details
40 bytes, text/x-review-board-request
mstange
: review+
Details
40 bytes, text/x-review-board-request
mstange
: review+
Details
40 bytes, text/x-review-board-request
mstange
: review+
Details
3.32 KB, patch
botond
: review+
Details | Diff | Splinter Review
Attached file testcase
In this testcase, if you scroll all the way to the bottom, the blue square disappears. If you refresh the page while you're scrolled to the bottom, the square reappears, and if you then scroll all the way up, it briefly disappears and then reappears once you're at the top.
I don't experience the exact STR you describe, but I do see badness when scrolling the testcase, specifically the square being clipped as you scroll (and then repainted in full when you're done scrolling). Will look into it.
Assignee: nobody → botond
I can't reproduce it disappearing any more either. It looks like part of it got fixed at least.
The clipping issue is probably bug 1211487.
It appears that the async scroll offset being fractional is necessary to reproduce the bug. The reftest harness currently only supports specifying integer async scroll offsets - we may want to change that so we can write a reftest for this bug.
(In reply to Botond Ballo [:botond] from comment #3)
> It appears that the async scroll offset being fractional is necessary to
> reproduce the bug.

I now understand that's because the compositor only creates an intermediate surface under those conditions (for this page and in my setup, anyways). This supports Markus' suggestion that this is related to bug 1211487, although I'm not 100% convinced yet that the solution proposed there (async-manipulating the visible regions of layers in the compositor) will solve this problem. I'm continuing to investigate.
Further investigation reveals that the intermediate surface itself is incorrectly rendered, so the clipping problem occurs when we are drawing to the intermediate surface.
(In reply to Botond Ballo [:botond] from comment #4)
> I now understand that's because the compositor only creates an intermediate
> surface under those conditions (for this page and in my setup, anyways).
> This supports Markus' suggestion that this is related to bug 1211487,
> although I'm not 100% convinced yet that the solution proposed there
> (async-manipulating the visible regions of layers in the compositor) will
> solve this problem. I'm continuing to investigate.

After debugging this further and thinking about it some more, I think async-adjusting the visible regions is at least necessary to fix this. (It may not be sufficient, but let's cross that bridge when we get there.)
Posting what I have so far; still needs cleanup and testing.
Bug 1208829 - Rename LayerManagerComposite::ApplyOcclusionCulling to PostProcessLayers, and clean it up a bit. r=mstange
Bug 1208829 - static_assert that IntRegionTyped is instantiated with a proper unit type. r=mstange
Bug 1208829 - Make FromUnknownRegion() actually work, and add ToUnknownRegion(). r=mstange
Bug 1208829 - Add utilities for converting between typed and untyped regions. r=mstange
Bug 1208829 - Recompute visible regions during composition. r=mstange
The reftest as written doesn't actually test anything, because of the limitation described in comment 3:

(In reply to Botond Ballo [:botond] from comment #3)
> It appears that the async scroll offset being fractional is necessary to
> reproduce the bug. The reftest harness currently only supports specifying
> integer async scroll offsets - we may want to change that so we can write a
> reftest for this bug.

However, I have verified using a local hack that the patch does fix the problem the test is trying to capture.
Depends on: 1224021
(In reply to Botond Ballo [:botond] from comment #14)
> The reftest as written doesn't actually test anything, because of the
> limitation described in comment 3:
> 
> (In reply to Botond Ballo [:botond] from comment #3)
> > It appears that the async scroll offset being fractional is necessary to
> > reproduce the bug. The reftest harness currently only supports specifying
> > integer async scroll offsets - we may want to change that so we can write a
> > reftest for this bug.

I have a fix for this up at bug 1224021. With that, the reftest is meaningful.
When scrolling the original testcase in this bug, the clipping of arbitrary amounts of the square no longer happens, but there is slight a slight flicker at the edges, suggesting that there may be a rounding problem somewhere.
(In reply to Botond Ballo [:botond] from comment #16)
> When scrolling the original testcase in this bug, the clipping of arbitrary
> amounts of the square no longer happens, but there is slight a slight
> flicker at the edges, suggesting that there may be a rounding problem
> somewhere.

The rounding error arises during recomputation of the visible region of the ContainerLayer that uses the intermediate surface. The child's visible region, which is an nsIntRegion, is translated by the fractional async scroll, and the result is then represented as an nsIntRegion; in this process, the region's rects are rounded out [1].

Changing [1] to call Round() instead of RoundOut() seems to fix the problem. Markus, do you think rounding to nearest pixels is a reasonable solution, and if so, should we allow callers of nsIntRegion::Transform() to specify the rounding behaviour on a per-call-site basis?

[1] https://dxr.mozilla.org/mozilla-central/rev/84a7cf29f4f14c9b359db2f7f19c0abd6a8e178e/gfx/src/nsRegion.cpp#618
Flags: needinfo?(mstange)
I implemented the approach described in comment 17 (rounding to nearest pixels when recomputing the visible regions). Try push:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e2602dd11cbb

I'm still interested to hear Markus' opinion on whether this is a good idea.
(In reply to Botond Ballo [:botond] from comment #18)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=e2602dd11cbb

Some reftests are failing, need to investigate.
(In reply to Botond Ballo [:botond] from comment #17)
> Changing [1] to call Round() instead of RoundOut() seems to fix the problem.

Who is the caller of TransformRect in that case?

> Markus, do you think rounding to nearest pixels is a reasonable solution,

Probably not, no. I can't really think of use cases where the caller would something other than "give me the bounds of all the pixels that the transformed region touches".


It's unfortunate that we end up with fractional transforms in the first place. Main thread scrolling doesn't set fractional transforms on scrolled layers, so APZ probably shouldn't, either.
Flags: needinfo?(mstange)
Comment on attachment 8685214 [details]
MozReview Request: Bug 1208829 - Make FromUnknownRegion() actually work, and add ToUnknownRegion(). r=kats

https://reviewboard.mozilla.org/r/24755/#review22747

Stealing review so that njn can use this patch for his patchset
Attachment #8685214 - Flags: review+
Landed the "Make FromUnknownRegion() actually work, and add ToUnknownRegion()" patch as per comment 21. Marking bug leave-open until the remaining patches land.
Keywords: leave-open
(In reply to Markus Stange [:mstange] from comment #20)
> > Markus, do you think rounding to nearest pixels is a reasonable solution,
> 
> Probably not, no. I can't really think of use cases where the caller would
> something other than "give me the bounds of all the pixels that the
> transformed region touches".

After much debugging, I discovered the source of the problem. You are correct that rounding the transformed region is not the correct solution.

Here's what's happening:

  - Rounding out increased the size of the shadow visible region by 1
    compared to the size of the visible region.

  - Layer::GetIntermediateSurfaceRect() uses the size visible region
    (as opposed to the shadow visible region) to determine the size of
    the intermediate surface rect. This function is used by
    CalculateScissorRect(), which is called in ContainerPrepare() to
    calculate the clip rect for the painted layer being rendered into
    the intermediate surface.

  - In CompositorOGL::DrawQuad(), the size of clip rect is combined with 
    the render target origin, which is derived from the origin of the
    shadow visible region. Since the shadow visible region has been
    rounded out, this clips away 1 row of pixels from it. 
    
My solution is to change GetIntermediateSurfaceRect() to use the shadow visible region for composite layers.
(In reply to Botond Ballo [:botond] from comment #25)
> My solution is to change GetIntermediateSurfaceRect() to use the shadow
> visible region for composite layers.

Great! With that fix, your patches also fix attachment 8669711 [details].

I looked at the preserves3d-nested.html test failure on your try push a little. It seems like we need to respect the layer's prescale somehow. Replacing composite->GetShadowTransform() in the visibleParentSpace calculation with aLayer->GetLocalTransform() (which applies pre and post scales) seems to fix that test, but I don't know if that's the right thing to do.
(In reply to Markus Stange [:mstange] from comment #27)
> I looked at the preserves3d-nested.html test failure on your try push a
> little. It seems like we need to respect the layer's prescale somehow.
> Replacing composite->GetShadowTransform() in the visibleParentSpace
> calculation with aLayer->GetLocalTransform() (which applies pre and post
> scales) seems to fix that test, but I don't know if that's the right thing
> to do.

Good catch! Using GetLocalTransform() is indeed the right thing to do.

Here's the way the various transform accessors relate to each other:

           |    base transform        |    including scales
------------------------------------------------------------------
client     |    GetBaseTransform()    |    GetTransform()
------------------------------------------------------------------
compositor |    GetShadowTransform()  |    GetLocalTransform()

(Except that GetLocalTransform() is also defined for client layers, and returns the same thing as GetTransform()).

I think the names are misleading, because it's easy to make the mistake (as I did) of thinking that GetShadowTransform() is the shadow equivalent of GetTransform().
And that fix seems to have done the trick!

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4ef9a22326b9

Posting patches for review.
Attachment #8685214 - Attachment description: MozReview Request: Bug 1208829 - Make FromUnknownRegion() actually work, and add ToUnknownRegion(). r=mstange → MozReview Request: Bug 1208829 - Make FromUnknownRegion() actually work, and add ToUnknownRegion(). r=kats
Comment on attachment 8685214 [details]
MozReview Request: Bug 1208829 - Make FromUnknownRegion() actually work, and add ToUnknownRegion(). r=kats

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/24755/diff/1-2/
Comment on attachment 8685213 [details]
MozReview Request: Bug 1208829 - static_assert that IntRegionTyped is instantiated with a proper unit type. r=mstange

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/24753/diff/1-2/
Attachment #8685213 - Flags: review?(mstange)
Comment on attachment 8685215 [details]
MozReview Request: Bug 1208829 - Add utilities for converting between typed and untyped regions. r=mstange

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/24757/diff/1-2/
Attachment #8685215 - Flags: review?(mstange)
Comment on attachment 8685212 [details]
MozReview Request: Bug 1208829 - Rename LayerManagerComposite::ApplyOcclusionCulling to PostProcessLayers, and clean it up a bit. r=mstange

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/24751/diff/1-2/
Attachment #8685212 - Flags: review?(mstange)
Comment on attachment 8685216 [details]
MozReview Request: Bug 1208829 - Recompute visible regions during composition. r=mstange

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/24759/diff/1-2/
Attachment #8685216 - Flags: review?(mstange)
Attachment #8685217 - Attachment description: MozReview Request: Bug 1208829 - Reftest. r=mstange → MozReview Request: Bug 1208829 - Use the effective visible region to calculate the intermediate surface size. r=mstange
Attachment #8685217 - Flags: review?(mstange)
Comment on attachment 8685217 [details]
MozReview Request: Bug 1208829 - Use the effective visible region to calculate the intermediate surface size. r=mstange

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/24761/diff/1-2/
Bug 1208829 - Reftest. r=mstange
Attachment #8691063 - Flags: review?(mstange)
Comment on attachment 8685213 [details]
MozReview Request: Bug 1208829 - static_assert that IntRegionTyped is instantiated with a proper unit type. r=mstange

https://reviewboard.mozilla.org/r/24753/#review23489
Attachment #8685213 - Flags: review?(mstange) → review+
Comment on attachment 8685215 [details]
MozReview Request: Bug 1208829 - Add utilities for converting between typed and untyped regions. r=mstange

https://reviewboard.mozilla.org/r/24757/#review23491
Attachment #8685215 - Flags: review?(mstange) → review+
Comment on attachment 8685212 [details]
MozReview Request: Bug 1208829 - Rename LayerManagerComposite::ApplyOcclusionCulling to PostProcessLayers, and clean it up a bit. r=mstange

https://reviewboard.mozilla.org/r/24751/#review23493
Attachment #8685212 - Flags: review?(mstange) → review+
Comment on attachment 8685216 [details]
MozReview Request: Bug 1208829 - Recompute visible regions during composition. r=mstange

https://reviewboard.mozilla.org/r/24759/#review23495

::: gfx/layers/composite/LayerManagerComposite.cpp:245
(Diff revision 2)
> +  if (!descendantsVisibleRegion.IsEmpty()) {

Let's make this if condition match what the comment says, e.g by changing it to: if (aLayer->GetFirstChild()).
It's more explicit, and it doesn't accidentally hide bugs if we compute an empty visible region when it shouldn't be empty.
Attachment #8685216 - Flags: review?(mstange) → review+
Comment on attachment 8685217 [details]
MozReview Request: Bug 1208829 - Use the effective visible region to calculate the intermediate surface size. r=mstange

https://reviewboard.mozilla.org/r/24761/#review23497
Attachment #8685217 - Flags: review?(mstange) → review+
Attachment #8691063 - Flags: review?(mstange) → review+
Comment on attachment 8691063 [details]
MozReview Request: Bug 1208829 - Reftest. r=mstange

https://reviewboard.mozilla.org/r/26055/#review23499

This reftest is ok, but I'd like to have another reftest that doesn't require fractional async scroll offsets. I'll attach one.
https://reviewboard.mozilla.org/r/24759/#review23511

::: gfx/layers/composite/LayerManagerComposite.cpp:230
(Diff revision 2)
> +  // Recurse on our descendants, in fron-to-back order. In this process:

typo: "fron"
Attached patch another reftestSplinter Review
Comment on attachment 8692590 [details] [diff] [review]
another reftest

Review of attachment 8692590 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, Markus!

::: layout/reftests/async-scrolling/reftest.list
@@ +26,5 @@
>  skip-if(!asyncPan) == culling-1.html culling-1-ref.html
>  skip-if(!asyncPan) == position-fixed-iframe-1.html position-fixed-iframe-1-ref.html
>  skip-if(!asyncPan) == position-fixed-iframe-2.html position-fixed-iframe-2-ref.html
>  skip-if(!asyncPan) == position-fixed-in-scroll-container.html position-fixed-in-scroll-container-ref.html
> +fuzzy(1,60000) skip-if(!asyncPan) == group-opacity-surface-size-1.html group-opacity-surface-size-1-ref.html

Out of curiosity, what platform(s) did you need the fuzz for?
Attachment #8692590 - Flags: review+
(In reply to Botond Ballo [:botond] from comment #45)
> ::: layout/reftests/async-scrolling/reftest.list
> @@ +26,5 @@
> >  skip-if(!asyncPan) == culling-1.html culling-1-ref.html
> >  skip-if(!asyncPan) == position-fixed-iframe-1.html position-fixed-iframe-1-ref.html
> >  skip-if(!asyncPan) == position-fixed-iframe-2.html position-fixed-iframe-2-ref.html
> >  skip-if(!asyncPan) == position-fixed-in-scroll-container.html position-fixed-in-scroll-container-ref.html
> > +fuzzy(1,60000) skip-if(!asyncPan) == group-opacity-surface-size-1.html group-opacity-surface-size-1-ref.html
> 
> Out of curiosity, what platform(s) did you need the fuzz for?

OS X. (Which doesn't run e10s tests on treeherder yet.)
But having a fuzz of 1 is never a bad idea when comparing opacity rendering between active and inactive opacities, since we don't guarantee consistent rounding between compositor rendering and main thread layers.
Whoops, I meant to clear the leave-open flag when landing the remaining patches.
Status: NEW → RESOLVED
Closed: 8 years ago
Keywords: leave-open
Resolution: --- → FIXED
Did this fix possibly cause bug 1231004 ? It regressed either because of this or one of the other changes going into the same nightly.
(In reply to Hallvord R. M. Steen [:hallvors] from comment #51)
> Did this fix possibly cause bug 1231004 ? It regressed either because of
> this or one of the other changes going into the same nightly.

(Just to close the loop here: bug 1231004 ended up being a regression from bug 1208673, not this bug.)
You need to log in before you can comment on or make changes to this bug.