Closed Bug 943846 Opened 11 years ago Closed 10 years ago

Scrolling quickly sometimes results into a white screen

Categories

(Core :: Panning and Zooming, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
blocking-b2g 1.3+
Tracking Status
firefox28 --- wontfix
firefox29 --- wontfix
b2g-v1.3 --- wontfix

People

(Reporter: vingtetun, Assigned: cwiiis)

References

Details

Attachments

(2 files, 1 obsolete file)

With APZ turned on, scrolling quickly in contacts sometimes results into a white rendering area. I kind of suspect that it is because we use visibility_monitor.js which is out of sync at some point.
How do you turn on APZ?  Not sure I have ever tested this configuration.  What release are we trying to enable it for?
Any logcat available?
I first thought it was because of visibility_monitor.js but it seems like I can reproduce the same bug in the call log, which does not use visibility_monitor.js...

The steps to reproduce are:
 - Populate the device by |cd $GAIA; make reference-workload-heavy| while the device is plugged.
 - Open the dialer app
 - Go to the call log (bottom-left tab)
 - Wait for it to be fully loaded
 - Once it is loaded, pan continuously to the bottom with very rapid gestures

Expected result:
 - The call log pan and the call log rows are visible

Actual result:
 - At some point the call log rows are not rendered anymore. The only thing I can see if the fixed header containing the date. If I keep scrolling sometimes the content re-render, or sometimes the fixed header seems to be updated, sometimes it is not, but the content does not rendered anymore.
Blocks: gaia-apzc
No longer blocks: 943841
Component: Gaia::Contacts → Panning and Zooming
Product: Firefox OS → Core
Editing the title to reflect that it does not happens only on Contacts.
Summary: [Contacts] Scrolling quickly sometimes results into a white screen → Scrolling quickly sometimes results into a white screen
(In reply to Francisco Jordano [:arcturus] from comment #2)
> Any logcat available?

This sounds a platform issue finally. It is hard to really see if there is any issue with visibility_monitor.js until this one is fixed.

(In reply to Ben Kelly [:bkelly] from comment #1)
> How do you turn on APZ?  Not sure I have ever tested this configuration. 
> What release are we trying to enable it for?

The target is 1.3 as I mentionned on IRC yesterday. There is an option in the developer settings.
Looks like a dupe of bug 944047.
(In reply to Benoit Girard (:BenWa) from comment #6)
> Looks like a dupe of bug 944047.

Could be. Does the original bug is about scrolling quickly or is it about scrolling near the top/bottom edges ?
No, but it could still be the same underlying problem. But it's too early to dupe.
(In reply to Benoit Girard (:BenWa) from comment #8)
> No, but it could still be the same underlying problem. But it's too early to
> dupe.

I tried the patches in bug 944047 and those does not fixed the issue I'm seeing here :(
I'll take a look.
Assignee: nobody → chrislord.net
Status: NEW → ASSIGNED
So bug 944047 was one problem, it seems there's still another if you scroll with fast, frequent movements, it seems some of them can escape the sub-frame and end up on the root frame (which shouldn't be scrollable...)

I think this problem accounts for the entire screen going blank, or the application shifting upwards in the display (I've seen both, though they can be tricky to reproduce).

On the other hand, I'm also seeing the situation where the scrolled list goes blank - the letter headers are still there in the Contacts list case, but not in the call log case... Both situations coincide with a (valid, but erroneously set) display port on the root frame.

These are quite likely related issues.
So it seems that as soon as the APZC gets so far ahead that the displayport doesn't intersect with the viewport, bad things happen. (i.e. the displayport describes a rectangle that's entirely outside of the element it's being set on)

Seems like some kind of displayport prediction code is going awry. Will look into. I think that's a separate issue from scrolling the root frame, unfortunately, which I can't easily reproduce...
(In reply to Chris Lord [:cwiiis] from comment #12)
> So it seems that as soon as the APZC gets so far ahead that the displayport
> doesn't intersect with the viewport, bad things happen. (i.e. the
> displayport describes a rectangle that's entirely outside of the element
> it's being set on)

That's bug 936500.
Depends on: 936500
There's a comment in AsyncPanZoomController to this effect:

  // If we go over the bounds when trying to predict where we will be when this
  // paint finishes, move it back into the range of the CSS content rect.
  // FIXME/bug 780395: Generalize this. This code is pretty hacky as it will
  // probably not work at all for RTL content. This is not intended to be
  // incredibly accurate; it'll just prevent the entire displayport from being
  // outside the content rect (which causes bad things to happen).

But it looks like the block that follows it doesn't work correctly(?)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #13)
> (In reply to Chris Lord [:cwiiis] from comment #12)
> > So it seems that as soon as the APZC gets so far ahead that the displayport
> > doesn't intersect with the viewport, bad things happen. (i.e. the
> > displayport describes a rectangle that's entirely outside of the element
> > it's being set on)
> 
> That's bug 936500.

ah, that's handy :) I'll take this to IRC.
(In reply to Chris Lord [:cwiiis] from comment #11)
> So bug 944047 was one problem, it seems there's still another if you scroll
> with fast, frequent movements, it seems some of them can escape the
> sub-frame and end up on the root frame (which shouldn't be scrollable...)
> 

The events out of the subframe is likely what I'm seeing in bug 942856. So definitively some of them escape the sub-frame.

> I think this problem accounts for the entire screen going blank, or the
> application shifting upwards in the display (I've seen both, though they can
> be tricky to reproduce).
> 
> On the other hand, I'm also seeing the situation where the scrolled list
> goes blank - the letter headers are still there in the Contacts list case,
> but not in the call log case... Both situations coincide with a (valid, but
> erroneously set) display port on the root frame.
> 

I have seen the letters in the call log a few time as well. Once I have them I can sometimes continue to scroll and they seems to change in random orders instead of the logical day by day I would expect.
(In reply to Vivien Nicolas (:vingtetun) (:21) (RTO - Review Time Off until 09/12/13) from comment #16)
> (In reply to Chris Lord [:cwiiis] from comment #11)
> > So bug 944047 was one problem, it seems there's still another if you scroll
> > with fast, frequent movements, it seems some of them can escape the
> > sub-frame and end up on the root frame (which shouldn't be scrollable...)
> > 
> 
> The events out of the subframe is likely what I'm seeing in bug 942856. So
> definitively some of them escape the sub-frame.
>

Sorry I meant bug 942752
Maybe it was obvious for you but since I'm pretty new to this code I need to add comment everywhere to remember stuffs and have others over my shoulder.
So while playing with  http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/AsyncPanZoomController.cpp#1127 I realized that I can not enter this state anymore if aVelocity.y and aAcceleration.y are both equal to 1.0. Not saying that's something since it results in a lot of white screens.
(In reply to Vivien Nicolas (:vingtetun) (:21) (RTO - Review Time Off until 09/12/13) from comment #19)
> Maybe it was obvious for you but since I'm pretty new to this code I need to
> add comment everywhere to remember stuffs and have others over my shoulder.
> So while playing with 
> http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/
> AsyncPanZoomController.cpp#1127 I realized that I can not enter this state
> anymore if aVelocity.y and aAcceleration.y are both equal to 1.0. Not saying
> that's something since it results in a lot of white screens.

On some apps like Gallery performance seems also much better.
So I'm pretty certain this is just a dupe of bug 936500 - we could quickly fix it by restricting the displayport to contain the content area of the element, but I'm trying to fix the underlying issue in nsDisplayList, as this is a reasonable thing to do (have a displayport that doesn't contain the content area).
Here's the work-around for now.
Attachment #8343243 - Flags: review?(bugmail.mozilla)
Comment on attachment 8343243 [details] [diff] [review]
Work around layout issue, make sure display port contains composition bounds

Sorry, this isn't quite right. Let me do it properly...
Attachment #8343243 - Flags: review?(bugmail.mozilla)
Sorry about that, here's the real thing.
Attachment #8343243 - Attachment is obsolete: true
Attachment #8343250 - Flags: review?(bugmail.mozilla)
Comment on attachment 8343250 [details] [diff] [review]
Work around layout issue, make sure display port contains composition bounds (v2)

Review of attachment 8343250 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/ipc/AsyncPanZoomController.cpp
@@ +1154,5 @@
> +  // bounds. This is to work around a layout bug that means if a display item's
> +  // corresponding displayport doesn't contain its frame's bounds, it may get
> +  // optimised out and the layer won't get created.
> +  if (displayPort.x + displayPort.width < compositionBounds.width) {
> +    displayPort.x = -(displayPort.width - compositionBounds.width);

Is it necessary for the displayport to entirely contain the composition bounds? I thought that as long as it intersected it should be fine. I would have done

if (displayPort.XMost() <= compositionBounds.x) {
  displayPort.x = compositionBounds.x - displayPort.width + 1;
}

which simplifies to:
displayPort.x = max(displayPort.x, compositionBounds.x - displayPort.width + 1);

@@ +1156,5 @@
> +  // optimised out and the layer won't get created.
> +  if (displayPort.x + displayPort.width < compositionBounds.width) {
> +    displayPort.x = -(displayPort.width - compositionBounds.width);
> +  } else if (displayPort.x > 0) {
> +    displayPort.x = 0;

Ditto here. I would think

if (displayPort.x >= compositionBounds.XMost()) {
  displayPort.x = compositionBounds.XMost() - 1;
}

would be enough. Maybe increase the "1" to "10" or some other minimum overlap threshold.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #25)
> Comment on attachment 8343250 [details] [diff] [review]
> Work around layout issue, make sure display port contains composition bounds
> (v2)
> 
> Review of attachment 8343250 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/layers/ipc/AsyncPanZoomController.cpp
> @@ +1154,5 @@
> > +  // bounds. This is to work around a layout bug that means if a display item's
> > +  // corresponding displayport doesn't contain its frame's bounds, it may get
> > +  // optimised out and the layer won't get created.
> > +  if (displayPort.x + displayPort.width < compositionBounds.width) {
> > +    displayPort.x = -(displayPort.width - compositionBounds.width);
> 
> Is it necessary for the displayport to entirely contain the composition
> bounds? I thought that as long as it intersected it should be fine.

Yes, I think it is necessary - there's the possibility that the element in question will have an opaque display item over only part of its bounds, and if that part happens to occlude the intersection with the display-port, none of it will draw. To be certain that we draw correctly, I think it needs to contain the entire bounds. This is only temporary until the underlying layout issue is fixed, of course.

With that in mind, are there still changes you'd like to see to this patch?
Flags: needinfo?(bugmail.mozilla)
Comment on attachment 8343250 [details] [diff] [review]
Work around layout issue, make sure display port contains composition bounds (v2)

Review of attachment 8343250 [details] [diff] [review]:
-----------------------------------------------------------------

Ah that makes sense. In that case this patch is fine but I *really* would prefer fixing the underlying issue because this defeats most of the displayport prediction code by constraining it so heavily.
Attachment #8343250 - Flags: review?(bugmail.mozilla) → review+
Flags: needinfo?(bugmail.mozilla)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #27)
> Comment on attachment 8343250 [details] [diff] [review]
> Work around layout issue, make sure display port contains composition bounds
> (v2)
> 
> Review of attachment 8343250 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Ah that makes sense. In that case this patch is fine but I *really* would
> prefer fixing the underlying issue because this defeats most of the
> displayport prediction code by constraining it so heavily.

Agreed - I'm going to push this because the effects of this bug are pretty awful, but I'll go back to looking at bug 936500 unless tn gets there first. I think I was on the right track, but my limited layout knowledge was making it very slow getting to a fix...
https://hg.mozilla.org/mozilla-central/rev/adcc42bc9430
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
Attached patch Backout patchSplinter Review
[Approval Request Comment]
Bug caused by (feature/regressing bug #): Our display ports will be much less useful and cause blank area to display more frequently than is necessary
User impact if declined: The above
Testing completed (on m-c, etc.): Tested locally, I'm confident in backing this out.
Risk to taking this patch (and alternatives if risky): Risk that the bug that this was working around still exists (it doesn't, so low risk)
String or IDL/UUID changes made by this patch: None
Attachment #8348063 - Flags: review?(bugmail.mozilla)
Attachment #8348063 - Flags: approval-mozilla-aurora?
Do I need to request mozilla-b2g28 too, or is Aurora still being merged?
Attachment #8348063 - Flags: review?(bugmail.mozilla) → review+
Requesting the blocking-1.3 flag.
blocking-b2g: --- → 1.3?
blocking-b2g: 1.3? → 1.3+
Attachment #8348063 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/7610a39b29b4

In my queue to push to b-i as well.
Resolution: FIXED → WONTFIX
Target Milestone: mozilla28 → ---
Thanks for handling this! Has this been backed out in b2g1.3 as well?
Flags: needinfo?(ryanvm)
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #34)
> https://hg.mozilla.org/releases/mozilla-aurora/rev/7610a39b29b4

Aurora = 1.3
Flags: needinfo?(ryanvm)
You need to log in before you can comment on or make changes to this bug.