Closed Bug 1087453 Opened 10 years ago Closed 10 years ago

[Email] Scrolling is disabled on email settings page

Categories

(Core :: Layout, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla36
blocking-b2g 2.1+
Tracking Status
firefox34 --- wontfix
firefox35 --- wontfix
firefox36 --- fixed
b2g-v2.0 --- unaffected
b2g-v2.1 --- verified
b2g-v2.2 --- verified

People

(Reporter: smiko, Assigned: tnikkel)

References

()

Details

(Keywords: regression, Whiteboard: [caf priority: p2][2.1-Daily-Testing][CR 746850])

Attachments

(7 files, 3 obsolete files)

Attached file Email.txt
Description: In the email application, the user is unable to scroll on the settings page. This blocks the ability to delete email accounts.

Setup: Have an active email account signed in.

Repro Steps:
1: Update a Flame to 20141022001201.
2: Open Email > Menu Icon (top left) > Settings Icon (gear) > Chose an account.
3: Scroll to the bottom of the page.

Actual: Scrolling is disabled.

Expected: The user is able to scroll down on the page.

Device: Flame 2.1 (319mb/full flash)
BuildID: 20141022001201
Gaia: 3d9cc667f4e929861a9a77c41096bbf5a9c1bde0
Gecko: 928b18f7d8ff
Gonk: 05aa7b98d3f891b334031dc710d48d0d6b82ec1d
Version: 34.0 (2.1)
Firmware: V188
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Repro frequency: 100%

See attached: logcat

Video clip:http://youtu.be/j4BF0kFgDxU
This issue does NOT repro on Flame 2.2 (319mb/full flash)

Actual result: The user is able to scroll down.

Device: Flame 2.2
BuildID: 20141022040201
Gaia: 4d7f051cede6544f4c83580253c743c22b0cb279
Gecko: ae4d9b4ff2ee
Gonk: 05aa7b98d3f891b334031dc710d48d0d6b82ec1d
Version: 36.0a1 (2.2)
Firmware: V188
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
Whiteboard: [2.1-Daily-Testing] [systemsfe]
On Flame 2.2 (319mb/full flash) the user is unable to scroll on the initial try. After that, the user is able to scroll down. 

Device: Flame 2.2
BuildID: 20141022040201
Gaia: 4d7f051cede6544f4c83580253c743c22b0cb279
Gecko: ae4d9b4ff2ee
Gonk: 05aa7b98d3f891b334031dc710d48d0d6b82ec1d
Version: 36.0a1 (2.2)
Firmware: V188
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0
[Blocking Requested - why for this release]:
Functional regression of a core feature.

Requesting a window.
blocking-b2g: --- → 2.1?
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
QA Contact: pcheng
Ni :doliver here given this needs urgent attention as it breaks smoketests.
Flags: needinfo?(doliver)
I flashed a 2.1 device, and I see the bug, but I am at a loss to see something that email has done specifically to cause this. It is either a change in gecko or maybe something in app window system management? Or maybe in a building block?

Here is what I have checked so far:

The core email app on 2.1 has not had any changes to it in over a week:

https://github.com/mozilla-b2g/gaia/commits/v2.1/apps/email

I first thought that maybe the email cards event interception went awry, which would result in scroll or clicks not being allowed. However, Cards._eatingEventsUntilNextCard is correctly set as false, and taps in the card still work. So for example, tapping on the signature button to edit it, works, and transitions to a new card.

Interestingly, after opening the "Sent Using Firefox OS" signature edit card, then going back to account settings, the account settings can then be scrolled.

I checked the scrollHeight of the scrollregion-below-header element, the scrollHeight of 905 is larger than the getBoundingClientRect height of 489, so it implies it thinks there is content that can be scrolled.

Explicitly adding an overflow-y via WebIDE does not affect the issue.

Looking at the changes in /shared:
https://github.com/mozilla-b2g/gaia/commits/v2.1/shared/

This change sounded vaguely scroll related:
https://github.com/mozilla-b2g/gaia/commit/c91588fbddfe4a15418b2cf51e0792092eaed71c

I tried reverting that locally, as well as this system app change:
https://github.com/mozilla-b2g/gaia/commit/9f1b84c3655128aec5b50523878b7892f69430a0

But it did not fix the issue. However, perhaps code related/around those changes, maybe in the platform too, contribute to the issue?

I am out of time for it tonight, but so far I do not expect an email change caused the issue, perhaps just an interaction with how it does the card work and a recent gecko or system change. A regression range would help.
Flags: needinfo?(doliver)
Whiteboard: [2.1-Daily-Testing] [systemsfe] → [2.1-Daily-Testing]
mozilla-inbound regression window:

Last Working Environmental Variables:
Device: Flame
BuildID: 20141020174936
Gaia: dc496d04907dd314f9736ff78bab3bd27156f79a
Gecko: 33f348ff0417
Version: 36.0a1 (2.2 Master)
Firmware: V188
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0

First Broken Environmental Variables:
Device: Flame
BuildID: 20141020182200
Gaia: dc496d04907dd314f9736ff78bab3bd27156f79a
Gecko: 2d4f4862589d
Version: 36.0a1 (2.2 Master)
Firmware: V188
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0

Gaia is the same so it's a Gecko issue.

Gecko pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=33f348ff0417&tochange=2d4f4862589d

Caused by Bug 1076783.

Note that as comment 2 mentioned, this bug is much LESS severe in trunk. On 2.2 this issue only occurs on the initial swipe on the said page, the issue would no longer occur on the 2nd swipe. Whereas in 2.1 the page doesn't respond to all swipes. I'm not sure what makes the discrepancy between these two branches.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmitchell)
Moving to Core/Layout to match component on bug 1076783.
Component: Gaia::E-Mail → Layout
Product: Firefox OS → Core
Version: unspecified → Trunk
Blocks: 1076783
No longer blocks: 1076783
If this is a smoketest blocker please ensure that the keywords indicate that.
Blocks: 1076783
Caused by Bug 1076783. Can you take a look Timothy
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(jmitchell) → needinfo?(tnikkel)
QA Contact: pcheng
Layer dumps from B2G on master. In the first layer dump, the layer 0xae57a400 is the scrollinfo container layer that represents the scrollable area in question. It is (almost entirely) obscured by the PaintedLayerComposite just after it, 0xae57a800. The compbounds height on the scrollinfo layer is 734, but the height of the visible region of the PaintedLayer is 710, so there's a 24-pixel gap at the bottom. If I initiate a scroll in that 24 pixel gap it works fine. If I try to scroll from higher up on the page it doesn't work the first time.

However, the failed attempt at scrolling somehow seems to activate the subframe, so that the PaintedLayerComposite now gets a frame metrics. This allows it to scroll normally on the second scroll attempt.

I'm not sure why the layer tree ends up exactly this way, but this is kind of what I was worried about with bug 1076783 and what I was doing sanity tests for. I wasn't able to construct a case where this happened but clearly it does.
I'm assuming that on 2.1 the failed attempt at scrolling doesn't activate the subframe, for whatever reason, and so the subsequent scroll attempts fare no better than the first.
This is the layer dump with layout.async-containerless-scrolling.enabled set to false.
Same as before, but now also includes the layer dump after doing the scroll. There is an extra APZ in this one.
Attachment #8510554 - Attachment is obsolete: true
This is the last bit of output after I clicked on the email account and it painted the email settings screen. I full log was a >1meg so I just kept the last few layer/displaylist dump iterations.
Ok, I don't know what's going on here. Now when I load it with multi-layer-apz disabled I still see the scrollinfo layer. I'll attach the full dumps of everything that gets output between my clicking on the email account and the final render of the settings for that email account.
A brief history of the relevant layer subtree...

(A) In the olden days (2.0) we had:

ContainerLayerComposite
  ThebesLayerComposite
  ContainerLayerComposite    APZC (4,2,7)    cb=Rect(0,75,480,743) (scrollinfo)
  ThebesLayerComposite

In this case the scrollinfo layer was obscured by another layer, but this was before bug 973096 landed so it didn't matter because we didn't check for obscured-ness.

(B) On a central build from 2014-07-29, just after bug 973096 landed, I see a layer tree that looks like this: (dimensions are a little different because I used a hamachi for this one)

ContainerLayerComposite
  ContainerLayerComposite    APZC (3,2,7)    cb=Rect(0,50,320,406) (scrollinfo)
  ThebesLayerComposite
  ThebesLayerComposite

That is, the scrollinfo layer magically moved to the top, and it is no longer obscured, so hit-testing worked fine.

(C) On master with multi-layer-apz DISABLED, the layer tree looks like this:

ContainerLayerComposite
  ContainerLayerComposite  APZC (5,2,8)    cb=Rect(0,75,480,734) (scrollinfo)
  PaintedLayerComposite
  PaintedLayerComposite

Same as before, so again hit-testing works fine since the scrollinfo layer is not obscured.

(D) On master with multi-layer-apz ENABLED, the layer tree looks like this:

ContainerLayerComposite
  PaintedLayerComposite
  ContainerLayerComposite  APZC (5,2,7)    cb=Rect(0,75,480,734)   (scrollinfo)
  PaintedLayerComposite

and this fails because we check for obscured-ness and the PaintedLayerComposite on top of the scrollinfo layer obscures it.

(E) On master, prior to landing bug 1076783 I assume the scrollinfo layer was still on top of everything, and so again it worked fine.

---
It's an open question as to what changed between A and B that caused the scrollinfo layer to move up.

It's also an open question which is the more "correct" layer tree structure. Thinking about what the APZ code does at the moment, it expects the scrollinfo layer to be not necessarily at the "top" or the "bottom", but in the correct position in z-order so that it sits just on top of the layer with the content. Perhaps the real problem in bug 1076783 was that the root process scrollinfo layer was in the wrong position. That is, the "fullscreen" layer tree from attachment 8507380 [details], looked like this:

[start]
ContainerLayerComposite
  ContainerLayerComposite
    ContainerLayerComposite   APZC (1,3,3) (scrollinfo)
    RefLayerComposite
      ContainerLayerComposite APZC (5,2,3) (child process)
        ImageLayerComposite
        ColorLayerComposite
        PaintedLayerComposite
        PaintedLayerComposite
    ContainerLayerComposite
      ColorLayerComposite
      PaintedLayerComposite
    PaintedLayerComposite
  ColorLayerComposite
  PaintedLayerComposite
[end]

The scrollinfo layer is generated because of the scrollgrab attribute set on an element in the parent process, but it's not clear to me why that layer is positioned where it is. In particular, I would expect a scrollgrabbing layer to be an ancestor of the thing it's grabbing, and that's not the case here.

---
Putting all that aside for the moment, tn suggested a solution for now which involves modifying the APZ hit test code so that it treats scrollinfo layers (with no displayport) as always unobscured. So based on the layer trees above, I think it would fix this bug but quite possibly regress bug 1076783, because the (1,3,3) scrollinfo layer above would not be obscured by the reflayer and its contents. I'll give that a shot though, it should be pretty easy to hack up.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #19)
> So based on the layer trees
> above, I think it would fix this bug but quite possibly regress bug 1076783,
> because the (1,3,3) scrollinfo layer above would not be obscured by the
> reflayer and its contents. I'll give that a shot though, it should be pretty
> easy to hack up.

Actually it turns out that does work, because the (1,3,3) APZ ends up as a sibling of the (5,2,3) APZ and the (5,2,3) one is "on top". So I guess this might be a viable solution if we don't find anything better.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #19)
> It's an open question as to what changed between A and B that caused the
> scrollinfo layer to move up.

Bug 1041510. It was a regression fix for bug 1022612 that came shortly after.
The reason that bug 1076783 happened was because the scroll info layer was on top of fixed position content because when placing the scroll info layer we just checked if there was anything on the positioned descendants list and then placed it on top of those. This is a bad idea when those positioned descendants aren't scrolled by our scroll frame.

I think we should also ignore all scroll info layers for the purposes of hit testing if there is a non-scroll info layer for the same scroll id. Because if we have the actual scrolled content we should hit that, not the scroll info layer, which just blindly covers the whole scroll frame area.
Flags: needinfo?(tnikkel)
So your patch to reposition the scrollinfo layer works great, it fixes both bugs (this one and the youtube one) and your description of it makes sense to me conceptually. As discussed I think we should get it landed. If we need to I can also write a patch to ignore the scroll info layers for hit testing if there is a non-scroll info layer for the same scroll id, but it should really only matter in the case where the scrollinfo layer is in the root process (such as it was in the youtube case).
Comment on attachment 8510982 [details] [diff] [review]
put scroll info layers belows positioned descendants that aren't scrolled by the associated scroll frame

Turns out we need scroll info layers on top because they don't get hit otherwise. The original problem was a scroll info layer being put on top of fixed pos content, which doesn't make sense since our scroll frame does not scroll it. In this patch we only put the scroll info layer on top of positioned descendants that we scroll. There are still cases that will not get the ideal behaviour, but given that the bug 1076783 was the only problem we've found since multi-layer-apz landed with having scroll info layers on top (and that was caused by/complicated by the fact that the scroll info layer was in the parent process), it should be pretty rare.
Attachment #8510982 - Flags: review?(roc)
blocking-b2g: 2.1? → 2.1+
https://hg.mozilla.org/mozilla-central/rev/853cd6236161
Assignee: nobody → tnikkel
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
Please request b2g34 approval on this when you get a chance (though it sounds like we may want to wait on bug 1089620).
Flags: needinfo?(tnikkel)
I think the state is that uplifting this should fix bug 1089620 (which so far has only been seen on 2.1) so we should go ahead and request uplift here.
Comment on attachment 8510982 [details] [diff] [review]
put scroll info layers belows positioned descendants that aren't scrolled by the associated scroll frame

NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 1076783
User impact if declined: can't scroll in some situations
Testing completed: on m-c, tested locally by kats
Risk to taking this patch (and alternatives if risky): this patch moves scroll info layers up. before bug 1076783 they used to be even higher. but that caused bug 1076783. so this patch puts them near the top, but slightly lower (below out of flow stuff that isn't scrolled by the scroll frame). Since we've had this behaviour for quite a while and only found bug 1076783 this should be the best compromise.
String or UUID changes made by this patch: none
Flags: needinfo?(tnikkel)
Attachment #8510982 - Flags: approval-mozilla-b2g34?
Keywords: verifyme
Comment on attachment 8510982 [details] [diff] [review]
put scroll info layers belows positioned descendants that aren't scrolled by the associated scroll frame

:njpark, can you please help or assign someone to do some exploratory testing here around scrolling. Thanks!
Attachment #8510982 - Flags: approval-mozilla-b2g34? → approval-mozilla-b2g34+
Flags: needinfo?(npark)
[Environment]
Gaia-Rev        0dfc1996eb583c8b507a82bf6b8319624bba23ea
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/80e18ff7c7b2
Build-ID        20141029160202
Version         36.0a1
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  39
FW-Date         Thu Oct 16 18:19:14 CST 2014
Bootloader      L1TC00011880


[Result]
PASS
Status: RESOLVED → VERIFIED
(In reply to bhavana bajaj [:bajaj] from comment #34)
> Comment on attachment 8510982 [details] [diff] [review]
> put scroll info layers belows positioned descendants that aren't scrolled by
> the associated scroll frame
> 
> :njpark, can you please help or assign someone to do some exploratory
> testing here around scrolling. Thanks!

I did a quick check this morning on the mail app on master branch, and graphically things looks fine.
Flags: needinfo?(npark)
Whiteboard: [2.1-Daily-Testing] → [2.1-Daily-Testing][CR 746850]
Whiteboard: [2.1-Daily-Testing][CR 746850] → [caf priority: p2][2.1-Daily-Testing][CR 746850]
This issue is verified fixed on Flame 2.1 and 2.2.

Result: Scrolling on the email settings screen functions properly.

Device: Flame 2.1 (319mb)(Kitkat Base)(Full Flash)
BuildID: 20141104001202
Gaia: 8b0cf889ae0d48a9eb7ecdcb9b67590de45cc5e5
Gecko: 388b03efe92d
Gonk: 48835395daa6a49b281db62c50805bd6ca24077e
Version: 34.0 (2.1)
Firmware: V188
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Device: Flame 2.2 Master (319mb)(Kitkat Base)(Full Flash)
BuildID: 20141104040207
Gaia: 3c50520982560ccba301474d1ac43706138fc851
Gecko: 54d05732f29b
Gonk: 48835395daa6a49b281db62c50805bd6ca24077e
Version: 36.0a1 (2.2) 
Firmware Version: v188
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Keywords: verifyme
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: