Closed Bug 969483 Opened 6 years ago Closed 6 years ago

[B2G][Contacts]User is unable to scroll in the Add Contact menu after opening phone label menu and pressing Done

Categories

(Core :: Graphics: Layers, defect)

28 Branch
ARM
Gonk (Firefox OS)
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla30
blocking-b2g 1.3+
Tracking Status
firefox28 --- fixed
firefox29 --- fixed
firefox30 --- fixed
b2g-v1.2 --- unaffected
b2g-v1.3 --- fixed
b2g-v1.3T --- fixed
b2g-v1.4 --- fixed

People

(Reporter: astole, Assigned: kats)

References

()

Details

(Keywords: regression)

Attachments

(3 files, 2 obsolete files)

User is unable to scroll in the Add Contact menu after opening the keyboard and scrolling the page, then opening the phone label menu and pressing 'Done'.

Repro Steps:
1) Updated Buri to BuildID: 20140207004002
2) Open Contacts app and add a new contact
3) Select a field (ex. Name) to open the keyboard
4) Scroll down and select Mobile
5) Scroll down then up in Label menu
6) Press done and attempt to scroll

Actual:
User is unable to scroll in the Add contact menu after opening and closing the Label menu.

Expected:
User is able to scroll after opening and closing the Label menu.

Environmental Variables:
Device: Buri v1.3 Mozilla RIL
BuildID: 20140207004002
Gaia: 1527d1e450364e383eeb95ff898dca2042e2b4b5
Gecko: 0a6d83aabb02
Version: 28.0
Firmware Version: V1.2-device.cfg

Repro frequency: 100%
Link to failed test case:
See attached: Video
Attached video Video
Summary: [B2G][Contacts]User is unable to scroll in the Add Contact menu after opening phone label menu and pressing Done. → [B2G][Contacts]User is unable to scroll in the Add Contact menu after opening phone label menu and pressing Done
This issue does not occur on the latest 1.2.

Environmental Variables:
Device: Buri v1.2 Mozilla RIL
BuildID: 20140207004002
Gaia: 539a25e1887b902b8b25038c547048e691bd97f6
Gecko: 1592ce49929c
Version: 26.0
Firmware Version: V1.2-device.cfg
Blocks: gaia-apzc
blocking-b2g: --- → 1.3?
Component: Gaia::Contacts → Panning and Zooming
Product: Firefox OS → Core
Version: unspecified → 28 Branch
Regression Window:

Last Working Environmental Variables:
Device: Buri 1.3
BuildID: 20140122004001
Gaia: 744fb691c2b2a25a07c5d19fabf5748ae9aba4d9
Gecko: 71a8786c3815
Version: 28.0a2
Base Image: V1.2-device.cfg

First Broken Environmental Variables:
Device: Buri 1.3
BuildID: 20140123004001
Gaia: 6fbeac2415f07f10de181f0877ddf67ee299b885
Gecko: e8f6bdf8db3d
Version: 28.0a2
Base Image: V1.2-device.cfg
QA Contact: jzimbrick
Milan

Please take a look for APZC
Flags: needinfo?(milan)
Can we get a narrower range? The pvtbuilds server still has per-commit builds from 21 January onwards so this should still be possible in the next day or two.
Narrower range with tinderbox aurora builds:

Last Working Environmental Variables:
Device: Buri
BuildID: 20140122014610
Gaia: 15ad9c79120b3391d4b8c206417e0658d83fb84b
Gecko: 86b2dfa26b6e
Version: 28.0a2
Base Image: V1.2-device.cfg

First Broken Environmental Variables:
Device: Buri
BuildID: 20140122015710
Gaia: 15ad9c79120b3391d4b8c206417e0658d83fb84b
Gecko: 590e76b01565
Version: 28.0a2
Base Image: V1.2-device.cfg

Note: This issue only seems to occur when APZ is enabled.
Thanks!

If I'm doing this right, the only gecko change in that range is a gaia pushbot bump:

http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=86b2dfa26b6e&tochange=590e76b01565
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #7)
> Thanks!
> 
> If I'm doing this right, the only gecko change in that range is a gaia
> pushbot bump:
> 
> http://hg.mozilla.org/releases/mozilla-aurora/
> pushloghtml?fromchange=86b2dfa26b6e&tochange=590e76b01565

That range can't be right. That patch points to a gaia ui test change modifying a comment, which definitely can't cause this problem.
Can we get the range done again then?
The range is correct unless those tinderbox builds display completely wrong gecko and gaia variables. 

I've double checked the results in comment 6, the last working does not reproduce this issue, and the first broken does. The environmental variables were the same.
Could be a Gaia change that caused it then.  Either way, let's wait for this to become 1.3+ before we spend more time on it.
Flags: needinfo?(milan)
I triple checked this just for sanity's sake, and apparently the repro rate drops to less than 100% on the build listed in "Last Working" build listed in comment 6, but DOES reproduce. 

Each time I had checked it previously (and had another tester check) the issue did not reproduce, but after trying a few more times I was able to repro the issue.

On all the other builds, this would happen on the first try 100% of the time. This is not the case on build 20140122014610, the rate seems closer to 30% or so. 

So the regression range is incorrect after all, my apologies for the bad info. This could also indicate the original broader range listed in comment 3 was incorrect as well.

I'll see if I can get a better range with the lowered repro rate in mind on builds from before 01/22. Unfortunately, there's only a handful of builds left from the 21st still available in the tinderbox builds.
Okay, so the repro seemed to drop from 100% in more recent builds to something like 10-20% at some point the further back I went, but I believe this is the correct window. I retested this last working several times and could not get a repro.

Last Working Environmental Variables:
Device: Buri
BuildID: 20140109004002
Gaia: 22bc6be5b76cdc6d4e9667ff070979041a20ce2f
Gecko: 2c8f8683bd0d
Version: 28.0a2
Base Image: V1.2-device.cfg

First Broken Environmental Variables:
Device: Buri 
BuildID: 20140110004009
Gaia: c606b129a2d1647c0fc7bfb197555026e9b27f9e
Gecko: c5109884ae3a
Version: 28.0a2
Base Image: V1.2-device.cfg

Unfortunately, this range is before there are tinderbox builds available, so this is the closest I can get for now.
Hopefully there's a useful commit in this range.
To clarify - repro rate is now 10% - 20% on the latest 1.3, right? Also, can I a working video of the bug on the latest build? The video link here doesn't work.
Keywords: qawanted
The attached video and original URL didn't work so the video URL was changed in comment 2 to a working URL.
(In reply to Jason Smith [:jsmith] from comment #14)
> To clarify - repro rate is now 10% - 20% on the latest 1.3, right? Also, can
> I a working video of the bug on the latest build? The video link here
> doesn't work.

The repro rate seemed to be higher when the bug was reported  on 02/07, close to 100%.

Then the repro rate seemed to lower down to something like 10-20% going further back in time with builds from 01/22 and before, which is why the initial regression range posted was incorrect. It seemed that those builds were working fine at a glance, but after several retries I would get this to reproduce.

I just tested this on today's 1.3 build and it seems the repro rate is something around 80%, it's quite easy to reproduce.

It seems that Andrew addressed the video request in the URL field of this bug.

Also, I just noticed that in comment 13 the range I listed is around or on the date when APZ was enabled by default, so that range may not be correct either, as this issue only occurs when APZ is enabled. I can do some more investigation around that.
Keywords: qawanted
So, I did some further investigation with APZ manually turned on, it seems the regression window was further back after all. I attempted to repro this issue on the 12/17 build something like 35 times and did not see a repro, so unless this issue's repro rate lowered considerably even more, I am pretty confident this is finally the true regression window.

Last Working Environmental Variables:
Device: Buri
BuildID: 20131217004001
Gaia: dca0a3dcf062ce3e422a9c56d141c14543c816fb
Gecko: 1f7db4cc788e
Version: 28.0a2
Base Image: V1.2-device.cfg

First Broken Environmental Variables:
Device: Buri 
BuildID: 20131218004002
Gaia: a99b23e73fe5630a877004746d0e7fcec1b6d653
Gecko: 369bdbff6c38
Version: 28.0a2
Base Image: V1.2-device.cfg
Here is the pushlog that corresponds to the range in comment 17:

https://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=1f7db4cc788e&tochange=369bdbff6c38

A couple of APZ changes in here but based on my investigation so far I don't have a good idea of what's happening yet.

As far as I can tell, when this reproduces, the content is still scrollable (because the scrollHeight is greater than than the clientHeight) and I can make it scroll by setting the scrollTop values through the app manager console. Also if I press the power button to turn off the screen and then turn it back on, it seems to scroll normally. So I suspect something in the layer tree or APZ tree just isn't getting updated properly. I am rebuilding now to dump those trees so I can investigate further, but I didn't see anything in the regression range that stood out to me as obviously causing this.
Oddly enough I'm having more and more trouble reproducing this problem as I continue investigation. What I have determined so far is that when the scrolling is busted, what's actually happening is that on an ancestor layer of the scrollable thing, there is a CSS transform of x:100%. This means that when we do hit detection for the touch events, the untransform puts it a negative x-coordinate, and so it doesn't hit the right layer. It actually ends up hitting the layer underneath (the contacts list) and so that ends up scrolling instead. When you dismiss the "add contact" form you can see that the contact list has moved by however much you scrolled.

I'm not yet sure which CSS transform is causing this, why it only happens sometimes, and where the actual bug is.
So... on the times that I can reproduce this bug, the layer tree is broken (backwards iteration results in a different tree than forward iteration) but on times that I cannot reproduce this bug, the layer tree is fine. I'm not sure that the broken layer tree is directly causing this, but if they don't then they probably share the same root cause.
Component: Panning and Zooming → Graphics: Layers
Whiteboard: [ETA:2/18]
Comment on attachment 8376476 [details] [diff] [review]
Fix treebuilding logic

Discussed this with bjacob. My fix leaves behind another bug where mFirstChild isn't updated either. I will update the patch, maybe write some tests to with it.
Attachment #8376476 - Flags: review?(bjacob)
Whiteboard: [ETA:2/18] → [ETA:2/21]
Attachment #8377730 - Flags: review?(bgirard) → review+
Comment on attachment 8377731 [details] [diff] [review]
Part 2 - Fix the RepositionChild function and add a gtest for it

Review of attachment 8377731 [details] [diff] [review]:
-----------------------------------------------------------------

I'm happy to review this, feel free to review it again matt

::: gfx/tests/gtest/TestLayers.cpp
@@ +308,5 @@
> +
> +  //   0
> +  // 3 2 1
> +
> +  ASSERT_EQ(layers[2], layers[3]->GetNextSibling());

Why not check this after each operation?
Attachment #8377731 - Flags: review?(matt.woodrow) → review+
(In reply to Benoit Girard (:BenWa) from comment #28)
> Why not check this after each operation?

Sure, I can do that. I just figured the test at the end would only work correctly if all the previous tests worked as well.
Update to test the tree state after each change. Carrying r+.

Will land once the try is green. I just want to make sure that removing the stubs in part 1 doesn't break things somewhere because I believe it used to before (which is why they were added in the first place).
Attachment #8377731 - Attachment is obsolete: true
Attachment #8377787 - Flags: review+
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #29)
> (In reply to Benoit Girard (:BenWa) from comment #28)
> > Why not check this after each operation?
> 
> Sure, I can do that. I just figured the test at the end would only work
> correctly if all the previous tests worked as well.

I'll leave that up to you if its a good use of your time. It's possible that a mistake could cancel out.
Comment on attachment 8377730 [details] [diff] [review]
Part 1 - Don't replace good production code with bad test stub

Note that I'm requesting all sorts of uplift approval here because these patches are a pretty safe fix for what is obviously a bug in our layer tree code. I would like to get this uplifted to aurora and beta and thereby also get into b2g28, but if it gets minused for aurora and/or beta it should still get uplifted to b2g28 because this bug is a 1.3 blocker.

NOTE: This flag is now for security issues only. Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug in prehistoric code now tickled by APZ
User impact if declined: as described in comment 0 of this bug, but it could also manifest in many many other ways, depending on the structure of the layer tree.
Testing completed: locally
Risk to taking this patch (and alternatives if risky): low-risk. the actual patch is only two lines; the rest of it is testing code
String or UUID changes made by this patch: none
Attachment #8377730 - Flags: approval-mozilla-beta?
Attachment #8377730 - Flags: approval-mozilla-b2g28?
Attachment #8377730 - Flags: approval-mozilla-aurora?
Comment on attachment 8377787 [details] [diff] [review]
Part 2 - Fix the RepositionChild function and add a gtest for it (v2)

Same as above
Attachment #8377787 - Flags: approval-mozilla-beta?
Attachment #8377787 - Flags: approval-mozilla-b2g28?
Attachment #8377787 - Flags: approval-mozilla-aurora?
Attachment #8377730 - Flags: approval-mozilla-b2g28? → approval-mozilla-b2g28+
Attachment #8377787 - Flags: approval-mozilla-b2g28? → approval-mozilla-b2g28+
Comment on attachment 8377730 [details] [diff] [review]
Part 1 - Don't replace good production code with bad test stub

I will leave the b2g RM to approve (or not) the uplift
Attachment #8377730 - Flags: approval-mozilla-beta?
Attachment #8377730 - Flags: approval-mozilla-beta+
Attachment #8377730 - Flags: approval-mozilla-aurora?
Attachment #8377730 - Flags: approval-mozilla-aurora+
Attachment #8377787 - Flags: approval-mozilla-beta?
Attachment #8377787 - Flags: approval-mozilla-beta+
Attachment #8377787 - Flags: approval-mozilla-aurora?
Attachment #8377787 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.