Closed Bug 969483 Opened 6 years ago Closed 6 years ago
[B2G][Contacts]User is unable to scroll in the Add Contact menu after opening phone label menu and pressing Done
1.35 MB, video/quicktime
1.96 KB, patch
|Details | Diff | Splinter Review|
4.91 KB, patch
|Details | Diff | Splinter Review|
User is unable to scroll in the Add Contact menu after opening the keyboard and scrolling the page, then opening the phone label menu and pressing 'Done'. Repro Steps: 1) Updated Buri to BuildID: 20140207004002 2) Open Contacts app and add a new contact 3) Select a field (ex. Name) to open the keyboard 4) Scroll down and select Mobile 5) Scroll down then up in Label menu 6) Press done and attempt to scroll Actual: User is unable to scroll in the Add contact menu after opening and closing the Label menu. Expected: User is able to scroll after opening and closing the Label menu. Environmental Variables: Device: Buri v1.3 Mozilla RIL BuildID: 20140207004002 Gaia: 1527d1e450364e383eeb95ff898dca2042e2b4b5 Gecko: 0a6d83aabb02 Version: 28.0 Firmware Version: V1.2-device.cfg Repro frequency: 100% Link to failed test case: See attached: Video
Summary: [B2G][Contacts]User is unable to scroll in the Add Contact menu after opening phone label menu and pressing Done. → [B2G][Contacts]User is unable to scroll in the Add Contact menu after opening phone label menu and pressing Done
This issue does not occur on the latest 1.2. Environmental Variables: Device: Buri v1.2 Mozilla RIL BuildID: 20140207004002 Gaia: 539a25e1887b902b8b25038c547048e691bd97f6 Gecko: 1592ce49929c Version: 26.0 Firmware Version: V1.2-device.cfg
blocking-b2g: --- → 1.3?
Component: Gaia::Contacts → Panning and Zooming
Product: Firefox OS → Core
Version: unspecified → 28 Branch
Regression Window: Last Working Environmental Variables: Device: Buri 1.3 BuildID: 20140122004001 Gaia: 744fb691c2b2a25a07c5d19fabf5748ae9aba4d9 Gecko: 71a8786c3815 Version: 28.0a2 Base Image: V1.2-device.cfg First Broken Environmental Variables: Device: Buri 1.3 BuildID: 20140123004001 Gaia: 6fbeac2415f07f10de181f0877ddf67ee299b885 Gecko: e8f6bdf8db3d Version: 28.0a2 Base Image: V1.2-device.cfg
QA Contact: jzimbrick
Milan Please take a look for APZC
Can we get a narrower range? The pvtbuilds server still has per-commit builds from 21 January onwards so this should still be possible in the next day or two.
Narrower range with tinderbox aurora builds: Last Working Environmental Variables: Device: Buri BuildID: 20140122014610 Gaia: 15ad9c79120b3391d4b8c206417e0658d83fb84b Gecko: 86b2dfa26b6e Version: 28.0a2 Base Image: V1.2-device.cfg First Broken Environmental Variables: Device: Buri BuildID: 20140122015710 Gaia: 15ad9c79120b3391d4b8c206417e0658d83fb84b Gecko: 590e76b01565 Version: 28.0a2 Base Image: V1.2-device.cfg Note: This issue only seems to occur when APZ is enabled.
Thanks! If I'm doing this right, the only gecko change in that range is a gaia pushbot bump: http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=86b2dfa26b6e&tochange=590e76b01565
(In reply to Kartikaya Gupta (email:email@example.com) from comment #7) > Thanks! > > If I'm doing this right, the only gecko change in that range is a gaia > pushbot bump: > > http://hg.mozilla.org/releases/mozilla-aurora/ > pushloghtml?fromchange=86b2dfa26b6e&tochange=590e76b01565 That range can't be right. That patch points to a gaia ui test change modifying a comment, which definitely can't cause this problem.
Can we get the range done again then?
The range is correct unless those tinderbox builds display completely wrong gecko and gaia variables. I've double checked the results in comment 6, the last working does not reproduce this issue, and the first broken does. The environmental variables were the same.
Could be a Gaia change that caused it then. Either way, let's wait for this to become 1.3+ before we spend more time on it.
I triple checked this just for sanity's sake, and apparently the repro rate drops to less than 100% on the build listed in "Last Working" build listed in comment 6, but DOES reproduce. Each time I had checked it previously (and had another tester check) the issue did not reproduce, but after trying a few more times I was able to repro the issue. On all the other builds, this would happen on the first try 100% of the time. This is not the case on build 20140122014610, the rate seems closer to 30% or so. So the regression range is incorrect after all, my apologies for the bad info. This could also indicate the original broader range listed in comment 3 was incorrect as well. I'll see if I can get a better range with the lowered repro rate in mind on builds from before 01/22. Unfortunately, there's only a handful of builds left from the 21st still available in the tinderbox builds.
Okay, so the repro seemed to drop from 100% in more recent builds to something like 10-20% at some point the further back I went, but I believe this is the correct window. I retested this last working several times and could not get a repro. Last Working Environmental Variables: Device: Buri BuildID: 20140109004002 Gaia: 22bc6be5b76cdc6d4e9667ff070979041a20ce2f Gecko: 2c8f8683bd0d Version: 28.0a2 Base Image: V1.2-device.cfg First Broken Environmental Variables: Device: Buri BuildID: 20140110004009 Gaia: c606b129a2d1647c0fc7bfb197555026e9b27f9e Gecko: c5109884ae3a Version: 28.0a2 Base Image: V1.2-device.cfg Unfortunately, this range is before there are tinderbox builds available, so this is the closest I can get for now. Hopefully there's a useful commit in this range.
To clarify - repro rate is now 10% - 20% on the latest 1.3, right? Also, can I a working video of the bug on the latest build? The video link here doesn't work.
The attached video and original URL didn't work so the video URL was changed in comment 2 to a working URL.
(In reply to Jason Smith [:jsmith] from comment #14) > To clarify - repro rate is now 10% - 20% on the latest 1.3, right? Also, can > I a working video of the bug on the latest build? The video link here > doesn't work. The repro rate seemed to be higher when the bug was reported on 02/07, close to 100%. Then the repro rate seemed to lower down to something like 10-20% going further back in time with builds from 01/22 and before, which is why the initial regression range posted was incorrect. It seemed that those builds were working fine at a glance, but after several retries I would get this to reproduce. I just tested this on today's 1.3 build and it seems the repro rate is something around 80%, it's quite easy to reproduce. It seems that Andrew addressed the video request in the URL field of this bug. Also, I just noticed that in comment 13 the range I listed is around or on the date when APZ was enabled by default, so that range may not be correct either, as this issue only occurs when APZ is enabled. I can do some more investigation around that.
So, I did some further investigation with APZ manually turned on, it seems the regression window was further back after all. I attempted to repro this issue on the 12/17 build something like 35 times and did not see a repro, so unless this issue's repro rate lowered considerably even more, I am pretty confident this is finally the true regression window. Last Working Environmental Variables: Device: Buri BuildID: 20131217004001 Gaia: dca0a3dcf062ce3e422a9c56d141c14543c816fb Gecko: 1f7db4cc788e Version: 28.0a2 Base Image: V1.2-device.cfg First Broken Environmental Variables: Device: Buri BuildID: 20131218004002 Gaia: a99b23e73fe5630a877004746d0e7fcec1b6d653 Gecko: 369bdbff6c38 Version: 28.0a2 Base Image: V1.2-device.cfg
Here is the pushlog that corresponds to the range in comment 17: https://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=1f7db4cc788e&tochange=369bdbff6c38 A couple of APZ changes in here but based on my investigation so far I don't have a good idea of what's happening yet. As far as I can tell, when this reproduces, the content is still scrollable (because the scrollHeight is greater than than the clientHeight) and I can make it scroll by setting the scrollTop values through the app manager console. Also if I press the power button to turn off the screen and then turn it back on, it seems to scroll normally. So I suspect something in the layer tree or APZ tree just isn't getting updated properly. I am rebuilding now to dump those trees so I can investigate further, but I didn't see anything in the regression range that stood out to me as obviously causing this.
Oddly enough I'm having more and more trouble reproducing this problem as I continue investigation. What I have determined so far is that when the scrolling is busted, what's actually happening is that on an ancestor layer of the scrollable thing, there is a CSS transform of x:100%. This means that when we do hit detection for the touch events, the untransform puts it a negative x-coordinate, and so it doesn't hit the right layer. It actually ends up hitting the layer underneath (the contacts list) and so that ends up scrolling instead. When you dismiss the "add contact" form you can see that the contact list has moved by however much you scrolled. I'm not yet sure which CSS transform is causing this, why it only happens sometimes, and where the actual bug is.
So... on the times that I can reproduce this bug, the layer tree is broken (backwards iteration results in a different tree than forward iteration) but on times that I cannot reproduce this bug, the layer tree is fine. I'm not sure that the broken layer tree is directly causing this, but if they don't then they probably share the same root cause.
Component: Panning and Zooming → Graphics: Layers
Depends on: 970747
No longer depends on: 970747
Comment on attachment 8376476 [details] [diff] [review] Fix treebuilding logic Discussed this with bjacob. My fix leaves behind another bug where mFirstChild isn't updated either. I will update the patch, maybe write some tests to with it.
Whiteboard: [ETA:2/18] → [ETA:2/21]
Attachment #8377730 - Flags: review?(bgirard) → review+
Comment on attachment 8377731 [details] [diff] [review] Part 2 - Fix the RepositionChild function and add a gtest for it Review of attachment 8377731 [details] [diff] [review]: ----------------------------------------------------------------- I'm happy to review this, feel free to review it again matt ::: gfx/tests/gtest/TestLayers.cpp @@ +308,5 @@ > + > + // 0 > + // 3 2 1 > + > + ASSERT_EQ(layers, layers->GetNextSibling()); Why not check this after each operation?
Attachment #8377731 - Flags: review?(matt.woodrow) → review+
(In reply to Benoit Girard (:BenWa) from comment #28) > Why not check this after each operation? Sure, I can do that. I just figured the test at the end would only work correctly if all the previous tests worked as well.
Update to test the tree state after each change. Carrying r+. Will land once the try is green. I just want to make sure that removing the stubs in part 1 doesn't break things somewhere because I believe it used to before (which is why they were added in the first place).
(In reply to Kartikaya Gupta (email:firstname.lastname@example.org) from comment #29) > (In reply to Benoit Girard (:BenWa) from comment #28) > > Why not check this after each operation? > > Sure, I can do that. I just figured the test at the end would only work > correctly if all the previous tests worked as well. I'll leave that up to you if its a good use of your time. It's possible that a mistake could cancel out.
Saw enough green on the try run to convince me. remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/b8ee6ec3af98 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/e1c8174ce97b
Comment on attachment 8377730 [details] [diff] [review] Part 1 - Don't replace good production code with bad test stub Note that I'm requesting all sorts of uplift approval here because these patches are a pretty safe fix for what is obviously a bug in our layer tree code. I would like to get this uplifted to aurora and beta and thereby also get into b2g28, but if it gets minused for aurora and/or beta it should still get uplifted to b2g28 because this bug is a 1.3 blocker. NOTE: This flag is now for security issues only. Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings. [Approval Request Comment] Bug caused by (feature/regressing bug #): bug in prehistoric code now tickled by APZ User impact if declined: as described in comment 0 of this bug, but it could also manifest in many many other ways, depending on the structure of the layer tree. Testing completed: locally Risk to taking this patch (and alternatives if risky): low-risk. the actual patch is only two lines; the rest of it is testing code String or UUID changes made by this patch: none
Comment on attachment 8377787 [details] [diff] [review] Part 2 - Fix the RepositionChild function and add a gtest for it (v2) Same as above
Attachment #8377730 - Flags: approval-mozilla-b2g28? → approval-mozilla-b2g28+
Attachment #8377787 - Flags: approval-mozilla-b2g28? → approval-mozilla-b2g28+
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Comment on attachment 8377730 [details] [diff] [review] Part 1 - Don't replace good production code with bad test stub I will leave the b2g RM to approve (or not) the uplift
remote: https://hg.mozilla.org/releases/mozilla-beta/rev/b24e6c62e64a remote: https://hg.mozilla.org/releases/mozilla-beta/rev/175f7f1ac03f Based on https://wiki.mozilla.org/Release_Management/B2G_Landing#v1.3.0 this should get merged to b2g28 automatically.
You need to log in before you can comment on or make changes to this bug.