Closed Bug 1187056 Opened 4 years ago Closed 4 years ago
_gallery _edit _photo .py: "Timed out after 10 .3 seconds"
Description: test_gallery_edit_photo test is failing after opening the gallery app and selecting a photo to embiggen. While on large photo view, test taps the edit button but no action resolves, leaving the test to timeout. Saw this issue reproduce once manually but have not discovered STR at this time. * http://jenkins1.qa.scl3.mozilla.com/view/UI/job/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.2.bitbar/181/HTML_Report/ * http://jenkins1.qa.scl3.mozilla.com/job/flame-kk.ui.adhoc.bitbar/146/HTML_Report/ PreReq: * picture in gallery app Repro Steps: 1) Update phone to 20150723010205 2) Open Gallery app 3) Tap picture 4) Tap Edit button Actual: Phone does not transition to edit view; test times out Expected: Phone transitions to edit view; test continues editting Traceback (most recent call last): File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.2.bitbar/workspace/.env/lib/python2.7/site-packages/marionette_client-0.16-py2.7.egg/marionette/marionette_test.py", line 296, in run testMethod() File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.2.bitbar/workspace/tests/python/gaia-ui-tests/gaiatest/tests/functional/gallery/test_gallery_edit_photo.py", line 28, in test_gallery_edit_photo edit_image = image.tap_edit_button() File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.2.bitbar/workspace/tests/python/gaia-ui-tests/gaiatest/apps/gallery/regions/fullscreen_image.py", line 92, in tap_edit_button return EditPhoto(self.marionette) File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.2.bitbar/workspace/tests/python/gaia-ui-tests/gaiatest/apps/gallery/regions/edit_photo.py", line 27, in __init__ *self._canvas_locator)))) File "/var/lib/jenkins/jobs/flame-kk-319.mozilla-central.nightly.ui.functional.non-smoke.2.bitbar/workspace/.env/lib/python2.7/site-packages/marionette_driver-0.9-py2.7.egg/marionette_driver/wait.py", line 143, in until cause=last_exc) TimeoutException: TimeoutException: Timed out after 10.3 seconds Environmental Variables: Device firmware (base) L1TC000118D0 Device firmware (date) 23 Jul 2015 08:24:55 Device firmware (incremental) eng.cltbld.20150723.042443 Device firmware (release) 4.4.2 Device identifier flame Device memory 219772 kB Device serial 356cd044 Device uptime 0 days 0 hours 2 minutes 8 seconds Gaia date 22 Jul 2015 18:44:09 Gaia revision f04fdbfa1943 Gecko build 20150723010205 Gecko revision 2ddec2dedced Gecko version 42.0a1 Reproducible manually: Once, cannot reproduce at this time Repro frequency: 3/6
Ok, I can reproduce this manually too, when you after the Gallery is open, you quickly tap on the image to open it as a fullscreen image. So this looks like a regression to me. We should find out when this regressed.
Confirmed comment 1. It seems that the bug will only reproduce if I tap into a picture while Gallery is still loading. Flame 2.2 prevented this behavior. If a certain picture is not ready to enter Edit mode, Edit button is grayed out until it's ready. Working on getting the window.
I cannot reproduce this issue using shallow flashing whatsoever. That means I'll have to full flash to find the window and won't be able to do swaps to find out whether it's Gecko or Gaia issue.
This is also a problem with test_gallery_delete_image.py.
I don't understand why ./flash_pvt.py makes this reproduce, but when I do ./flash_pvt.py -d flame-kk -v mozilla-central --eng -g -G -b 20150624080416 it doesn't.
Fortunately I don't have to do swaps for this one. mozilla-inbound regression window: Last Working Device: Flame BuildID: 20150721075041 Gaia: 4fe0507781f3ed56c8ae5e66dd9489165d1ff68e Gecko: 49ae0961591e Gonk: 41d3e221039d1c4486fc13ff26793a7a39226423 Version: 42.0a1 (2.5 Master) Firmware Version: v18D User Agent: Mozilla/5.0 (Mobile; rv:42.0) Gecko/42.0 Firefox/42.0 First Broken Device: Flame BuildID: 20150721075241 Gaia: 4fe0507781f3ed56c8ae5e66dd9489165d1ff68e Gecko: fee45cd9a4d9 Gonk: 41d3e221039d1c4486fc13ff26793a7a39226423 Version: 42.0a1 (2.5 Master) Firmware Version: v18D User Agent: Mozilla/5.0 (Mobile; rv:42.0) Gecko/42.0 Firefox/42.0 Gaia is the same so it's a Gecko issue. Gecko pushlog: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=49ae0961591e&tochange=fee45cd9a4d9 This issue is likely caused by changes made in Bug 1178847.
[Blocking Requested - why for this release]: Broken functionality and a regression so nominating this 2.5? Kartikaya, can you take a look at this please? This looks to have been caused by the changes made for bug 1178847.
blocking-b2g: --- → 2.5?
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker) → needinfo?(bugmail.mozilla)
Thanks, I'll take a look.
Assignee: nobody → bugmail.mozilla
I can reproduce this behaviour very intermittently. When I do, I see in WebIDE that the buttons in question don't have their onclick listeners registered, which seems very odd. I don't see how my change could have caused that behaviour, since the onclick listeners are registered unconditionally in frames.js. I'll keep digging.
So I'm unable to reproduce the problem manually any more. I also can't reproduce it while running test_gallery_edit_photo.py; instead it just times out while waiting for files to load, even though visually I can see on the device that the files are done loading. This is the command I tried: GAIATEST_SKIP_WARNING=1 GAIATEST_ACKNOWLEDGED_RISKS=1 $MARIONETTE_ENV/bin/gaiatest --address=localhost:2828 --testvars gaiatest/testvars_template.json --restart --timeout 120000 gaiatest/tests/functional/gallery/test_gallery_edit_photo.py Without a way to reproduce this problem I can't really do much else. I don't think this test/suite has enough documentation on how to properly run it on a local dev machine.
Assignee: bugmail.mozilla → nobody
(In reply to Kartikaya Gupta (email:email@example.com) from comment #13) > Without a way to reproduce this problem I can't really do much else. I don't > think this test/suite has enough documentation on how to properly run it on > a local dev machine. What do you mean? You were able to run it, no?
So I locally ran the test using today's nightly build, and I can repro the error. Basically it times out after tapping the edit button, and waiting for the canvas view. (Because canvas view doesn't come up). I tapped the edit button manually after the test failed, and it was still unresponsive. Kats, if you need me to try something out here, let me know. It's strange that this happens with low repro rate when I do it manually.
(In reply to Martijn Wargers [:mwargers] (QA) from comment #14) > What do you mean? You were able to run it, no? Yes, but the environment on the device is not specified in enough detail. It loads pictures off the SD card and in my case I have around 80-100 pictures that I took yesterday while trying to reproduce this bug. It seems to me that the test is not expecting that, which is why it's timing out for me. I don't know what set of pictures the test is expecting, and it doesn't appear to be documented anywhere.
(In reply to Kartikaya Gupta (email:firstname.lastname@example.org) from comment #16) > (In reply to Martijn Wargers [:mwargers] (QA) from comment #14) > > What do you mean? You were able to run it, no? > > Yes, but the environment on the device is not specified in enough detail. It > loads pictures off the SD card and in my case I have around 80-100 pictures > that I took yesterday while trying to reproduce this bug. It seems to me > that the test is not expecting that, which is why it's timing out for me. I > don't know what set of pictures the test is expecting, and it doesn't appear > to be documented anywhere. Oh, I see. That sounds like a bug in the framework. It should actually delete those pictures and only use the picture that was added in the test.
So what I found out was that I can only repro the bug once for each cold startup. I have to shutdown and power up the phone and run the script to repro the bug, and after that, I cannot reproduce this unless I turn off the phone again. Also I am on 319MB config as well. mwargers is investigating 512MB setup.
(In reply to No-Jun Park [:njpark] from comment #18) > Also I am on 319MB config as well. mwargers is investigating > 512MB setup. Yeah, I couldn't reproduce after 20 retries, but I could reproduce it 2 times out of 10 using 319MB. If one would want to reproduce manually, I would recommend trying with 256MB.
(In reply to Martijn Wargers [:mwargers] (QA) from comment #17) > Oh, I see. That sounds like a bug in the framework. It should actually > delete those pictures and only use the picture that was added in the test. It's actually strange that it doesn't do that, because in bug 1044019 this was done supposedly.
blocking-b2g: 2.5? → 2.5+
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage+][memory-failure]
Ok, I tried on a flame-kk and I can reproduce the test failure there. For some reason it doesn't repro on Aries. Anyway I can flash a local gecko build on top of a stock pvtbuild and still repro so I do have a chance at debugging this now. I tried flashing a full local build but that made the issue go away. Haven't tried flashing just gaia yet.
So what appears to be happening is that the gallery.js code is loaded before the meta-viewport tag has taken effect. This results in ScreenLayout.getCurrentLayout('tiny') returning false (because the innerWidth is 980) and so the fullscreenButtons array gets populated with empty things. This causes the test to fail later on. I'm not really sure why the meta-viewport tag is getting processed so late, since it's right at the top of the HTML file, before the <script> tags. Need to dig some more.
Assignee: nobody → bugmail.mozilla
Ok, so the problem is the async-ness of the DOMMetaAdded dispatch at . The meta viewport tag is parsed before the JS is run, but the DOMMetaAdded event itself isn't dispatched until after, so the MobileViewportManager doesn't update the viewport until after. I'm not entirely sure how this was working prior to bug 1178847 since it seems like it should have had the same problem. Nonetheless I think the correct fix is to add a synchronous notification of the meta-viewport tag so that we can update the screen properties immediately. The patch also has the side benefit of not running MobileViewportManager code with random non-viewport meta tags are added, which is nice. However I'm not entirely sure it's ok to be running a reflow synchronously from the DOM building code, if it isn't then I don't see a way to fix this in gecko and it will have to be fixed in the gallery app. Smaug, do you know if this is ok? Try push in progress at https://treeherder.mozilla.org/#/jobs?repo=try&revision=341db37de415  http://mxr.mozilla.org/mozilla-central/source/dom/html/HTMLMetaElement.cpp?rev=bd079aadd3fe#97
Attachment #8641169 - Flags: feedback?(bugs)
Comment on attachment 8641169 [details] [diff] [review] Patch This is unfortunately most probably not safe. We won't reflow at that point, since there are script blockers on stack, but doing anything like MetaViewportChanged() during BindToTree looks scary to me. One would need to explicitly prove that MetaViewportChanged call may not run scripts. So either prove that and assert somewhere hard (RELEASE_ASSERT) that script aren't run, nor events dispatched etc. or Would it help if HTMLMetaElement::CreateAndDispatchEvent didn't use asyncDispatcher->PostDOMEvent(); but asyncDispatcher->RunDOMEventWhenSafe() ?
Attachment #8641169 - Flags: feedback?(bugs) → feedback-
Yeah using RunDOMEventWhenSafe also fixes it, thanks for the tip! Try push at https://treeherder.mozilla.org/#/jobs?repo=try&revision=fe342ec1f1f9
Comment on attachment 8641210 [details] [diff] [review] Patch v2 r+, naturally assuming tryserver results look good
Attachment #8641210 - Flags: review?(bugs) → review+
Try looks good. There's a frequent M-3 failure on B2G but looking at the "similar jobs" pane shows a lot of other pushes from the same time hit that too. More retriggers on it are coming up green, so I assume some glitch in the environment caused those. I'll land it once inbound reopens.
Component: Gaia::Gallery → Layout
Product: Firefox OS → Core
adding qawanted to test out with gaia ui test once it lands.
I just tested locally test_gallery_edit_photo.py and repeated 10 times. I didn't get any failure. Normally, I would definitely see this failure within 5 times, so I think this is really fixed. Thanks for fixing this, Kats!
This issue is verified as fixed on Flame and Aries. Ran the 'test_gallery_edit_photo.py' locally 10 times on each device/branch and they all passed without issue. Device: Aries 2.6 BuildID: 20151210122424 Gaia: 7e962276913bd4da7ce5fa7540767107ce322c78 Gecko: 412e4d7ce98ca4dbc37de133d0f26d7e1a59946f Gonk: a19052e4389c3ae2d8fc3e7a74a475401baacc56 Version: 45.0a1 (2.6) Firmware Version: D5803_23.1.A.1.28_NCB.ftf User Agent: Mozilla/5.0 (Mobile; rv:45.0) Gecko/45.0 Firefox/45.0 Device: Flame 2.6 BuildID: 20151210030222 Gaia: 961528f4391668bc89ec0be14fa367cea099b588 Gecko: b40ba117fa757861c9caa660ae989122718b494b Gonk: 205ac4204bbbb2098a8046444acba551ba5dc75a Version: 45.0a1 (2.6) Firmware Version: v18Dv4 User Agent: Mozilla/5.0 (Mobile; rv:45.0) Gecko/45.0 Firefox/45.0 Device: Flame 2.5 BuildID: 20151209170211 Gaia: 7ca639a7bb0bacf27f548841c52617bfc0e3b21f Gecko: a35e8eb98969970d1af28b265bf99a9edd11e9c2 Gonk: 205ac4204bbbb2098a8046444acba551ba5dc75a Version: 44.0a2 (2.5) Firmware Version: v18Dv4 User Agent: Mozilla/5.0 (Mobile; rv:44.0) Gecko/44.0 Firefox/44.0
QA Whiteboard: [QAnalyst-Triage?][memory-failure] → [QAnalyst-Triage+][memory-failure]
You need to log in before you can comment on or make changes to this bug.