Closed
Bug 1361983
Opened 7 years ago
Closed 7 years ago
Intermittent test_click.py TestClick.test_click_number_link | NoSuchElementException: Unable to locate element: username (pageunload timer canceled because flushEventLoop returns after beforeunload event)
Categories
(Testing :: Marionette Client and Harness, defect)
Tracking
(firefox54 fixed, firefox55 fixed)
RESOLVED
FIXED
mozilla55
People
(Reporter: intermittent-bug-filer, Assigned: whimboo)
References
Details
(Keywords: intermittent-failure)
Attachments
(1 file)
Filed by: hskupin [at] mozilla.com https://treeherder.mozilla.org/logviewer.html#?job_id=96498981&repo=autoland https://queue.taskcluster.net/v1/task/NShzdoExTlaZQlCoUvgJbQ/runs/0/artifacts/public/logs/live_backing.log
Assignee | ||
Comment 1•7 years ago
|
||
The fix on bug 1357634 may miss something. So this failure happens because we cancel the page unload timer too early. And that even after 200ms whereby we use an extended timeout of 5000ms: 1493878218361 Marionette TRACE 374 -> [0,4,"clickElement",{"id":"e31a38ae-c0ef-4939-bf22-3ca7b52c247b"}] 1493878219108 Marionette DEBUG Received DOM event "beforeunload" for "http://127.0.0.1:38163/clicks.html" 1493878219318 Marionette DEBUG Canceled page load listener because no navigation has been detected 1493878219396 Marionette TRACE 374 <- [1,4,null,{}]
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Summary: Intermittent test_click.py TestClick.test_click_number_link | NoSuchElementException: Unable to locate element: username → Intermittent test_click.py TestClick.test_click_number_link | NoSuchElementException: Unable to locate element: username (pageunload timer canceled too early)
Assignee | ||
Comment 2•7 years ago
|
||
So while deeply inspecting this code I can only see a single code path which could trigger this problem: We setup the page unload timer right after the trigger callback. But the event handlers are registered before the trigger callback is executed. In the case of `clickElement` there is the `flushEventLoop` call. It currently escapes for the next animation frame, or for the `beforeunload` event. So if the latter is happening, we haven't setup the page unload timer yet, and as such the if condition in `handleEvent` will not re-initialize the timer. But it got set by our code in `navigate`, right after the trigger callback returns. It means we have an overlap here. Sadly nothing on try reproduces this problem yet, so it got unnoticed and my above idea is just an assumption: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a7b55c37aa39f9960a6783dcab7287633907e403
Assignee | ||
Comment 3•7 years ago
|
||
The try jobs finally hit a single instance of this problem. The signature of the failure is not identical but caused by the same underlying issue: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a7b55c37aa39f9960a6783dcab7287633907e403&selectedJob=96941953 The gecko.log exactly shows what is wrong: 1494001827731 Marionette TRACE 447 -> [0,20,"clickElement",{"id":"13ec7301-33e4-4284-a5dd-d78d9c6bbf12"}] 1494001828035 Marionette DEBUG Received DOM event "beforeunload" for "http://127.0.0.1:49365/clicks.html" 1494001828035 Marionette DEBUG Having a page unload timer: null 1494001828045 Marionette DEBUG Set page unload timer to 200ms 1494001828246 Marionette DEBUG Canceled page load listener because no navigation has been detected 1494001828368 Marionette TRACE 447 <- [1,20,null,{}] It means my assumption of that a late return from `flushEventLoop` is causing this issue, was correct. It's not that clear to me right now how to fix it. The easiest would be to have a general page unload timer, which stops the navigation after the 5s and no page unload being detected. But I'm not sure if we can/should do it for commands like `get`, `back`, `forward`, or `refresh`. Those haven't opted-in for the page unload timer. David, what do you think? Is that a possibility, or should we better find another solution?
Flags: needinfo?(dburns)
Assignee | ||
Comment 4•7 years ago
|
||
Thanks to Dave who provided a minimized testcase (attachment 8863819 [details]) on bug 1335778, I can now perfectly replicate this issue locally. Testing a bit I now know how to fix it. It's actually obvious when looking at the current code.
Flags: needinfo?(dburns)
Comment hidden (mozreview-request) |
Assignee | ||
Comment 6•7 years ago
|
||
The attached patch fixes the issue, but I would like to have a test for this problem. Not sure yet if I can easily minimize the crashstats example, but lets see. I should have the patch ready by Monday.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 8•7 years ago
|
||
So it looks like that using an onclick event handler on a select/option element, which causes a page load can trigger such a behavior consistently. That's what I got minimized from the crash-stats example. With it I can reproduce the failure 100% of the time, and with my patch attached it just works. Here some timing: 1494238546361 Marionette TRACE 6 -> [0,9,"clickElement",{"id":"7c13c82b-b46e-5b46-8dc8-04a4c76aa5f5"}] 1494238546619 Marionette DEBUG Received DOM event "mousedown" for "[object HTMLSelectElement]" 1494238546633 Marionette DEBUG Received DOM event "mouseup" for "[object HTMLSelectElement]" 1494238546634 Marionette DEBUG Received DOM event "beforeunload" for "http://127.0.0.1:61690/clicks.html" 1494238546645 Marionette DEBUG Received DOM event "click" for "[object HTMLSelectElement]" 1494238546723 Marionette DEBUG Received DOM event "mouseout" for "[object HTMLBodyElement]" 1494238547663 Marionette DEBUG Received DOM event "pagehide" for "http://127.0.0.1:61690/clicks.html" 1494238547665 Marionette DEBUG Received DOM event "unload" for "http://127.0.0.1:61690/clicks.html" So the page load already gets started with the `mouseup` event, while the `click` event hasn't been propagated yet.
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8865009 -
Flags: review?(ato)
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Summary: Intermittent test_click.py TestClick.test_click_number_link | NoSuchElementException: Unable to locate element: username (pageunload timer canceled too early) → Intermittent test_click.py TestClick.test_click_number_link | NoSuchElementException: Unable to locate element: username (pageunload timer canceled because flushEventLoop returns after beforeunload event)
Comment 11•7 years ago
|
||
mozreview-review |
Comment on attachment 8865009 [details] Bug 1361983 - Extend page unload timer if flushEventLoop returns after beforeunload event. https://reviewboard.mozilla.org/r/136666/#review140032 ::: commit-message-7c902:8 (Diff revision 3) > +In cases when the navigation trigger function returns too late and the > +beforeunload event has already been fired, the page load currently gets > +canceled. This happens because the page unload timer hasn't been set at > +this time and the unbeforeunload handler doesn't extend its time. > + > +With this patch a flag is used which indicates if an unbeforeunload unbeforeunload → beforeunload? ::: testing/marionette/harness/marionette_harness/tests/unit/test_click.py:340 (Diff revision 3) > + def test_click_option_navigate(self): > + self.marionette.find_element(By.ID, 'option').click() > + self.marionette.find_element(By.ID, 'delay') Stay consistent with existing code style in the file by using double quotes. ::: testing/marionette/listener.js:146 (Diff revision 3) > + this.seenBeforeUnload = false; > this.seenUnload = false; You could use a Set here for seen events: > this.seen = new Set(); > > … > > this.seen.add(event.type); > > … > > // Set#delete returns true if item existed > if (this.seen.delete("beforeunload")) { > this.timerPageUnload.initWithCallback(…); > … > } This would save you from having to clear the seen events manually, but it’s also fine to do what you do here. ::: testing/marionette/listener.js:289 (Diff revision 3) > + // If no page unload has been detected, ensure to properly stop the load > + // listener, and return from the currently active command. > + } else if (!this.seenUnload) { > logger.debug("Canceled page load listener because no navigation " + > "has been detected"); > this.stop(); > sendOk(this.command_id); > } this.seenUnload is not reset like this.seenBeforeUnload?
Attachment #8865009 -
Flags: review?(ato) → review+
Assignee | ||
Comment 12•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8865009 [details] Bug 1361983 - Extend page unload timer if flushEventLoop returns after beforeunload event. https://reviewboard.mozilla.org/r/136666/#review140032 > You could use a Set here for seen events: > > > this.seen = new Set(); > > > > … > > > > this.seen.add(event.type); > > > > … > > > > // Set#delete returns true if item existed > > if (this.seen.delete("beforeunload")) { > > this.timerPageUnload.initWithCallback(…); > > … > > } > > This would save you from having to clear the seen events manually, but it’s also fine to do what you do here. I will keep this in mind. If we have to make another change for that code we can get this implemented. Right now I would leave it as is. > this.seenUnload is not reset like this.seenBeforeUnload? `this.seenBeforeUnload` is only reset so that we do not run into the same if block again. If we have processed it once, we don't need it anymore. Anything else gets reset when `loadListener.start()` is called.
Comment hidden (mozreview-request) |
Comment 14•7 years ago
|
||
Pushed by hskupin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7761907c8b70 Extend page unload timer if flushEventLoop returns after beforeunload event. r=ato
Comment 15•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/7761907c8b70
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment hidden (Intermittent Failures Robot) |
Comment 17•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/e2018fdeb7fa
status-firefox54:
--- → fixed
Updated•1 year ago
|
Product: Testing → Remote Protocol
Comment 18•1 year ago
|
||
Moving bug to Testing::Marionette Client and Harness component per bug 1815831.
Component: Marionette → Marionette Client and Harness
Product: Remote Protocol → Testing
You need to log in
before you can comment on or make changes to this bug.
Description
•