WebDriver:ElementClick sometimes does not trigger a click
Categories
(Remote Protocol :: Marionette, defect, P1)
Tracking
(firefox82 fixed, firefox83 fixed)
People
(Reporter: whimboo, Assigned: whimboo)
References
(Blocks 3 open bugs)
Details
(Keywords: testcase-wanted, Whiteboard: [marionette-fission-mvp][simple])
Attachments
(3 files)
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta-
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
|
Details | Review |
Updated•7 years ago
|
Assignee | ||
Comment 2•7 years ago
|
||
Comment 4•7 years ago
|
||
Assignee | ||
Comment 5•7 years ago
|
||
Updated•7 years ago
|
Assignee | ||
Comment 6•4 years ago
|
||
As noticed on bug 1665209 we sometimes may hit the non-spec conforming click code but the one for Selenium:
https://searchfox.org/mozilla-release/rev/7ac70cbbbe2f44d1140535d2909458d04212b32f/testing/marionette/interaction.js#130-133
I would suggest that we add a trace log entry if that code path is used. It would give us an understanding if my assumption is correct.
Assignee | ||
Comment 7•4 years ago
|
||
Updated•4 years ago
|
Assignee | ||
Comment 8•4 years ago
|
||
I'm only going to land the trace log patch but not going to fix this bug at the moment.
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Comment 10•4 years ago
|
||
bugherder |
Assignee | ||
Comment 11•4 years ago
|
||
Now that we know if we call the legacy click by accident and that this doesn't actually happen in CI these days, we should make sure that the click definitely gets synthesized on the target element. For that I would propose that we add a new trace logger entry here:
While I was adding that I noticed that when we call the flushEventLoop()
method, the element as passed-in is NOT the target element but the container element. As such the click
event is not detected if the event doesn't bubble up. As such I think we have to pass-in the real element that actually receives the click.
Assignee | ||
Comment 12•4 years ago
|
||
When synthesizing a click on an element the flushing the event
loop step right after awaits a click event on the element's
container but not the element itself. This could fail if
events are not bubbling up the DOM. As such the click event
of the actual target element needs to be awaited.
Updated•4 years ago
|
Assignee | ||
Comment 13•4 years ago
|
||
Actually the try build perfectly shows such an example:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=316360560&repo=try&lineNumber=5820-5827
[task 2020-09-22T10:56:25.152Z] 10:56:25 INFO - 1600772185142 Marionette DEBUG 210 -> [0,6,"WebDriver:FindElement",{"using":"link text","value":"333333"}]
[task 2020-09-22T10:56:25.212Z] 10:56:25 INFO - 1600772185206 Marionette DEBUG 210 <- [1,6,null,{"value":{"element-6066-11e4-a52e-4f735466cecf":"ec06f7e3-8ee4-4f25-86c0-b7485e06a837"}}]
[task 2020-09-22T10:56:25.212Z] 10:56:25 INFO - 1600772185212 Marionette DEBUG 210 -> [0,7,"WebDriver:ElementClick",{"id":"ec06f7e3-8ee4-4f25-86c0-b7485e06a837"}]
[task 2020-09-22T10:56:25.435Z] 10:56:25 INFO - 1600772185422 Marionette TRACE Canceled page load listener because no navigation has been detected
[task 2020-09-22T10:56:25.435Z] 10:56:25 INFO - 1600772185423 Marionette DEBUG 210 <- [1,7,null,{"value":null}]
[task 2020-09-22T10:56:25.453Z] 10:56:25 INFO - 1600772185436 Marionette DEBUG 210 -> [0,8,"WebDriver:GetCurrentURL",{}]
[task 2020-09-22T10:56:25.453Z] 10:56:25 INFO - 1600772185447 Marionette TRACE Received DOM event click for http://127.0.0.1:49983/test.html
[task 2020-09-22T10:56:25.455Z] 10:56:25 INFO - 1600772185452 Marionette DEBUG 210 <- [1,8,null,{"value":"http://127.0.0.1:49983/clicks.html"}]
As it can be seen WebDriver:ElementClick
returns way too early, and doesn't await the click
event at all. As such it's clear why no navigation gets detected.
The reason for that seems to be that synthesizing the click is actually delayed, and the appropriate event is getting fired about 250ms later! Given that our check for beforeunload
only allows a 200ms delay, its Promise gets resolved before the click
event, and causes the command to return.
Olli, are those delays something which could happen once in a while? Or unexpected and should be investigated / fixed? Not sure if that might be only related to synthesizing the click via event utils, or if real clicks could also be affected.
Comment 14•4 years ago
|
||
How is the click synthesized? In which process and how? If in different process, then nothing of course guarantees messages get through
in a timely manner if OS decides to not run IPC thread or something (if the system is otherwise very busy).
Assignee | ||
Comment 15•4 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #14)
How is the click synthesized? In which process and how? If in different process, then nothing of course guarantees messages get through
in a timely manner if OS decides to not run IPC thread or something (if the system is otherwise very busy).
Yes, the click is synthesized in the appropriate content process here:
https://searchfox.org/mozilla-central/rev/b58ca45005fe02077c92779483d1b60e9a49687c/testing/marionette/event.js#164
That will make things tricky given that we have to setup a Promise first that checks for a possible navigation. It's part of the parent process and will timeout after 200ms. Maybe we have to allow a longer timeout, and somehow modify its existing timer instance for the click event delay.
Comment 16•4 years ago
|
||
So I assume something which calls synthesizeMouseAtPoint is delayed for whatever reason.
Assignee | ||
Comment 17•4 years ago
|
||
I will add some trace log output for timing. So lets see if that is for the IPC between the parent and content process, or indeed for synthesizeMouseAtPoint
.
Assignee | ||
Comment 18•4 years ago
|
||
Here a try build with trace logging for synthesizing the click and the event:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=803a5525f9cc62f14e00d14f6a5579a8e163df1a
I will have a look a the results tomorrow.
Assignee | ||
Comment 19•4 years ago
|
||
Here an example of such a delayed behavior:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=316408344&repo=try&lineNumber=7847-7863
** 1600811489009: 195 -> [0,16,"WebDriver:ElementClick",{"id":"19073fbb-393f-4835-87e9-f68ca5dfab31"}]
** 1600811489020: navigation set beforeunload timer
** 1600811489020: driver calling clickElement in listener
** 1600811489034: listener calling clickElement in interaction
** 1600811489291: navigation beforeunload not seen
** 1600811489291: Canceled page load listener because no navigation has been detected
** 1600811489291: driver after click + navigation?
** 1600811489292: 195 <- [1,16,null,{"value":null}]
** 1600811489336: 195 -> [0,17,"WebDriver:GetCurrentURL",{}]
** 1600811489335: interaction added click event listener
** 1600811489336: interaction calling synthesize mouse
** 1600811489344: interaction click event fired
** 1600811489345: Received DOM event click for http://127.0.0.1:44898/clicks.html#
** 1600811489345: interaction called synthesize mouse
** 1600811489363: interaction after click event
It shows that the IPC call from driver -> listener is done within 14ms and as such is fine. But the problem is when calling interaction.clickElement()
from within listener here:
https://searchfox.org/mozilla-central/rev/35d927df97900a57ecb562ad13909e392440b0fb/testing/marionette/listener.js#753-757
https://searchfox.org/mozilla-central/rev/35d927df97900a57ecb562ad13909e392440b0fb/testing/marionette/interaction.js#116
https://searchfox.org/mozilla-central/rev/35d927df97900a57ecb562ad13909e392440b0fb/testing/marionette/interaction.js#180
It means that it's not the click that is delayed here but the code related to these checks:
https://searchfox.org/mozilla-central/rev/35d927df97900a57ecb562ad13909e392440b0fb/testing/marionette/interaction.js#132-172
I will push another try build with more logger entries there. But anyway, the wait for beforeunload
clearly has to wait until clickElement
has been returned from the child process.
Assignee | ||
Comment 20•4 years ago
•
|
||
That's very strange. The delay indeed happens when calling into interaction.js:
Marionette TRACE ** 1600845279202: driver calling clickElement in listener
Marionette TRACE ** 1600845279203: listener calling clickElement in interaction
Marionette TRACE ** 1600845279402: navigation beforeunload not seen
Marionette TRACE Canceled page load listener because no navigation has been detected
Marionette TRACE ** 1600845279402: driver after click + navigation?
Marionette DEBUG 114 <- [1,20,null,{"value":null}]
Marionette TRACE ** 1600845279403: interaction: file check
Assignee | ||
Comment 21•4 years ago
|
||
Actually, we have the situation that current capabilities have to be retrieved from the parent process, and that actually triggers a sync message to the parent process. And by its implementation it's even done twice for both moz:accessibilityChecks
and moz:webdriverClick
:
I don't think that we should do that, and instead directly pass-over the capabilities as options to the appropriate command.
Assignee | ||
Comment 22•4 years ago
|
||
The new try build with the considered patch applied looks great:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1329cfec785e11a9a119bf8aa40e714770e33dbf
So I'm taking the way in passing the capabilities over to the framescript commands whose need these, and by that also getting rid of the capability getter in the framescript.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 23•4 years ago
|
||
For fixing the race condition I filed the follow-up bug 1666755.
Assignee | ||
Comment 24•4 years ago
|
||
To not cause extra synchronous IPC round-trips with the parent process
each method in the framescript that needs capabilities would have to
get those passed along with its call.
This will reduce code complexity and safety because there is only
a single object of truth. Also performance will be greatly improved
for the affected commands.
Depends on D90987
Assignee | ||
Comment 25•4 years ago
|
||
Once landed I will make sure that we get this patch uplifted to beta.
Updated•4 years ago
|
Comment 26•4 years ago
|
||
Comment 28•4 years ago
|
||
bugherder |
Assignee | ||
Comment 29•4 years ago
|
||
This bug is fixed now.
Updated•4 years ago
|
Assignee | ||
Comment 30•4 years ago
|
||
Comment on attachment 9177359 [details]
Bug 1394354 - [marionette] Don't update capabilities synchronously in the framescript.
Beta/Release Uplift Approval Request
- User impact if declined: The Webdriver command "Element click" can fail for users of Selenium when the machine is under high cpu load. Under such a condition the click might be synthesized with a delay >200ms so that the command returns without actually doing the click.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Extra IPC roundtrips to fetch Webdriver capabilities are removed to not cause unexpected delays in processing the click. Instead capabilities are directly send as arguments to the framescript.
- String changes made/needed: no
Assignee | ||
Updated•4 years ago
|
Comment 31•4 years ago
|
||
This bug is 3 years old, what makes it worth uplifting to 82 vs riding to 83?
Assignee | ||
Comment 32•4 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #31)
This bug is 3 years old, what makes it worth uplifting to 82 vs riding to 83?
Recent work for Fission support in Marionette caused a spike of test failures for various tests, and that also with Fission disabled. Only that work actually allowed us to somewhat reproduce the failure, get it investigated, and fixed. Before that the chance to hit this failure was extremely low.
That said users of Selenium are likely to also see this spike, with the result that their test cases would be broken. As such I highly recommend to uplift this feature to 82 beta.
Comment 33•4 years ago
|
||
Comment on attachment 9175980 [details]
Bug 1394354 - [marionette] Add trace log output when using non spec-compatible element click.
this one's already on beta
Comment 34•4 years ago
|
||
Comment on attachment 9177359 [details]
Bug 1394354 - [marionette] Don't update capabilities synchronously in the framescript.
approved for 82.0b4
Updated•4 years ago
|
Comment 35•4 years ago
|
||
bugherder uplift |
Updated•2 years ago
|
Description
•