Closed Bug 1345653 Opened 3 years ago Closed 3 years ago

Pointer actions execution hangs geckodriver/marionette after repeated execution

Categories

(Testing :: Marionette, defect, P1)

55 Branch
defect

Tracking

(firefox53 fixed, firefox54 fixed, firefox55 fixed)

RESOLVED FIXED
mozilla55
Tracking Status
firefox53 --- fixed
firefox54 --- fixed
firefox55 --- fixed

People

(Reporter: james.h.evans.jr, Assigned: maja_zf)

References

(Blocks 2 open bugs)

Details

Attachments

(8 files, 1 obsolete file)

Attached file start.html
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:51.0) Gecko/20100101 Firefox/51.0
Build ID: 20170125094131

Steps to reproduce:

Host the attached start.html and destination.html files on a web server.

Run the following code (provided code is in C# using Selenium, but the wire protocol commands are provided to make it easy to translate into any language and toolset equivalent):

public void TestAction()
{
    // POST http://<geckodriver location and port>/session
    IWebDriver driver = new FirefoxDriver();
    
    // N.B. Update to location of actual file.
    // POST http://<geckodriver location and port>/session/<session id>/url
    // Payload: {"url": "http://localhost:8080/start.html"}
    driver.Url = "http://localhost:8080/start.html";
    
    // Lazy person's wait. Don't do this in production.
    System.Threading.Thread.Sleep(2000);

    // POST http://<geckodriver location and port>/session/<session id>/element
    // Payload: {"using": "css selector", "value": "#normal"}
    IWebElement element = driver.FindElement(By.CssSelector("#normal"));
    
    // POST http://<geckodriver location and port>/session/<session id>/actions
    // The below action produces the JSON payload documented in actions.txt.
    new Actions(driver).Click(element).Perform();
    
    // N.B. Update to location of actual file.
    // POST http://<geckodriver location and port>/session/<session id>/url
    // Payload: {"url": "http://localhost:8080/start.html"}
    driver.Url = "http://localhost:8080/start.html";
    
    // Lazy person's wait. Don't do this in production.
    System.Threading.Thread.Sleep(2000);   

    // POST http://<geckodriver location and port>/session/<session id>/element
    // Payload: {"using": "css selector", "value": "#normal"}
    IWebElement element = driver.FindElement(By.CssSelector("#normal"));
    
    // POST http://<geckodriver location and port>/session/<session id>/actions
    // The below action produces the JSON payload documented in payload.txt.
    new Actions(driver).Click(element).Perform();
    
    // DELETE http://<geckodriver location and port>/session/<session id>
    driver.Quit();
}



Actual results:

Under ordinary execution, the code hangs on the second call to the `action` end point. Note that there were times during debugging that the code was able to proceed after several seconds' delay, suggesting a race condition of some sort.


Expected results:

The code should have run to completion without hanging.
Attached file destination.html
Attached file payload.txt
Reference to "actions.txt" in code comment is incorrect. The actual attached file name is "payload.txt".
Maja, can you have a look into this please?
Flags: needinfo?(mjzffr)
Thanks for the report. I haven't had a chance to reproduce yet, but it's on my to-do list.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I've reproduced this and have been trying out different conditions for the hang. What I've observed so far:
* no useful info in trace-level logs,
* specific to Marionette, not geckodriver,
* not specific to pointerMove, 
* the hang is tied to a click that results in navigation, (when clicking a link whose default behaviour is prevented, I don't encounter a hang),
* the hang doesn't occur with "Marionette:clickElement" only "performActions". A quick comparison of the two makes me wonder whether the performActions implementation is missing something like [1]
* the hang only occurs on the _second_ navigation if that second navigation is done with performActions, regardless of how the first navigation is done (get_url, clickElement, performActions)


[1] http://searchfox.org/mozilla-central/rev/7419b368156a6efa24777b21b0e5706be89a9c2f/testing/marionette/interaction.js#188
Flags: needinfo?(mjzffr)
Assignee: nobody → mjzffr
[mass] Setting priority
Priority: -- → P1
Comment on attachment 8854299 [details]
Bug 1345653 - Handle document unload when dispatching actions;

https://reviewboard.mozilla.org/r/126230/#review128856

::: commit-message-4c987:5
(Diff revision 2)
> +Bug 1345653 - Handle document unload when dispatching actions; r?ato
> +
> +This fixes the reported hang that occurs after a pointer click
> +action resulting in navigation.
> +

Superfluous newline in commit message here.
Attachment #8854299 - Flags: review?(ato) → review+
Comment on attachment 8854301 [details]
discard Bug 1345653 - Move creation of data URIs to shared util module;

https://reviewboard.mozilla.org/r/126234/#review128860

::: testing/web-platform/tests/webdriver/util/document.py:3
(Diff revision 3)
> +def inline(doc):
> +    return "data:text/html;charset=utf-8,%s" % urllib.quote(doc)

Unfortunately this already exists in on WPT master in https://github.com/w3c/web-platform-tests/blob/master/webdriver/support/inline.py, but this change does not seem to have propagated to mozilla-central yet.

I guess we go ahead with adding this, and then do a follow-up to remove it?
Attachment #8854301 - Flags: review?(ato) → review+
Comment on attachment 8854300 [details]
Bug 1345653 - Add pause and click to actions client API;

https://reviewboard.mozilla.org/r/126232/#review128858

::: testing/web-platform/tests/tools/webdriver/webdriver/client.py:117
(Diff revision 3)
>          }
>          if self._pointer_params is not None:
>              d["parameters"] = self._pointer_params
>          return d
>  
> +    def clear(self):

Why would you not simply construct a new chain?  It some more sensible to throw away the current chain than to add mutability methods for manipulating its state.

::: testing/web-platform/tests/tools/webdriver/webdriver/client.py:181
(Diff revision 3)
> +
> +        :param element: Element to click.
> +        :param button: Integer representing pointer button to perform action
> +                       with. Default: 0, which represents main device button.
> +        """
> +        return self.pointer_move(0, 0, duration=250, origin=element).click(button)

Is the duration something picked out of the air?  Do we want the client API to click an element to imply a delay, but not the arbitrary click on location?
Attachment #8854300 - Flags: review?(ato) → review-
Comment on attachment 8854302 [details]
Bug 1345653 - Test pointer move with element origin and click with navigation;

https://reviewboard.mozilla.org/r/126236/#review128862

::: testing/web-platform/tests/webdriver/actions/mouse.py:6
(Diff revision 4)
>  from support.refine import get_events, filter_dict
> +from util.document import inline
> +
> +
> +def link_doc(dest):
> +    content = '<a href="{}" id="link">destination</a>'.format(dest)

Double quotes, please.

::: testing/web-platform/tests/webdriver/actions/mouse.py:46
(Diff revision 4)
> +    assert "mousedown" in [e["type"] for e in events]
> +    for e in events:
> +        if e["type"] == "mousedown":

You’re already testing the type on :46, why the if condition on :48?
Comment on attachment 8854302 [details]
Bug 1345653 - Test pointer move with element origin and click with navigation;

https://reviewboard.mozilla.org/r/126236/#review128864
Attachment #8854302 - Flags: review?(ato) → review+
Comment on attachment 8854300 [details]
Bug 1345653 - Add pause and click to actions client API;

https://reviewboard.mozilla.org/r/126232/#review128858

> Why would you not simply construct a new chain?  It some more sensible to throw away the current chain than to add mutability methods for manipulating its state.

Only because creating a new chain with the same characteristics is mildly cumbersome, but I don't feel strongly about it. I agree: breaking the invariant that methods only add to the chain is not ideal.

> Is the duration something picked out of the air?  Do we want the client API to click an element to imply a delay, but not the arbitrary click on location?

I'm matching what I've seen in the payload produced by Selenium clients for "click element". Happy to remove. In fact, I'll combine click and click_element.
Comment on attachment 8854302 [details]
Bug 1345653 - Test pointer move with element origin and click with navigation;

https://reviewboard.mozilla.org/r/126236/#review128862

> You’re already testing the type on :46, why the if condition on :48?

I've added more detail to this test, so the question doesn't quite apply anymore. In any case, if the first step I'm checking that the mousedown event took place. In the second step I'm checking other event parameters, but only for the mousedown.
Comment on attachment 8854300 [details]
Bug 1345653 - Add pause and click to actions client API;

https://reviewboard.mozilla.org/r/126232/#review128858

> I'm matching what I've seen in the payload produced by Selenium clients for "click element". Happy to remove. In fact, I'll combine click and click_element.

With Selenium I assume it “teleports” the mouse to the targetted location because it doesn’t generate mousemove DOM events for the path from origin to destination, like W3C WebDriver does.  So giving this a duration of 250 ms will have different behaviour than giving the equivalent Selenium API the same duration.

That said, click_element is a fairly high level API so maybe this is what we expect it to do.  Maybe the sensible thing to do here is _not_ to define click_element at all, considering wdclient is solely being used for low-level testing of the WebDriver protocol?
Comment on attachment 8854300 [details]
Bug 1345653 - Add pause and click to actions client API;

https://reviewboard.mozilla.org/r/126232/#review128898

I guess the latest iteration is good enough.  Fine for it land as it is now.
Attachment #8854300 - Flags: review?(ato) → review+
We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again.

hg error in cmd: hg rebase -s e6df13e259ec -d 9fee9ccec3c9: rebasing 386669:e6df13e259ec "Bug 1345653 - Handle document unload when dispatching actions; r=ato"
merging testing/marionette/action.js
rebasing 386670:71d85e43e7a2 "Bug 1345653 - Add pause and click to actions client API; r=ato"
merging testing/web-platform/tests/tools/webdriver/webdriver/client.py
rebasing 386671:88f2f835e120 "discard Bug 1345653 - Move creation of data URIs to shared util module; r=ato"
merging testing/web-platform/meta/MANIFEST.json
warning: conflicts while merging testing/web-platform/meta/MANIFEST.json! (edit, then use 'hg resolve --mark')
unresolved conflicts (see hg resolve, then hg rebase --continue)
Attachment #8854301 - Attachment is obsolete: true
Pushed by mjzffr@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/9e8dd49499e1
Handle document unload when dispatching actions; r=ato
https://hg.mozilla.org/integration/autoland/rev/ec7d217640a2
Add pause and click to actions client API; r=ato
https://hg.mozilla.org/integration/autoland/rev/a9d111f06a1a
Test pointer move with element origin and click with navigation; r=ato
Please uplift to 54 and 53, a=testonly. I've attached a patch file for each. 

Similarly to https://bugzilla.mozilla.org/show_bug.cgi?id=1332279#c20, the uplift consists only of the first commit in the series.
Whiteboard: [checkin-needed-aurora][checkin-needed-beta]
https://hg.mozilla.org/releases/mozilla-aurora/rev/47c3ec42917e
Whiteboard: [checkin-needed-aurora][checkin-needed-beta] → [checkin-needed-beta]
Pushed by james@hoppipolla.co.uk:
https://hg.mozilla.org/integration/mozilla-inbound/rev/cf9c01ad600d
Add pause and click to actions client API; r=ato
You need to log in before you can comment on or make changes to this bug.