Closed Bug 1492499 Opened 6 years ago Closed 6 years ago

Make window manipulation more reliable

Categories

(Remote Protocol :: Marionette, enhancement, P1)

enhancement

Tracking

(firefox65 fixed)

RESOLVED FIXED
mozilla65
Tracking Status
firefox65 --- fixed

People

(Reporter: ato, Assigned: ato)

References

(Blocks 1 open bug, Regressed 1 open bug)

Details

Attachments

(20 files, 16 obsolete files)

46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
Marionette has long had a problem with unreliable window manipulation. Subsequent invocations of WebDriver:SetWindowRect, WebDriver:FullscreenWindow followed by a call to SetWindowRect, &c. are examples of when the window rect might be returned with the correct values but the subsequent call to GetWindowRect will return the wrong values, or of when our throtteling code waiting for the WM to process and for the outerWidth/outerHeight DOM properties to be propagated doesn’t work. The aim is to provide synchronous APIs for window manipulation that does not return control to the user before the window has been successfully moved/resized/minimized/maximized/fullscreen’ed and the relevant internal properties, such as ChromeWindow.{outerWidth,outerHeight} have received the new values.
whimboo helpfully suggested that we need to give the main thread time enough to process at least one event before we start querying the updated properties. I then discovered via an experiment that if we delegate the entire window resizing code (the implementation of WebDriver:SetWindowRect) to the main thread, we could see that the callback resolved much later than when we currently return from WebDriver:SetWindowRect. It is not exactly news that we return too early, but the good news from this finding was that the outerWidth/outerHeight properties were correctly reported inside the callback that ran on the main thread. The working theory we have at the moment is that we need to expand the use of IdlePromise to encompass all the synchronisation points checking outerWidth/outerHeight.
Blocks: webdriver
Priority: -- → P2
Blocks: 1471288
Assignee: nobody → ato
Status: NEW → ASSIGNED
Priority: P2 → P1
Depends on: 1493660
Blocks: 1445212
I think we concluded last week that delegation to the main thread helped for _some_ of the window manipulation commands, but that it in itself did not solve all the edge cases. I have since rewritten the way we put a window into fullscreen mode, how we iconify it, and how we synchronously wait for it to be restored. This appears to have solved a few problems with headless mode and generally seems to make a subset of the cases more reliable. It is hard to tell conclusively if these changes will indeed be more reliable in the long run, since changes to this code is notoriously hard to test due to different configurations of systems, window managers, and setups.
Summary: Delegate window manipulation calls to main thread → Make window manipulation more reliable
My latest try runs show that the changes I have made are pretty stable: https://treeherder.mozilla.org/#/jobs?repo=try&revision=cdacb0bcc955db4cdd9a1a4c9bc0a45ce61a5a64&group_state=expanded Except very occasionally the ChromeWindow is reported as minimized on the main thread, but that document.hidden in web content will report false: > 1538052289305 webdriver::server DEBUG -> POST /session/9aea82ec-d5fa-4b3f-9358-bcb00d52716a/window/minimize {} > 1538052289307 Marionette TRACE 0 -> [0,49,"WebDriver:MinimizeWindow",{}] > 1538052289324 Marionette TRACE 0 <- [1,49,null,{"x":0,"y":0,"width":800,"height":600,"state":"minimized"}] > 1538052289328 webdriver::server DEBUG <- 200 OK {"value":{"x":0,"y":0,"width":800,"height":600}} > 1538052289331 webdriver::server DEBUG -> POST /session/9aea82ec-d5fa-4b3f-9358-bcb00d52716a/execute/sync {"args": [], "script": "return document.hidden"} > 1538052289333 Marionette TRACE 0 -> [0,50,"WebDriver:ExecuteScript",{"args":[],"newSandbox":false,"script":"return document.hidden","scriptTimeout":null,"specialPowers":false}] > 1538052289419 Marionette TRACE 0 <- [1,50,null,{"value":false}] > 1538052289420 webdriver::server DEBUG <- 200 OK {"value":false} My educated guess is that DOM properties take some time to propagate from the main process to the content child process, and that the best way to fix this is to add an explicit wait in the test.
It would be good to know how long this delay is until the content process is aware of the hidden state. Maybe we should better have an internal wait for that, so that tests can always trust that the window is minimized when the command returns. Similar to any other command.
We can’t currently wait on document.hidden because of a bug in headless mode. Because it appears to happen so infrequently, I would like to apply the fixes we know work first, file the bug about document.hidden and get this fixed, and then consider using an internal wait for document.hidden in web content.
I had a look at the try build and lots of Wd tests still show a hang in SetWindowRect. Didn't we also wanted to use the timed promise for the changes on that bug? Or did I mix-up bugs?
(In reply to Henrik Skupin (:whimboo) from comment #8) > I had a look at the try build and lots of Wd tests still show a > hang in SetWindowRect. I didn’t request review on these changes yet. I must have made a mistake when splitting the changes into multiple patches. > Didn't we also wanted to use the timed promise for the changes on > that bug? Or did I mix-up bugs? The problematic promises that would previously hanged have been replaced with PollPromise, which eventually will reject after a a timeout.
Depends on: 1496432
Blocks: 1496489
This adds two new synchronisation primitives to the Marionette sync module. MainThreadPromise mimics a Promise and dispatches a callback function to the main thread of the current process which awaits until it is resolved. MainThreadTask is similar, except it is resolved automatically when the callback function returns, along with passing on its return value. Both are built around Services.tm.dispatchToMainThread.
Depends on D7761
Instead of waiting for the visibilitychange event to fire, which does not work in headless mode, we can poll ChromeWindow.windowState which appears to be a more reliable way of telling the window's current state. In addition to this we delegate the polling to the main thread, and don't return from there until the state has changed. This will resolve the problem where Marionette never returns from minimization or restoration due to visibilitychange never firing, and will additionally make it more reliable. Depends on D7762
This delegates moving the window to the main thread, and additionally synchronises the state using an IdlePromise after it has returned to ensure DOM properties are propagated. Depends on D7763
Should the window already be at the target X/Y position, we can skip having to repositon it. Depends on D7764
This adds a new synchronisation primitive to Marionette which will allow us to wait for the last in a sequence of events to fire. This is especially useful for high-frequency events such as the DOM resize event, where multiple resize events may fire as the window dimensions are being changed. Depends on D7765
Depends on D7766
My delegating to the main thread, waiting for the last DOM resize event to fire, and requesting an animation frame from the ChromeWindow, we can ensure the window is (more) synchronously resized. This ensures better reliability when setting a window's dimensions. All this means we can get rid of the heuristics we use for "waiting for a window resize" to occur by checking if the outerWidth/outerHeight has changed. These were obviously bug ridden. Depends on D7767
For the time being we need to poll for document.hidden to become true because certain driver implementations, such as geckodriver, occasionally does not wait until the DOM property is propagated to the child process. Depends on D7768
The "is True" and "is False" style is unnecessary as the return values are boolean, which means "assert is_fullscreen(session)" and "assert not is_fullscreen(session)" is sufficient. Depends on D8396
The stack is joined with "\n" causing an extra carriage return line feed to appear at the end of the string. Depends on D8397
We often use TimedPromise to ensure Marionette does not unexpectedly block on a promise that, for whatever reason, does not resolve. It can however be useful to be alerted when they don't, as it quite often means there is an underlying problem. Depends on D8398
To be able to reuse is_fullscreen() in other tests, we need to uplift it to the support.helpers package. Depends on D8399
On some systems and window managers, such as macOS and when X11 forwarding an application across systems, there exists a 22px window border that we cannot detect or do anything about. As this test is to verify that the width/height changed beyond 800x600, this assertion change should make the tests pass on more configurations. Depends on D7769
Attachment #9014511 - Attachment is obsolete: true
Attachment #9014503 - Attachment is obsolete: true
Attachment #9014504 - Attachment is obsolete: true
Attachment #9014505 - Attachment is obsolete: true
Attachment #9014506 - Attachment is obsolete: true
Attachment #9014507 - Attachment is obsolete: true
Attachment #9014508 - Attachment is obsolete: true
Attachment #9014509 - Attachment is obsolete: true
Attachment #9014510 - Attachment is obsolete: true
Attachment #9016315 - Attachment is obsolete: true
Attachment #9016316 - Attachment is obsolete: true
Attachment #9016317 - Attachment is obsolete: true
Attachment #9016318 - Attachment is obsolete: true
Attachment #9016319 - Attachment is obsolete: true
Attachment #9016320 - Attachment is obsolete: true
Attachment #9016321 - Attachment is obsolete: true
The "is True" and "is False" style is unnecessary as the return values are boolean, which means "assert is_fullscreen(session)" and "assert not is_fullscreen(session)" is sufficient. Depends on D8404
The stack is joined with "\n" causing an extra carriage return line feed to appear at the end of the string. Depends on D8405
We often use TimedPromise to ensure Marionette does not unexpectedly block on a promise that, for whatever reason, does not resolve. It can however be useful to be alerted when they don't, as it quite often means there is an underlying problem. Depends on D8406
To be able to reuse is_fullscreen() in other tests, we need to uplift it to the support.helpers package. Depends on D8407
For the time being we need to poll for document.hidden to become true because certain driver implementations, such as geckodriver, occasionally does not wait until the DOM property is propagated to the child process. Depends on D8408
On some systems and window managers, such as macOS and when X11 forwarding an application across systems, there exists a 22px window border that we cannot detect or do anything about. As this test is to verify that the width/height changed beyond 800x600, this assertion change should make the tests pass on more configurations. Depends on D8409
This adds a new synchronisation primitive to Marionette which will allow us to wait for the last in a sequence of events to fire. This is especially useful for high-frequency events such as the DOM resize event, where multiple resize events may fire as the window dimensions are being changed. Depends on D8411
Instead of waiting for the visibilitychange event to fire, which does not work in headless mode, we can poll ChromeWindow.windowState which appears to be a more reliable way of telling the window's current state. This will resolve the problem where Marionette never returns from restoration due to visibilitychange never firing, and will additionally make it more reliable. Depends on D8412
The visibilitychange DOM event does not fire reliably in all configurations when retrieved from web content. It appears to fire more reliably in chrome, but to ensure a call to WebDriver:MinimizeWindow never hangs, we additionally change it to be a TimedPromise. There is an assumption here that if the iconification process does not complete within this duration, there is nothing we can do. Depends on D8413
Unlike the visibilitychange event, sizemodechange appears to fire in a mostly reliable way. However, in the off-chance that sizemodechange should not fire, we need an escape path so that Marionette returns within a timely fashion. Depends on D8414
win.Maximized does not exist. This should be WindowState.Maximized. Depends on D8415
When the ChromeWindow is already in the desired x/y position, WebDriver:SetWindowRect should act as a no-op. This avoids a superfluous call to ChromeWindow.moveTo. Depends on D8416
My delegating to the main thread, waiting for the last DOM resize event to fire, and requesting an animation frame from the ChromeWindow, we can ensure the window is (more) synchronously resized. This ensures better reliability when setting a window's dimensions. All this means we can get rid of the heuristics we use for "waiting for a window resize" to occur by checking if the outerWidth/outerHeight has changed. These were obviously bug ridden. Depends on D8417
This requests an animation frame off ChromeWindow and waits for the main thread's event queue to become idle before returning from window manipulation commands. This ensures all potential synchronisation code between e.g. the parent process and the child processes have had time to run before we return from WebDriver:{MinimizeWindow,MaximizeWindow,FullscreenWindow}. Depends on D8418
Attachment #9016321 - Attachment is obsolete: false
Attachment #9016320 - Attachment is obsolete: false
Keywords: leave-open
Pushed by atolfsen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b95d40e31251 marionette: fix GeckoDriver#setWindowRect docs; r=automatedtester https://hg.mozilla.org/integration/autoland/rev/246b45eae45c webdriver: fix is_fullscreen assertions; r=automatedtester https://hg.mozilla.org/integration/autoland/rev/0e4677eff22b marionette: trim crlf off produced stack; r=automatedtester https://hg.mozilla.org/integration/autoland/rev/9ca2c179e63e marionette: warn on TimedPromise bailing; r=automatedtester
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/13804 for changes under testing/web-platform/tests
Pushed by atolfsen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cbca19a1e3f4 webdriver: uplift is_fullscreen as helper; r=automatedtester https://hg.mozilla.org/integration/autoland/rev/bfe0c0f39dd3 webdriver: poll for document.hidden; r=automatedtester
Lots of changes here. It's great to see those landed! Andreas, can this bug be closed now, or are there more fixes to be landed?
(In reply to Henrik Skupin (:whimboo) from comment #49) > Lots of changes here. It's great to see those landed! Andreas, can this bug > be closed now, or are there more fixes to be landed? There’s still a lot more changes to land. You can see the full list of open revisions here: https://phabricator.services.mozilla.com/D8419
Attachment #9016341 - Attachment description: bug 1492499: marionette: empty event queue before returning from window manipulation; → bug 1492499: marionette: empty event queue on window manipulation;
On certain window manager configurations a window may be resized to fit the full available dimensions of the screen without going into the maximised state. Similarily, a maximised window may be the exact dimensions of the screen. If the window outerWidth/outerHeight is 800x600 and in maximised state, resizing it to 800x600 through WebDriver:SetWindowRect will not work because the window has already reached its requested dimensions. This patch ensures to wait for the resizeEnd event when the window state is not normal. Depends on D8419
The Marionette test_window_maximize.py test is now duplicated in WPT tests. Depends on D10568
Attachment #9021836 - Attachment description: bug 1492499: marionette: deduplicate maximize window tests; → bug 1492499: marionette: deduplicate window manipulation tests;
Attachment #9016319 - Attachment is obsolete: false
Attachment #9016318 - Attachment is obsolete: false
Attachment #9021836 - Attachment description: bug 1492499: marionette: deduplicate window manipulation tests; → bug 1492499: marionette: drop broken window maximization test;
Attachment #9016316 - Attachment is obsolete: false
Attachment #9016317 - Attachment is obsolete: false
Attachment #9016315 - Attachment is obsolete: false
The things that rely on TimedPromise, such as awaiting a sizemodechange event on exiting fullscreen, takes a lot longer to perform in debug builds than in optimised builds. This patch increases the TimedPromise timeout duration by three times in debug builds, which is the same amount WPT uses. Depends on D10569
Pushed by atolfsen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/83ee1c23d31a webdriver: take 22px window border into account on maximizing; r=automatedtester,whimboo https://hg.mozilla.org/integration/autoland/rev/8490a7301d92 webdriver: add stress tests for window manipulation; r=automatedtester https://hg.mozilla.org/integration/autoland/rev/1cceee950d62 marionette: add debounce sync primitive; r=automatedtester https://hg.mozilla.org/integration/autoland/rev/33fd5380d4bc marionette: poll on restoring window; r=whimboo https://hg.mozilla.org/integration/autoland/rev/f1b2efff0c44 marionette: bail if visibilitychange does not fire on minimization; r=whimboo https://hg.mozilla.org/integration/autoland/rev/1d9d159a04b7 marionette: accept sizemodechange may not always fire; r=automatedtester,whimboo https://hg.mozilla.org/integration/autoland/rev/15b378cd5a24 marionette: fix buggy window state comparison; r=whimboo https://hg.mozilla.org/integration/autoland/rev/95700d2eb9c5 marionette: make window positioning idempotent; r=whimboo https://hg.mozilla.org/integration/autoland/rev/2aaa43050c7a marionette: wait for last event on resizing window; r=automatedtester,whimboo https://hg.mozilla.org/integration/autoland/rev/05b349eaba54 marionette: empty event queue on window manipulation; r=automatedtester,whimboo https://hg.mozilla.org/integration/autoland/rev/81ac8f7d11b1 marionette: restore from maximized before setting rect; r=whimboo https://hg.mozilla.org/integration/autoland/rev/a4fcddffde41 marionette: drop broken window maximization test; r=automatedtester https://hg.mozilla.org/integration/autoland/rev/2c1b00a6a97f marionette: increase TimedPromise timeout on debug; r=whimboo
The Maximize Window command's stress test is broken on Windows due to a socket problem on Windows specifically. I think the problem is that the socket buffer is too small because of a very intensive check for the window size not changing. See https://bugzilla.mozilla.org/show_bug.cgi?id=1505806 for more details.
Pushed by atolfsen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c812e3810b94 webdriver: disable maximize_window/stress.py on windows; a=stefan_hindli r=jgraham
Blocks: 1505806
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/13983 for changes under testing/web-platform/tests
Keywords: leave-open
Andreas, a big thank you for working on that bug! It will make use way more stable now for all the window manipulation commands, which cause lesser hangs and timeouts for people using Selenium via geckodriver/Marionette. Also 5 more commands are no longer listed as timeout (complete red) on wpt.fyi!
Attachment #9016315 - Attachment is obsolete: true
Attachment #9016316 - Attachment is obsolete: true
Attachment #9016317 - Attachment is obsolete: true
Attachment #9016318 - Attachment is obsolete: true
Attachment #9016319 - Attachment is obsolete: true
Attachment #9016320 - Attachment is obsolete: true
Attachment #9016321 - Attachment is obsolete: true
Depends on: 1509557
Blocks: 1509557
No longer depends on: 1509557
Blocks: 1512593
No longer depends on: 1520284
Regressions: 1520284
Product: Testing → Remote Protocol
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: