Closed Bug 1364319 Opened 5 years ago Closed 5 years ago

Intermittent /webdriver/contexts.py | contexts.py::test_window_resize

Categories

(Testing :: web-platform-tests, defect, P1)

Version 3
x86_64
Linux
defect

Tracking

(firefox56 fixed)

RESOLVED FIXED
mozilla56
Tracking Status
firefox56 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: ato)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell fixed:other])

Attachments

(2 files)

:jgraham, this looks to be a recent failure starting on may 12th and becoming more frequent.  Specifically on linux64-opt wdspec (both e10s and non-e10s), can you look into this intermittent sometime in the next 2 weeks?
Flags: needinfo?(james)
Whiteboard: [stockwell needswork]
Flags: needinfo?(james) → needinfo?(ato)
I’m just acknowledging that I’ve seen this and that I’ll look at this in more detail hopefully tomorrow.
Flags: needinfo?(ato)
Did not mean to reset my needinfo.
Flags: needinfo?(ato)
:ato, can you look into this next week?  Otherwise I will be inclined to disable this test temporarily to clean up our intermittent failures until you have more time to look at this.
(In reply to Joel Maher ( :jmaher) from comment #8)

> :ato, can you look into this next week?  Otherwise I will be inclined
> to disable this test temporarily to clean up our intermittent failures
> until you have more time to look at this.

Sorry for the delayed response here.  I was waiting for a few crucial
patches to land in-tree so it would be easier to investigate this.  They
have now landed, and I have no trouble reproducing the instability of
this test locally.

The source of the issue is that we’re returning too soon from
GeckoDriver#setWindowRect after calling ChromeWindow.resizeTo(width,
height).  Unfortunately the DOM resize event is not synchronous and we
can’t rely on that alone to make the Marionette command blocking.

In the past I had problems using ChromeWindow.requestAnimationFrame,
but my current investigation shows it is now working and in fact more
reliable that throttling by 15 fps and setTimeout which we do currently.
Assignee: nobody → ato
Status: NEW → ASSIGNED
Flags: needinfo?(ato)
Comment on attachment 8876713 [details]
Bug 1364319 - Make setWindowRect deterministic;

https://reviewboard.mozilla.org/r/148042/#review152942
Attachment #8876713 - Flags: review?(mjzffr) → review+
Pushed by atolfsen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7492cd4b4cb3
Throttle setWindowRect with requestAnmationFrame; r=maja_zf
(In reply to Wes Kocher (:KWierso) from comment #16)
> I had to back this out for failures like
> https://treeherder.mozilla.org/logviewer.html#?job_id=106844272&repo=autoland
> 
> https://hg.mozilla.org/integration/autoland/rev/c5d73e83adc5

I’m not able to reproduce this, and I suspect that this is an unrelated
intermittent.  I’m going to push this again to see if I’m in better luck
this time.
Flags: needinfo?(ato)
Pushed by atolfsen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5afb0e2ec86c
Throttle setWindowRect with requestAnmationFrame; r=maja_zf
Backed out for failing Marionette's test_window_rect.py TestSize.test_resize_while_fullscreen on Linux opt:

https://hg.mozilla.org/integration/autoland/rev/d3c353a9fdd21c9a8dc83edea233a343ef54f8ab

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=5afb0e2ec86cc9ad92f332122ed8a390295a56ed&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=107116182&repo=autoland

[task 2017-06-14T19:02:57.074860Z] 19:02:57     INFO - TEST-START | test_window_rect.py TestSize.test_resize_while_fullscreen
[task 2017-06-14T19:02:57.326426Z] 19:02:57     INFO - TEST-UNEXPECTED-FAIL | test_window_rect.py TestSize.test_resize_while_fullscreen | AssertionError: 1280 != 1500
[task 2017-06-14T19:02:57.327323Z] 19:02:57     INFO - Traceback (most recent call last):
[task 2017-06-14T19:02:57.327414Z] 19:02:57     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_harness/marionette_test/testcases.py", line 157, in run
[task 2017-06-14T19:02:57.327486Z] 19:02:57     INFO -     testMethod()
[task 2017-06-14T19:02:57.327581Z] 19:02:57     INFO -   File "/home/worker/workspace/build/tests/marionette/tests/testing/marionette/harness/marionette_harness/tests/unit/test_window_rect.py", line 204, in test_resize_while_fullscreen
[task 2017-06-14T19:02:57.327667Z] 19:02:57     INFO -     self.assertEqual(result["width"], self.max["width"] - 100)
[task 2017-06-14T19:02:57.327712Z] 19:02:57     INFO - TEST-INFO took 235ms
Flags: needinfo?(ato)
I may wait until https://bugzil.la/1358978 lands before investigating this further.
Flags: needinfo?(ato)
Thanks to --run-until-failure I was able to fix the second fallout,
which was due to the following code segment:

>   if (win.windowState == win.STATE_FULLSCREEN) {
>     yield new Promise(resolve => {
>       win.addEventListener("sizemodechange", resolve, {once: true});
>       win.fullScreen = false;
>     });
>   }

After moving out of fullscreen before resizing the window, the window
doesn’t have time to fully resize to its original size before we
try to resize it again.  This is solved by requesting an additional
animation frame as part of resolving the promise when the sizemodechange
event fires.

This makes the final patch look like this:

> diff --git a/testing/marionette/driver.js b/testing/marionette/driver.js
> index 88c5cd47581b..5111122b1a97 100644
> --- a/testing/marionette/driver.js
> +++ b/testing/marionette/driver.js
> @@ -1306,10 +1306,11 @@ GeckoDriver.prototype.setWindowRect = function* (cmd, resp) {
>    assert.noUserPrompt(this.dialog);
>  
>    let {x, y, width, height} = cmd.parameters;
> +  let optimisedResize = resolve => () => win.requestAnimationFrame(resolve);
>  
>    if (win.windowState == win.STATE_FULLSCREEN) {
>      yield new Promise(resolve => {
> -      win.addEventListener("sizemodechange", resolve, {once: true});
> +      win.addEventListener("sizemodechange", optimisedResize(resolve), {once: true});
>        win.fullScreen = false;
>      });
>    }
> @@ -1320,8 +1321,8 @@ GeckoDriver.prototype.setWindowRect = function* (cmd, resp) {
>  
>      if (win.outerWidth != width || win.outerHeight != height) {
>        yield new Promise(resolve => {
> -        const optimisedResize = () => win.requestAnimationFrame(resolve);
> -        win.addEventListener("resize", optimisedResize, {once: true});
> +        //const optimisedResize = () => win.requestAnimationFrame(resolve);
> +        win.addEventListener("resize", optimisedResize(resolve), {once: true});
>          win.resizeTo(width, height);
>        });
>      }
Pushed by atolfsen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b698ac3feeaf
Throttle setWindowRect with requestAnmationFrame; r=maja_zf
https://hg.mozilla.org/mozilla-central/rev/b698ac3feeaf
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on
intermittent or backout) from comment #26)

> Still active:
> https://treeherder.mozilla.org/logviewer.html#?job_id=107893373&repo=autoland

I found something highly suspicious in the log:

> 1497796541664  webdriver::server       DEBUG   -> POST /session/a188a7e0-48ea-4f01-a18c-12f1f4b88626/window/rect {"width": 500, "height": 600}
> 1497796541664  geckodriver::marionette TRACE   -> 67:[0,16,"setWindowRect",{"height":600,"width":500,"x":null,"y":null}]
> 1497796541666  Marionette      TRACE   0 -> [0,16,"setWindowRect",{"height":600,"width":500,"x":null,"y":null}]
> 1497796541714  Marionette      TRACE   0 <- [1,16,null,{"x":4,"y":8,"width":500,"height":600}]
> 1497796541715  geckodriver::marionette TRACE   <- [1,16,null,{"x":4,"y":8,"width":500,"height":600}]
> 1497796541715  webdriver::server       DEBUG   <- 200 OK {"value": {"x":4.0,"y":8.0,"width":500.0,"height":600.0}}
> 1497796541716  webdriver::server       DEBUG   -> GET /session/a188a7e0-48ea-4f01-a18c-12f1f4b88626/window/rect
> 1497796541716  geckodriver::marionette TRACE   -> 25:[0,17,"getWindowRect",{}]
> 1497796541730  Marionette      TRACE   0 -> [0,17,"getWindowRect",{}]
> 1497796541756  Marionette      TRACE   0 <- [1,17,null,{"x":4,"y":8,"width":400,"height":500}]
> 1497796541756  geckodriver::marionette TRACE   <- [1,17,null,{"x":4,"y":8,"width":400,"height":500}]
> 1497796541756  webdriver::server       DEBUG   <- 200 OK {"value": {"x":4,"y":8,"width":400,"height":500}}

It would appear that the return value from _setting_ the window rect
to (500,600) returns the correct (500.0/600.0) value, but that the
subsequent getWindowRect command returns (400,500), as if it hasn’t
changed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
with >25% failure rate and no action on almost a week, we need to disable this until there is time to fix it properly.
Attachment #8879895 - Flags: review?(gbrown)
here is a try push showing the fix:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=79aa90eb090d7a056cb7544657dbb00262df6488

you can see Wd failing on debug (that is different issues, and looks to be perma fail).  This is fixing it for opt.
Comment on attachment 8879895 [details] [diff] [review]
temporarily disable test on linux

I’ve spent a considerable amount of time trying to decipher what the
problem is.  I think I have nailed it down to two distinct issues.

The main issue in providing a blocking API for resizing the window is
obviously that the DOM resize event fires at a high rate and needs
to be throttled.  We are able to throttle this successfully with
requestAnimationFrame and/or a hard-coded 60 fps setTimeout delay.
To the naked eye, this appears to synchronise window resizing before
returning a response to the client.

The second, and more worrying issue, is that said response occasionally
returns the wrong window size.  window.outerWidth and window.outerHeight
does not appear to be deterministic as DOM properties are not
synchronously populated.

The third issue, and perhaps even more troubling, is that calls to
ChromeWindow.{outerWidth,outerHeight} sometimes returns inconsistent
values.  In the log I showed in comment #27, you will see that the
returned window size from the setWindowRect command is correct, but
that the size returned from the subsequent getWindowRect command is
different.  You can tell from the timestamps that this happens in
sequence and reproduces quite frequently locally.

The command I have used to run these tests, and which has been quite
helpful, is:

> % ./mach wpt --webdriver-arg='-vv' testing/web-platform/tests/webdriver/contexts.py --repeat-until-unexpected

By using a combination of Services.tm.mainThread.idleDispatch and a
blocking promise on not returning until the window size has changed,
I’m able to reduce the intermittent rate significantly (by over an
order of magnitude).  As one would expect, delaying computation allows
DOM property values to populate, and as such this does not address the
underlying problem or make it inconceivable that a race could appear,
but it seems to be more reliable than the current approach.
Flags: needinfo?(ato)
Attachment #8879895 - Flags: review?(gbrown)
Blocks: webdriver
OS: Unspecified → Linux
Priority: -- → P1
Hardware: Unspecified → x86_64
Comment on attachment 8876713 [details]
Bug 1364319 - Make setWindowRect deterministic;

https://reviewboard.mozilla.org/r/148042/#review156318
Attachment #8876713 - Flags: review?(dburns) → review+
Pushed by atolfsen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5dc0f2cd4bc7
Make setWindowRect deterministic; r=automatedtester,maja_zf
Duplicate of this bug: 1373879
https://hg.mozilla.org/mozilla-central/rev/5dc0f2cd4bc7
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Whiteboard: [stockwell needswork] → [stockwell fixed:other]
You need to log in before you can comment on or make changes to this bug.