Frequent WD macosx1470-64 debug /webdriver/tests/classic/set_window_rect/set.py | single tracking bug
Categories
(Remote Protocol :: Marionette, defect, P2)
Tracking
(firefox-esr115 unaffected, firefox-esr128 unaffected, firefox135 unaffected, firefox136 unaffected, firefox137 fixed)
| Tracking | Status | |
|---|---|---|
| firefox-esr115 | --- | unaffected |
| firefox-esr128 | --- | unaffected |
| firefox135 | --- | unaffected |
| firefox136 | --- | unaffected |
| firefox137 | --- | fixed |
People
(Reporter: whimboo, Assigned: emilio)
References
(Regression)
Details
(Keywords: intermittent-failure, regression, Whiteboard: [webdriver:m15][webdriver:external])
Attachments
(2 files)
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta-
|
Details | Review |
Since bug 581863 landed we can see hangs in WebDriver tests that manipulate the window position and size. Here an example of a try build from earlier today as built from most recent mozilla-central:
https://treeherder.mozilla.org/logviewer?job_id=494342940&repo=try&lineNumber=281343
[task 2025-02-11T11:17:38.859Z] 11:17:38 INFO - PID 4569 | 1739272658857 Marionette DEBUG 0 -> [0,1236,"WebDriver:SetWindowRect",{"height":1080,"width":1920,"x":0,"y":25}]
[task 2025-02-11T11:17:38.860Z] 11:17:38 INFO - PID 4569 | 1739272658857 Marionette TRACE Checking window geometry 800x600 @ (100, 100)
[task 2025-02-11T11:17:38.862Z] 11:17:38 INFO - PID 4569 | 1739272658860 RemoteAgent TRACE Received DOM event MozUpdateWindowPos for [object WindowRoot]
[task 2025-02-11T11:17:38.863Z] 11:17:38 INFO - PID 4569 | 1739272658861 Marionette TRACE Checking window geometry 1920x1055 @ (0, 25)
[task 2025-02-11T11:17:39.285Z] 11:17:39 INFO - PID 4569 | [Parent 4589, Main Thread] ###!!! ASSERTION: mBounds out of sync!: 'mWindow && mBounds == r', file /builds/worker/checkouts/gecko/widget/cocoa/nsCocoaWindow.mm:1892
[task 2025-02-11T11:17:39.298Z] 11:17:39 INFO - PID 4569 | [Parent 4589, Main Thread] ###!!! ASSERTION: mBounds out of sync!: 'mWindow && mBounds == r', file /builds/worker/checkouts/gecko/widget/cocoa/nsCocoaWindow.mm:1892
[task 2025-02-11T11:17:39.299Z] 11:17:39 INFO - PID 4569 | 1739272659298 RemoteAgent TRACE Received DOM event resize for [object Window]
..
[task 2025-02-11T11:29:41.391Z] 11:29:41 INFO - PID 4569 | 1739273381390 RemoteAgent TRACE Received DOM event MozUpdateWindowPos for [object WindowRoot]
[task 2025-02-11T11:29:41.391Z] 11:29:41 INFO - PID 4569 | 1739273381390 Marionette TRACE Checking window geometry 1920x1055 @ (0, 25)
[task 2025-02-11T11:29:41.641Z] 11:29:41 WARNING - Forcibly terminating runner process
[task 2025-02-11T11:29:41.676Z] 11:29:41 INFO - STDERR: Could not terminate process: 4569
[task 2025-02-11T11:29:41.682Z] 11:29:41 ERROR - Failed to stop either the runner or the browser process
The manifest files for the related wdspec tests have multiple statuses included since then. Here an example:
https://searchfox.org/mozilla-central/source/testing/web-platform/meta/webdriver/tests/classic/set_window_rect/set.py.ini
| Reporter | ||
Comment 1•1 year ago
|
||
Oh and we usually had a timeout of 500ms which should have caused the command to return early if the target destination and size cannot be reached:
| Assignee | ||
Comment 2•1 year ago
|
||
Is this on macOS only? If so, it seems likely something about the remote changes is upsetting macOS? Also, is this -debug only?
The timeout seems to still be there unless I'm missing something, it's probably something hanging on the widget side...
| Reporter | ||
Comment 3•1 year ago
|
||
Note that I cannot reproduce the hangs locally on MacOS 15.2. Maybe it's related to the configuration in CI? Also I noticed that the meta file got updated on Monday last week so that for OS X 14.7.1 all tests are marked as [TIMEOUT, OK] now. I've reverted those entries and pushed a try build. I'll re-trigger the specific jobs once it's clear in which chunk the tests are running:
https://treeherder.mozilla.org/jobs?repo=try&revision=a2df83245da3876ea684fb144492d6257d1b9000
| Reporter | ||
Comment 4•1 year ago
|
||
Updated•1 year ago
|
| Reporter | ||
Comment 5•1 year ago
|
||
It looks like there are no more such failures. Lets get the meta data updated.
Comment 7•1 year ago
|
||
| bugherder | ||
| Reporter | ||
Updated•1 year ago
|
| Reporter | ||
Comment 8•1 year ago
|
||
Actually this is still a problem even through on try it didn't fail:
https://treeherder.mozilla.org/logviewer?job_id=496615509&repo=autoland&lineNumber=180883
And yes this is debug only because that assertion is only run for such builds:
https://searchfox.org/mozilla-central/rev/f256e50e068136275c1a2aff827aeddc92a75e07/widget/cocoa/nsCocoaWindow.mm#1889-1893
#ifdef DEBUG
LayoutDeviceIntRect r = nsCocoaUtils::CocoaRectToGeckoRectDevPix(
mWindow.frame, BackingScaleFactor());
NS_ASSERTION(mWindow && mBounds == r, "mBounds out of sync!");
#endif
| Assignee | ||
Comment 9•1 year ago
|
||
Does removing the "retry" code-path on macOS fix it?
diff --git a/remote/shared/WindowManager.sys.mjs b/remote/shared/WindowManager.sys.mjs
index af1da9d25b452..ea9a825d6833e 100644
--- a/remote/shared/WindowManager.sys.mjs
+++ b/remote/shared/WindowManager.sys.mjs
@@ -251,7 +251,7 @@ class WindowManager {
// wait longer if window geometry does not match.
const options = {
checkFn() {
- return geometryMatches(/* retry = */ true);
+ return geometryMatches();
},
timeout: 500,
};
I don't know how many retriggers am I expected to do to confirm that so if you could that'd be great :)
| Reporter | ||
Comment 10•1 year ago
|
||
I've a dirty working tree with changes I cannot commit right now. So if you could push that would be great. But not sure how reliable the results are given that my last try build also didn't fail for quite a few pushes.
Maybe we should not retry when the MozUpdateWindowPos event reports the exact same data as the one before? It probably means we cannot further move the window?
| Assignee | ||
Comment 11•1 year ago
|
||
Here's a patch: https://treeherder.mozilla.org/jobs?repo=try&revision=a4dd6b96694b55a3011cec9e2c65075a25feaf9f
But no, I think there's a cocoa widget bug, but without reproing it is a bit hard for me to justify sinking a lot of time into it :(
| Reporter | ||
Comment 12•1 year ago
|
||
Emilio, so the trick here to get it reproduced seem to be a full build. While an artifact build on try didn't reproduce the failure I can see it here multiple times:
https://treeherder.mozilla.org/jobs?repo=try&revision=1ee7423804f22531e6094c4083653d61b7d9533f
You pushed your try build for an artifact build, so maybe you try it locally with a full build as well first, and then push again to try?
Given that all the tests are passing in your try build also for other platforms do we really need the retry logic?
| Reporter | ||
Comment 13•1 year ago
|
||
To allow sheriffs to better classify the failures we need to update the bug's summary.
Updated•1 year ago
|
| Assignee | ||
Updated•1 year ago
|
| Assignee | ||
Comment 14•1 year ago
|
||
It seems to confuse macOS, doesn't seem needed anymore on Linux after
bug 1946184, where we started to look at the X server position rather
than the last configure position.
Updated•1 year ago
|
Comment 15•1 year ago
|
||
Comment 16•1 year ago
|
||
| bugherder | ||
| Comment hidden (Intermittent Failures Robot) |
| Assignee | ||
Comment 18•1 year ago
|
||
Comment on attachment 9469207 [details]
Bug 1947453 - Remove retry logic from geometryMatches. r=whimboo
Beta/Release Uplift Approval Request
- User impact if declined/Reason for urgency: Reduces failure rate of bug 1785367
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: none
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): simple tweak
- String changes made/needed: none
- Is Android affected?: No
Comment 19•1 year ago
|
||
Comment on attachment 9469207 [details]
Bug 1947453 - Remove retry logic from geometryMatches. r=whimboo
The patch landed during the 137 nightly cycle so it's already in beta.
| Reporter | ||
Comment 20•1 year ago
|
||
This is strange. I've checked some of these failures and it turns out that those are for opt builds and we run into a timeout already after 13s running the test. Isn't the default timeout 25s for wdspec? Here an example:
https://treeherder.mozilla.org/logviewer?job_id=498600907&repo=mozilla-beta&lineNumber=163036
[task 2025-03-11T17:03:44.727Z] 17:03:44 INFO - TEST-START | /webdriver/tests/classic/set_window_rect/set.py
[task 2025-03-11T17:03:44.852Z] 17:03:44 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/classic/set_window_rect/set.py::test_null_parameter_value
[task 2025-03-11T17:03:48.997Z] 17:03:48 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/classic/set_window_rect/set.py::test_no_top_browsing_context
[task 2025-03-11T17:03:49.500Z] 17:03:49 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/classic/set_window_rect/set.py::test_no_browsing_context
[..]
[task 2025-03-11T17:03:56.489Z] 17:03:56 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/classic/set_window_rect/set.py::test_y_as_current
[task 2025-03-11T17:03:56.518Z] 17:03:56 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/classic/set_window_rect/set.py::test_negative_x_y
[task 2025-03-11T17:03:57.622Z] 17:03:57 INFO - TEST-UNEXPECTED-OK | /webdriver/tests/classic/set_window_rect/set.py | expected TIMEOUT
Maybe we need to file a new bug for this particular issue in case we do not have an immediate answer to that. Sasha, maybe something changed in wpt?
| Reporter | ||
Comment 21•1 year ago
|
||
Oh wait. The problem here is not that we timeout, but that we do NOT timeout. Looks like the following entry in the manifest file is causing it:
https://searchfox.org/mozilla-beta/source/testing/web-platform/meta/webdriver/tests/classic/set_window_rect/set.py.ini#5
| Reporter | ||
Comment 22•1 year ago
|
||
On mozilla-central we have [OK, TIMEOUT] but shouldn't we no longer expect timeout since the fix landed?
https://searchfox.org/mozilla-central/source/testing/web-platform/meta/webdriver/tests/classic/set_window_rect/set.py.ini#5
Comment 23•1 year ago
|
||
I've pushed to try: https://treeherder.mozilla.org/jobs?repo=try&revision=95eaab54270cc9d4db6d2056d9914a1c61ce5ba5, to see if we can try to remove the metadata.
| Comment hidden (Intermittent Failures Robot) |
| Reporter | ||
Updated•1 year ago
|
Description
•