Closed Bug 1673851 Opened 4 years ago Closed 4 years ago

Intermittent Fis /webdriver/tests/new_window/<test-name>| test_sets_no_window_name - AssertionError: assert None == ''

Categories

(Remote Protocol :: Marionette, defect, P5)

Default
defect

Tracking

(Fission Milestone:M7, firefox-esr78 unaffected, firefox82 unaffected, firefox83 unaffected, firefox84 fixed)

RESOLVED FIXED
84 Branch
Fission Milestone M7
Tracking Status
firefox-esr78 --- unaffected
firefox82 --- unaffected
firefox83 --- unaffected
firefox84 --- fixed

People

(Reporter: whimboo, Assigned: jdescottes)

References

(Regression, )

Details

(Keywords: intermittent-failure, regression)

No description provided.

Here the failure details:

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=320009541&repo=autoland&lineNumber=3704-3715

[task 2020-10-28T12:06:10.341Z] 12:06:10     INFO - PID 912 | 1603886770330	Marionette	DEBUG	0 -> [0,404,"WebDriver:NewWindow",{"type":"window"}]
[task 2020-10-28T12:06:10.342Z] 12:06:10     INFO - PID 912 | 1603886770334	Marionette	TRACE	Received observer notification toplevel-window-ready
[task 2020-10-28T12:06:10.342Z] 12:06:10     INFO - PID 912 | 1603886770334	Marionette	TRACE	Received observer notification toplevel-window-ready
[...]
[task 2020-10-28T12:06:10.363Z] 12:06:10     INFO - PID 912 | 1603886770335	Marionette	TRACE	Received observer notification toplevel-window-ready
[task 2020-10-28T12:06:10.363Z] 12:06:10     INFO - PID 912 | 1603886770335	Marionette	TRACE	Received observer notification toplevel-window-ready
[task 2020-10-28T12:06:10.364Z] 12:06:10     INFO - PID 912 | 1603886770335	Marionette	TRACE	Received observer notification toplevel-window-ready
[task 2020-10-28T12:06:10.365Z] 12:06:10     INFO - PID 912 | 1603886770335	Marionette	TRACE	Received observer notification toplevel-window-ready
[task 2020-10-28T12:06:10.365Z] 12:06:10     INFO - PID 912 | 1603886770335	Marionette	TRACE	Received observer notification toplevel-window-ready
[task 2020-10-28T12:06:10.366Z] 12:06:10     INFO - PID 912 | 1603886770335	Marionette	TRACE	Received observer notification toplevel-window-ready
[task 2020-10-28T12:06:10.367Z] 12:06:10     INFO - PID 912 | 1603886770338	Marionette	TRACE	Received DOM event activate for [object Window]
[task 2020-10-28T12:06:10.368Z] 12:06:10     INFO - PID 912 | 1603886770339	Marionette	TRACE	Received DOM event focus for [object HTMLDocument]
[task 2020-10-28T12:06:10.505Z] 12:06:10     INFO - PID 912 | DEBUG: Adding blocker ContentParent: id=7f0d60ed5000 for phase xpcom-will-shutdown
[task 2020-10-28T12:06:10.505Z] 12:06:10     INFO - PID 912 | DEBUG: Adding blocker ContentParent: id=7f0d60ed5000 for phase profile-before-change
[task 2020-10-28T12:06:10.507Z] 12:06:10     INFO - PID 912 | 1603886770501	Marionette	TRACE	Received observer notification browser-delayed-startup-finished
[task 2020-10-28T12:06:10.528Z] 12:06:10     INFO - PID 912 | 1603886770522	Marionette	TRACE	Received DOM event activate for [object Window]
[task 2020-10-28T12:06:10.529Z] 12:06:10     INFO - PID 912 | 1603886770522	Marionette	TRACE	Received DOM event focus for [object HTMLDocument]
[task 2020-10-28T12:06:10.531Z] 12:06:10     INFO - PID 912 | 1603886770524	Marionette	DEBUG	0 <- [1,404,null,{"handle":"70","type":"window"}]
[task 2020-10-28T12:06:10.532Z] 12:06:10     INFO - PID 912 | 1603886770524	webdriver::server	DEBUG	<- 200 OK {"value":{"handle":"70","type":"window"}}
[task 2020-10-28T12:06:10.533Z] 12:06:10     INFO - PID 912 | 1603886770526	webdriver::server	DEBUG	-> POST /session/2e8b1f62-02a1-4c43-8473-eb0a829735bc/window {"handle": "70"}
[task 2020-10-28T12:06:10.534Z] 12:06:10     INFO - PID 912 | 1603886770527	Marionette	DEBUG	0 -> [0,405,"WebDriver:SwitchToWindow",{"handle":"70","name":"70"}]
[task 2020-10-28T12:06:10.570Z] 12:06:10     INFO - PID 912 | 1603886770564	Marionette	TRACE	[70] Child actor created for window id 71
[task 2020-10-28T12:06:10.578Z] 12:06:10     INFO - PID 912 | 1603886770569	Marionette	TRACE	[70] Parent actor created
[task 2020-10-28T12:06:10.593Z] 12:06:10     INFO - PID 912 | 1603886770583	Marionette	TRACE	[70] Frame script loaded
[task 2020-10-28T12:06:10.601Z] 12:06:10     INFO - PID 912 | 1603886770595	Marionette	DEBUG	0 <- [1,405,null,{"value":null}]
[task 2020-10-28T12:06:10.601Z] 12:06:10     INFO - PID 912 | 1603886770596	webdriver::server	DEBUG	<- 200 OK {"value":null}
[task 2020-10-28T12:06:10.603Z] 12:06:10     INFO - PID 912 | 1603886770598	webdriver::server	DEBUG	-> POST /session/2e8b1f62-02a1-4c43-8473-eb0a829735bc/execute/sync {"args": [], "script": "\n        return window.name;\n        "}
[task 2020-10-28T12:06:10.605Z] 12:06:10     INFO - PID 912 | 1603886770598	Marionette	DEBUG	0 -> [0,406,"WebDriver:ExecuteScript",{"args":[],"script":"\n        return window.name;\n        "}]
[task 2020-10-28T12:06:10.605Z] 12:06:10     INFO - PID 912 | 1603886770601	Marionette	DEBUG	0 <- [1,406,null,{"value":null}]
[task 2020-10-28T12:06:10.610Z] 12:06:10     INFO - PID 912 | 1603886770604	webdriver::server	DEBUG	<- 200 OK {"value":null}
[task 2020-10-28T12:06:10.612Z] 12:06:10     INFO - PID 912 | 1603886770608	Marionette	TRACE	[70] Child actor created for window id 40802189313
[task 2020-10-28T12:06:10.615Z] 12:06:10     INFO - PID 912 | 1603886770612	Marionette	TRACE	[70] Parent actor created
Blocks: 1669169
Keywords: regression
Regressed by: 1660168
Whiteboard: [marionette-fission-mvp]

Not sure why we are getting that many toplevel-window-ready notifications, and if those are related to the failure. Maybe we miss to unregister from an observer.

But trying to switch to the window with the id 70 a parent actor for a browsing context with id 71 is getting created.

Summary: Intermittent Fis TEST-UNEXPECTED-FAIL | /webdriver/tests/new_window/new_window.py | test_sets_no_window_name - AssertionError: assert None == '' → Intermittent Fis /webdriver/tests/new_window/new_window.py | test_sets_no_window_name - AssertionError: assert None == ''
Assignee: nobody → jdescottes
Status: NEW → ASSIGNED
Priority: P5 → P1

I suspect that we get None here instead of an empty string because the executeScript command which is supposed to provide the window's name fails on an AbortError and the proxy will swallow the error and return null (which will be turned to None in python).

I tried to modify the proxy to forward AbortError(s) thrown when calling a no-retry command such as execute script.
Try is at https://treeherder.mozilla.org/#/jobs?repo=try&revision=e238dddfaceb0564d999817acade784ce8cd557a
This change triggers a lot of intermittents, but the point is to confirm if the theory above is correct.

I didn't manage to get the exact same error yet, but I get a very similar one: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=320059289&repo=try&lineNumber=4050

The test is

def test_sets_no_opener(session):
    response = new_window(session, type_hint="window")
    value = assert_success(response)
    assert value["type"] == "window"

    session.window_handle = value["handle"]
    assert opener(session) is None

opener(session) relies on executeScript:

def opener(session):
    return session.execute_script("""
        return window.opener;
        """)

Which here fails with an AbortError.

Tracking marionette-fission-mvp bugs for Fission Beta milestone (M7).

Fission Milestone: --- → M7
Summary: Intermittent Fis /webdriver/tests/new_window/new_window.py | test_sets_no_window_name - AssertionError: assert None == '' → Intermittent Fis /webdriver/tests/new_window/<test-name>| test_sets_no_window_name - AssertionError: assert None == ''

Not making a lot of progress for now. I haven't been able to reproduce this locally, and on Try it seems to happen only on Linux.

From the investigations mentioned in comment 3, I still believe this is related to AbortError that occur during an executeScript command. So far I've tried a few approaches (that I initially planned to use for https://bugzilla.mozilla.org/show_bug.cgi?id=1673478).

It's a bit difficult to make progress here because for now I can't reproduce it locally, so I'm not exactly sure what triggers the destruction of the initial JSWindowActor pair used for the executeScript command. Normally the tests properly wait for the new windows to be ready. I will try to add more logs to understand better what happens here.

Would you mind adding some delays in between the steps of the test? It would be good to know if that is a race or not. By removing timeouts step by step we could figure out where exactly the code is racy.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #8)

Would you mind adding some delays in between the steps of the test? It would be good to know if that is a race or not. By removing timeouts step by step we could figure out where exactly the code is racy.

Here is a try push with a 1 second delay added at the end of each new_window helper under testing/web-platform/tests/webdriver/tests/new_window/

https://treeherder.mozilla.org/jobs?repo=try&revision=8cb72695faa2884bf738c842742d0e59b86fa218&selectedTaskRun=LpkEREEnT8KolVPN4z41UQ.1

So far, no failure over 15 retriggers of the W-Fis Wd1 job on Linux 18.04 x64 WebRender opt. Started more retriggers right now.

That's good to hear. So do we know if that only happens for type tabs, windows, or both?

As it looks like the problem appears for both tabs and windows. So my assumption is that this happens because when opening one of these we wait until some window / tab specific events have happened, but not until the contained page (in case of a browser window) has been finished loading.

Julian, to reproduce the failure maybe force loading of a specific website instead of about:blank as what we do by default.

I wonder if the WebDriver spec around New Window should be updated so that the command actually waits for the contained page to be loaded. As we know now loading about:blank is a beast... James, what do you think?

Flags: needinfo?(james)

Yes, I think the spec assumes that the about:blank load is synchronous, which at least in olden times is nearly true for tabs and might be less true for windows (c.f. [1]). Certainly we shouldn't return until about:blank has finished loading, but whether or not the spec needs to say anything here depends on whether the initial about:blank loading is now specified in HTML or not.

[1] https://hsivonen.fi/about-blank/

Flags: needinfo?(james)
Depends on: 1675320

Nothing we can fix in geckodriver. I filed bug 1675320 for Marionette to get it implemented.

Assignee: jdescottes → nobody
Status: ASSIGNED → NEW
Priority: P1 → P5
No longer blocks: 1669169

Note that for non-Fission builds this failure seems to be exclusive for ASAN builds:
https://treeherder.mozilla.org/jobs?repo=try&revision=7b5f6e645275bb7cc1cd6c369621c1678ac43d2f&searchStr=wd

Blocks: 1675248
No longer blocks: 1675248

Now that Bug 1675320 has landed, I expect this intermittent to be resolved. Let's wait for a few days to see if we get new occurrences.

No more failures since bug 1675320 has been fixed. We can close it.

Assignee: nobody → jdescottes
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Whiteboard: [marionette-fission-mvp]
Target Milestone: --- → 84 Branch
See Also: → 1749106
See Also: → 1810987
Moving bug to Remote Protocol::Marionette component per bug 1815831.
Component: geckodriver → Marionette
Product: Testing → Remote Protocol
You need to log in before you can comment on or make changes to this bug.