Open Bug 1792688 Opened 2 years ago Updated 10 months ago

[wdspec] WebDriver BiDi client hangs when asyncio.wait_for cancels an underlying command

Categories

(Remote Protocol :: WebDriver BiDi, defect, P3)

Firefox 107
defect

Tracking

(Not tracked)

People

(Reporter: whimboo, Unassigned)

References

Details

(Whiteboard: [webdriver:backlog])

With my upcoming patch on bug 1789659 to make the browsingContext.close command async and to wait for the tab to really be closed the following perma failure exists:

https://treeherder.mozilla.org/logviewer?job_id=391658675&repo=try&lineNumber=19713-19751

[task 2022-09-28T08:33:53.470Z] 08:33:53     INFO - PID 1421 | 1664354033467	RemoteAgent	TRACE	WebDriverBiDiConnection f5d1c235-db0c-4516-bbdb-f1bf960db1ed -> {"id":2,"method":"browsingContext.navigate","params":{"context":"498ea6d0-9efa-4cb5-95f9-7d6c1de287ae","url":"http://web-platform.test:8000/webdriver/tests/support/inline.py?doc=%3C%21doctype+html%3E%0A%3Cmeta+charset%3DUTF-8%3E%0A%3Ci ... sts%2Fbidi%2Fbrowsing_context%2Fnavigate%2Fsupport%2Fempty.svg%3Fpipe%3Dtrickle%28d10%29%27%3E&mime=text%2Fhtml&charset=UTF-8","wait":"complete"}}
[task 2022-09-28T08:33:53.471Z] 08:33:53     INFO - PID 1421 | 1664354033468	RemoteAgent	TRACE	Received command browsingContext.navigate for destination ROOT
[task 2022-09-28T08:33:53.472Z] 08:33:53     INFO - PID 1421 | 1664354033469	RemoteAgent	TRACE	Received command browsingContext._getBaseURL for destination WINDOW_GLOBAL
[task 2022-09-28T08:33:53.473Z] 08:33:53     INFO - PID 1421 | 1664354033472	RemoteAgent	TRACE	Module windowglobal-in-root/browsingContext.jsm found for WINDOW_GLOBAL
[task 2022-09-28T08:33:53.479Z] 08:33:53     INFO - PID 1421 | 1664354033478	RemoteAgent	TRACE	Created MessageHandler WINDOW_GLOBAL for session 77d3703c-c928-4511-9b82-1e3ec51dc70d
[task 2022-09-28T08:33:53.480Z] 08:33:53     INFO - PID 1421 | 1664354033479	RemoteAgent	TRACE	Received command browsingContext._getBaseURL for destination WINDOW_GLOBAL
[task 2022-09-28T08:33:53.485Z] 08:33:53     INFO - PID 1421 | 1664354033484	RemoteAgent	TRACE	Module windowglobal/browsingContext.jsm found for WINDOW_GLOBAL
[task 2022-09-28T08:33:53.501Z] 08:33:53     INFO - PID 1421 | 1664354033499	RemoteAgent	TRACE	[11] ProgressListener state=start: http://web-platform.test:8000/webdriver/tests/support/inline.py?doc=%3C%21doctype+html%3E%0A%3Cmeta+charset%3DUTF-8%3E%0A%3Ci ... sts%2Fbidi%2Fbrowsing_context%2Fnavigate%2Fsupport%2Fempty.svg%3Fpipe%3Dtrickle%28d10%29%27%3E&mime=text%2Fhtml&charset=UTF-8
[task 2022-09-28T08:33:54.432Z] 08:33:54     INFO - PID 1421 | ### XPCOM_MEM_BLOAT_LOG defined -- logging bloat/leaks to /tmp/tmpqa75c4n8.mozrunner/runtests_leaks_915_tab_pid8292.log
[task 2022-09-28T08:33:54.433Z] 08:33:54     INFO - PID 1421 | [8292, Main Thread] WARNING: XPCOM_MEM_BLOAT_LOG is set, disabling native allocations.: file /builds/worker/checkouts/gecko/tools/profiler/core/platform.cpp:339
[task 2022-09-28T08:33:54.452Z] 08:33:54     INFO - PID 1421 | ### XPCOM_MEM_BLOAT_LOG defined -- logging bloat/leaks to /tmp/tmpqa75c4n8.mozrunner/runtests_leaks_915_tab_pid8296.log
[task 2022-09-28T08:33:54.453Z] 08:33:54     INFO - PID 1421 | [8296, Main Thread] WARNING: XPCOM_MEM_BLOAT_LOG is set, disabling native allocations.: file /builds/worker/checkouts/gecko/tools/profiler/core/platform.cpp:339
[task 2022-09-28T08:33:54.473Z] 08:33:54     INFO - STDOUT: *** after wait for navigation ***
[task 2022-09-28T08:33:54.475Z] 08:33:54     INFO - PID 1421 | 1664354034473	RemoteAgent	TRACE	WebDriverBiDiConnection f5d1c235-db0c-4516-bbdb-f1bf960db1ed -> {"id":3,"method":"browsingContext.getTree","params":{"maxDepth":0,"root":"498ea6d0-9efa-4cb5-95f9-7d6c1de287ae"}}
[task 2022-09-28T08:33:54.476Z] 08:33:54     INFO - PID 1421 | 1664354034474	RemoteAgent	TRACE	Received command browsingContext.getTree for destination ROOT
[task 2022-09-28T08:33:54.478Z] 08:33:54     INFO - PID 1421 | 1664354034476	RemoteAgent	TRACE	WebDriverBiDiConnection f5d1c235-db0c-4516-bbdb-f1bf960db1ed <- {"id":3,"result":{"contexts":[{"context":"498ea6d0-9efa-4cb5-95f9-7d6c1de287ae","url":"http://web-platform.test:8000/webdriver/tests/support/inline.py?doc=%3C%21doctype+html%3E%0A%3Cmeta+charset%3DUTF-8%3E%0A%3Ci ... sts%2Fbidi%2Fbrowsing_context%2Fnavigate%2Fsupport%2Fempty.svg%3Fpipe%3Dtrickle%28d10%29%27%3E&mime=text%2Fhtml&charset=UTF-8","children":null,"parent":null}]}}
[task 2022-09-28T08:33:54.484Z] 08:33:54     INFO - PID 1421 | [Child 8296, Main Thread] WARNING: could not set real-time limit in CubebUtils::InitLibrary: file /builds/worker/checkouts/gecko/dom/media/CubebUtils.cpp:720
[task 2022-09-28T08:33:54.485Z] 08:33:54     INFO - PID 1421 | [Child 8292, Main Thread] WARNING: could not set real-time limit in CubebUtils::InitLibrary: file /builds/worker/checkouts/gecko/dom/media/CubebUtils.cpp:720
[task 2022-09-28T08:33:54.486Z] 08:33:54     INFO - PID 1421 | 1664354034480	RemoteAgent	TRACE	WebDriverBiDiConnection f5d1c235-db0c-4516-bbdb-f1bf960db1ed -> {"id":4,"method":"browsingContext.close","params":{"context":"498ea6d0-9efa-4cb5-95f9-7d6c1de287ae"}}
[task 2022-09-28T08:33:54.491Z] 08:33:54     INFO - PID 1421 | 1664354034481	RemoteAgent	TRACE	Received command browsingContext.close for destination ROOT
[task 2022-09-28T08:33:54.504Z] 08:33:54     INFO - PID 1421 | 1664354034503	RemoteAgent	TRACE	[12] ProgressListener state=stop: error=0x80004004 (NS_ERROR_ABORT)
[task 2022-09-28T08:33:54.513Z] 08:33:54     INFO - PID 1421 | 1664354034504	RemoteAgent	TRACE	MessageHandler WINDOW_GLOBAL for session 77d3703c-c928-4511-9b82-1e3ec51dc70d is being destroyed
[task 2022-09-28T08:33:54.517Z] 08:33:54     INFO - PID 1421 | 1664354034505	RemoteAgent	TRACE	Unregistered MessageHandler WINDOW_GLOBAL for session 77d3703c-c928-4511-9b82-1e3ec51dc70d
[task 2022-09-28T08:33:54.521Z] 08:33:54     INFO - PID 1421 | [Child 8176, Main Thread] WARNING: NS_ENSURE_TRUE(GetOwner() && !mCanceled) failed: file /builds/worker/checkouts/gecko/image/imgRequestProxy.cpp:955
[task 2022-09-28T08:33:54.524Z] 08:33:54     INFO - PID 1421 | 1664354034523	RemoteAgent	TRACE	WebDriverBiDiConnection f5d1c235-db0c-4516-bbdb-f1bf960db1ed <- {"id":2,"error":"unknown error","message":"Error: NS_ERROR_ABORT","stacktrace":"#checkLoadingState@chrome://remote/content/shared/Navigate.jsm:239:28\nonStateChange@chrome://remote/content/shared/Navigate.j ... //remote/content/server/WebSocketTransport.jsm:91:18\nhandleEvent@chrome://remote/content/server/WebSocketTransport.jsm:73:14\n"}
[task 2022-09-28T08:33:54.531Z] 08:33:54     INFO - PID 1421 | 1664354034530	RemoteAgent	TRACE	WebDriverBiDiConnection f5d1c235-db0c-4516-bbdb-f1bf960db1ed <- {"id":4,"result":{}}
[task 2022-09-28T08:33:54.589Z] 08:33:54     INFO - PID 1421 | JavaScript error: chrome://remote/content/shared/Navigate.jsm, line 239: Error: NS_ERROR_ABORT
[task 2022-09-28T08:33:54.598Z] 08:33:54     INFO - PID 1421 | [Child 8176, Main Thread] WARNING: IPC message 'PNecko::Msg_RemoveRequestContext' discarded: actor cannot send: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:504
[task 2022-09-28T08:33:54.601Z] 08:33:54     INFO - PID 1421 | [Parent 8061, IPDL Background] WARNING: IPC Connection Error: [Parent][PBackgroundParent] RunMessage(msgname=PRemoteWorkerService::Msg___delete__) Channel closing: too late to send/recv, messages will be lost: file /builds/worker/checkouts/gecko/ipc/glue/MessageChannel.cpp:1883
[task 2022-09-28T08:33:54.606Z] 08:33:54     INFO - PID 1421 | [Parent 8061, Compositor] WARNING: IPC Connection Error: [Parent][PCompositorManagerParent] RunMessage(msgname=PCompositorBridge::Msg___delete__) Channel closing: too late to send/recv, messages will be lost: file /builds/worker/checkouts/gecko/ipc/glue/MessageChannel.cpp:1883
[task 2022-09-28T08:33:54.624Z] 08:33:54     INFO - PID 1421 | [Child 8167, Main Thread] WARNING: Extra shutdown CC: 'i < NORMAL_SHUTDOWN_COLLECTIONS', file /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp:3385
[task 2022-09-28T08:33:54.626Z] 08:33:54     INFO - PID 1421 | [Child 8167, Main Thread] WARNING: NS_ENSURE_TRUE(InitStaticMembers()) failed: file /builds/worker/workspace/obj-build/dist/include/mozilla/Preferences.h:129
[task 2022-09-28T08:33:54.631Z] 08:33:54     INFO - PID 1421 | [Child 8167, Main Thread] WARNING: NS_ENSURE_TRUE(Preferences::InitStaticMembers()) failed: file /builds/worker/checkouts/gecko/modules/libpref/Preferences.cpp:4630
[task 2022-09-28T08:33:54.635Z] 08:33:54     INFO - PID 1421 | [Child 8176, Main Thread] WARNING: Extra shutdown CC: 'i < NORMAL_SHUTDOWN_COLLECTIONS', file /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp:3385
[task 2022-09-28T08:33:54.637Z] 08:33:54     INFO - PID 1421 | [Child 8176, Main Thread] WARNING: NS_ENSURE_TRUE(InitStaticMembers()) failed: file /builds/worker/workspace/obj-build/dist/include/mozilla/Preferences.h:129
[task 2022-09-28T08:33:54.642Z] 08:33:54     INFO - PID 1421 | [Child 8176, Main Thread] WARNING: NS_ENSURE_TRUE(Preferences::InitStaticMembers()) failed: file /builds/worker/checkouts/gecko/modules/libpref/Preferences.cpp:4630
[task 2022-09-28T08:35:08.036Z] 08:35:08     INFO - TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/bidi/browsing_context/navigate/wait.py | expected OK

The test aborts awaiting for the browsingContext.navigate command after 1s while the page takes 10s to load. Then a browsingContext.getTree command succeeds and the test finishes. But then the new_tab fixture hangs because of the tab being closed the navigation is aborted and returns a failure. By adding some print calls to the code the payload of the browsingContext.navigate command is the last that our Python client receives and then hangs for the result for browsingContext.close.

A simple test that can be used to reproduce this problem with the above mentioned patch landed first is the following:

async def test_slow_image(bidi_session, inline, new_tab):
    script_url = "/webdriver/tests/bidi/browsing_context/navigate/support/empty.svg"
    url = inline(f"<img src='{script_url}?pipe=trickle(d10)'>")

    with pytest.raises(asyncio.TimeoutError):
        await asyncio.wait_for(
            bidi_session.browsing_context.navigate(
                context=new_tab["context"], url=url, wait="complete"
            ),
            timeout=1,
        )

Currently we workaround that by wrapping the command inside asyncio.shield:

        with pytest.raises(asyncio.TimeoutError):
            await asyncio.wait_for(
                asyncio.shield(bidi_session.browsing_context.navigate(
                    context=context, url=url, wait=wait
                )),
                timeout=1,
            )

But it should basically work without this wrapper.

Summary: Perma TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/bidi/browsing_context/navigate/wait.py | test_slow_image[complete-True] - expected OK (hang in browsingContext.close) → Python WebDriver BiDi client hangs when asyncio.wait_for cancels an underlying command
No longer blocks: 1743116

The severity field is not set for this bug.
:whimboo, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(hskupin)
Severity: -- → S3
Flags: needinfo?(hskupin)
Priority: -- → P3
Whiteboard: [webdriver:backlog]
Component: geckodriver → WebDriver BiDi
Product: Testing → Remote Protocol
Summary: Python WebDriver BiDi client hangs when asyncio.wait_for cancels an underlying command → [wdspec] WebDriver BiDi client hangs when asyncio.wait_for cancels an underlying command
You need to log in before you can comment on or make changes to this bug.