Open Bug 1894213 Opened 6 months ago Updated 10 hours ago

Intermittent Android Wd | unknown error (500): Failed to decode response from marionette

Categories

(Remote Protocol :: WebDriver BiDi, defect, P5)

All
Android
defect

Tracking

(firefox-esr115 unaffected, firefox-esr128 unaffected, firefox131 unaffected, firefox132 affected, firefox133 affected)

Tracking Status
firefox-esr115 --- unaffected
firefox-esr128 --- unaffected
firefox131 --- unaffected
firefox132 --- affected
firefox133 --- affected

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Depends on 1 open bug, Regression)

Details

(4 keywords)

Attachments

(1 file)

Filed by: tszentpeteri [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=456393301&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/EnvWa6OeQlq_OHbDC-95HA/runs/1/artifacts/public/logs/live_backing.log


[task 2024-04-30T10:27:00.238Z] 10:27:00     INFO - STDOUT: PASSED
[task 2024-04-30T10:27:00.243Z] 10:27:00     INFO - PID 15080 | 1714472820238	RemoteAgent	DEBUG	WebDriverBiDiConnection 8ce37f78-4d09-4695-b7ef-520e7a53d4af -> {"id":24,"method":"session.unsubscribe","params":{"events":["script.message"]}}
[task 2024-04-30T10:27:00.244Z] 10:27:00     INFO - PID 15080 | 1714472820239	RemoteAgent	TRACE	Module root/session.sys.mjs found for ROOT
[task 2024-04-30T10:27:00.245Z] 10:27:00     INFO - PID 15080 | 1714472820239	RemoteAgent	TRACE	Received command session.unsubscribe for destination ROOT
[task 2024-04-30T10:27:00.245Z] 10:27:00     INFO - PID 15080 | 1714472820239	RemoteAgent	TRACE	Module root/session.sys.mjs found for ROOT
[task 2024-04-30T10:27:00.247Z] 10:27:00     INFO - PID 15080 | 1714472820240	RemoteAgent	TRACE	Module root/script.sys.mjs found for ROOT
[task 2024-04-30T10:27:00.247Z] 10:27:00     INFO - PID 15080 | 1714472820242	RemoteAgent	TRACE	Module windowglobal-in-root/script.sys.mjs found for WINDOW_GLOBAL
[task 2024-04-30T10:27:00.247Z] 10:27:00     INFO - PID 15080 | 1714472820242	RemoteAgent	TRACE	Module windowglobal/script.sys.mjs found for WINDOW_GLOBAL
[task 2024-04-30T10:27:00.247Z] 10:27:00     INFO - PID 15080 | 1714472820242	RemoteAgent	TRACE	Received command script._applySessionData for destination WINDOW_GLOBAL
[task 2024-04-30T10:27:00.248Z] 10:27:00     INFO - PID 15080 | 1714472820242	RemoteAgent	TRACE	Module windowglobal-in-root/script.sys.mjs found for WINDOW_GLOBAL
[task 2024-04-30T10:27:00.250Z] 10:27:00     INFO - PID 15080 | 1714472820242	RemoteAgent	TRACE	Module windowglobal/script.sys.mjs found for WINDOW_GLOBAL
[task 2024-04-30T10:27:00.250Z] 10:27:00     INFO - PID 15080 | 1714472820244	RemoteAgent	TRACE	Module root/script.sys.mjs found for ROOT
[task 2024-04-30T10:27:00.251Z] 10:27:00     INFO - PID 15080 | 1714472820244	RemoteAgent	TRACE	Received command script._applySessionData for destination ROOT
[task 2024-04-30T10:27:00.251Z] 10:27:00     INFO - PID 15080 | 1714472820244	RemoteAgent	TRACE	Module root/script.sys.mjs found for ROOT
[task 2024-04-30T10:27:00.251Z] 10:27:00     INFO - PID 15080 | 1714472820244	RemoteAgent	TRACE	Received command script._applySessionData for destination WINDOW_GLOBAL
[task 2024-04-30T10:27:00.251Z] 10:27:00     INFO - PID 15080 | 1714472820244	RemoteAgent	TRACE	Module windowglobal/script.sys.mjs found for WINDOW_GLOBAL
[task 2024-04-30T10:27:00.251Z] 10:27:00     INFO - PID 15080 | 1714472820246	RemoteAgent	DEBUG	WebDriverBiDiConnection 8ce37f78-4d09-4695-b7ef-520e7a53d4af <- {"type":"success","id":24,"result":{}}
[task 2024-04-30T10:27:00.251Z] 10:27:00     INFO - PID 15080 | 1714472820249	RemoteAgent	DEBUG	WebDriverBiDiConnection 8ce37f78-4d09-4695-b7ef-520e7a53d4af closed
[task 2024-04-30T10:27:00.252Z] 10:27:00     INFO - PID 15080 | 1714472820250	webdriver::server	DEBUG	-> POST /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/timeouts {"implicit": 0}
[task 2024-04-30T10:27:00.253Z] 10:27:00     INFO - PID 15080 | 1714472820252	Marionette	DEBUG	0 -> [0,85,"WebDriver:SetTimeouts",{"implicit":0}]
[task 2024-04-30T10:27:00.254Z] 10:27:00     INFO - PID 15080 | 1714472820253	Marionette	DEBUG	0 <- [1,85,null,{"value":null}]
[task 2024-04-30T10:27:00.255Z] 10:27:00     INFO - PID 15080 | 1714472820253	webdriver::server	DEBUG	<- 200 OK {"value":null}
[task 2024-04-30T10:27:00.256Z] 10:27:00     INFO - PID 15080 | 1714472820254	webdriver::server	DEBUG	-> POST /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/timeouts {"pageLoad": 300000}
[task 2024-04-30T10:27:00.256Z] 10:27:00     INFO - PID 15080 | 1714472820255	Marionette	DEBUG	0 -> [0,86,"WebDriver:SetTimeouts",{"pageLoad":300000}]
[task 2024-04-30T10:27:00.258Z] 10:27:00     INFO - PID 15080 | 1714472820256	Marionette	DEBUG	0 <- [1,86,null,{"value":null}]
[task 2024-04-30T10:27:00.258Z] 10:27:00     INFO - PID 15080 | 1714472820257	webdriver::server	DEBUG	<- 200 OK {"value":null}
[task 2024-04-30T10:27:00.259Z] 10:27:00     INFO - PID 15080 | 1714472820257	webdriver::server	DEBUG	-> POST /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/timeouts {"script": 30000}
[task 2024-04-30T10:27:00.259Z] 10:27:00     INFO - PID 15080 | 1714472820258	Marionette	DEBUG	0 -> [0,87,"WebDriver:SetTimeouts",{"script":30000}]
[task 2024-04-30T10:27:00.260Z] 10:27:00     INFO - PID 15080 | 1714472820259	Marionette	DEBUG	0 <- [1,87,null,{"value":null}]
[task 2024-04-30T10:27:00.261Z] 10:27:00     INFO - PID 15080 | 1714472820260	webdriver::server	DEBUG	<- 200 OK {"value":null}
[task 2024-04-30T10:27:00.262Z] 10:27:00     INFO - PID 15080 | 1714472820261	webdriver::server	DEBUG	-> GET /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/window
[task 2024-04-30T10:27:00.262Z] 10:27:00     INFO - PID 15080 | 1714472820262	Marionette	DEBUG	0 -> [0,88,"WebDriver:GetWindowHandle",{}]
[task 2024-04-30T10:27:00.263Z] 10:27:00     INFO - PID 15080 | 1714472820262	Marionette	DEBUG	0 <- [1,88,null,{"value":"94cb48a0-9423-4f6d-801a-c1dd37d9c254"}]
[task 2024-04-30T10:27:00.264Z] 10:27:00     INFO - PID 15080 | 1714472820262	webdriver::server	DEBUG	<- 200 OK {"value":"94cb48a0-9423-4f6d-801a-c1dd37d9c254"}
[task 2024-04-30T10:27:00.264Z] 10:27:00     INFO - PID 15080 | 1714472820264	webdriver::server	DEBUG	-> GET /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/window
[task 2024-04-30T10:27:00.265Z] 10:27:00     INFO - PID 15080 | 1714472820265	Marionette	DEBUG	0 -> [0,89,"WebDriver:GetWindowHandle",{}]
[task 2024-04-30T10:27:00.266Z] 10:27:00     INFO - PID 15080 | 1714472820265	Marionette	DEBUG	0 <- [1,89,null,{"value":"94cb48a0-9423-4f6d-801a-c1dd37d9c254"}]
[task 2024-04-30T10:27:00.266Z] 10:27:00     INFO - PID 15080 | 1714472820265	webdriver::server	DEBUG	<- 200 OK {"value":"94cb48a0-9423-4f6d-801a-c1dd37d9c254"}
[task 2024-04-30T10:27:00.267Z] 10:27:00     INFO - PID 15080 | 1714472820267	webdriver::server	DEBUG	-> GET /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/window/handles
[task 2024-04-30T10:27:00.268Z] 10:27:00     INFO - PID 15080 | 1714472820268	Marionette	DEBUG	0 -> [0,90,"WebDriver:GetWindowHandles",{}]
[task 2024-04-30T10:27:00.269Z] 10:27:00     INFO - PID 15080 | 1714472820268	Marionette	DEBUG	0 <- [1,90,null,["94cb48a0-9423-4f6d-801a-c1dd37d9c254"]]
[task 2024-04-30T10:27:00.270Z] 10:27:00     INFO - PID 15080 | 1714472820269	webdriver::server	DEBUG	<- 200 OK {"value":["94cb48a0-9423-4f6d-801a-c1dd37d9c254"]}
[task 2024-04-30T10:27:00.270Z] 10:27:00     INFO - PID 15080 | 1714472820270	webdriver::server	DEBUG	-> POST /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/window {"handle": "94cb48a0-9423-4f6d-801a-c1dd37d9c254"}
[task 2024-04-30T10:27:00.271Z] 10:27:00     INFO - PID 15080 | 1714472820270	Marionette	DEBUG	0 -> [0,91,"WebDriver:SwitchToWindow",{"handle":"94cb48a0-9423-4f6d-801a-c1dd37d9c254"}]
[task 2024-04-30T10:27:00.272Z] 10:27:00     INFO - PID 15080 | 1714472820271	Marionette	DEBUG	0 <- [1,91,null,{"value":null}]
[task 2024-04-30T10:27:00.272Z] 10:27:00     INFO - PID 15080 | 1714472820272	webdriver::server	DEBUG	<- 200 OK {"value":null}
[task 2024-04-30T10:27:00.274Z] 10:27:00     INFO - PID 15080 | 1714472820273	webdriver::server	DEBUG	-> POST /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/alert/dismiss {}
[task 2024-04-30T10:27:00.275Z] 10:27:00     INFO - PID 15080 | 1714472820274	Marionette	DEBUG	0 -> [0,92,"WebDriver:DismissAlert",{}]
[task 2024-04-30T10:27:00.276Z] 10:27:00     INFO - PID 15080 | 1714472820275	Marionette	DEBUG	0 <- [1,92,{"error":"no such alert","message":"","stacktrace":"RemoteError@chrome://remote/content/shared/RemoteError.sys.mjs:8:8\nWebDriverError@chrome://remote/content/shared/webdriver/Er ... ote/content/marionette/server.sys.mjs:263:9\n_onJSONObjectReady/<@chrome://remote/content/marionette/transport.sys.mjs:494:20\n"},null]
[task 2024-04-30T10:27:00.277Z] 10:27:00     INFO - PID 15080 | 1714472820276	webdriver::server	DEBUG	<- 404 Not Found {"value":{"error":"no such alert","message":"","stacktrace":"RemoteError@chrome://remote/content/shared/Remo ... ontent/marionette/server.sys.mjs:263:9\n_onJSONObjectReady/<@chrome://remote/content/marionette/transport.sys.mjs:494:20\n"}}
[task 2024-04-30T10:27:00.277Z] 10:27:00     INFO - PID 15080 | 1714472820277	webdriver::server	DEBUG	-> POST /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/window {"handle": "94cb48a0-9423-4f6d-801a-c1dd37d9c254"}
[task 2024-04-30T10:27:00.278Z] 10:27:00     INFO - PID 15080 | 1714472820278	Marionette	DEBUG	0 -> [0,93,"WebDriver:SwitchToWindow",{"handle":"94cb48a0-9423-4f6d-801a-c1dd37d9c254"}]
[task 2024-04-30T10:27:00.280Z] 10:27:00     INFO - PID 15080 | 1714472820279	Marionette	DEBUG	0 <- [1,93,null,{"value":null}]
[task 2024-04-30T10:27:00.280Z] 10:27:00     INFO - PID 15080 | 1714472820279	webdriver::server	DEBUG	<- 200 OK {"value":null}
[task 2024-04-30T10:27:00.281Z] 10:27:00     INFO - PID 15080 | 1714472820281	webdriver::server	DEBUG	-> GET /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/window
[task 2024-04-30T10:27:00.282Z] 10:27:00     INFO - PID 15080 | 1714472820281	Marionette	DEBUG	0 -> [0,94,"WebDriver:GetWindowHandle",{}]
[task 2024-04-30T10:27:00.283Z] 10:27:00     INFO - PID 15080 | 1714472820282	Marionette	DEBUG	0 <- [1,94,null,{"value":"94cb48a0-9423-4f6d-801a-c1dd37d9c254"}]
[task 2024-04-30T10:27:00.283Z] 10:27:00     INFO - PID 15080 | 1714472820282	webdriver::server	DEBUG	<- 200 OK {"value":"94cb48a0-9423-4f6d-801a-c1dd37d9c254"}
[task 2024-04-30T10:27:00.284Z] 10:27:00     INFO - PID 15080 | 1714472820283	webdriver::server	DEBUG	-> GET /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/window/handles
[task 2024-04-30T10:27:00.285Z] 10:27:00     INFO - PID 15080 | 1714472820284	Marionette	DEBUG	0 -> [0,95,"WebDriver:GetWindowHandles",{}]
[task 2024-04-30T10:27:00.286Z] 10:27:00     INFO - PID 15080 | 1714472820285	Marionette	DEBUG	0 <- [1,95,null,["94cb48a0-9423-4f6d-801a-c1dd37d9c254"]]
[task 2024-04-30T10:27:00.286Z] 10:27:00     INFO - PID 15080 | 1714472820286	webdriver::server	DEBUG	<- 200 OK {"value":["94cb48a0-9423-4f6d-801a-c1dd37d9c254"]}
[task 2024-04-30T10:27:00.288Z] 10:27:00     INFO - PID 15080 | 1714472820287	webdriver::server	DEBUG	-> POST /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/window {"handle": "94cb48a0-9423-4f6d-801a-c1dd37d9c254"}
[task 2024-04-30T10:27:00.289Z] 10:27:00     INFO - PID 15080 | 1714472820288	Marionette	DEBUG	0 -> [0,96,"WebDriver:SwitchToWindow",{"handle":"94cb48a0-9423-4f6d-801a-c1dd37d9c254"}]
[task 2024-04-30T10:27:00.290Z] 10:27:00     INFO - PID 15080 | 1714472820289	Marionette	DEBUG	0 <- [1,96,null,{"value":null}]
[task 2024-04-30T10:27:00.291Z] 10:27:00     INFO - PID 15080 | 1714472820290	webdriver::server	DEBUG	<- 200 OK {"value":null}
[task 2024-04-30T10:27:00.292Z] 10:27:00     INFO - PID 15080 | 1714472820291	webdriver::server	DEBUG	-> POST /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/window/rect {"width": 800, "height": 600}
[task 2024-04-30T10:27:00.293Z] 10:27:00     INFO - PID 15080 | 1714472820292	Marionette	DEBUG	0 -> [0,97,"WebDriver:SetWindowRect",{"height":600,"width":800}]
[task 2024-04-30T10:27:00.294Z] 10:27:00     INFO - PID 15080 | 1714472820293	Marionette	TRACE	Requested window geometry matches
[task 2024-04-30T10:27:00.295Z] 10:27:00     INFO - PID 15080 | 1714472820293	Marionette	DEBUG	0 <- [1,97,null,{"x":100,"y":100,"width":800,"height":600}]
[task 2024-04-30T10:27:00.295Z] 10:27:00     INFO - PID 15080 | 1714472820294	webdriver::server	DEBUG	<- 200 OK {"value":{"x":100,"y":100,"width":800,"height":600}}
[task 2024-04-30T10:27:00.296Z] 10:27:00     INFO - PID 15080 | 1714472820295	webdriver::server	DEBUG	-> POST /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19/frame {"id": null}
[task 2024-04-30T10:27:00.297Z] 10:27:00     INFO - PID 15080 | 1714472820296	Marionette	DEBUG	0 -> [0,98,"WebDriver:SwitchToFrame",{"id":null}]
[task 2024-04-30T10:27:00.302Z] 10:27:00     INFO - PID 15080 | 1714472820300	Marionette	DEBUG	0 <- [1,98,null,{"value":null}]
[task 2024-04-30T10:27:00.302Z] 10:27:00     INFO - PID 15080 | 1714472820300	webdriver::server	DEBUG	<- 200 OK {"value":null}
[task 2024-04-30T10:27:00.303Z] 10:27:00     INFO - PID 15080 | 1714472820303	webdriver::server	DEBUG	-> DELETE /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19
[task 2024-04-30T10:27:00.305Z] 10:27:00     INFO - PID 15080 | 1714472820304	Marionette	DEBUG	0 -> [0,99,"Marionette:Quit",{"flags":["eForceQuit"]}]
[task 2024-04-30T10:27:00.306Z] 10:27:00     INFO - PID 15080 | 1714472820306	Marionette	INFO	Stopped listening on port 44880
[task 2024-04-30T10:27:00.340Z] 10:27:00     INFO - PID 15080 | 1714472820332	Marionette	TRACE	[3] MarionetteCommands actor destroyed for window id 4294967297
[task 2024-04-30T10:27:00.340Z] 10:27:00     INFO - PID 15080 | 1714472820333	WebDriver BiDi	DEBUG	Unregistered session handler: /session/9f8b0e88-81c5-4925-a32d-1bd9587c2b19
[task 2024-04-30T10:27:00.342Z] 10:27:00     INFO - PID 15080 | 1714472820337	RemoteAgent	TRACE	MessageHandler ROOT for session 9f8b0e88-81c5-4925-a32d-1bd9587c2b19 is being destroyed
[task 2024-04-30T10:27:00.342Z] 10:27:00     INFO - PID 15080 | 1714472820337	RemoteAgent	TRACE	Unregistered MessageHandler ROOT for session 9f8b0e88-81c5-4925-a32d-1bd9587c2b19
[task 2024-04-30T10:27:00.365Z] 10:27:00     INFO - PID 15080 | [Utility 15917, IPC I/O Child] WARNING: [2078F4AE4E661B7E.A48A851C203B6B42]: Dropping message '<null>'; no connection to unknown peer 1.1: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:365
[task 2024-04-30T10:27:00.366Z] 10:27:00     INFO - PID 15080 | [Child 15926, IPC I/O Child] WARNING: [D9E12F578253617E.A90C2CDD8AEF678A]: Dropping message '<null>'; no connection to unknown peer 1.1: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:365
[task 2024-04-30T10:27:00.366Z] 10:27:00     INFO - PID 15080 | [Child 15881, IPC I/O Child] WARNING: [82284782C1650622.14137403A07751B6]: Dropping message '<null>'; no connection to unknown peer 1.1: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:365
[task 2024-04-30T10:27:00.367Z] 10:27:00     INFO - PID 15080 | [Child 15827, IPC I/O Child] WARNING: [B7BF54E419F325D8.1CF5B8DBC809334]: Dropping message '<null>'; no connection to unknown peer 1.1: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:365
[task 2024-04-30T10:27:00.371Z] 10:27:00     INFO - PID 15080 | Exiting due to channel error.
[task 2024-04-30T10:27:00.371Z] 10:27:00     INFO - PID 15080 | Exiting due to channel error.
[task 2024-04-30T10:27:00.375Z] 10:27:00     INFO - TEST-UNEXPECTED-ERROR | /webdriver/tests/bidi/script/call_function/channel.py | unknown error (500): Failed to decode response from marionette
[task 2024-04-30T10:27:00.376Z] 10:27:00     INFO - TEST-INFO took 7116ms
[task 2024-04-30T10:27:00.376Z] 10:27:00     INFO - PID 15080 | Exiting due to channel error.
[task 2024-04-30T10:27:00.376Z] 10:27:00     INFO - PID 15080 | [Utility 15917, Main Thread] WARNING: Shutting down Utility process early due to a crash!: file /builds/worker/checkouts/gecko/ipc/glue/UtilityProcessChild.cpp:352
[task 2024-04-30T10:27:00.376Z] 10:27:00     INFO - PID 15080 | [Child 15816, IPC I/O Child] WARNING: [D190BC62C10BEB4C.D73EFB8F6056A146]: Dropping message '<null>'; no connection to unknown peer 1.1: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:365
[task 2024-04-30T10:27:00.376Z] 10:27:00     INFO - PID 15080 | Exiting due to channel error.
[task 2024-04-30T10:27:00.377Z] 10:27:00     INFO - PID 15080 | 1714472820360	webdriver::server	DEBUG	Teardown session
[task 2024-04-30T10:27:00.377Z] 10:27:00     INFO - PID 15080 | [Child 15812, IPC I/O Child] WARNING: [517EDAF21736917D.C1AD7DC62D255F8C]: Dropping message '<null>'; no connection to unknown peer 1.1: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:365
[task 2024-04-30T10:27:00.377Z] 10:27:00     INFO - PID 15080 | 1714472820360	geckodriver::marionette	ERROR	Failed to close browser connection: unknown error
[task 2024-04-30T10:27:00.377Z] 10:27:00     INFO - PID 15080 | Exiting due to channel error.
[task 2024-04-30T10:27:00.377Z] 10:27:00     INFO - PID 15080 | 1714472820360	webdriver::server	DEBUG	<- 500 Internal Server Error {"value":{"error":"unknown error","message":"Failed to decode response from marionette","stacktrace":""}}
[task 2024-04-30T10:27:00.378Z] 10:27:00     INFO - PID 15080 | [Child 15779, IPC I/O Child] WARNING: [3389D6213DCE0DAB.8692B38B66D041F5]: Dropping message '<null>'; no connection to unknown peer 1.1: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:365
[task 2024-04-30T10:27:00.378Z] 10:27:00     INFO - PID 15080 | Exiting due to channel error.
[task 2024-04-30T10:27:00.378Z] 10:27:00     INFO - PID 15080 | 1714472820375	webdriver::server	DEBUG	-> GET /status
[task 2024-04-30T10:27:00.378Z] 10:27:00     INFO - PID 15080 | 1714472820375	webdriver::server	DEBUG	<- 200 OK {"value":{"message":"","ready":true}}
[task 2024-04-30T10:27:00.397Z] 10:27:00     INFO - PROCESS LEAKS /tmp/tmpynq1x1pd.mozrunner/runtests_leaks_907.log
[task 2024-04-30T10:27:00.398Z] 10:27:00     INFO - leakcheck | Processing log file /tmp/tmpynq1x1pd.mozrunner/runtests_leaks_907.log for scope /webdriver/tests/bidi/script/call_function
[task 2024-04-30T10:27:00.399Z] 10:27:00     INFO - TEST-INFO | leakcheck | default process: leak threshold set at 2867200 bytes
[task 2024-04-30T10:27:00.399Z] 10:27:00     INFO - TEST-INFO | leakcheck | tab process: leak threshold set at 358400 bytes
[task 2024-04-30T10:27:00.399Z] 10:27:00     INFO - TEST-INFO | leakcheck | gmplugin process: leak threshold set at 20000 bytes
[task 2024-04-30T10:27:00.400Z] 10:27:00     INFO - TEST-INFO | leakcheck | gpu process: leak threshold set at 10000 bytes
[task 2024-04-30T10:27:00.400Z] 10:27:00     INFO - TEST-INFO | leakcheck | vr process: leak threshold set at 0 bytes
[task 2024-04-30T10:27:00.401Z] 10:27:00     INFO - TEST-INFO | leakcheck | rdd process: leak threshold set at 400 bytes
[task 2024-04-30T10:27:00.402Z] 10:27:00     INFO - TEST-INFO | leakcheck | socket process: leak threshold set at 0 bytes
[task 2024-04-30T10:27:00.402Z] 10:27:00     INFO - TEST-INFO | leakcheck | sandboxbroker process: leak threshold set at 0 bytes
[task 2024-04-30T10:27:00.403Z] 10:27:00     INFO - TEST-INFO | leakcheck | forkserver process: leak threshold set at 0 bytes
[task 2024-04-30T10:27:00.403Z] 10:27:00     INFO - TEST-INFO | leakcheck | utility process: leak threshold set at 0 bytes
[task 2024-04-30T10:27:00.404Z] 10:27:00     INFO - leakcheck | Processing leak log file /tmp/tmpynq1x1pd.mozrunner/runtests_leaks_907.log
[task 2024-04-30T10:27:00.404Z] 10:27:00     INFO - TEST-UNEXPECTED-FAIL | leakcheck | default missing output line for total leaks!
[task 2024-04-30T10:27:00.405Z] 10:27:00     INFO - leakcheck | Processing leak log file /tmp/tmpynq1x1pd.mozrunner/runtests_leaks_907_tab_pid15294.log
[task 2024-04-30T10:27:00.405Z] 10:27:00     INFO - TEST-PASS | leakcheck | tab no leaks detected!

There doesn't seem to be a crash stack but as it looks like the crash is due to:

[task 2024-04-30T10:27:00.376Z] 10:27:00     INFO - PID 15080 | [Utility 15917, Main Thread] WARNING: Shutting down Utility process early due to a crash!: file /builds/worker/checkouts/gecko/ipc/glue/UtilityProcessChild.cpp:352
[task 2024-04-30T10:27:00.376Z] 10:27:00     INFO - PID 15080 | [Child 15816, IPC I/O Child] WARNING: [D190BC62C10BEB4C.D73EFB8F6056A146]: Dropping message '<null>'; no connection to unknown peer 1.1: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:365
[task 2024-04-30T10:27:00.376Z] 10:27:00     INFO - PID 15080 | Exiting due to channel error.

Component: WebDriver BiDi → IPC
Product: Remote Protocol → Core
Summary: Intermittent /webdriver/tests/bidi/script/call_function/channel.py | unknown error (500): Failed to decode response from marionette → Intermittent Wd | unknown error (500): Failed to decode response from marionette (Shutting down Utility process early due to a crash!)

IPDL produces messages like that when the other side of a IPC connection crashes. Unfortunately there's no information in the log about what the crash actually is, as far as I can see.

Maybe there's no crash report because the crash reporter is enabled but no reports are being produced? I see this in the log:

MOZ_CRASHREPORTER="1" MOZ_CRASHREPORTER_NO_REPORT="1" MOZ_CRASHREPORTER_SHUTDOWN="1" 
Component: IPC → WebDriver BiDi
Product: Core → Remote Protocol

I also see this in the log: "Marionette DEBUG 0 -> [0,99,"Marionette:Quit",{"flags":["eForceQuit"]}]"

Maybe this killed off the Firefox parent process before anybody else (the child processes or whatever was supposed to send the response) was ready?

Summary: Intermittent Wd | unknown error (500): Failed to decode response from marionette (Shutting down Utility process early due to a crash!) → Intermittent Wd | unknown error (500): Failed to decode response from marionette

(In reply to Andrew McCreight [:mccr8] from comment #2)

Maybe there's no crash report because the crash reporter is enabled but no reports are being produced? I see this in the log:

MOZ_CRASHREPORTER="1" MOZ_CRASHREPORTER_NO_REPORT="1" MOZ_CRASHREPORTER_SHUTDOWN="1" 

This is still producing minidump files which are processed if written to the minidump folder of the profile. But there is none written and as such no crash stack is available. Maybe that is because the crash happened during shutdown of Firefox? A normal shutdown sequence looks like:

https://treeherder.mozilla.org/logviewer?job_id=456443806&repo=mozilla-central&lineNumber=2152

There the quit-application observer notification is sent out, which is not the case here.

Gabriele, maybe you have an idea?

Flags: needinfo?(gsvelto)

Looks like the following failure might be related? It has a stack:
https://treeherder.mozilla.org/logviewer?job_id=456401480&repo=try&lineNumber=165209

[task 2024-04-30T13:17:15.434Z] 13:17:15     INFO - STDOUT: Initializing stack-fixing for the first stack frame, this may take a while...
[task 2024-04-30T13:17:39.525Z] 13:17:39     INFO - PID 1639 | #01: NS_DebugBreak [xpcom/base/nsDebugImpl.cpp:469]
[task 2024-04-30T13:17:39.526Z] 13:17:39     INFO - PID 1639 | #02: base::SharedMemory::CreateInternal(unsigned long, bool) [ipc/chromium/src/base/shared_memory_posix.cc:0]
[task 2024-04-30T13:17:39.526Z] 13:17:39     INFO - PID 1639 | #03: mozilla::ipc::MemMapSnapshot::Init(unsigned long) [dom/ipc/MemMapSnapshot.cpp:19]
[task 2024-04-30T13:17:39.527Z] 13:17:39     INFO - PID 1639 | #04: mozilla::dom::ipc::WritableSharedMap::Serialize() [dom/ipc/SharedMap.cpp:307]
[task 2024-04-30T13:17:39.527Z] 13:17:39     INFO - PID 1639 | #05: mozilla::dom::ipc::WritableSharedMap::BroadcastChanges() [dom/ipc/SharedMap.cpp:364]
[task 2024-04-30T13:17:39.528Z] 13:17:39     INFO - PID 1639 | #06: mozilla::dom::MozWritableSharedMap_Binding::flush(JSContext*, JS::Handle<JSObject*>, void*, JSJitMethodCallArgs const&) [s3:gecko-generated-sources-l1:bed741272d2a5ef9388e5115569040659076c5f30a91c888ffb3fd0454fdd08fe4ffe7f8c290e62115bb159d6db35bd090d8191b4e944fc83164924163ea16fd/dom/bindings/MozSharedMapBinding.cpp::1535]
[task 2024-04-30T13:17:39.529Z] 13:17:39     INFO - PID 1639 | #07: mozilla::dom::binding_detail::GenericMethod<mozilla::dom::binding_detail::NormalThisPolicy, mozilla::dom::binding_detail::ThrowExceptions>(JSContext*, unsigned int, JS::Value*) [dom/bindings/BindingUtils.cpp:3270]
[task 2024-04-30T13:17:39.529Z] 13:17:39     INFO - PID 1639 | XPCOM_MEM_BLOAT_LOG: /var/folders/sv/bm84dg856jddr1v4fmfyp054000014/T/tmp2forxex9.mozrunner/runtests_leaks_727.log
[task 2024-04-30T13:17:39.530Z] 13:17:39     INFO - PID 1639 | Writing to log: /var/folders/sv/bm84dg856jddr1v4fmfyp054000014/T/tmp2forxex9.mozrunner/runtests_leaks_727.log

Andrew, what do you think? Given that I haven't seen this failure and the one above before, they both started at the same time.

Flags: needinfo?(continuation)

Looking a few lines up:

###!!! ABORT: failed to re-open freezeable shm: Too many open files

That sounds like file descriptor exhaustion. I'm not sure what could cause that issue. Maybe files aren't being closed somewhere? Jed might have some ideas.

Flags: needinfo?(continuation)

If it's file descriptor exhaustion then it's normal we can't write a minidump, it's a known limitation we have.

Flags: needinfo?(gsvelto)

(In reply to Andrew McCreight [:mccr8] from comment #7)

Looking a few lines up:

###!!! ABORT: failed to re-open freezeable shm: Too many open files

That sounds like file descriptor exhaustion. I'm not sure what could cause that issue. Maybe files aren't being closed somewhere? Jed might have some ideas.

Here we have a case for Marionette which doesn't show up this Too many open files issue:
https://treeherder.mozilla.org/logviewer?job_id=457517859&repo=autoland&lineNumber=60233

So maybe the issue is not triggered by an OOM situation?

Flags: needinfo?(continuation)

That log doesn't contain the string "Failed to decode response from marionette". Instead it has "KeyError: 'in_app'".

Flags: needinfo?(continuation)

remote/shared/Browser.sys.mjs returns after it calls Services.startup.quit() which seems like it would be inherently racey.

(In reply to Andrew McCreight [:mccr8] from comment #12)

remote/shared/Browser.sys.mjs returns after it calls Services.startup.quit() which seems like it would be inherently racey.

No, it waits for the quit-application notification before it returns. But that one is not getting send in this case because of a crash happening before. As such Marionette is not able to send any response to the client which eventually ends-up in this other test failure.

Duplicate of this bug: 1904167

Here is how a crash looks like on Android (adb log:

--------- beginning of crash
08-07 21:55:55.856 30977 30992 F libc    : Fatal signal 11 (SIGSEGV), code 1, fault addr 0x0 in tid 30992 (Gecko)
08-07 21:55:55.857  1293  1293 W         : debuggerd: handling request: pid=30977 uid=10062 gid=10062 tid=30992
08-07 21:55:55.914 31141 31141 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
08-07 21:55:55.914 31141 31141 F DEBUG   : Build fingerprint: 'Android/sdk_phone_x86_64/generic_x86_64:7.0/NYC/4174735:userdebug/test-keys'
08-07 21:55:55.914 31141 31141 F DEBUG   : Revision: '0'
08-07 21:55:55.914 31141 31141 F DEBUG   : ABI: 'x86_64'
08-07 21:55:55.914 31141 31141 F DEBUG   : pid: 30977, tid: 30992, name: Gecko  >>> org.mozilla.geckoview.test_runner <<<
08-07 21:55:55.914 31141 31141 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
08-07 21:55:55.914 31141 31141 F DEBUG   :     rax 23ba614d02f179be  rbx 00007f5e14a80c10  rcx 0000000000000320  rdx 0000000000000090
08-07 21:55:55.914 31141 31141 F DEBUG   :     rsi 0000000000000000  rdi 0000000000000000
08-07 21:55:55.914 31141 31141 F DEBUG   :     r8  0000000000000230  r9  0000000000000006  r10 00000000ffffffe0  r11 00007f5e44787808
08-07 21:55:55.914 31141 31141 F DEBUG   :     r12 00007f5e16c73088  r13 00007f5e3b89b088  r14 00007f5e16c89970  r15 00007f5e299fdae0
08-07 21:55:55.914 31141 31141 F DEBUG   :     cs  0000000000000033  ss  000000000000002b
08-07 21:55:55.914 31141 31141 F DEBUG   :     rip 00007f5e25ffd98e  rbp 00007f5e299fdaa0  rsp 00007f5e299fda60  eflags 0000000000000206
08-07 21:55:55.917 31141 31141 F DEBUG   : 
08-07 21:55:55.917 31141 31141 F DEBUG   : backtrace:
08-07 21:55:55.917 31141 31141 F DEBUG   :     #00 pc 000000000362098e  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.917 31141 31141 F DEBUG   :     #01 pc 0000000003620c0f  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.917 31141 31141 F DEBUG   :     #02 pc 0000000003620a75  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.917 31141 31141 F DEBUG   :     #03 pc 000000000057f1a2  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.917 31141 31141 F DEBUG   :     #04 pc 0000000000576ced  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #05 pc 0000000000575e38  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #06 pc 0000000000576058  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #07 pc 0000000000581081  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #08 pc 000000000058c3c8  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #09 pc 000000000059053b  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #10 pc 0000000000c45296  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #11 pc 0000000000bfa454  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #12 pc 00000000035a6928  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #13 pc 0000000003e9e395  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #14 pc 0000000003f56f48  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #15 pc 0000000003f57808  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #16 pc 0000000003f57c79  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #17 pc 0000000003f5a750  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2727000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #18 pc 00000000000139ac  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libmozglue.so (offset 0x7a000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #19 pc 000000000121bac3  /data/app/org.mozilla.geckoview.test_runner-1/oat/x86_64/base.odex (offset 0x110b000)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #20 pc 0000000000901c0f  <anonymous:00007f5e290fd000>
08-07 21:55:55.918 31141 31141 F DEBUG   :     #21 pc 00000000002e7259  /system/lib64/libart.so (_ZN3art11interpreter30EnterInterpreterFromEntryPointEPNS_6ThreadEPKNS_7DexFile8CodeItemEPNS_11ShadowFrameE+105)
08-07 21:55:55.918 31141 31141 F DEBUG   :     #22 pc 12c810e00000000b  <unknown>
08-07 21:55:56.337  1656 31146 W ActivityManager:   Force finishing activity org.mozilla.geckoview.test_runner/.App
08-07 21:55:56.338  1656  1674 I BootReceiver: Copying /data/tombstones/tombstone_00 to DropBox (SYSTEM_TOMBSTONE)
08-07 21:55:56.343  1293  1293 W         : debuggerd: resuming target 30977
08-07 21:55:56.358  1656  1694 W InputDispatcher: channel '1d9bf9e org.mozilla.geckoview.test_runner/org.mozilla.geckoview.test_runner.App (server)' ~ Consumer closed input channel or an error occurred.  events=0x9
08-07 21:55:56.358  1656  1694 E InputDispatcher: channel '1d9bf9e org.mozilla.geckoview.test_runner/org.mozilla.geckoview.test_runner.App (server)' ~ Channel is unrecoverably broken and will be disposed!
08-07 21:55:56.360 31010 31098 I Gecko   : Exiting due to channel error.
08-07 21:55:56.373  1656  1668 D GraphicsStats: Buffer count: 4
08-07 21:55:56.374  1656  1891 I ActivityManager: Process org.mozilla.geckoview.test_runner (pid 30977) has died

Note that the crash is not processed and maybe doing it manually would reveal more information. At least it can be seen that it is happening during the shutdown of the browser.

Note that there is a spike of this issue - most likely because of such crashes - in the last days. We should observe that and most likely should inform the GeckoView team about this issue. Let me check back this evening and escalate if needed.

The spike is gone. It could mean that the potential regressor got backed out or a follow-up fix landed.

So the problem here might also be related to the Exiting due to channel error issue that is as well covered on bug 1906583 and requires some more investigation on bug 1908293.

08-08 22:43:02.452 11105 11120 W google-breakpad: ExceptionHandler::GenerateDump minidump generation 
08-08 22:43:02.452 11105 11120 W google-breakpad: succeeded
08-08 22:43:02.476  1658  1695 W InputDispatcher: channel 'bd03f8c org.mozilla.geckoview.test_runner/org.mozilla.geckoview.test_runner.App (server)' ~ Consumer closed input channel or an error occurred.  events=0x9
08-08 22:43:02.476  1658  1695 E InputDispatcher: channel 'bd03f8c org.mozilla.geckoview.test_runner/org.mozilla.geckoview.test_runner.App (server)' ~ Channel is unrecoverably broken and will be disposed!
08-08 22:43:02.478 11173 11197 I Gecko   : [GPU 11173, IPC I/O Child] WARNING: [5ED1778DCE92BE12.EE58521795472396]: Dropping message '<null>'; no connection to unknown peer 1.1: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:364
08-08 22:43:02.479 11173 11190 I Gecko   : [GPU 11173, Main Thread] WARNING: Shutting down GPU process early due to a crash!: file /builds/worker/checkouts/gecko/gfx/ipc/GPUParent.cpp:734
08-08 22:43:02.479 11138 11220 I Gecko   : [Child 11138, IPC I/O Child] WARNING: [D29AE49CDA952F5D.1E21F6D4F7E1FA31]: Dropping message '<null>'; no connection to unknown peer 1.1: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:364
08-08 22:43:02.480 11138 11220 I Gecko   : Exiting due to channel error.
08-08 22:43:02.482  1658  1886 I WindowManager: WIN DEATH: Window{6c6383e u0 SurfaceView - org.mozilla.geckoview.test_runner/org.mozilla.geckoview.test_runner.App}
Depends on: 1908293
See Also: → 1906583

There is a huge increase of failures reported for this bug starting on September 24th (Tuesday last week). These are all happening on Android and are related to a shutdown crash of GeckoView and show a Channel is unrecoverably broken and will be disposed message.

Take the following job as example:
https://treeherder.mozilla.org/logviewer?job_id=476193393&repo=autoland&lineNumber=93010

In the Android logcat you can see the following:

09-29 22:27:21.761 14774 14789 I Gecko   : 1727648841761	Marionette	DEBUG	0 <- [1,26,null,{"cause":"shutdown","forced":false,"in_app":true}]
09-29 22:27:21.770 14774 14789 E GeckoConsole: [JavaScript Error: "TypeError: can't access property "window", this is undefined" {file: "resource://gre/modules/GeckoViewContent.sys.mjs" line: 211}]
09-29 22:27:21.770 14774 14789 E GeckoConsole: onResize@resource://gre/modules/GeckoViewContent.sys.mjs:211:15
09-29 22:27:21.770 14774 14789 E GeckoConsole: onEvent/<@resource://gre/modules/GeckoViewContent.sys.mjs:233:13
09-29 22:27:21.771 14774 14789 E GeckoConsole: [JavaScript Error: "TypeError: can't access property "window", this is undefined" {file: "resource://gre/modules/GeckoViewContent.sys.mjs" line: 211}]
09-29 22:27:21.771 14774 14789 E GeckoConsole: onResize@resource://gre/modules/GeckoViewContent.sys.mjs:211:15
09-29 22:27:21.771 14774 14789 E GeckoConsole: onEvent/<@resource://gre/modules/GeckoViewContent.sys.mjs:233:13
09-29 22:27:21.777 14774 14789 I Gecko   : nsWindow[0x7ec8e0180900]::Show 0
09-29 22:27:21.782 14774 14789 D GeckoViewNavigation: onDisable
09-29 22:27:21.782 14774 14789 D GeckoViewProcessHangMonitor: onDisable
09-29 22:27:21.782 14774 14789 D GeckoViewModule: unregisterListener
09-29 22:27:21.783 14774 14789 D GeckoViewProcessHangMonitor: onDestroy
09-29 22:27:21.784 14774 14789 D GeckoViewSelectionAction: onDisable
09-29 22:27:21.784 14774 14789 D GeckoViewModule: unregisterListener
09-29 22:27:21.784 14919 14933 D GeckoViewSelectionActionDelegate[C]: handleEvent: visibilitychange
09-29 22:27:21.788 14774 14789 D GeckoViewSessionStore: observe browsing-context-discarded
09-29 22:27:21.791 14774 14789 D GeckoViewSessionStore: observe browsing-context-discarded
09-29 22:27:21.792 14919 14933 D GeckoViewSelectionActionDelegate[C]: handleEvent: visibilitychange
09-29 22:27:21.793 14919 14933 D GeckoViewAutoFill[C]: handleEvent: pagehide
09-29 22:27:21.795 14919 14933 D GeckoViewAutoFill[C]: handleEvent: focusout
09-29 22:27:21.795 14919 14933 D GeckoViewAutoFill[C]: Auto-fill focus on null
09-29 22:27:21.818 14843 14871 D EGL_emulation: eglMakeCurrent: 0x7ec909dba080: ver 3 0 (tinfo 0x7ec909dbd000)
09-29 22:27:21.823 14774 14774 W IInputConnectionWrapper: getTextBeforeCursor on inactive InputConnection
09-29 22:27:21.823  1732  1732 E RichInputConnection: Unable to connect to the editor to retrieve text.
09-29 22:27:21.823  1732  1732 D RichInputConnection: Will try to retrieve text later.
09-29 22:27:21.824 14843 14871 D EGL_emulation: eglMakeCurrent: 0x7ec909dba080: ver 3 0 (tinfo 0x7ec909dbd000)
09-29 22:27:21.824 14774 14774 W IInputConnectionWrapper: getTextBeforeCursor on inactive InputConnection
09-29 22:27:21.824  1732  1732 E RichInputConnection: Unable to connect to the editor to retrieve text.
09-29 22:27:21.824 14843 14871 E eglCodecCommon: glUtilsParamSize: unknow param 0x00008caa
09-29 22:27:21.824  1732  1732 W RichInputConnection: Unable to connect to the editor. Setting caps mode without knowing text.
09-29 22:27:21.827  1732  1732 I LatinIME: Starting input. Cursor position = -1,-1
09-29 22:27:21.835  1732  1732 W InputMethodService: Window size has been changed. This may cause jankiness of resizing window: -1 -> -2
09-29 22:27:21.852  1732  8716 D EGL_emulation: eglMakeCurrent: 0x7ec91447e880: ver 3 0 (tinfo 0x7ec8fb8e65e0)
09-29 22:27:21.886 14774 14789 W google-breakpad: ExceptionHandler::GenerateDump attempting to generate:
09-29 22:27:21.886 14774 14789 W google-breakpad: /storage/emulated/0/Android/data/org.mozilla.geckoview.test_runner/files/test_root/org.mozilla.geckoview.test_runner-geckodriver-profile/minidumps/547cffe9-cf77-29fd-6b6e-d558df509c17.dmp
09-29 22:27:21.886 14774 14789 W google-breakpad: 
09-29 22:27:21.890 14774 14789 W google-breakpad: ExceptionHandler::GenerateDump cloned child 
09-29 22:27:21.890 14774 14789 W google-breakpad: 14959
09-29 22:27:21.890 14774 14789 W google-breakpad: 
09-29 22:27:21.890 14774 14789 W google-breakpad: ExceptionHandler::SendContinueSignalToChild sent continue signal to child
09-29 22:27:21.890 14959 14789 W google-breakpad: ExceptionHandler::WaitForContinueSignal waiting for continue signal...
09-29 22:27:22.032  1659  1680 I WindowManager: Destroying surface Surface(name=InputMethod) called by com.android.server.wm.WindowStateAnimator.destroySurface:2014 com.android.server.wm.WindowStateAnimator.destroySurfaceLocked:881 com.android.server.wm.WindowState.destroyOrSaveSurface:2073 com.android.server.wm.WindowSurfacePlacer.performSurfacePlacementInner:429 com.android.server.wm.WindowSurfacePlacer.performSurfacePlacementLoop:232 com.android.server.wm.WindowSurfacePlacer.performSurfacePlacement:180 com.android.server.wm.WindowManagerService$H.handleMessage:8079 android.os.Handler.dispatchMessage:102 
09-29 22:27:24.007  1732  1732 W InputConnectionWrapper.ICC: Timed out waiting on IInputContextCallback
09-29 22:27:24.007  1732  1732 W RichInputConnection: Slow InputConnection: GET_TEXT_AFTER_CURSOR took 2040 ms.
09-29 22:27:29.463 14774 14789 W google-breakpad: ExceptionHandler::GenerateDump minidump generation 
09-29 22:27:29.463 14774 14789 W google-breakpad: succeeded
09-29 22:27:29.468 14774 14962 D ProfileInstaller: Installing profile for org.mozilla.geckoview.test_runner
09-29 22:27:29.487  1659  1696 W InputDispatcher: channel '51e2207 org.mozilla.geckoview.test_runner/org.mozilla.geckoview.test_runner.App (server)' ~ Consumer closed input channel or an error occurred.  events=0x9
09-29 22:27:29.487  1659  1696 E InputDispatcher: channel '51e2207 org.mozilla.geckoview.test_runner/org.mozilla.geckoview.test_runner.App (server)' ~ Channel is unrecoverably broken and will be disposed!
09-29 22:27:29.491  1659  1769 D GraphicsStats: Buffer count: 4
09-29 22:27:29.492  1659  1670 I WindowManager: WIN DEATH: Window{51e2207 u0 org.mozilla.geckoview.test_runner/org.mozilla.geckoview.test_runner.App}
09-29 22:27:29.492  1659  1670 W InputDispatcher: Attempted to unregister already unregistered input channel '51e2207 org.mozilla.geckoview.test_runner/org.mozilla.geckoview.test_runner.App (server)'

The affected tests seems to be all around classic's Perform Actions and most likely are caused by my metadata update from bug 1920573.

I probably have to mark the erroring tests as expected [OK, ERROR] for now but it would be good to know why GeckoView is failing here.

Chris, are you aware of such a shutdown crash yet?

Flags: needinfo?(cpeterson)

Chris, are you aware of such a shutdown crash yet?

No, but I'll follow up with the engineers. Thanks for the heads up!

Flags: needinfo?(cpeterson)

(In reply to Henrik Skupin [:whimboo][⌚️UTC+2] from comment #29)

The spike is gone. It could mean that the potential regressor got backed out or a follow-up fix landed.

So the problem here might also be related to the Exiting due to channel error issue that is as well covered on bug 1906583 and requires some more investigation on bug 1908293.

Probably what we all see here is basically bug 1906583. Given that it is Android it's just harder to see in the Android adb log. So most likely the parent process exits too early while child processes are still open?

Chris, I can revert the meta data changes for the tests in the https://searchfox.org/mozilla-central/source/testing/web-platform/mozilla/tests/webdriver/classic/perform_actions/ folder so that the failures are gone. It looks like the issue is already around for a while. Not sure how long it would take to investigate and fix.

Flags: needinfo?(cpeterson)

(In reply to Henrik Skupin [:whimboo][⌚️UTC+2] from comment #39)

Probably what we all see here is basically bug 1906583. Given that it is Android it's just harder to see in the Android adb log. So most likely the parent process exits too early while child processes are still open?

Chris, I can revert the meta data changes for the tests in the https://searchfox.org/mozilla-central/source/testing/web-platform/mozilla/tests/webdriver/classic/perform_actions/ folder so that the failures are gone. It looks like the issue is already around for a while. Not sure how long it would take to investigate and fix.

In that case, restoring the [OK, ERROR] test expectation for these Android tests sounds good.

Do you recommend we use this bug or UnknownErrorException bug 1906583 to track this intermittent test failure? Should we resolve on of these bugs as a duplicate of the other?

Flags: needinfo?(cpeterson) → needinfo?(hskupin)

(In reply to Chris Peterson [:cpeterson] from comment #40)

In that case, restoring the [OK, ERROR] test expectation for these Android tests sounds good.

Ok, so I'll add those items back for now to get the tests no longer erroring out.

Do you recommend we use this bug or UnknownErrorException bug 1906583 to track this intermittent test failure? Should we resolve on of these bugs as a duplicate of the other?

I would suggest that we make that bug specific to Android as a regression from my work on bug 1920573. With the test expectation updates we could mark it as fixed. Do you have a GeckoView specific bug that I can reference in the wpt manifest?

Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Depends on: 1906583
Flags: needinfo?(hskupin) → needinfo?(cpeterson)
OS: Unspecified → Android
Hardware: Unspecified → All
Summary: Intermittent Wd | unknown error (500): Failed to decode response from marionette → Intermittent Wd Android | unknown error (500): Failed to decode response from marionette

Tests affected in order of appearance:

  • /webdriver/tests/classic/perform_actions/sequence.py (most often)
  • /webdriver/tests/classic/element_click/interactability.py
  • /webdriver/tests/classic/perform_actions/key_special_keys.py
  • /webdriver/tests/classic/perform_actions/key_shortcuts.py

The both last tests do fail very rarely so I might only update the expectations for the first two ones.

I saw one crash report in this Android logcat (CI job):

09-25 17:28:19.298  3743  3758 I Gecko   : 1727285299298	Marionette	DEBUG	0 <- [1,268,null,{"cause":"shutdown","forced":false,"in_app":true}]
09-25 17:28:19.309  3743  3758 E GeckoConsole: [JavaScript Error: "TypeError: can't access property "window", this is undefined" {file: "resource://gre/modules/GeckoViewContent.sys.mjs" line: 211}]
09-25 17:28:19.309  3743  3758 E GeckoConsole: onResize@resource://gre/modules/GeckoViewContent.sys.mjs:211:15
09-25 17:28:19.309  3743  3758 E GeckoConsole: onEvent/<@resource://gre/modules/GeckoViewContent.sys.mjs:233:13
09-25 17:28:19.309  3743  3758 E GeckoConsole: [JavaScript Error: "TypeError: can't access property "window", this is undefined" {file: "resource://gre/modules/GeckoViewContent.sys.mjs" line: 211}]
09-25 17:28:19.309  3743  3758 E GeckoConsole: onResize@resource://gre/modules/GeckoViewContent.sys.mjs:211:15
09-25 17:28:19.309  3743  3758 E GeckoConsole: onEvent/<@resource://gre/modules/GeckoViewContent.sys.mjs:233:13
09-25 17:28:19.333  3812  3840 D EGL_emulation: eglMakeCurrent: 0x79f1e764b320: ver 3 0 (tinfo 0x79f1e760c480)
09-25 17:28:19.346  3812  3840 D EGL_emulation: eglMakeCurrent: 0x79f1e764b320: ver 3 0 (tinfo 0x79f1e760c480)
09-25 17:28:19.347  3812  3840 E eglCodecCommon: glUtilsParamSize: unknow param 0x00008caa
--------- beginning of crash
09-25 17:28:19.360  3743  3758 F libc    : Fatal signal 11 (SIGSEGV), code 1, fault addr 0x0 in tid 3758 (Gecko)
09-25 17:28:19.361  3743  3743 W IInputConnectionWrapper: getTextBeforeCursor on inactive InputConnection
09-25 17:28:19.361  1732  1732 E RichInputConnection: Unable to connect to the editor to retrieve text.
09-25 17:28:19.361  1293  1293 W         : debuggerd: handling request: pid=3743 uid=10062 gid=10062 tid=3758
09-25 17:28:19.362  1732  1732 D RichInputConnection: Will try to retrieve text later.
09-25 17:28:19.422  3944  3944 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
09-25 17:28:19.422  3944  3944 F DEBUG   : Build fingerprint: 'Android/sdk_phone_x86_64/generic_x86_64:7.0/NYC/4174735:userdebug/test-keys'
09-25 17:28:19.422  3944  3944 F DEBUG   : Revision: '0'
09-25 17:28:19.423  3944  3944 F DEBUG   : ABI: 'x86_64'
09-25 17:28:19.423  3944  3944 F DEBUG   : pid: 3743, tid: 3758, name: Gecko  >>> org.mozilla.geckoview.test_runner <<<
09-25 17:28:19.423  3944  3944 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
09-25 17:28:19.423  3944  3944 F DEBUG   :     rax 000079f1cf2239d0  rbx 000079f1ef5bd240  rcx 0000000000000320  rdx 0000000000000090
09-25 17:28:19.423  3944  3944 F DEBUG   :     rsi 0000000000000000  rdi 0000000000000000
09-25 17:28:19.423  3944  3944 F DEBUG   :     r8  0000000000000230  r9  0000000000000006  r10 00000000ffffff80  r11 000079f1f17bc808
09-25 17:28:19.423  3944  3944 F DEBUG   :     r12 000079f1bf925e20  r13 000079f1d57fda78  r14 000079f1bdaf5fb0  r15 0000000000200005
09-25 17:28:19.423  3944  3944 F DEBUG   :     cs  0000000000000033  ss  000000000000002b
09-25 17:28:19.423  3944  3944 F DEBUG   :     rip 000079f1cf2239de  rbp 000079f1d57fd940  rsp 000079f1d57fd910  eflags 0000000000000246
09-25 17:28:19.426  3944  3944 F DEBUG   : 
09-25 17:28:19.426  3944  3944 F DEBUG   : backtrace:
09-25 17:28:19.426  3944  3944 F DEBUG   :     #00 pc 0000000003d059de  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #01 pc 0000000003d05b0a  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #02 pc 00000000000e17e2  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #03 pc 00000000000d4b8c  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #04 pc 00000000000b124d  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #05 pc 00000000000d3591  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #06 pc 000000000003593e  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #07 pc 00000000000e0d78  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #08 pc 0000000004522d24  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #09 pc 000000000026b09a  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #10 pc 000000000000e3bc  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #11 pc 000000000000df67  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #12 pc 000000000000ddc3  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libxul.so (offset 0x2d92000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #13 pc 0000000000086e5d  /data/app/org.mozilla.geckoview.test_runner-1/lib/x86_64/libmozglue.so (offset 0x76000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #14 pc 0000000001248f73  /data/app/org.mozilla.geckoview.test_runner-1/oat/x86_64/base.odex (offset 0x1131000)
09-25 17:28:19.427  3944  3944 F DEBUG   :     #15 pc 0000000000901c0f  <anonymous:000079f1d4efd000>
09-25 17:28:19.428  3944  3944 F DEBUG   :     #16 pc 00000000002e7259  /system/lib64/libart.so (_ZN3art11interpreter30EnterInterpreterFromEntryPointEPNS_6ThreadEPKNS_7DexFile8CodeItemEPNS_11ShadowFrameE+105)
09-25 17:28:19.428  3944  3944 F DEBUG   :     #17 pc 12c410e00000000b  <unknown>

Maybe that's related to ShadowRoots? See frame 16.

No longer depends on: 1906583, 1908293

Adding leave-open so the bug doesn't get closed while we don't have a proper fix in place.

Keywords: leave-open
Pushed by hskupin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5f5ca231643e [wdspec] Update expecations for tests causing "Failed to decode response from marionette" errors on Android. r=webdriver-reviewers,jdescottes
Assignee: hskupin → nobody
Status: ASSIGNED → NEW
Blocks: 1906583
See Also: 1906583

I would suggest that we make that bug specific to Android as a regression from my work on bug 1920573. With the test expectation updates we could mark it as fixed.

SGTM. I'll update bug 1906583 as you recommend and then we can resolve it as fixed by this bug's rollback of the test expectations.

Do you have a GeckoView specific bug that I can reference in the wpt manifest?

Can we use Marionette bug 1906583? I'm not familiar with this test failure, so I don't know if there is a related GeckoView or Fenix bug.

Flags: needinfo?(cpeterson) → needinfo?(hskupin)
No longer blocks: 1906583
See Also: → 1906583

(In reply to Chris Peterson [:cpeterson] from comment #48)

I would suggest that we make that bug specific to Android as a regression from my work on bug 1920573. With the test expectation updates we could mark it as fixed.

SGTM. I'll update bug 1906583 as you recommend and then we can resolve it as fixed by this bug's rollback of the test expectations.

Given that sheriffs always use this bug for classification we should keep using it for the regression on Android.

Do you have a GeckoView specific bug that I can reference in the wpt manifest?

Can we use Marionette bug 1906583? I'm not familiar with this test failure, so I don't know if there is a related GeckoView or Fenix bug.

Marionette client doesn't support Android so we cannot use it but have to run web-platform tests to get it into such a state. Someone could try to nail down the number of tests to run by using the test that were mostly failing in CI and that was marked with a multiple status (see landed patch on this bug). Maybe getting the crash reports from the device might be good but geckodriver doesn't have the capability yet.

Note that the issue might be already around for quite some time, and maybe we should check how SHIP and Fission behaves with the tests enabled again and run on try. If it only fails for no-fis maybe we could ignore it?

Please note that I'm off until Oct 13th, but someone else from the team like James, Julian, or Sasha can help. For questions feel free to ask in the #webdriver Matrix channel. Thanks!

Flags: needinfo?(hskupin)

ni? myself to keep track of this bug.

If I find time, I'll try to look into:

Note that the issue might be already around for quite some time, and maybe we should check how SHIP and Fission behaves with the tests enabled again and run on try. If it only fails for no-fis maybe we could ignore it?

Flags: needinfo?(ohall)

Comment on attachment 9428365 [details]
Bug 1894213 - [wdspec] Update expecations for tests causing "Failed to decode response from marionette" errors on Android.

Beta/Release Uplift Approval Request

  • User impact if declined: Tests are still failing on beta because this patch rolled back expectation changes that initially landed in 132. It would be good to uplift it to reduce the sheriffing noise
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Test expectation change, no actual code change.
  • String changes made/needed:
  • Is Android affected?: No
Attachment #9428365 - Flags: approval-mozilla-beta?

Comment on attachment 9428365 [details]
Bug 1894213 - [wdspec] Update expecations for tests causing "Failed to decode response from marionette" errors on Android.

a=test-only

Attachment #9428365 - Flags: approval-mozilla-beta?
Summary: Intermittent Wd Android | unknown error (500): Failed to decode response from marionette → Intermittent Android Wd | unknown error (500): Failed to decode response from marionette

We would need to pull the generated minidump files off the device and process with mozcrash. That means we need bug 1859377 fixed first.

Depends on: 1859377
Flags: needinfo?(ohall)

Note that the issue might be already around for quite some time, and maybe we should check how SHIP and Fission behaves with the tests enabled again and run on try. If it only fails for no-fis maybe we could ignore it?

Sent a try up with the metadata reverted just to follow up on this, in case there is something interesting there. Will wait for bug 1859377 before looking further.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: