Open Bug 1630162 Opened 1 year ago Updated 15 days ago

Intermittent Wd TEST-UNEXPECTED-TIMEOUT | expected OK [shutdown hang]

Categories

(Testing :: geckodriver, defect, P5)

Version 3
defect

Tracking

(Not tracked)

REOPENED

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Depends on 2 open bugs)

Details

(Keywords: intermittent-failure)

Filed by: ccoroiu [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=297631286&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/OIV_AGuRTDeuoSBzs4niRQ/runs/0/artifacts/public/logs/live_backing.log


[task 2020-04-15T00:29:36.827Z] 00:29:36 INFO - PID 1496 | 1586910576818 Marionette DEBUG Resetting recommended pref dom.file.createInChild
[task 2020-04-15T00:29:36.827Z] 00:29:36 INFO - PID 1496 | 1586910576818 Marionette DEBUG Resetting recommended pref dom.max_chrome_script_run_time
[task 2020-04-15T00:29:36.828Z] 00:29:36 INFO - PID 1496 | 1586910576819 Marionette DEBUG Resetting recommended pref dom.max_script_run_time
[task 2020-04-15T00:29:36.828Z] 00:29:36 INFO - PID 1496 | 1586910576820 Marionette DEBUG Resetting recommended pref dom.push.connection.enabled
[task 2020-04-15T00:29:36.829Z] 00:29:36 INFO - PID 1496 | 1586910576820 Marionette DEBUG Resetting recommended pref extensions.getAddons.cache.enabled
[task 2020-04-15T00:29:36.830Z] 00:29:36 INFO - PID 1496 | 1586910576820 Marionette DEBUG Resetting recommended pref extensions.webservice.discoverURL
[task 2020-04-15T00:29:36.830Z] 00:29:36 INFO - PID 1496 | 1586910576821 Marionette DEBUG Resetting recommended pref network.http.prompt-temp-redirect
[task 2020-04-15T00:29:36.831Z] 00:29:36 INFO - PID 1496 | 1586910576821 Marionette DEBUG Resetting recommended pref network.http.speculative-parallel-limit
[task 2020-04-15T00:29:36.831Z] 00:29:36 INFO - PID 1496 | 1586910576821 Marionette DEBUG Resetting recommended pref privacy.trackingprotection.enabled
[task 2020-04-15T00:29:36.832Z] 00:29:36 INFO - PID 1496 | 1586910576821 Marionette DEBUG Resetting recommended pref security.fileuri.strict_origin_policy
[task 2020-04-15T00:29:36.832Z] 00:29:36 INFO - PID 1496 | 1586910576821 Marionette DEBUG Resetting recommended pref security.notification_enable_delay
[task 2020-04-15T00:29:36.832Z] 00:29:36 INFO - PID 1496 | 1586910576822 Marionette DEBUG Resetting recommended pref signon.autofillForms
[task 2020-04-15T00:29:36.832Z] 00:29:36 INFO - PID 1496 | 1586910576823 Marionette DEBUG Resetting recommended pref signon.rememberSignons
[task 2020-04-15T00:29:36.832Z] 00:29:36 INFO - PID 1496 | 1586910576824 Marionette DEBUG Resetting recommended pref toolkit.cosmeticAnimations.enabled
[task 2020-04-15T00:29:36.832Z] 00:29:36 INFO - PID 1496 | 1586910576826 Marionette DEBUG Marionette stopped listening
[task 2020-04-15T00:29:36.832Z] 00:29:36 INFO - PID 1496 | DEBUG: Starting phase xpcom-will-shutdown
[task 2020-04-15T00:29:36.832Z] 00:29:36 INFO - PID 1496 | DEBUG: Spinning the event loop
[task 2020-04-15T00:29:36.922Z] 00:29:36 INFO - PID 1496 | DEBUG: Finished phase xpcom-will-shutdown
[task 2020-04-15T00:29:36.923Z] 00:29:36 INFO - PID 1496 |
[task 2020-04-15T00:29:36.924Z] 00:29:36 INFO - PID 1496 | ###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
[task 2020-04-15T00:29:36.925Z] 00:29:36 INFO - PID 1496 |
[task 2020-04-15T00:29:36.925Z] 00:29:36 INFO - PID 1496 | DEBUG: Starting phase web-workers-shutdown
[task 2020-04-15T00:29:36.926Z] 00:29:36 INFO - PID 1496 | DEBUG: Spinning the event loop
[task 2020-04-15T00:29:36.926Z] 00:29:36 INFO - PID 1496 | DEBUG: Completed blocker OS.File: flush pending requests, warn about unclosed files, shut down service. for phase web-workers-shutdown
[task 2020-04-15T00:29:36.926Z] 00:29:36 INFO - PID 1496 | DEBUG: Finished phase web-workers-shutdown
[task 2020-04-15T00:30:09.508Z] 00:30:09 INFO - TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/set_window_rect/user_prompts.py | expected OK
[task 2020-04-15T00:30:09.508Z] 00:30:09 INFO - TEST-INFO took 365004ms
[task 2020-04-15T00:30:09.589Z] 00:30:09 INFO - Closing logging queue
[task 2020-04-15T00:30:09.589Z] 00:30:09 INFO - queue closed
[task 2020-04-15T00:30:09.589Z] 00:30:09 INFO - Starting runner
[task 2020-04-15T00:30:10.019Z] 00:30:10 INFO - PID 2596 | 1586910610002 geckodriver DEBUG Listening on 127.0.0.1:49240
[task 2020-04-15T00:30:10.505Z] 00:30:10 INFO - WebDriver HTTP server listening at http://127.0.0.1:49240/
[task 2020-04-15T00:30:10.505Z] 00:30:10 INFO - TEST-START | /webdriver/tests/status/status.py
[task 2020-04-15T00:30:10.796Z] 00:30:10 INFO - STDOUT: ============================= test session starts =============================
[task 2020-04-15T00:30:10.796Z] 00:30:10 INFO - STDOUT: platform win32 -- Python 2.7.15, pytest-unknown, py-1.5.2, pluggy-0.5.3.dev -- Z:\task_1586905291\build\venv\Scripts\python.exe
[task 2020-04-15T00:30:10.796Z] 00:30:10 INFO - STDOUT: rootdir: Z:\task_1586905291\build, inifile:
[task 2020-04-15T00:30:10.796Z] 00:30:10 INFO - STDOUT: collecting ...
[task 2020-04-15T00:30:10.796Z] 00:30:10 INFO - STDOUT: collected 2 items
[task 2020-04-15T00:30:10.797Z] 00:30:10 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/status/status.py::test_get_status_no_session
[task 2020-04-15T00:30:10.798Z] 00:30:10 INFO - PID 2596 | 1586910610795 webdriver::server DEBUG -> GET /status
[task 2020-04-15T00:30:10.799Z] 00:30:10 INFO - PID 2596 | 1586910610795 webdriver::server DEBUG <- 200 OK {"value":{"message":"","ready":true}}
[task 2020-04-15T00:30:10.799Z] 00:30:10 INFO - STDOUT: PASSED
[task 2020-04-15T00:30:10.799Z] 00:30:10 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/status/status.py::test_status_with_session_running_on_endpoint_node
[task 2020-04-15T00:30:10.806Z] 00:30:10 INFO - PID 2596 | 1586910610798 webdriver::server DEBUG -> POST /session {"capabilities": {"alwaysMatch": {"moz:firefoxOptions": {"binary": "Z:\task_1586905291\build\application\firefox\firefox.exe", "prefs": {"network.process.enabled": false, "toolkit.asyncshutdown.log": true, "network.dns.localDomains": "www1.xn--n8j6ds53lwwkrqhv28a.web-platform.test,op88.web-platform.test,op36.not-web-platform.test,op53.not-web-platform.test,op50.not-web-platform.test,xn--lve-6lad.www.web-platform.test,op98.web-platform.test,op24.not-web-platform.test,op31.not-web-platform.test,op95.not-web-platform.test,op85.web-platform.test,op83.not-web-platform.test,www2.not-web-platform.test,xn--lve-6lad.www.not-web-platform.test,op73.not-web-platform.test,op8.web-platform.test,www2.www2.not-web-platform.test,op89.web-platform.test,op66.web-platform.test,xn--lve-6lad.web-platform.test,op19.not-web-platform.test,www1.www2.web-platform.test,op72.web-platform.test,op24.web-platform.test,op21.not-web-platform.test,xn--lve-6lad.not-web-platform.test,op41.web-platform.test,op79.web-platform.test,op81.not-web-platform.test,op70.not-web-platform.test,xn--n8j6ds53lwwkrqhv28a.xn--lve-6lad.not-web-platform.test,op78.not-web-platform.test,op6.not-web-platform.test,www1.www.not-web-platform.test,op40.not-web-platform.test,op25.not-web-platform.test,op3.not-web-platform.test,op65.not-web-platform.test,op91.web-platform.test,www.www2.web-platform.test,op80.not-web-platform.test,op59.web-platform.test,op52.not-web-platform.test,xn--lve-6lad.xn--lve-6lad.web-platform.test,op68.not-web-platform.test,op45.not-web-platform.test,op71.not-web-platform.test,op72.not-web-platform.test,xn--n8j6ds53lwwkrqhv28a.www2.web-platform.test,op39.web-platform.test,op90.not-web-platform.test,op60.web-platform.test,op58.web-platform.test,op28.web-platform.test,www1.web-platform.test,xn--n8j6ds53lwwkrqhv28a.xn--lve-6lad.web-platform.test,op14.web-platform.test,op89.not-web-platform.test,op69.web-platform.test,op49.not-web-platform.test,op40.web-platform.test,op2.not-web-platform.test,op5.not-web-platform.test,www.www2.not-web-platform.test,op77.not-web-platform.test,www.xn--n8j6ds53lwwkrqhv28a.web-platform.test,op7.web-platform.test,op74.web-platform.test,op79.not-web-platform.test,op82.not-web-platform.test,www.www1.web-platform.test,op12.not-web-platform.test,op39.not-web-platform.test,op31.web-platform.test,www.not-web-platform.test,www.www.not-web-platform.test,op44.not-web-platform.test,www1.not-web-platform.test,xn--n8j6ds53lwwkrqhv28a.www1.web-platform.test,op58.not-web-platform.test,op14.not-web-platform.test,op30.not-web-platform.test,op62.not-web-platform.test,op61.not-web-platform.test,op92.not-web-platform.test,www2.xn--lve-6lad.web-platform.test,op29.not-web-platform.test,op18.web-platform.test,op73.web-platform.test,xn--n8j6ds53lwwkrqhv28a.xn--n8j6ds53lwwkrqhv28a.web-platform.test,op77.web-platform.test,op12.web-platform.test,op54.web-platform.test,op63.web-platform.test,op71.web-platform.test,www2.www1.not-web-platform.test,op95.web-platform.test,op16.web-platform.test,op36.web-platform.test,op27.web-platform.test,www.www.web-platform.test,op98.not-web-platform.test,op64.not-web-platform.test,op29.web-platform.test,op9.web-platform.test,op26.not-web-platform.test,op94.web-platform.test,xn--n8j6ds53lwwkrqhv28a.www2.not-web-platform.test,op44.web-platform.test,op94.not-web-platform.test,op33.web-platform.test,op38.not-web-platform.test,op33.not-web-platform.test,op84.web-platform.test,www1.www1.not-web-platform.test,op23.not-web-platform.test,op57.not-web-platform.test,op54.not-web-platform.test,op85.not-web-platform.test,www2.www2.web-platform.test,op46.not-web-platform.test,op97.not-web-platform.test,op32.web-platform.test,op61.web-platform.test,op70.web-platform.test,www2.web-platform.test,op32.not-web-platform.test,op60.not-web-platform.test,op4.web-platform.test,op43.web-platform.test,op7.not-web-platform.test,op78.web-platform.test,op26.web-platform.test,xn--lve-6lad.xn--n8j6ds53lwwkrqhv28a.web-platform.test,op96.not-web-platform.test,op51.not-web-platform.test,op41.not-web-platform.test,op76.web-platform.test,op52.web-platform.test,op99.web-platform.test,op35.not-web-platform.test,op99.not-web-platform.test,op86.web-platform.test,not-web-platform.test,op42.not-web-platform.test,op46.web-platform.test,op67.not-web-platform.test,op17.web-platform.test,op90.web-platform.test,op93.web-platform.test,op37.not-web-platform.test,op48.not-web-platform.test,op10.web-platform.test,op55.not-web-platform.test,op4.not-web-platform.test,www1.xn--n8j6ds53lwwkrqhv28a.not-web-platform.test,op55.web-platform.test,xn--lve-6lad.www2.web-platform.test,op47.web-platform.test,op51.web-platform.test,op45.web-platform.test,op80.web-platform.test,op68.web-platform.test,op49.web-platform.test,op57.web-platform.test,www2.xn--n8j6ds53lwwkrqhv28a.web-platform.test,www.xn--n8j6ds53lwwkrqhv28a.not-web-platform.test,op56.not-web-platform.test,op35.web-platform.test,op84.not-web-platform.test,xn--n8j6ds53lwwkrqhv28a.not-web-platform.test,xn--lve-6lad.xn--n8j6ds53lwwkrqhv28a.not-web-platform.test,op34.not-web-platform.test,op6.web-platform.test,web-platform.test,op67.web-platform.test,op69.not-web-platform.test,op11.not-web-platform.test,op93.not-web-platform.test,www1.www.web-platform.test,op86.not-web-platform.test,op8.not-web-platform.test,www2.xn--n8j6ds53lwwkrqhv28a.not-web-platform.test,op92.web-platform.test,xn--lve-6lad.www1.not-web-platform.test,op15.web-platform.test,op13.not-web-platform.test,op13.web-platform.test,xn--n8j6ds53lwwkrqhv28a.web-platform.test,xn--n8j6ds53lwwkrqhv28a.www.web-platform.test,op75.web-platform.test,op20.not-web-platform.test,op76.not-web-platform.test,op64.web-platform.test,op97.web-platform.test,op37.web-platform.test,op56.web-platform.test,op62.web-platform.test,op82.web-platform.test,op25.web-platform.test,op11.web-platform.test,www.xn--lve-6lad.not-web-platform.test,www2.www1.web-platform.test,op27.not-web-platform.test,op50.web-platform.test,op17.not-web-platform.test,op38.web-platform.test,www2.www.not-web-platform.test,xn--lve-6lad.www1.web-platform.test,op75.not-web-platform.test,op83.web-platform.test,op81.web-platform.test,op15.not-web-platform.test,xn--n8j6ds53lwwkrqhv28a.www.not-web-platform.test,op20.web-platform.test,op3.web-platform.test,www1.www2.not-web-platform.test,xn--n8j6ds53lwwkrqhv28a.xn--n8j6ds53lwwkrqhv28a.not-web-platform.test,op2.web-platform.test,op21.web-platform.test,op23.web-platform.test,op42.web-platform.test,op47.not-web-platform.test,www1.www1.web-platform.test,op18.not-web-platform.test,op22.web-platform.test,xn--lve-6lad.xn--lve-6lad.not-web-platform.test,op63.not-web-platform.test,op28.not-web-platform.test,op22.not-web-platform.test,op65.web-platform.test,www.www1.not-web-platform.test,www1.xn--lve-6lad.web-platform.test,op43.not-web-platform.test,op66.not-web-platform.test,www2.www.web-platform.test,op96.web-platform.test,op91.not-web-platform.test,www.xn--lve-6lad.web-platform.test,op1.web-platform.test,op74.not-web-platform.test,op87.web-platform.test,op59.not-web-platform.test,op19.web-platform.test,xn--n8j6ds53lwwkrqhv28a.www1.not-web-platform.test,op9.not-web-platform.test,op88.not-web-platform.test,op53.web-platform.test,www2.xn--lve-6lad.not-web-platform.test,op87.not-web-platform.test,op30.web-platform.test,op10.not-web-platform.test,op48.web-platform.test,op16.not-web-platform.test,op34.web-platform.test,op1.not-web-platform.test,www.web-platform.test,op5.web-platform.test,www1.xn--lve-6lad.not-web-platform.test,xn--lve-6lad.www2.not-web-platform.test", "media.peerconnection.mtransport_process": false}}}}}
[task 2020-04-15T00:30:10.811Z] 00:30:10 INFO - PID 2596 | 1586910610803 mozrunner::runner INFO Running command: "Z:\task_1586905291\build\application\firefox\firefox.exe" "-marionette" "-foreground" "-no-remote" "-profile" "C:\Users\task_1586905291\AppData\Local\Temp\rust_mozprofileCej2NI"
[task 2020-04-15T00:30:10.811Z] 00:30:10 INFO - PID 2596 | 1586910610804 geckodriver::marionette DEBUG Waiting 60s to connect to browser on 127.0.0.1:2828
[task 2020-04-15T00:30:11.220Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker OS.File: flush pending requests, warn about unclosed files, shut down service. for phase web-workers-shutdown
[task 2020-04-15T00:30:11.220Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker OS.File: flush I/O queued before profileBeforeChange for phase profile-before-change
[task 2020-04-15T00:30:11.225Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker AddonManager: shutting down. for phase profile-before-change
[task 2020-04-15T00:30:11.312Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker ClientManagerService: start destroying IPC actors early for phase xpcom-will-shutdown
[task 2020-04-15T00:30:11.497Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker Flush WebExtension StartupCache for phase profile-before-change
[task 2020-04-15T00:30:11.785Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker JSON store: writing data for phase profile-before-change
[task 2020-04-15T00:30:11.919Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker JSON store: writing data for phase AddonManager: Waiting for providers to shut down.
[task 2020-04-15T00:30:11.919Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker Update add-on blocklist state into add-on DB for phase AddonManager: Waiting to start provider shutdown.
[task 2020-04-15T00:30:11.919Z] 00:30:11 INFO - PID 2596 | DEBUG: Completed blocker Update add-on blocklist state into add-on DB for phase AddonManager: Waiting to start provider shutdown.
[task 2020-04-15T00:30:11.941Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker XPIProvider shutdown for phase quit-application-granted
[task 2020-04-15T00:30:11.941Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker XPIProvider for phase AddonManager: Waiting for providers to shut down.
[task 2020-04-15T00:30:11.941Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker PluginProvider for phase AddonManager: Waiting for providers to shut down.
[task 2020-04-15T00:30:11.960Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker ServiceWorkerRegistrar: Flushing data for phase profile-before-change
[task 2020-04-15T00:30:11.960Z] 00:30:11 INFO - PID 2596 | DEBUG: Adding blocker CrashMonitor: Writing notifications to file after receiving profile-before-change for phase OS.File: Waiting for clients before profileBeforeChange
[task 2020-04-15T00:30:12.087Z] 00:30:12 INFO - PID 2596 | DEBUG: Adding blocker TelemetryController: shutting down for phase profile-before-change-telemetry
[task 2020-04-15T00:30:12.117Z] 00:30:12 INFO - PID 2596 | DEBUG: Adding blocker EnvironmentAddonBuilder for phase AddonManager: Waiting to start provider shutdown.
[task 2020-04-15T00:30:12.156Z] 00:30:12 INFO - PID 2596 | DEBUG: Adding blocker RemoteSettingsClient - finish IDB access. for phase profile-before-change
[task 2020-04-15T00:30:12.156Z] 00:30:12 INFO - PID 2596 | 1586910612149 Marionette TRACE Marionette enabled
[task 2020-04-15T00:30:12.322Z] 00:30:12 INFO - PID 2596 | 1586910612319 Marionette TRACE Received observer notification toplevel-window-ready
[task 2020-04-15T00:30:12.711Z] 00:30:12 INFO - PID 2596 | DEBUG: Adding blocker MediaShutdownManager: shutdown for phase profile-before-change
[task 2020-04-15T00:30:12.957Z] 00:30:12 INFO - PID 2596 | DEBUG: Adding blocker ServiceWorkerShutdownBlocker: shutting down Service Workers for phase profile-change-teardown
[task 2020-04-15T00:30:13.059Z] 00:30:13 INFO - PID 2596 | DEBUG: Adding blocker JSON store: writing data for phase profile-before-change
[task 2020-04-15T00:30:13.102Z] 00:30:13 INFO - PID 2596 | DEBUG: Adding blocker JSON store: writing data for phase profile-before-change
[task 2020-04-15T00:30:13.102Z] 00:30:13 INFO - PID 2596 | DEBUG: Adding blocker Flush AddonRepository for phase profile-before-change
[task 2020-04-15T00:30:13.145Z] 00:30:13 INFO - PID 2596 | DEBUG: Adding blocker GMPProvider for phase AddonManager: Waiting for providers to shut down.
[task 2020-04-15T00:30:13.145Z] 00:30:13 INFO - PID 2596 | DEBUG: Adding blocker PluginFinder: finding plugins to load for phase profile-change-teardown
[task 2020-04-15T00:30:13.296Z] 00:30:13 INFO - PID 2596 | DEBUG: Adding blocker Remote Settings profile-before-change for phase profile-before-change
[task 2020-04-15T00:30:13.785Z] 00:30:13 INFO - PID 2596 | DEBUG: Adding blocker MediaShutdownManager: shutdown for phase xpcom-will-shutdown
[task 2020-04-15T00:30:14.172Z] 00:30:14 INFO - PID 2596 | DEBUG: Adding blocker Places Clients shutdown for phase profile-change-teardown
[task 2020-04-15T00:30:14.172Z] 00:30:14 INFO - PID 2596 | DEBUG: Adding blocker Places Connection shutdown for phase profile-before-change
[task 2020-04-15T00:30:14.231Z] 00:30:14 INFO - PID 2596 | DEBUG: Adding blocker PageActions: purging unregistered actions from cache for phase profile-before-change
[task 2020-04-15T00:30:14.453Z] 00:30:14 INFO - PID 2596 | DEBUG: Adding blocker Sqlite.jsm shutdown blocker for phase profile-before-change

There is a 1 minute shutdown delay after phase web-workers-shutdown:

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=297631286&repo=mozilla-central&lineNumber=78424-78425

[task 2020-04-15T00:28:28.248Z] 00:28:28 INFO - PID 1496 | DEBUG: Finished phase web-workers-shutdown
[task 2020-04-15T00:29:28.890Z] 00:29:28 INFO - PID 1496 | 1586910568880 geckodriver::marionette DEBUG Browser process stopped: exit code: 0

Andrea, any idea what this could be?

Flags: needinfo?(amarchesini)

asuth do you have time to take a look? Thanks!

Flags: needinfo?(amarchesini) → needinfo?(bugmail)

I just noticed that this failure is for a ccov build. Maybe there is some special logic, and Marco might know.

Summary: Intermittent TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/set_window_rect/user_prompts.py | expected OK → Intermittent ccov TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/set_window_rect/user_prompts.py | expected OK [hang after finished phase web-workers-shutdown]

The web workers (sub-)phase happens during RuntimeService::Shutdown being invoked by the "xpcom-shutdown" notification. This suggests something else in "xpcom-shutdown" is still doing things? Ideally a crash would be induced in the browser to see what was on the stack.

Flags: needinfo?(bugmail)

Sadly we cannot detect crashes yet for these type of tests due to bug 1490906. That means we might not have a chance to continue investigation for now.

Depends on: 1490906

Ccov is pretty slow at shutdown, as that's when it writes out the coverage counters. When we'll be able to induce a crash we can check if the process was in the process of writing coverage files.

See Also: → 1627394
Duplicate of this bug: 1637519

This is not only happening for ccov builds but also shippable opt. See the recently duped bug.

Summary: Intermittent ccov TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/set_window_rect/user_prompts.py | expected OK [hang after finished phase web-workers-shutdown] → Intermittent Wd TEST-UNEXPECTED-TIMEOUT | expected OK [hang after finished phase web-workers-shutdown]
Status: NEW → RESOLVED
Closed: 10 months ago
Resolution: --- → INCOMPLETE
Duplicate of this bug: 1666527
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Duplicate of this bug: 1698259

The shutdown hang as port of the failure logged as bug 1698259 is even longer and takes nearly 3 minutes.

Gabriele, isn't the shutdown hang monitor to kick in and crash Firefox eventually? Why doesn't that happen here? If you are the wrong person to ask please forward appropriately.

Flags: needinfo?(gsvelto)
Summary: Intermittent Wd TEST-UNEXPECTED-TIMEOUT | expected OK [hang after finished phase web-workers-shutdown] → Intermittent Wd TEST-UNEXPECTED-TIMEOUT | expected OK [shutdown hang]

It depends on the build. The normal timeout is 60 seconds but it can be bumped to 180 or even 360 depending on the build, see here. However IIRC every time we start a new shutdown phase the timer is reset to 0 and we have quite a few phases, so if they're all slow the over shutdown may last for significantly longer than the timeout w/o causing a crash.

Flags: needinfo?(gsvelto)

(In reply to Gabriele Svelto [:gsvelto] from comment #17)

However IIRC every time we start a new shutdown phase the timer is reset to 0 and we have quite a few phases, so if they're all slow the over shutdown may last for significantly longer than the timeout w/o causing a crash.

Huh, is that really the case? We are using toolkit.asyncshutdown.crash_timeout in Marionette/geckodriver to determine when to actually force-kill the application if time exceeded by extra 10s on top. Do you know who can say for sure how that timer is working?

Flags: needinfo?(gsvelto)

Looking at the code we compare gHeartbeat to the running timer. Whenever we advance to a new shutdown phase we invoke UpdateHeartbeat() that resets the heartbeat to zero.

Or at least that's how I understand this code. BTW I just noticed that this code was recently changed in bug 1689953.

Flags: needinfo?(gsvelto)

Hm ok. Thinking a bit more of it I believe we should keep that force-kill after the 60s. If there is indeed a long shutdown due to some hanging phases we would like to know about it. And killing the process would at least give us a chance to see how often it happens.

Would there be a way to kill the process and get a proper crash stack with the hanging phase? If yes, that would be perfect.

Maybe we need mozcrash.kill_and_get_minidump() under such a condition and not just call mozrunner.stop().

That would do the trick.

Thanks! I filed bug 1698937 to get this implemented.

You need to log in before you can comment on or make changes to this bug.