Closed Bug 1440719 Opened 7 years ago Closed 7 years ago

"application timed out after 370 seconds with no output" sometimes caused by "Gah. Your tab just crashed." (MOZ_CRASHREPORTER_SHUTDOWN not set)

Categories

(Testing :: General, enhancement, P2)

Version 3
enhancement

Tracking

(firefox61 fixed)

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: gbrown, Assigned: gbrown)

References

Details

Attachments

(1 file)

Bug 1290856 and others collect test failure reports for "application timed out after 370 seconds with no output" -- when the harness notices that a test has apparently hung. Many current failures in bug 1290856 have a screenshot showing a tab crash "Gah. Your tab just crashed." Tests hang because the harness is waiting for that tab crash dialog. :whimboo points out that we can likely handle this much better by setting MOZ_CRASHREPORTER_SHUTDOWN: https://bugzilla.mozilla.org/show_bug.cgi?id=1339568#c71
See Also: → 1388249
Please note that mozcrash should usually also pick up the minidumps from those content crashes.
Setting MOZ_CRASHREPORTER_SHUTDOWN produces good crash reports and avoids the timeouts in linux mochitest-media (bug 1290856 and others): https://treeherder.mozilla.org/#/jobs?repo=try&revision=e776f29b7fda88dd744b54df192e6b8a3a01d749
See Also: → 1378743
Each of those cases has actually two assertions logged before that leak warning/error. So it might have revealed an underlying problem.
Priority: -- → P2
There are mochitest-browser-chrome tests that actually test content process crashes, so I don't know that we can use it there. I'd say enable it for other suites and we can file a followup to do something smarter for browser-chrome (like create an API for those tests to disable the behavior for the duration of the test).
(In reply to Henrik Skupin (:whimboo) from comment #1) > Please note that mozcrash should usually also pick up the minidumps from > those content crashes. This does seem to be working as designed. Looking at a random log from the latest orangefactor link in that bug: https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1290856&startday=2018-02-26&endday=2018-03-04&tree=all shows: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=164256106&lineNumber=5262 [task 2018-02-26T02:25:32.321Z] 02:25:32 INFO - PROCESS-CRASH | Main app process exited normally | application crashed [@ mozilla::NrUdpSocketIpc::create]
Blocks: 1443327
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #5) > I'd say enable it for > other suites and we can file a followup to do something smarter for > browser-chrome That seems very sensible -- thanks! Follow-up bug 1443327 filed and referenced by this patch. Seems okay on try now: https://treeherder.mozilla.org/#/jobs?repo=try&revision=561f1ec2d0914f8b9ae32a14c5f613cef8fe99d8
Attachment #8956250 - Flags: review?(ted)
Attachment #8956250 - Flags: review?(ted) → review+
Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/1383d36fdabf Set MOZ_CRASHREPORTER_SHUTDOWN during tests, other than browser-chrome; r=ted
Summary: "application timed out after 370 seconds with no output" sometimes caused by "Gah. Your tab just crashed." → "application timed out after 370 seconds with no output" sometimes caused by "Gah. Your tab just crashed." (MOZ_CRASHREPORTER_SHUTDOWN not set)
Thanks. The failure is a plain mochitest, dom/ipc/tests/test_CrashService_crash.html and the log contains: A content process crashed and MOZ_CRASHREPORTER_SHUTDOWN is set, shutting down I guess that makes sense. :(
Flags: needinfo?(gbrown)
Excluding both plain and browser-chrome mochitests from MOZ_CRASHREPORTER_SHUTDOWN seems a bit much. Perhaps: - move this test to browser-chrome? - move this test and browser-chrome crash-service tests to a separate subsuite?
With this patch, I get an error every time I try to run mochitests locally: The details of the failure are as follows: KeyError: 'MOZ_CRASHREPORTER_SHUTDOWN' File "/home/kris/code/mozilla-central/testing/mochitest/mach_commands.py", line 422, in run_mochitest_general **harness_args) File "/home/kris/code/mozilla-central/testing/mochitest/mach_commands.py", line 152, in run_desktop_test result = mochitest.run_test_harness(parser, options) File "/home/kris/code/mozilla-central/obj-package/_tests/testing/mochitest/runtests.py", line 3074, in run_test_harness result = runner.runTests(options) File "/home/kris/code/mozilla-central/obj-package/_tests/testing/mochitest/runtests.py", line 2605, in runTests res = self.runMochitests(options, tests_in_manifest) File "/home/kris/code/mozilla-central/obj-package/_tests/testing/mochitest/runtests.py", line 2400, in runMochitests result = self.doTests(options, testsToRun) File "/home/kris/code/mozilla-central/obj-package/_tests/testing/mochitest/runtests.py", line 2705, in doTests debuggerInfo is not None) File "/home/kris/code/mozilla-central/obj-package/_tests/testing/mochitest/runtests.py", line 1655, in buildBrowserEnv del browserEnv["MOZ_CRASHREPORTER_SHUTDOWN"] Command exited with status 1
Thanks :kmag. I'll be sure to fix that before pushing again.
We should move that test to browser-chrome, preferably. Plain Mochitests shouldn't be trying to test things like that. It looks like we'll also have problems with at least a few mochitest-chrome tests: https://dxr.mozilla.org/mozilla-central/rev/bccdc684210431c233622650a91454c09f6af9eb/dom/plugins/test/mochitest/chrome.ini
Honestly I think this is important enough to fix that I would sign off on disabling the tests that break with this enabled for now, re-landing this patch, and moving the tests to browser-chrome or whatever afterwards (as long as someone signs up to do the work and we don't wind up with the tests permanently disabled).
The only plain-mochitest that fails with MOZ_CRASHREPORTER_SHUTDOWN is dom/ipc/tests/test_CrashService_crash.html. This test only runs on non-e10s. These days we only run non-e10s plain-mochitest on linux32/debug. I cannot move the test to browser-chrome (where we have other crash service tests) because it wouldn't run at all: we do not run non-e10s browser-chrome anywhere. :adw -- It has been a few years, but it looks like you wrote test_CrashService_crash.html. :jimm -- Checking in with you as you are the triage owner for DOM Content Processes. :jmaher -- fyi, in case you have an interest or a bright idea. Can anyone suggest what we should do with this test? Since it is non-e10s only, I find myself wondering if test_CrashService_crash.html could be retired (disabled or deleted). If it still needs to run, on non-e10s, I think it needs to be moved to a new mochitest-plain subsuite, run without MOZ_CRASHREPORTER_SHUTDOWN, only on non-e10s; I'm pretty sure it would be the only such test in the new subsuite.
Flags: needinfo?(jmathies)
Flags: needinfo?(jmaher)
Flags: needinfo?(adw)
Shouldn't MOZ_CRASHREPORTER_SHUTDOWN only have an effect for content processes? We use it only here: https://dxr.mozilla.org/mozilla-central/source/browser/modules/ContentCrashHandlers.jsm#152 So how could a non-e10s test job fail when no content process is around? Or do I miss something?
Wait, surely you're reading that wrong and it's disabled for *non*-e10s, right? https://hg.mozilla.org/mozilla-central/annotate/d6957f004e9cc3d7408ac3a8f2b49ff97556e27f/dom/ipc/tests/mochitest.ini#l18 That would make way more sense, given that it's testing content crashes. Also, this should definitely be a chrome or browser-chrome test, it's doing all sorts of chrome things.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #18) > Wait, surely you're reading that wrong and it's disabled for *non*-e10s, > right? > https://hg.mozilla.org/mozilla-central/annotate/ > d6957f004e9cc3d7408ac3a8f2b49ff97556e27f/dom/ipc/tests/mochitest.ini#l18 It's not easy to parse: skip-if: !(crashreporter && !e10s && (toolkit == 'gtk3' || toolkit == 'cocoa' || toolkit == 'windows')) but I read that as: run-if: crashreporter && non-e10s && any-desktop-platform I checked a sampling of test logs and found it is skipped on e10s test tasks. For example: https://taskcluster-artifacts.net/JsK3lE2pQ4uAoj8DMVVxig/0/public/logs/live_backing.log 03:46:27 INFO - TEST-START | dom/ipc/tests/test_CrashService_crash.html 03:46:27 INFO - TEST-SKIP | dom/ipc/tests/test_CrashService_crash.html | took 0ms Also, the failure that caused the backout - comment 9 - was linux32/debug non-e10s mochitest-5...and I think that was the only (non-intermittent) failure on that push.
You are correct and that conditional is extremely hard to parse. It looks like this became confusing when we removed `run-if`--this was originally a `run-if` condition that got rewritten to `skip-if`: https://hg.mozilla.org/mozilla-central/diff/451dda813637/dom/ipc/tests/mochitest.ini I guess that makes sense--it's a plain mochitest that explicitly spawns a content process, so that wouldn't work in e10s where the test itself would be running in a content process. It should be possible to rewrite this as a browser-chrome test though, since we have helpers to run things in tabs etc there, like: https://dxr.mozilla.org/mozilla-central/rev/d6957f004e9cc3d7408ac3a8f2b49ff97556e27f/browser/components/sessionstore/test/browser_crashedTabs.js#142 Running it in e10s mode in browser-chrome is fine, since it actually does want a content process.
I could get on board with converting this to browser-chrome.
Flags: needinfo?(jmaher)
(In reply to Geoff Brown [:gbrown] from comment #16) > :adw -- It has been a few years, but it looks like you wrote Yes, IIRC Ted is right: (In reply to Ted Mielczarek [:ted.mielczarek] from comment #20) > I guess that makes sense--it's a plain mochitest that explicitly spawns a > content process, so that wouldn't work in e10s where the test itself would > be running in a content process.
Flags: needinfo?(adw)
I'm going to push ahead with the non-problematic cases: everything except browser-chrome and non-e10s. That's good enough for now. https://treeherder.mozilla.org/#/jobs?repo=try&revision=17bfa3ab85b2c6e9df5fa9280f0cfa5c1883eace
Flags: needinfo?(jmathies)
Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/1681780e0715 Set MOZ_CRASHREPORTER_SHUTDOWN during tests, other than browser-chrome and non-e10s; r=ted
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: