Open Bug 1229549 Opened 4 years ago Updated 4 years ago
a try run of a change that crashes on startup doesn't produce useful diagnostics
So I just did a try run of a change that makes Firefox crash during startup. (I did this intentionally, because I wanted to see the crash report for *how* it crashed for a particular change, on the machines in automation.) In particular, the change I pushed was adding: const_cast<uint8_t*>(mFd->mFileData)[offset] = 0; right before the return at the end of nsZipArchive::GetData in modules/libjar/nsZipArchive.cpp. When doing this, I would expect that the diagnostics for the failed test runs should indicate that we crashed, and how. I would expect a "PROCESS-CRASH" with a stack signature to show up in every single build's details, visible in treeherder. However, of our many test harness results: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a7878dcf8553 only a single one (reftest, including crashtest and jsreftest) produces the correct result: TEST-UNEXPECTED-FAIL | reftest | application terminated with exit code 1 PROCESS-CRASH | reftest | application crashed [@ nsZipArchive::GetData(nsZipItem *)] And all the rest fail in different ways -- many of which are ways that are common intermittent failures that we typically can't do anything about because there aren't any useful diagnostics. Mochitests report: TEST-UNEXPECTED-FAIL | runtests.py | Timed out while waiting for server startup. Web platform tests report: Test runner failed to initialise correctly; shutting down Cpp unit tests report: TEST-UNEXPECTED-TIMEOUT | TestAudioEventTimeline.exe | timed out after 900 seconds xpcshell tests report: TEST-UNEXPECTED-FAIL | dom/base/test/unit/test_error_codes.js | xpcshell return code: 1 The builder even reported a cryptic failure and turned purple, despite submitting a build to be tested: command timed out: 10800 seconds without output running ['c:/mozilla-build/python27/python', '-u', 'scripts/scripts/fx_desktop_build.py', '--config', 'builds/releng_base_windows_32_builds.py', '--config', 'balrog/production.py', '--branch', 'try', '--build-pool', 'production'], attempting to kill Marionette reports: AssertionError: Timed out waiting for port! Talos reports: TalosError: browser failed to close after being initialized b-m (VideoPuppeteer, whatever that is), reports: 1:00.71 LOG: MainThread ERROR Failure during execution of the playback test. AssertionError: Timed out waiting for port! So it seems like testing that things like a simple startup crash give a correct diagnostic on automation is something that needs to be done, and that these cases need to be fixed. This bug probably needs dependent bugs filed on fixing the actual problems.
For the mochitests, I suspect that xpcshell is crashing too: 11:55:26 INFO - MochitestServer : launching [u'C:\\slave\\test\\build\\tests\\bin\\xpcshell.exe', '-g', 'C:\\slave\\test\\build\\application\\firefox', '-v', '170', '-f', 'C:\\slave\\test\\build\\tests\\bin\\components\\httpd.js', '-e', "const _PROFILE_PATH = 'c:\\\\users\\\\cltbld\\\\appdata\\\\local\\\\temp\\\\tmpj0bw6x.mozrunner'; const _SERVER_PORT = '8888'; const _SERVER_ADDR = '127.0.0.1'; const _TEST_PREFIX = undefined; const _DISPLAY_RESULTS = false;", '-f', 'C:\\slave\\test\\build\\tests\\mochitest\\server.js'] 11:55:26 INFO - runtests.py | Server pid: 1200 11:55:26 INFO - runtests.py | Websocket server pid: 3312 11:55:26 INFO - runtests.py | SSL tunnel pid: 1960 11:56:56 WARNING - TEST-UNEXPECTED-FAIL | runtests.py | Timed out while waiting for server startup. You introduced a browser crash, but you also introduced a web-server crash, and because xpcshell won't start, the harness never gets around to running the browser. It would be nice if server crashes had more diagnostics (a full crash report?).
Thanks for doing this, this is good information. Good point about xpcshell Geoff. We should be able to get crash stacks out of there (we do out of xpcshell tests, so why not here?). Maybe we just need to pass in MOZ_CRASHREPORTER=1 and MOZ_CRASHREPORTER_NO_REPORT=1 into it.
Yeah, we can get crash reports out of xpcshell no problem: http://hg.mozilla.org/mozilla-central/annotate/f6ac392322b3/testing/xpcshell/runxpcshelltests.py#l899 is how the xpcshell test harness does it. Note that the xpcshell harness runs some JS to set the minidump path to its temp dir: http://hg.mozilla.org/mozilla-central/annotate/f6ac392322b3/testing/xpcshell/head.js#l108 A startup crash that happens early enough could fail to run this, meaning the minidump would wind up in the temp dir (this is where the crash reporter writes minidumps before we have a profile). We don't actually handle that properly in the xpcshell harness right now.
You need to log in before you can comment on or make changes to this bug.