Mochitest shows green on try even with a crash in content process
Categories
(Testing :: Mochitest, defect, P2)
Tracking
(firefox69 fixed)
Tracking | Status | |
---|---|---|
firefox69 | --- | fixed |
People
(Reporter: mjf, Assigned: ahal)
References
(Depends on 1 open bug)
Details
Attachments
(3 files)
I was investigating a crash in the RDD process decoding av1 with the DAV1D decoder. On a Windows asan build, it was turning the test orange. After lots of debugging around why it was failing in RDD, I tried decoding in content but the test stayed green.
Then I looked at the logs and found that the decoder was crashing the content process, but not turning the test orange. Try run here[1]. Log for content crash here[2].
[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=a3ec9590a68febee64fb38ad4ed4f86cad1ede88&selectedJob=236277480
[2] https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=236277480&repo=try&lineNumber=2551
Comment 1•6 years ago
|
||
:ahal, any thoughts on this- maybe something to pick up next month? I marked this as a P3, feel free to adjust!
Assignee | ||
Comment 2•6 years ago
|
||
Crashes not turning jobs orange is very serious, and there are other bugs recently filed related to content crashes not producing stack traces (e.g bug 1545856). So there might be something deeper afoot.
Unfortunately the logs here have expired. Michael, was this reproducible and do you think you could craft a STR?
Reporter | ||
Comment 3•6 years ago
•
|
||
The bug that was causing this crash has been fixed, but prior to the fix it was 100% reproducible. I will talk with :achronop to see if we can provide a patch that will cause the crash again.
The bug that fixed the crash was Bug 1535631.
Reporter | ||
Comment 4•6 years ago
|
||
Andrew,
I re-pushed to try with something close to my original patch applied (decoding av1 on content process instead of RDD process), and it reproduced the crash, but the test stays green. I will bundle the patches I applied and attach them here later this evening. Note, that it is necessary to build/run these patches off an older base of moz-central (5edbe9b1b822), before the DAV1D decoder was fixed.
Here is the try run[1]. Here is pointer into the log[2].
[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=c355dadb8ce58fd0424a5531b57a12e478dabe80&selectedJob=246689416
[2] https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=246689416&repo=try&lineNumber=1460
Reporter | ||
Comment 5•6 years ago
|
||
Here is the patch to show a DAV1D decoder crash on the content process that does not turn the mochitest orange.
Apply onto 5edbe9b1b822.
Assignee | ||
Comment 6•6 years ago
|
||
Thanks! This looks pretty similar to bug 1545856. In there Gabriel's theory was that the crash happened before the crashreporter was able to set itself up. We can test this theory here by adding a sleep to the test harness and seeing if that causes us to get a traceback.
I think there are two issues that needs to be fixed here:
- Make sure that content crashes turn the task orange even if there is no stack trace
- Make sure there are stack traces
Problem 1) is unique to this bug (in the other one, reftest happens to error out a different way).
Reporter | ||
Comment 7•6 years ago
|
||
If you want to run a test locally with my patch applied, you can do:
./mach mochitest dom/media/test/test_playback.html
Assignee | ||
Comment 8•6 years ago
|
||
Thanks, I can reproduce on try. This will be very helpful. Here were my STR:
$ hg up 5edbe9b1b822
$ curl https://bug1539449.bmoattachments.org/attachment.cgi?id=9065272 | hg import -
$ ./mach try fuzzy -q "win64asan mochimedia !spi"
Reporter | ||
Updated•6 years ago
|
Assignee | ||
Comment 9•6 years ago
|
||
There's a failure case where content processes are crashing but no stack trace
is being printed. Test harnesses often rely on the presence of a minidump to
determine whether or not there was a crash, and so in this failure mode report
success (so tasks are staying green).
This adds a new string to mozharness' error logs to make sure the task turns
orange. Note: it does not fix the lack of stack traces.
Assignee | ||
Comment 10•6 years ago
|
||
Not sure I'll be able to fix the missing stack traces, but this patch at least solves problem 1 (which is the more serious of the two). We can't land it yet though because there's a chance this might reveal pre-existing content process crashes that weren't previously detected.
If one of these exists, we'll need to figure out a landing strategy to get this into the tree ASAP without being backed out.
Assignee | ||
Comment 11•6 years ago
|
||
This patch breaks the Mn tasks because they have a unit test that purposefully triggers a content process crash (and therefore A content process has crashed
gets dumped to the log and parsed by mozharness). I'll need to either hide this test output from the log or else skip checking for this substr in Mn.
Assignee | ||
Comment 12•6 years ago
|
||
Depends on D31635
Comment 13•6 years ago
|
||
Comment 14•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f9397b497e49
https://hg.mozilla.org/mozilla-central/rev/cfc637a0af12
Description
•