Closed Bug 1391766 Opened 7 years ago Closed 7 years ago

Intermittent test_profile_management.py TestProfileManagement.test_preferences_are_set | application crashed [None]

Categories

(Testing :: Marionette Client and Harness, defect, P5)

Version 3
defect

Tracking

(Not tracked)

VERIFIED INCOMPLETE

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: bulk-close-intermittents, intermittent-failure)

The minidump file of the crashing content process was not readable:

16:13:17     INFO -  Crash dump filename: c:\users\genericworker\appdata\local\temp\tmp82aurd.mozrunner\minidumps\d5392090-e8ec-4672-a929-8442364e7e05.dmp
16:13:17     INFO -  stderr from minidump_stackwalk:
16:13:17     INFO -  2017-08-18 16:13:17: minidump.cc:4359: INFO: Minidump opened minidump c:\users\genericworker\appdata\local\temp\tmp82aurd.mozrunner\minidumps\d5392090-e8ec-4672-a929-8442364e7e05.dmp
16:13:17     INFO -  2017-08-18 16:13:17: minidump.cc:4808: ERROR: ReadBytes: read 0/32
16:13:17     INFO -  2017-08-18 16:13:17: minidump.cc:4453: ERROR: Minidump cannot read header
16:13:17     INFO -  2017-08-18 16:13:17: stackwalk.cc:133: ERROR: Minidump c:\users\genericworker\appdata\local\temp\tmp82aurd.mozrunner\minidumps\d5392090-e8ec-4672-a929-8442364e7e05.dmp could not be read
16:13:17     INFO -  2017-08-18 16:13:17: minidump.cc:4331: INFO: Minidump closing minidump

The content of the file can be found here:
https://queue.taskcluster.net/v1/task/eB0alvlnSkC7s5jaFE9rCQ/runs/0/artifacts/public/test_info/d5392090-e8ec-4672-a929-8442364e7e05.dmp

Gabriele, can you or someone else check why the above is failing? Not sure who exactly works right now on the crashreporter itself, and has the time to make it work better. Thanks.
Flags: needinfo?(gsvelto)
From the looks of it there is a crash in breakpad itself:

https://treeherder.mozilla.org/logviewer.html#?job_id=124180395&repo=autoland&lineNumber=35323

... and we have a few of those already:

https://bugzilla.mozilla.org/buglist.cgi?quicksearch=writeminidump

I'll try to look at the dump themselves and see if I can figure out what's going wrong. Ted, does this ring a bell? I've already seen a signature like this a few times, it might be a generalized problem.
Flags: needinfo?(gsvelto) → needinfo?(ted)
I see those writeminidump crash stacks once in a while, and also filed bugs with bogus minidumps. But so far we haven't gotten any eye on them. So maybe it's a good case here to get one of the issues breakpad has can be fixed.
Agreed, I'll have a look ASAP.
So yes, it looks like we're crashing while trying to write a minidump for the child process (which is what CreateMinidumpsAndPair is for). The top of the stack is here:
https://hg.mozilla.org/integration/autoland/file/32fe50044beace1245a6eddaf41126a6453c0e7f/toolkit/crashreporter/breakpad-client/windows/handler/exception_handler.cc#l764

and the second frame is here:
https://hg.mozilla.org/integration/autoland/file/32fe50044beace1245a6eddaf41126a6453c0e7f/toolkit/crashreporter/nsExceptionHandler.cpp#l4126

The second PROCESS-CRASH in the log that shows only errors from minidump_stackwalk is for the empty dump that this code created a file for, but then failed to actually write.
Interestingly, the disassembly for this dump shows:
5c3859d3 8d8d30ffffff    lea     ecx,[ebp-0D0h]
5c3859d9 e844f5ffff      call    xul!google_breakpad::ExceptionHandler::ExceptionHandler (5c384f22)
5c3859de 8d8d30ffffff    lea     ecx,[ebp-0D0h]
5c3859e4 e84affffff      call    xul!google_breakpad::ExceptionHandler::WriteMinidump (5c385933)

It's calling the ExceptionHandler constructor, then trying to call WriteMinidump, but ecx is a null pointer for the second call. I checked, and the stack memory at [ebp-0D0h] is a null pointer. I have no idea how that actually goes wrong.
Flags: needinfo?(ted)
I was wrong in comment 6, this is the chrome process trying to write a dump for itself and failing.
...dmajor pointed out that that's what our *normal* "write a dump of the current process" dumps look like, so this isn't actually a crash, it's just the stack in the chrome process from when it decided to write a dump of the child process.

The real issue here is that we apparently failed to write a dump for the child process somehow.
Ted or Gabriele, should we better file a new bug (in which component) which covers this necessary work for breakpad? There are actually a couple of tests depending on a fix here.
Flags: needinfo?(ted)
Flags: needinfo?(gsvelto)
Yes, you can file one in Toolkit: Crash Reporting.
Flags: needinfo?(ted)
Flags: needinfo?(gsvelto)
Actually there is bug 1314885 which already handles that situation. Thanks to the suggested bug list when filing a new bug.
Depends on: 1314885
Status: RESOLVED → VERIFIED
Product: Testing → Remote Protocol
Moving bug to Testing::Marionette Client and Harness component per bug 1815831.
Component: Marionette → Marionette Client and Harness
Product: Remote Protocol → Testing
You need to log in before you can comment on or make changes to this bug.