Make the WER runtime exception module inform the main browser about the minidumps it has generated
Categories
(Toolkit :: Crash Reporting, task)
Tracking
()
Tracking | Status | |
---|---|---|
firefox91 | --- | fixed |
People
(Reporter: gsvelto, Assigned: gsvelto)
References
Details
Attachments
(1 file)
When the exception module intercepts a child process crash it will have to inform the main process about it and hand over the minidump it has generated. Child process crashes caught by the exception handler first store the minidump in a pid-to-minidump mapping table when Breakpad is done writing it out, then the main process will try to take the minidump using TakeMinidumpForChild(). The flow with an externally generated minidump will have to be different. One important thing to keep in mind is to avoid races between whatever process/thread is generating the minidump and the main process' main thread that takes ownership of it. One always has to keep in mind that the main process will never try to grab a minidump as long as the child process hasn't been fully terminated, so if the minidump writing code sets everything up before letting the child process die all should be well (cue multiple processes dying at the same time which might make things a wee bit more complicated).
Another thing worth taking into account is that we write out some annotations in Breakpad exception handler. Those will have to be handled by the module if possible.
Assignee | ||
Comment 1•3 years ago
|
||
I've been working on this for at least three weeks already :-/
Assignee | ||
Comment 2•3 years ago
|
||
Finally making some real progress here. I've managed to pass data from the main process, to child processes and then to WER, now I only need to fill it up after writing a minidump and send it back again to the main process. I wish there was an easier way but ATM there isn't.
Assignee | ||
Comment 3•3 years ago
|
||
This also notifies the main process after the minidump has been generated.
I refactored the code a bit so the patch is probably larger than it should be
but the code should be a bit more readable overall.
With this change the minidump generation flow works like this:
- When the callback gets invoked in the WER process we read the structure that
is stored in every process' to figure out if it's the main process or a child
one. This is done by reading said process' memory, the pointer has been
passed to the runtime exception module when it was registered. - If the main process crashed everything works like it used to.
- If it was a child process then we first capture a minidump of it.
- Then we read the structure representing it in the main process:
WindowsErrorReportingData. The address of this structure was passed into the
child process' command-line so we need to parse that first, then we read it
from the main process memory. - We fill the structure and write it back into the main process memory.
- At this point if everything went fine we create a new thread in the main
process just to execute the WerNotifyProc function that will inform the main
process to the presence of the new minidump.
There's one important tidbit that's worth keeping in mind: the synchronization
between the main process and the WER process is implicit. The
WindowsErrorReportingData structure in the main process is kept alive until the
child process dies, the main process will destroy it only after that point. As
long as we're in the runtime exception module the crashed process is kept alive
so this will prevent the main process from touching that structure.
We explicitly terminate the crashed process after we're done with the
structure so nothing bad could happen... unless someone makes a change to
Gecko that breaks the previous assumption.
Another important thing to keep in mind: we wait for the newly created thread
to inform the main process but only for 5 seconds. We don't want to wait
indefinitely because the function that we're calling is taking a lock and if
it blocks for some reason WER will get stuck waiting for it, so it will never
kill the crashed process which in turn will prevent the main process from
moving ahead. In principle this should never happen but better be safe than
sorry.
Depends on D115379
Pushed by gsvelto@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/43c12d3f5c4f Add minidump generation for child processes in the WER module r=KrisWright
Comment 5•3 years ago
|
||
Backed out 5 changesets (Bug 1697895, Bug 1682518, Bug 1703761, Bug 1711418) for causing Windows 2012 x64 asan build bustages.
https://hg.mozilla.org/integration/autoland/rev/f4046bbf47c7b1b2f596168b3ffa7f046c191524
Push with failures:
https://treeherder.mozilla.org/jobs?repo=autoland&revision=4cc2cb3653f2f17e0cdf220eb1714c0a2d0dba30&selectedTaskRun=NBgmowhvR0-HwPjbLja2eQ.0
Failure log:
https://treeherder.mozilla.org/logviewer?job_id=342378389&repo=autoland&lineNumber=50884
Pushed by gsvelto@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/345c36d8e46b Add minidump generation for child processes in the WER module r=KrisWright
Comment 7•3 years ago
|
||
Backed out for causing build bustages on nsEmbedFunctions.cpp.
Pushed by gsvelto@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9d3542c25986 Add minidump generation for child processes in the WER module r=KrisWright
Comment 10•3 years ago
|
||
bugherder |
Description
•