Closed Bug 1682518 Opened 3 years ago Closed 3 years ago

Make the WER runtime exception module inform the main browser about the minidumps it has generated

Categories

(Toolkit :: Crash Reporting, task)

Unspecified
Windows
task

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox91 --- fixed

People

(Reporter: gsvelto, Assigned: gsvelto)

References

Details

Attachments

(1 file)

When the exception module intercepts a child process crash it will have to inform the main process about it and hand over the minidump it has generated. Child process crashes caught by the exception handler first store the minidump in a pid-to-minidump mapping table when Breakpad is done writing it out, then the main process will try to take the minidump using TakeMinidumpForChild(). The flow with an externally generated minidump will have to be different. One important thing to keep in mind is to avoid races between whatever process/thread is generating the minidump and the main process' main thread that takes ownership of it. One always has to keep in mind that the main process will never try to grab a minidump as long as the child process hasn't been fully terminated, so if the minidump writing code sets everything up before letting the child process die all should be well (cue multiple processes dying at the same time which might make things a wee bit more complicated).

Another thing worth taking into account is that we write out some annotations in Breakpad exception handler. Those will have to be handled by the module if possible.

Depends on: 1697895

I've been working on this for at least three weeks already :-/

Assignee: nobody → gsvelto
Status: NEW → ASSIGNED

Finally making some real progress here. I've managed to pass data from the main process, to child processes and then to WER, now I only need to fill it up after writing a minidump and send it back again to the main process. I wish there was an easier way but ATM there isn't.

This also notifies the main process after the minidump has been generated.
I refactored the code a bit so the patch is probably larger than it should be
but the code should be a bit more readable overall.

With this change the minidump generation flow works like this:

  • When the callback gets invoked in the WER process we read the structure that
    is stored in every process' to figure out if it's the main process or a child
    one. This is done by reading said process' memory, the pointer has been
    passed to the runtime exception module when it was registered.
  • If the main process crashed everything works like it used to.
  • If it was a child process then we first capture a minidump of it.
  • Then we read the structure representing it in the main process:
    WindowsErrorReportingData. The address of this structure was passed into the
    child process' command-line so we need to parse that first, then we read it
    from the main process memory.
  • We fill the structure and write it back into the main process memory.
  • At this point if everything went fine we create a new thread in the main
    process just to execute the WerNotifyProc function that will inform the main
    process to the presence of the new minidump.

There's one important tidbit that's worth keeping in mind: the synchronization
between the main process and the WER process is implicit. The
WindowsErrorReportingData structure in the main process is kept alive until the
child process dies, the main process will destroy it only after that point. As
long as we're in the runtime exception module the crashed process is kept alive
so this will prevent the main process from touching that structure.
We explicitly terminate the crashed process after we're done with the
structure so nothing bad could happen... unless someone makes a change to
Gecko that breaks the previous assumption.

Another important thing to keep in mind: we wait for the newly created thread
to inform the main process but only for 5 seconds. We don't want to wait
indefinitely because the function that we're calling is taking a lock and if
it blocks for some reason WER will get stuck waiting for it, so it will never
kill the crashed process which in turn will prevent the main process from
moving ahead. In principle this should never happen but better be safe than
sorry.

Depends on D115379

Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/43c12d3f5c4f
Add minidump generation for child processes in the WER module r=KrisWright
Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/345c36d8e46b
Add minidump generation for child processes in the WER module r=KrisWright

Backed out for causing build bustages on nsEmbedFunctions.cpp.

Push with failures

Failure log

Backout link

Flags: needinfo?(gsvelto)

Updated the patches, trying again

Flags: needinfo?(gsvelto)
Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9d3542c25986
Add minidump generation for child processes in the WER module r=KrisWright
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch
Regressions: 1716470
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: