Closed Bug 1659185 Opened 4 years ago Closed 4 years ago

Crash annotations are read only from the parent process when generating paired minidumps

Categories

(Toolkit :: Crash Reporting, defect, P1)

defect

Tracking

()

RESOLVED FIXED
82 Branch
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- wontfix
firefox80 --- wontfix
firefox81 --- wontfix
firefox82 --- fixed

People

(Reporter: gsvelto, Assigned: gsvelto)

References

(Regression)

Details

(Keywords: regression)

Attachments

(2 files)

As per the title, this can be seen in crashes caused by IPC failure such as e56fbdc9-f230-459f-839f-0d9180200814. This is probably fallout from bug 1614933.

I've investigated this a bit further and can confirm that this is indeed a regression from bug 1614933. I hadn't considered this particular use-case when refactoring that code. This will require a bit of surgery because all the code that grabs the minidumps for this particular flow resides in the parent process, and the child's annotations are not accessible from it anymore. I'll probably have to send a signal to the child process (on Linux/macOS) or spawn a remote thread (on Windows) to force it to send the annotations back to the parent.

When generating a hang report we need to force a hung child process to return
its annotation via the pipe that connects it to the main process. Since the
child is not responding to regular IPC messages we force it to using
platform-specific code:

  • On Windows we launch a new thread using CreateRemoteThread() that will
    invoke the code used to send the crash annotations. The Windows-specific
    code also has another slight modification: it stored the pipe used to send
    the annotations in Breakpad's minidump callback context. This patch moves it
    into the gChildCrashAnnotationReportFd global which is what all the other
    platforms use.
  • On Linux/macOS we use a signal handler for the SIGTERM signal. It would be
    simpler to use SIGUSR1 or another signal that doesn't have special meanings,
    but those are all already in use in Gecko (either by jprof or the gcov
    machinery). We don't install SIGTERM handlers for anything else but
    obvioulsy its semantics get in the way since it's used to politely kill a
    process, and Chromium IPC also uses it for that purpose. So to tell apart
    our use of SIGTERM from a "regular" terminating SIGTERM we send a magic
    value along using sigqueue(). Our signal handler will look for it and only
    send annotations when it is set. If the magic number isn't there it will
    re-install the default SIGTERM handler and raise the signal once more which
    will lead to the process being terminated as it should be.

The patch refactors some of the code so that the PIDs and process/file handles
use platform-independent types to cut down on the amount of duplication we
have there.

Assignee: nobody → gsvelto
Status: NEW → ASSIGNED
Attached file Test patch

Applying this patch and then running with ./mach run --enable-crash-reporter is sufficient to test this. The patch blocks child process shutdown indefinitely so that shutdown hang reports will be generated every time one is shut down.

Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0a477b894943
Use child process crash annotations when generating hang reports; r=dmajor,froydnj
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 82 Branch
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: