Closed Bug 1153205 Opened 5 years ago Closed 5 years ago

[e10s] ContentParent should annotate crash reports prior to calling GeneratePairedMinidump

Categories

(Core :: DOM: Content Processes, defect)

x86_64
All
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
e10s + ---
firefox40 --- affected
firefox41 --- fixed

People

(Reporter: jimm, Assigned: jimm)

References

Details

Attachments

(1 file, 8 obsolete files)

Attached patch patch (obsolete) — Splinter Review
Attachment #8590803 - Flags: review?(ted)
I tested this locally and everything seems to be working. To test on try I pushed this plus a plug that returns false from ContentParent's LoadPlugin. That should trigger this quite a bit in mochitest.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=fc4f0fac71bc
Comment on attachment 8590803 [details] [diff] [review]
patch

Review of attachment 8590803 [details] [diff] [review]:
-----------------------------------------------------------------

Looking at how empty that .extra file was I feel like we're still missing some annotation bits that get called in the normal hang/crash case. I started looking into it this morning but got a little lost and don't have that much time to devote to sorting it out right now.
Attachment #8590803 - Flags: review?(ted) → review+
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #3)
> Comment on attachment 8590803 [details] [diff] [review]
> patch
> 
> Review of attachment 8590803 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looking at how empty that .extra file was I feel like we're still missing
> some annotation bits that get called in the normal hang/crash case. I
> started looking into it this morning but got a little lost and don't have
> that much time to devote to sorting it out right now.

I'll file a follow up on this. Eventually someone will need this data and fix it. :)
Attached patch wip (obsolete) — Splinter Review
Attachment #8590803 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Attachment #8590987 - Attachment is obsolete: true
Attached patch tmp.txt (obsolete) — Splinter Review
Attachment #8590989 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Attachment #8591037 - Attachment is obsolete: true
Using the latest wip, good meta data here. Going to push to try again and trigger a crash, see what we get.

https://crash-stats.mozilla.com/report/index/5b1fe74c-67f6-4710-92f5-852742150410
Attached patch patch v.2 (obsolete) — Splinter Review
Attachment #8591040 - Attachment is obsolete: true
Still getting a crash in google_breakpad::ExceptionHandler::WriteMinidump() in a really weird place - 

http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/google-breakpad/src/client/windows/handler/exception_handler.cc#741
Hmm, this top frame was found through "given as instruction pointer in context".. I wonder if that can be inaccurate. Working with a debug build might be better here.
(In reply to Jim Mathies [:jimm] from comment #13)
> Hmm, this top frame was found through "given as instruction pointer in
> context".. I wonder if that can be inaccurate. Working with a debug build
> might be better here.

reliable as in, sits at the very bottom of the "trust list" - 

http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/google-breakpad/src/google_breakpad/processor/stack_frame.h#53
Yeah, that's accurate. What did your try log look like?
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #15)
> Yeah, that's accurate. What did your try log look like?

It crashed in the same place - 

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a2e0187da8ef

I'm wondering if this has something to do with that WaitForSingleObject call ExceptionHandler makes when it yields to its worker thread. Maybe something is running out of band, through some sort of callback or something. Just a guess.

The big question I'm left with is how to debug it, since I can't reproduce locally.
So this isn't crashing. Note:
 PROCESS-CRASH | Main app process exited normally | application crashed [@ ntdll.dll + 0x470b4] 

This is just what a minidump from the browser process looks like when you call CreatePairedMinidumps. The problem is that something is calling this code path and either not calling SimpleTest.expectChildProcessCrash(), or something in your patch mucks up how expectChildProcessCrash cleans up the expected dumps.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #17)
> So this isn't crashing. Note:
>  PROCESS-CRASH | Main app process exited normally | application crashed [@
> ntdll.dll + 0x470b4] 
> 
> This is just what a minidump from the browser process looks like when you
> call CreatePairedMinidumps. The problem is that something is calling this
> code path and either not calling SimpleTest.expectChildProcessCrash(), or
> something in your patch mucks up how expectChildProcessCrash cleans up the
> expected dumps.

That's good to hear. Maybe I'm just running into some random test failures here.
(In reply to Jim Mathies [:jimm] from comment #18)
> (In reply to Ted Mielczarek [:ted.mielczarek] from comment #17)
> > So this isn't crashing. Note:
> >  PROCESS-CRASH | Main app process exited normally | application crashed [@
> > ntdll.dll + 0x470b4] 
> > 
> > This is just what a minidump from the browser process looks like when you
> > call CreatePairedMinidumps. The problem is that something is calling this
> > code path and either not calling SimpleTest.expectChildProcessCrash(), or
> > something in your patch mucks up how expectChildProcessCrash cleans up the
> > expected dumps.
> 
> That's good to hear. Maybe I'm just running into some random test failures
> here.

Ha! Forgot I was doing this on purpose so I could see the crash reports!

https://hg.mozilla.org/try/rev/4fcacd5134ba#l1.12
Attached patch patch v.3 (obsolete) — Splinter Review
Attachment #8591094 - Attachment is obsolete: true
Attachment #8591928 - Flags: review?(ted)
tracking-e10s: --- → +
Comment on attachment 8591928 [details] [diff] [review]
patch v.3

Review of attachment 8591928 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/ipc/ContentParent.cpp
@@ +3309,5 @@
>  
>  #if defined(MOZ_CRASHREPORTER) && !defined(MOZ_B2G)
> +    // We're about to kill the child process associated with this content.
> +    // Something has gone wrong to get us here, so we generate a minidump
> +    // of the parent and child for submittal to the crash server.

The word you're looking for is "submission". :)

::: dom/ipc/CrashReporterParent.cpp
@@ +145,5 @@
> +        // If we don't have a dump id yet, we haven't called
> +        // CrashReporter::CreatePairedMinidumps. Apply any annotations using the
> +        // generic crash reporter api. When we call CreatePairedMinidumps
> +        // these annotations will get written out to an extra file associated
> +        // with our report.

I'm not sure this is right. These annotations are from the child process, but calling AnnotateCrashReport sticks them into the chrome process' list of annotations. That means that if we later crashed the chrome process we'd get these annotations in the crash report which would be confusing.

@@ +180,5 @@
> +
> +    if (NS_IsMainThread()) {
> +        // Inline, this is the main thread. Get this done.
> +        NotifyCrashService();
> +        return;

It'd be nice if this pattern was easier to write with a one-liner that did the right thing for main-thread-or-not.
Attachment #8591928 - Flags: review?(ted)
Attached patch patch v.4 (obsolete) — Splinter Review
Attachment #8591928 - Attachment is obsolete: true
Attached patch patch v.4Splinter Review
Attachment #8602147 - Attachment is obsolete: true
Comment on attachment 8602150 [details] [diff] [review]
patch v.4

this adds the all in one call, and basically just bails if somebody does something in the wrong order.
Attachment #8602150 - Flags: review?(ted)
Blocks: 1160142
Comment on attachment 8602150 [details] [diff] [review]
patch v.4

Review of attachment 8602150 [details] [diff] [review]:
-----------------------------------------------------------------

This is all getting very complicated for my brain to handle, but I think I follow.
Attachment #8602150 - Flags: review?(ted) → review+
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #22)
> ...
> 
> ::: dom/ipc/CrashReporterParent.cpp
> @@ +145,5 @@
> > +        // If we don't have a dump id yet, we haven't called
> > +        // CrashReporter::CreatePairedMinidumps. Apply any annotations using the
> > +        // generic crash reporter api. When we call CreatePairedMinidumps
> > +        // these annotations will get written out to an extra file associated
> > +        // with our report.
> 
> I'm not sure this is right. These annotations are from the child process,
> but calling AnnotateCrashReport sticks them into the chrome process' list of
> annotations. That means that if we later crashed the chrome process we'd get
> these annotations in the crash report which would be confusing.

I'd actually want to get the annotations from all processes in the crash report, or did I misunderstand the objection?  I don't care if it's separated somehow, as long as the annotations from child processes are available in the crash report even if it was only the chrome process that crashed.
(In reply to Milan Sreckovic [:milan] from comment #27)
> I'd actually want to get the annotations from all processes in the crash
> report, or did I misunderstand the objection?  I don't care if it's
> separated somehow, as long as the annotations from child processes are
> available in the crash report even if it was only the chrome process that
> crashed.

That's not how it works, and I don't see how you could make that workable in the general sense. If we have >1 content process and they all AnnotateCrashReport("URL", <whatever>), you want those annotations to be per-process.

Currently the main process has a set of annotations, and each child process has its own set of annotations. When the main process crashes it uses its set. When a child process crashes it uses its own set along with a subset of the parent process annotations (specifically everything except the ones in the blacklist here: https://hg.mozilla.org/mozilla-central/annotate/baa9c64fea6f/toolkit/crashreporter/nsExceptionHandler.cpp#l308 ).
It may be how it works, but that makes the crash report less useful, as it may have been the child process that caused the parent process to crash, and we lose valuable information if we don't see the details of the child process annotations.

Child process crashing picking up a subset of parent annotations sounds good - why can't parent process crashing also pick up annotations from all the child processes?
And I understand that the child process is not "crashing" when the parent process crashes, but I'm arguing that there is enough information in the child process annotations that I'd want them picked up even though the child process isn't crashing, as such.
We have multiple child processes. How should we sort out annotations from each? Even in a single content process e10s case we can have that content process + multiple plugin (and GMP) processes.
https://hg.mozilla.org/mozilla-central/rev/ce6dbefadccd
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Depends on: 1360308
You need to log in before you can comment on or make changes to this bug.