1323097 - Crash reason does not appear in minidumps submitted via the infobar

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Description

•

8 years ago

I think we don't get the crash reason since it's attached via PrepareChildExceptionTimeAnnotations. That function writes its annotations into a temp file that (I believe) doesn't get picked up by the infobar code.

When we fix this, we should also make sure that the shmem code is updated as well (see bug 1278717).

Ryan VanderMeulen [:RyanVM]

Comment 1

•

7 years ago

Brad, can you help find an owner for this? This came up in bug 1322211 as a reason for why we're getting crash submissions without an abort reason.

Flags: needinfo?(blassey.bugs)

Mike Conley (:mconley) (:⚙️)

Comment 2

•

7 years ago

Having talked to aklotz, it seems like bug 1236108 made it so that the content process, when it crashes, creates both a minidump as well as its own .extra file (called GeckoChildCrash<pid>.extra). That .extra file is where the information we care about is.

Strangely, we've been apparently getting the information from that .extra file from about:tabcrashed, and both about:tabcrashed and the unsubmitted crash report notification bar both submit using the same mechanism.

Anyhow, I'll look into this.

Assignee: nobody → mconley

Status: NEW → ASSIGNED

Flags: needinfo?(blassey.bugs)

Mike Conley (:mconley) (:⚙️)

Updated

•

7 years ago

Flags: needinfo?(mconley)

Mike Conley (:mconley) (:⚙️)

Comment 3

•

7 years ago

Actually, more interesting is that with the function at [1], when constructing the .extra file for the main dump, the content process .extra information is merged into it.

I wonder if what's happening here is that sometimes this function isn't running - perhaps in the event that the parent process hasn't also had a minidump collected for it (ie, it wasn't a paired minidump).

[1]: http://searchfox.org/mozilla-central/rev/e95e4ed8b5229a29dacc32c0b90968df5e7a6ff3/toolkit/crashreporter/nsExceptionHandler.cpp#3298-3299

Mike Conley (:mconley) (:⚙️)

Comment 4

•

7 years ago

Hey billm, in bug 1322211 comment 2, you mentioned that the we're not picking up the extra values from the content process. It looks like bug 1236108 added code to sew the .extra file from the parent process together with one from the content process, and that's what gets sent. I just confirmed that this merging and sending works when sending from the infobar (which isn't super surprising, since they share the same crash report submission backend).

Do you know of a reliable way of creating the class of crash that caused this bug to be filed? Perhaps something in how we write the child process .extra file (or merge with the parent process) is falling over in certain conditions.

In any case, I suspect the notification bar code is innocent.

Flags: needinfo?(mconley) → needinfo?(wmccloskey)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Comment 5

•

7 years ago

If the parent process has already crashed or shut down, then we won't do the merging. I suspect that's what is happening. I'm not really sure what we could do about this then.

I think Chrome has a crash reporter process that just sits there and watches for crashes from all the other processes. That seems like a much better design, but we're probably a long way from that point.

Flags: needinfo?(wmccloskey)

Mike Conley (:mconley) (:⚙️)

Comment 6

•

7 years ago

(In reply to Bill McCloskey (:billm) from comment #5)
> If the parent process has already crashed or shut down, then we won't do the
> merging. I suspect that's what is happening. I'm not really sure what we
> could do about this then.
> 

Ah. In that case, I wonder if it'd be better to try sending the content process dump and its .extra file (which should still be on the disk) on their own.

Would that be useful? Is it even possible / supported to send a content process crash dump that isn't paired with a stack from the parent?

peterbe, do you happen to know the answer to that last question?

Flags: needinfo?(peterbe)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Comment 7

•

7 years ago

(In reply to Mike Conley (:mconley) from comment #6)
> (In reply to Bill McCloskey (:billm) from comment #5)
> > If the parent process has already crashed or shut down, then we won't do the
> > merging. I suspect that's what is happening. I'm not really sure what we
> > could do about this then.
> > 
> 
> Ah. In that case, I wonder if it'd be better to try sending the content
> process dump and its .extra file (which should still be on the disk) on
> their own.
> 
> Would that be useful? Is it even possible / supported to send a content
> process crash dump that isn't paired with a stack from the parent?

Yes, it's certainly possible. I thought it was even the typical case.

Mike Conley (:mconley) (:⚙️)

Comment 8

•

7 years ago

If I'm reading this right, then we create paired minidumps even when the content process goes down without the parent force-killing it: http://searchfox.org/mozilla-central/rev/f5077ad52f8b90183e73038869f6140f0afbf427/dom/ipc/ContentParent.cpp#1591-1593

In any case, perhaps the UnsubmittedCrashHandler can scan for those GeckoChildCrash<pid>.extra files and "do the right thing". I'll look at that today.

Mike Conley (:mconley) (:⚙️)

Comment 9

•

7 years ago

(In reply to Mike Conley (:mconley) from comment #8)
> If I'm reading this right, then we create paired minidumps even when the
> content process goes down without the parent force-killing it:
> http://searchfox.org/mozilla-central/rev/
> f5077ad52f8b90183e73038869f6140f0afbf427/dom/ipc/ContentParent.cpp#1591-1593

I'm now second-guessing that part. I don't think my mental model of how it works is exactly correct. It's quite possible we only submit the dump for the content process here (which makes sense - the parent process dump is unlikely useful when the content process goes down for reasons unrelated to what the parent is doing).

Mike Conley (:mconley) (:⚙️)

Comment 10

•

7 years ago

Hey aklotz, is there any way for the content process to know the UUID that the content process dump is going to be written to at the time when the GeckoChildCrash<pid>.extra file is written?

I ask because there's not currently a great way of associating loose GeckoChildCrash<pid>.extra files that haven't been merged with dumps in the directory. If we were able to make this, for example, <uuid>-GeckoChildCrash<pid>.extra, that'd make it much simpler. Or is this information not available when that file is written?

Flags: needinfo?(aklotz)

Mike Conley (:mconley) (:⚙️)

Updated

•

7 years ago

Depends on: 1338286

(not currently active) Ted Mielczarek

Comment 11

•

7 years ago

Yeah, in the normal case (where the content process simply crashes) we just send a single minidump. Paired minidumps are used for hangs where we force-kill the content process and some other situation that I've forgotten the details of. If the content process crash doesn't have the additional parent annotations merged in submission is not likely to work, since I believe that's where it gets things like the application name and whatnot.

Flags: needinfo?(peterbe)

(No longer employed by Mozilla) Aaron Klotz

Comment 12

•

7 years ago

(In reply to Mike Conley (:mconley) from comment #10)
> Hey aklotz, is there any way for the content process to know the UUID that
> the content process dump is going to be written to at the time when the
> GeckoChildCrash<pid>.extra file is written?

As discussed today on IRC:

11:38:35 AM - aklotz: mconley: what if we did it the other way around: each content generates its own uuid, registers that w/ parent. Parent records the child uuids as part of its .extra file. Then if we need to do the merge later on, we see those annotations in the parent .extra file and use them as pointers to the child .extra files to merge?
11:40:05 AM - mconley: That's not bad
11:41:17 AM - mconley: aklotz: although in this scenario, the parent .extra file has never been written
11:41:30 AM - mconley: but I like this line of thinking
11:42:34 AM - aklotz: mconley: yeah, that really only works in the "parent crashing too" case... but yeah I think this could be fleshed out in a way that could work
11:42:54 AM - aklotz: just need to find a nice way to persist that info

Then we discussed the bug 1338286 route.

Flags: needinfo?(aklotz)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Comment 13

•

7 years ago

Just FYI, David is changing a lot of stuff in bug 1338308. That will move everything over to the shmem approach.

Mike Conley (:mconley) (:⚙️)

Updated

•

7 years ago

Depends on: 1338308

Mike Conley (:mconley) (:⚙️)

Comment 14

•

7 years ago

Okay. It's quite possible in that case that bug 1338308 will take care of this particular case. Once that bug closes, I'll poke at this again to see if there's anything that the UnsubmittedCrashHandler needs to do.

Mike Conley (:mconley) (:⚙️)

Comment 15

•

7 years ago

Hey dvander, great job with bug 1338308! I'm curious - now that that's in tree, do you know if we can safely use the shmem mechanism to send these annotations that are computed at exception time in the content process: http://searchfox.org/mozilla-central/rev/ea31c4b64f34a29415a69fb768f8051495547315/toolkit/crashreporter/nsExceptionHandler.cpp#1387-1419 ?

If so, we can probably just use that instead of the stuff from bug 1236108 for content process annotations.

Flags: needinfo?(dvander)

David Anderson [:dvander] - inactive, e-mail if emergency

Comment 16

•

7 years ago

(In reply to Mike Conley (:mconley) from comment #15)
> Hey dvander, great job with bug 1338308! I'm curious - now that that's in
> tree, do you know if we can safely use the shmem mechanism to send these
> annotations that are computed at exception time in the content process:
> http://searchfox.org/mozilla-central/rev/
> ea31c4b64f34a29415a69fb768f8051495547315/toolkit/crashreporter/
> nsExceptionHandler.cpp#1387-1419 ?
> 
> If so, we can probably just use that instead of the stuff from bug 1236108
> for content process annotations.

I'm not 100% sure, it would depend on two things:
 (1) that the memory location containing the CrashReporterClient/shmem hasn't been stomped
 (2) the UI process hasn't yet tried to read the metadata shmem (which happens on ActorDestroy, after the IPDL pipe is detected broken)

If both of those are true it would be safe to switch to normal annotations again. Though now that I think about it, there's no reason we can't just do the annotation right before the exception occurs (i.e. inside MOZ_CRASH and wherever).

Flags: needinfo?(dvander)

Mike Conley (:mconley) (:⚙️)

Comment 17

•

7 years ago

(In reply to David Anderson [:dvander] from comment #16)
> 
> I'm not 100% sure, it would depend on two things:
>  (1) that the memory location containing the CrashReporterClient/shmem
> hasn't been stomped
>  (2) the UI process hasn't yet tried to read the metadata shmem (which
> happens on ActorDestroy, after the IPDL pipe is detected broken)
> 

Okay, great. I don't think there's too much I can do about (1), but if my memory serves me, (2) is already satisfied. I believe that by the time that the ExceptionHandler has been writing these annotations, the content process is still in the process of going down, and the pipe should still be up. aklotz, does that sound right?

> If both of those are true it would be safe to switch to normal annotations
> again. Though now that I think about it, there's no reason we can't just do
> the annotation right before the exception occurs (i.e. inside MOZ_CRASH and
> wherever).

I suppose we could do that, assuming we only cared about the crashes that are occurring intentionally. Another question for aklotz - can I assume that PrepareChildExceptionTimeAnnotations is writing annotations that are useful even when we're crashing unintentionally?

Flags: needinfo?(aklotz)

(No longer employed by Mozilla) Aaron Klotz

Comment 18

•

7 years ago

(In reply to Mike Conley (:mconley) from comment #17)
> I suppose we could do that, assuming we only cared about the crashes that
> are occurring intentionally. Another question for aklotz - can I assume that
> PrepareChildExceptionTimeAnnotations is writing annotations that are useful
> even when we're crashing unintentionally?

Yes, you may assume that.

Flags: needinfo?(aklotz)

Mike Conley (:mconley) (:⚙️)

Comment 19

•

7 years ago

I'm afraid Photon / QF stuff is taking a lot of my cycles lately. Freeing this up.

Assignee: mconley → nobody

Status: ASSIGNED → NEW

Gabriele Svelto [:gsvelto] (PTO)

Comment 20

•

5 years ago

All issues related to missing annotations should have been solved by bug 1547698. Also if I lookup crashes sent from the infobar on crashstats plenty of them have the crash reason set so I guess this if fixed.

Status: NEW → RESOLVED

Closed: 5 years ago

Resolution: --- → FIXED

Bugzilla

Quick Search

Crash reason does not appear in minidumps submitted via the infobar

Categories

(Toolkit :: Crash Reporting, defect)

Tracking

()

People

(Reporter: billm, Unassigned)

References

(Depends on 1 open bug)

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Comment 11

Comment 12

Comment 13

Updated

Comment 14

Comment 15

Comment 16

Comment 17

Comment 18

Comment 19

Comment 20