Closed Bug 1605209 Opened 1 year ago Closed 1 year ago

Add memory region to crash reports containing current actor/message manager message being processed

Categories

(Core :: DOM: Content Processes, task, P2)

task

Tracking

()

RESOLVED FIXED
mozilla77
Fission Milestone M5b
Tracking Status
firefox77 --- fixed

People

(Reporter: kmag, Assigned: Yoric)

References

(Blocks 1 open bug)

Details

Attachments

(4 files)

Crash reports don't contain JS stacks, which makes it hard to diagnose crashes caused by JS unless there's particularly relevant C++ on the stack. In the case of JS calling or called by actor or message manager messages, there often isn't.

It would help a lot if we simply stored the name of the actor/message we're currently sending or receiving in a memory region that we register with the crash reporter, so that we can inspect it to get some context for crash reports where it seems relevant.

This should add a small amount of overhead to message processing, but hopefully not much, since it just requires copying a handful of machine words which are already in the cache anyway.

Tracking for Fission dogfooding (M5)

kmag says he will have a patch up soon. This bug will make Fission crash reports easier to debug.

Fission Milestone: --- → M5
Priority: -- → P1

Moving P1 M5 bugs to M5a milestone

Fission Milestone: M5 → M5a

:kmag has a lot of bugs on his plate, so moving to m5b, and unassigning so we can find a new owner.

Assignee: kmaglione+bmo → nobody
Fission Milestone: M5a → M5b
Priority: P1 → P2

Yoric, will you have time to work on this bug after you finish JSContentActor?

This bug will improve Fission crash reporting. We expect to receive a lot of new crash reports when we start dogfood testing soon.

Flags: needinfo?(dteller)

I can do that, yes.
We'll need to check the perf impact, though.

Assignee: nobody → dteller
Flags: needinfo?(dteller)

So, I think that the best way to do this is to do something like:

  • nsICrashReporter::annotateCrashReport("Fission:agent", agentName) whenever we start receiving a message (start of recvRawMessage);
  • nsICrashReporter::annotateCrashReport("Fission:agent", "") whenever we're done receiving a message (end of recvRawMessage).

Gabriele, is there a better way?

Flags: needinfo?(gsvelto)

That's the simple way. The other way if you have a persistent piece of memory that's holding this is using CrashReporter::RegisterAppMemory(). However inspecting that is a lot harder on Socorro, because you need to manually download the minidump and analyze it with minidump_dump, it's not surface any other way.

BTW if you're using annotations and you're in C++ it's better to just use CrashReporter::AnnotateCrashReport() without going through the XPCOM interface.

Flags: needinfo?(gsvelto)

Thanks.

Attaching a first version. I'm not sure how to best test this.

I guess

  1. I'll need a mochitest of whichever kind we're using in dom/ipc/tests/JSWindowActor;
  2. Create a JSWindowActor{Child, Parent} pair;
  3. Send the message "crash" to the child;
  4. Make sure that the child does crash;
  5. From the parent, somehow locate the crash file <-- I've seen the code that does this in a xpcshell, but how do I do this in a mochitest?
  6. Run the minidump analyzer manually from the parent;
  7. From the parent, look at the table of annotations;
  8. Check that we have the right annotations.

Does this make sense? Is there a better way?

Flags: needinfo?(gsvelto)
Flags: needinfo?(nika)

I have no idea how to test crash annotations, so this is a :gsvelto question.

Flags: needinfo?(nika)

BrowserTestUtils.crashFrame() is what you're looking for. It returns a promise that resolves to an object holding the crash annotations so you can inspect them.

Flags: needinfo?(gsvelto)

Thanks. Any chance that there is also a way to crash without using an actor, to test that the annotation is not present when we're not in an actor?

Flags: needinfo?(gsvelto)

(In reply to David Teller [:Yoric] (please use "needinfo") from comment #12)

Thanks. Any chance that there is also a way to crash without using an actor, to test that the annotation is not present when we're not in an actor?

I don't know enough about actors to answer that question :-(

Flags: needinfo?(gsvelto)

(In reply to David Teller [:Yoric] (please use "needinfo") from comment #12)

Thanks. Any chance that there is also a way to crash without using an actor, to test that the annotation is not present when we're not in an actor?

Maybe this helps: do_crash is how crashes are generated for xpcshell-tests.

This should save (a little) memory and avoid quite a few conversions.

Pushed by dteller@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/289f5bbac1ae
Annotate JSWindowActor's message exchanges to determine which actor is on the stack of a crash;r=gsvelto,nika
https://hg.mozilla.org/integration/autoland/rev/e0e6dbf1d48d
Testing JSWindowActor crash annotations;r=gsvelto
https://hg.mozilla.org/integration/autoland/rev/d81b566ad94f
Extending BrowserTestUtils to allow out-of-stack crashing;r=mconley
https://hg.mozilla.org/integration/autoland/rev/6eb1cc169dbf
Turn actor names into nsCString;r=nika

Backed out 4 changesets (Bug 1605209) for causing browser-chrome failures at dom/ipc/tests/JSWindowActor/browser_crash_report.js

Push with failure: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&classifiedState=unclassified&revision=6eb1cc169dbf3f258bed8cb747ced0ebfb15188d

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=298064845&repo=autoland&lineNumber=3174

Backout link: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&classifiedState=unclassified&revision=54789c4c8fcbfd4f1fc4827ae2c10791815ad39a

[task 2020-04-17T09:40:49.285Z] 09:40:49     INFO - TEST-START | dom/ipc/tests/JSWindowActor/browser_crash_report.js
[task 2020-04-17T09:40:50.862Z] 09:40:50     INFO - TEST-INFO | started process screentopng
[task 2020-04-17T09:40:51.456Z] 09:40:51     INFO - TEST-INFO | screentopng: exit 0
[task 2020-04-17T09:40:51.457Z] 09:40:51     INFO - Buffered messages logged at 09:40:49
[task 2020-04-17T09:40:51.458Z] 09:40:51     INFO - Entering test bound 
[task 2020-04-17T09:40:51.458Z] 09:40:51     INFO - Entering test: crash actor
[task 2020-04-17T09:40:51.459Z] 09:40:51     INFO - Buffered messages logged at 09:40:50
[task 2020-04-17T09:40:51.459Z] 09:40:51     INFO - browser ready
[task 2020-04-17T09:40:51.460Z] 09:40:51     INFO - Cannot test crash annotations without a crash reporter
[task 2020-04-17T09:40:51.460Z] 09:40:51     INFO - Exiting test: crash actor
[task 2020-04-17T09:40:51.461Z] 09:40:51     INFO - Leaving test bound 
[task 2020-04-17T09:40:51.462Z] 09:40:51     INFO - Buffered messages finished
[task 2020-04-17T09:40:51.465Z] 09:40:51     INFO - TEST-UNEXPECTED-FAIL | dom/ipc/tests/JSWindowActor/browser_crash_report.js | This test contains no passes, no fails and no todos. Maybe it threw a silent exception? Make sure you use waitForExplicitFinish() if you need it. - 
[task 2020-04-17T09:40:51.466Z] 09:40:51     INFO - GECKO(3502) | MEMORY STAT vsizeMaxContiguous not supported in this build configuration.
[task 2020-04-17T09:40:51.469Z] 09:40:51     INFO - GECKO(3502) | MEMORY STAT heapAllocated not supported in this build configuration.
[task 2020-04-17T09:40:51.469Z] 09:40:51     INFO - GECKO(3502) | MEMORY STAT | vsize 20974971MB | residentFast 1114MB
[task 2020-04-17T09:40:51.469Z] 09:40:51     INFO - TEST-OK | dom/ipc/tests/JSWindowActor/browser_crash_report.js | took 1620ms
[task 2020-04-17T09:40:51.469Z] 09:40:51     INFO - checking window state
[task 2020-04-17T09:40:51.469Z] 09:40:51     INFO - GECKO(3502) | must wait for focus
Flags: needinfo?(dteller)
Backout by dvarga@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6daca7282599
Backed out 4 changesets for causing browser-chrome failures at dom/ipc/tests/JSWindowActor/browser_crash_report.js
Pushed by dteller@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b2da7e86bdb0
Annotate JSWindowActor's message exchanges to determine which actor is on the stack of a crash;r=gsvelto,nika
https://hg.mozilla.org/integration/autoland/rev/61b7701d6f58
Testing JSWindowActor crash annotations;r=gsvelto
https://hg.mozilla.org/integration/autoland/rev/dfa9edcdc743
Extending BrowserTestUtils to allow out-of-stack crashing;r=mconley
https://hg.mozilla.org/integration/autoland/rev/279da844cdfc
Turn actor names into nsCString;r=nika
Flags: needinfo?(dteller)
Duplicate of this bug: 1623981
You need to log in before you can comment on or make changes to this bug.