Closed Bug 1280477 Opened 3 years ago Closed 3 years ago

Store stack traces into the crash.main event file for each browser crash

Categories

(Toolkit :: Crash Reporting, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox52 --- fixed

People

(Reporter: gsvelto, Assigned: gsvelto)

References

Details

(Whiteboard: [fce-active-legacy])

Attachments

(2 files, 14 obsolete files)

249.71 KB, patch
gsvelto
: review+
Details | Diff | Splinter Review
24.37 KB, patch
ted
: review+
Details | Diff | Splinter Review
+++ This bug was initially created as a clone of Bug #1280470 +++

Once bug 1280470 has been implemented we should be able to gather the stack traces from an incoming minidump when launching the crash reporter client. The traces should be stored in the crash.main event file; we should bump the event file version to make the crash manager aware of the changes.
Blocks: 1280484
Are we going to store the complete JSON output of minidump-stackwalk? A lot of that information is valuable beyond the stack trace itself.
Flags: needinfo?(gsvelto)
Whiteboard: [fce-active]
For the initial step, we were going for the stack only. Mostly because that has a clear need for data transform that the rest of the data would not. If we're going to add more, I'd like to quantify what before we spec the data transform step.
There's quite a few extra fields available in the minidump including processor information, OS information, etc... I would start with stack traces and progressively add stuff that we deem important. Not only to reduce the scope of this work during its early stages but because most of this data would need to be vetted for privacy reasons and I remember that the process of adding bits of client information to Telemetry can be very long.
Flags: needinfo?(gsvelto)
Additionally we probably already collect functionally equivalent data in telemetry for much of what the minidump contains. I agree that getting stack traces working is the most important immediate concern.
As data steward, I can make the decisions about data privacy. I'm concerned that *most* of the output of minidump-stackwalk is probably valuable in some situations. In general we have the module list, the largest_free_vm_block data. In particular cases when people are diagnosing we also have the stackwalk particulars. If you want to start with just the stack, that's fine, but we should plan on doing the full set.
Okay. Do note that a lot of that data comes from the stackwalk implementation that lives in the Socorro repo, so we'd have to either import that code into m-c or copy/paste bits out. lonnen has been talking about splitting up the Socorro repo, so we could potentially move it elsewhere and share that code with m-c if we want to have a consistent set of data between crash-stats and client-side crash reports.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #5)
> As data steward, I can make the decisions about data privacy.

That would make things a lot simpler :)

> I'm concerned
> that *most* of the output of minidump-stackwalk is probably valuable in some
> situations. In general we have the module list, the largest_free_vm_block
> data. In particular cases when people are diagnosing we also have the
> stackwalk particulars. If you want to start with just the stack, that's
> fine, but we should plan on doing the full set.

I'll open a bug and attach sample output from minidumps taken on different platforms so that we can evaluate what else we can take.
The stackwalker I mentioned in comment 6 (and that bsmeberg is referring to vis things like `largest_free_vm_block` is this code:
https://github.com/mozilla/socorro/blob/master/minidump-stackwalk/stackwalker.cc

It takes the minidump and the .extra data (it expects it to be in JSON format, Socorro does that translation) and produces JSON.
I've started working on this basing myself on the code pointed out in comment 8. I'll be producing JSON which will be stored in the event file and then sent with the crash ping. The output will be compatible with Socorro's stackwalker JSON output minus the redundant fields e.g. a frame's module_offset and normalized fields can be recomputed from the remaining data so there's no point in sending them with telemetry. A frame's module info might also be an index instead of the module name if we want to save more space but for now I'm not doing it for compatibility's sake.
Assignee: nobody → gsvelto
Status: NEW → ASSIGNED
This is a WIP patch that processes the main minidump for a crash and puts the generated traces in the crash event file under the StackTraces key/value pair. I've made the resulting JSON trace in a format that's similar to the one used by Socorro minus all the redundant fields that are already available in the crash ping and those that can be regenerated from others. Also the trace is not symbolicated. Overall the new wntry in the crash file will look like this (on a single line without spaces or newlines):

{
  "status":"OK",
  "crash_info":{
    "address":"0x8",
    "crashing_thread":0,
    "type":"SIGSEGV"
  },
  "main_module":0,
  "modules":[
    {
      "base_addr":"0x400000",
      "code_id":"id",
      "debug_file":"firefox",
      "debug_id":"3D20E1805F4DCEBC63685E9898CD04EC0",
      "end_addr":"0x428000",
      "filename":"firefox",
      "version":""
    },
    ...
  ],
  "thread_count":55,
  "threads":[
    {
      "frame_count":132,
      "frames":[
        {
          "frame":0,
          "module":"libxul.so",
          "offset":"0x7f1a74f85aa2",
          "trust":"context"
        },
        {
          "frame":1,
          "module":"libxul.so",
          "offset":"0x7f1a74f8a1a5",
          "trust":"frame_pointer"
        },
        {
          "frame":2,
          "module":"libxul.so",
          "offset":"0x7f1a74fa24c8",
          "trust":"frame_pointer"
        },
        ...
      ]
    }
  ]
}
Attachment #8768807 - Flags: feedback?(ted)
Attachment #8768807 - Flags: feedback?(benjamin)
Is there specific feedback you want? It's unlikely that I'll be able to look at this in general until Monday.
Flags: needinfo?(gsvelto)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #11)
> Is there specific feedback you want? It's unlikely that I'll be able to look
> at this in general until Monday.

Nothing specific, I just wanted to make sure that it's OK to send the data I'm gathering via telemetry and that the overall method I'm using is sensible for achieving our design goals.
Flags: needinfo?(gsvelto)
Attachment #8768807 - Flags: feedback?(benjamin)
Updated patch, much uglier than before but including the changes described in bug 1280470 comment 14. In a nutshell instead of processing the minidump files in the crash reporter client we now have a dedicated executable which analyzes the minidump before the crash reporter is launched.

This currently has a lot of limitations:

- It's working only on Linux and I might be doing something wrong there anyway

- It most likely doesn't event compile on Windows and Mac, let alone run there

- The code I've added is pretty clunky and badly needs refactoring into something sensible (in my defense I must say that the code in nsExceptionHandler.cpp is remarkably clunky already)
Attachment #8768807 - Attachment is obsolete: true
Attachment #8768807 - Flags: feedback?(ted)
This is an updated patch that works correctly on Linux, builds on Mac and Android and doesn't on Windows (but that's breakpad's fault, we need bug 1264367 for Windows to build correctly).

So what this patch does is:

- Introduce a new program (minidump-analyzer) that will generate the stack traces from the minidump and stuff them into the crash event file

- Refactor the code in nsExceptionHandler.cpp so that launching a program is factored out of the minidump callback

- Change the minidump callback to call the analyzer before the crashreporter. My patch is currently waiting for the analyzer to finish before launching the crashreporter. We might want to put a timeout there so that it doesn't accidentally get stuck if for some reason the analyzer takes too long.

Some limitations of this patch:

- We don't support Android yet, I think I'll do that in a separate bug because crash handling is slightly different there.

- I haven't tested on Mac and Windows, it might work, it might not.

- It's not pref'd out for release/aurora, that should be trivial to add though

Ted, let me know what you think about this. The code is not super-pretty yet but it does everything it needs to do already.
Attachment #8774748 - Attachment is obsolete: true
Attachment #8775184 - Flags: feedback?(ted)
Comment on attachment 8775184 [details] [diff] [review]
[PATCH WIP] Write the stack traces extracted from a crash dump into the crash.main event file

Review of attachment 8775184 [details] [diff] [review]:
-----------------------------------------------------------------

This approach seems OK, although I think maybe it'd be better to run minidump-analyzer from the crashreporter client rather than adding one more thing to do in the crashed process. For content/plugin crashes it should be fine to run it from the chrome process.

I wonder if we'd be happier long-term figuring out how to share the stackwalker from Socorro instead? The only dependency it has outside of Breakpad is jsoncpp (which is liberally licensed). One quirk there is that that stackwalker takes the annotations as input (what we write into the .extra file) but it expects to receive them in JSON format, so we'd have to write our .extra files as JSON (which probably isn't a terrible idea anyway).
Attachment #8775184 - Flags: feedback?(ted)
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #15)
> This approach seems OK, although I think maybe it'd be better to run
> minidump-analyzer from the crashreporter client rather than adding one more
> thing to do in the crashed process.

Right, it's definitely possible but it would require more system-dependent code for locating the executable. I can explore how complex it would be and report back.

> I wonder if we'd be happier long-term figuring out how to share the
> stackwalker from Socorro instead? The only dependency it has outside of
> Breakpad is jsoncpp (which is liberally licensed). One quirk there is that
> that stackwalker takes the annotations as input (what we write into the
> .extra file) but it expects to receive them in JSON format, so we'd have to
> write our .extra files as JSON (which probably isn't a terrible idea anyway).

It's somewhat desirable since we want the same output from both. The downside would be that we'd pull yet another external dependency into gecko and I'm loath to do that.
Yet another iteration this time the minidump analyzer is being run from the crashreporter executable instead of the crashed Firefox process. I've tested it on Linux, Mac and Windows and it works fine in all three cases. I haven't tried non-debug builds yet but that's next on my list of tests. The only gripe I have with the patch is that there's quite a bit of code duplication between the crashreporter and minidump analyzer. I'll try to cut down on that a bit but it will require further work too; for example to put the shared functions into a small library used by both.
Attachment #8775184 - Attachment is obsolete: true
Whoops, wrong patch. This is more like it.
Attachment #8776884 - Attachment is obsolete: true
I've finished cleaning up and polishing the patch. I'm not asking for review just yet because I'm waiting for an extensive try-run to finish but in the meantime the code should remain pretty stable since it seems to be working well on Linux, Mac and Windows. The main changes are:

- The minidump-analyzer is called from the crashreporter instead of the crashed Firefox process

- I've removed all the changes to nsExceptionHandler.cpp to make the review leaner, since they're mostly cleanups I'll post them in a follow up

- The minidump-analyzer is properly packaged in the Mac version

The only issue I'm running into is that under Windows a window pops up showing breakpad's log and then closes automatically when it's done. I don't know how to fix that unless I fix bug 1289784 first
Attachment #8776934 - Attachment is obsolete: true
Attachment #8776959 - Flags: feedback?(ted)
The try run is looking good; there doesn't seem to be any tests breaking due to this:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5542a6ea078b

I should also find a way to test it though I'm not sure how yet. I have to investigate how we test the crash reporter and maybe expand on that.
I've looked into the tests and I can't seem to find any testing the crashreporter client proper. Down the road we'll have to think about something though.
My previous try run turned out green and there weren't any broken tests so this is good for review. I did only one small build-system change compared to the previous patch: the minidump-analyzer is built only for nightly builds so on aurora/beta/release this feature will be effectively disabled which is our first goal. This is the smallest possible change to disable this; what it will cause is that on those builds the crash reporter will try to run the minidump-analyzer, won't find it and just move along with the usual crash reporting process.
Attachment #8776959 - Attachment is obsolete: true
Attachment #8776959 - Flags: feedback?(ted)
Attachment #8777727 - Flags: review?(ted)
Changing the title to make the scope clearer since I'm about to open bugs for the other cases we've discussed.
Summary: Store stack traces into the crash.main event file for each crash → Store stack traces into the crash.main event file for each browser crash
Blocks: 1293656
I've only done formatting changes and modified the documentation to reflect the changes to the crash reporting workflow. The code is exactly the same as the previous patch.
Attachment #8777727 - Attachment is obsolete: true
Attachment #8777727 - Flags: review?(ted)
Attachment #8781283 - Flags: review?(ted)
Comment on attachment 8781283 [details] [diff] [review]
[PATCH v3] Write the stack traces extracted from a crash dump into the crash.main event file

Clearing the review flag because of the changes requested in bug 1280484 comment 13. These should be pretty small and shouldn't affect the overall structure of the patch.
Attachment #8781283 - Flags: review?(ted)
Blocks: 1296274
Updated patch, this does a couple of things:

- Remove the frame_count, thread_count and frame fields (they are redundant)

- Make the module field in each frame an index in the module array rather than a string

The rest is unchanged.
Attachment #8781283 - Attachment is obsolete: true
Attachment #8782634 - Flags: review?(ted)
Comment on attachment 8782634 [details] [diff] [review]
[PATCH v4] Write the stack traces extracted from a crash dump into the crash.main event file

Review of attachment 8782634 [details] [diff] [review]:
-----------------------------------------------------------------

This patch is generally fine, but the manual JSON generation in the analyzer makes me very uncomfortable, so the r- is for that. Could you pull in jsoncpp to do that? It's pretty tiny:
https://github.com/open-source-parsers/jsoncpp

That's what the Socorro stackwalker uses:
https://github.com/mozilla/socorro/blob/master/minidump-stackwalk/stackwalker.cc

Longer-term I'd like to figure out a way to share the Socorro stackwalker so we're not maintaining two different versions of this code, and also so we can get the same set of extra annotations out from client-side and server-side stackwalking.

Aside from the JSON bit I just have some small comments, and I'd really like to avoid adding a new Makefile.in since we're trying to get rid of them.

::: toolkit/crashreporter/client/crashreporter.cpp
@@ +492,5 @@
>    return UIFileExists(reportPath);
>  }
>  
> +static string
> +GetMinidumpAnalyzerPath( void )

nit: You don't need the `void`.

::: toolkit/crashreporter/client/crashreporter.h
@@ +26,5 @@
>  #include <windows.h>
>  
>  #define UI_SNPRINTF _snprintf
>  #define UI_DIR_SEPARATOR "\\"
> +#define UI_CRASH_REPORTER_FILENAME "crashreporter.exe"

You could instead stick BIN_SUFFIX in DEFINES in moz.build, like:
https://dxr.mozilla.org/mozilla-central/rev/24763f58772d45279a935790f732d80851924b46/dom/ipc/moz.build#166

::: toolkit/crashreporter/client/crashreporter_win.cpp
@@ +1549,5 @@
> +{
> +  wstring cmdLine;
> +
> +  cmdLine += L"\"" + UTF8ToWide(exename) + L"\" ";
> +  cmdLine += L"\"" + UTF8ToWide(filename) + L"\" ";

Ugh, I hate Windows' CreateProcess API.

@@ +1554,5 @@
> +
> +  STARTUPINFO si;
> +  PROCESS_INFORMATION pi;
> +
> +  ZeroMemory(&si, sizeof(si));

I don't think it's in our style guide, but I prefer zero-initializing structs by just writing `T foo = {};`.

::: toolkit/crashreporter/minidump-analyzer/Makefile.in
@@ +8,5 @@
> +ifeq ($(OS_ARCH),Darwin)
> +libs::
> +	$(NSINSTALL) -D $(DIST)/bin/crashreporter.app
> +	$(NSINSTALL) -D $(DIST)/bin/crashreporter.app/Contents/MacOS
> +	$(NSINSTALL) $(DIST)/bin/minidump-analyzer $(DIST)/bin/crashreporter.app/Contents/MacOS

I think for this case you can get away with just putting (in moz.build):
```
if CONFIG['OS_TARGET'] == 'Darwin':
  DIST_SUBDIR = 'crashreporter.app/Contents/MacOS'
```

::: toolkit/crashreporter/minidump-analyzer/minidump.cpp
@@ +1,1 @@
> +/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */

nit: I would name this file the same as the resulting binary, `minidump-analyzer.cpp`.

@@ +21,5 @@
> +#if defined(XP_WIN32)
> +
> +#include <windows.h>
> +
> +#define DIR_SEPARATOR "\\"

Does Breakpad not have useful code for this? That's a bummer.

@@ +126,5 @@
> +}
> +
> +#endif
> +
> +static string gEventsPath;

Does this need to be global? You're only using it in one function, just pass it as a parameter?

@@ +136,5 @@
> +  }
> +};
> +
> +// We use this set to get the index of a module when listed by address
> +static set<const CodeModule*, ModuleCompare> gModules;

I wonder if this would be more straightforward to use as a `map<const CodeModule*, unsigned int>`, where you just stored the modules directly?

@@ +174,5 @@
> +      moduleIndex << distance(gModules.begin(), itr);
> +    }
> +  }
> +
> +  aStream << "{\"module\":\"" << moduleIndex.str()

Is there a reason you're writing the index as a string? Seems like it'd be better to just leave the "module" field out when a frame isn't in a known module. That's what the Socorro stackwalker does:
https://github.com/mozilla/socorro/blob/f6a02e45015b4d502d31c2fb20d34d7941071200/minidump-stackwalk/stackwalker.cc#L510
Attachment #8782634 - Flags: review?(ted) → review-
(In reply to Gabriele Svelto [:gsvelto] from comment #21)
> I've looked into the tests and I can't seem to find any testing the
> crashreporter client proper. Down the road we'll have to think about
> something though.

Yeah, it's sort of a hard thing to test because we'd have to automate native UI. We've talked about adding an environment variable to allow the client to auto-submit which would at least let us test that code path.

You should be able to write some simple standalone tests for the analyzer, at least. We've got xpcshell tests that crash an xpcshell subprocess and give you back the path to the minidump as well as the annotations, so you can write a test that crashes, runs minidump-analyzer, and does some simple sanity checks on the output. This test is a decent example:
https://dxr.mozilla.org/mozilla-central/rev/24763f58772d45279a935790f732d80851924b46/toolkit/crashreporter/test/unit/test_crashreporter_crash.js#17
Thanks a lot for the thorough review Ted, I'll address your comments as soon as I'm done with the Breakpad integration. Just a few comments inline:

(In reply to Ted Mielczarek [:ted.mielczarek] from comment #27)
> This patch is generally fine, but the manual JSON generation in the analyzer
> makes me very uncomfortable, so the r- is for that. Could you pull in
> jsoncpp to do that? It's pretty tiny:
> https://github.com/open-source-parsers/jsoncpp

I thought about that but while very small it would yet another external project pulled into the gecko sources and I'm loath to do that. What if I write just a handful of helpers to make the code terser and easier to read? If you still think that jsoncpp is our best option I'll pull it in but I don't know how to handle this particular scenario; I suppose that it'll need to be at very least vetted by legal because of the license, etc...

> That's what the Socorro stackwalker uses:
> https://github.com/mozilla/socorro/blob/master/minidump-stackwalk/
> stackwalker.cc
> 
> Longer-term I'd like to figure out a way to share the Socorro stackwalker so
> we're not maintaining two different versions of this code, and also so we
> can get the same set of extra annotations out from client-side and
> server-side stackwalking.

That was my idea too but in bug 1280484 Benjamin asked for changes in the output of the minidump analyzer that makes it fundamentally different than Socorro's output (e.g. within a frame the owning module is referenced by index instead of by name). As long as they produce different output we won't be able to merge them.

> Aside from the JSON bit I just have some small comments, and I'd really like
> to avoid adding a new Makefile.in since we're trying to get rid of them.

Good point, I'll try to get rid of it with your suggestion.

> ::: toolkit/crashreporter/client/crashreporter.h
> @@ +26,5 @@
> >  #include <windows.h>
> >  
> >  #define UI_SNPRINTF _snprintf
> >  #define UI_DIR_SEPARATOR "\\"
> > +#define UI_CRASH_REPORTER_FILENAME "crashreporter.exe"
> 
> You could instead stick BIN_SUFFIX in DEFINES in moz.build, like:
> https://dxr.mozilla.org/mozilla-central/rev/
> 24763f58772d45279a935790f732d80851924b46/dom/ipc/moz.build#166

Thanks for the tip! Will do.

> ::: toolkit/crashreporter/client/crashreporter_win.cpp
> @@ +1549,5 @@
> > +{
> > +  wstring cmdLine;
> > +
> > +  cmdLine += L"\"" + UTF8ToWide(exename) + L"\" ";
> > +  cmdLine += L"\"" + UTF8ToWide(filename) + L"\" ";
> 
> Ugh, I hate Windows' CreateProcess API.

Yeah, I've also noticed that this code is duplicated in RestartApplication(), I'll factor it out in a single function.

> @@ +21,5 @@
> > +#if defined(XP_WIN32)
> > +
> > +#include <windows.h>
> > +
> > +#define DIR_SEPARATOR "\\"
> 
> Does Breakpad not have useful code for this? That's a bummer.

I hadn't thought about that. I think it should; I'll have a look.

> @@ +126,5 @@
> > +}
> > +
> > +#endif
> > +
> > +static string gEventsPath;
> 
> Does this need to be global? You're only using it in one function, just pass
> it as a parameter?

Yeah, it's a leftover.

> @@ +136,5 @@
> > +  }
> > +};
> > +
> > +// We use this set to get the index of a module when listed by address
> > +static set<const CodeModule*, ModuleCompare> gModules;
> 
> I wonder if this would be more straightforward to use as a `map<const
> CodeModule*, unsigned int>`, where you just stored the modules directly?

Yes it would, I feel kind of silly for having done it using distance() :|

> Is there a reason you're writing the index as a string? Seems like it'd be
> better to just leave the "module" field out when a frame isn't in a known
> module. That's what the Socorro stackwalker does:
> https://github.com/mozilla/socorro/blob/
> f6a02e45015b4d502d31c2fb20d34d7941071200/minidump-stackwalk/stackwalker.
> cc#L510

OK, I'll leave out the module entirely when it's not available then.

(In reply to Ted Mielczarek [:ted.mielczarek] from comment #28)
> You should be able to write some simple standalone tests for the analyzer,
> at least. We've got xpcshell tests that crash an xpcshell subprocess and
> give you back the path to the minidump as well as the annotations, so you
> can write a test that crashes, runs minidump-analyzer, and does some simple
> sanity checks on the output. This test is a decent example:
> https://dxr.mozilla.org/mozilla-central/rev/
> 24763f58772d45279a935790f732d80851924b46/toolkit/crashreporter/test/unit/
> test_crashreporter_crash.js#17

Good point, I'll open bugs for that too.
(In reply to Gabriele Svelto [:gsvelto] from comment #29)
> I thought about that but while very small it would yet another external
> project pulled into the gecko sources and I'm loath to do that. What if I

While I understand the pain, I still think it's worthwhile to make the code more readable and future-proof ourselves against serialization bugs.

> write just a handful of helpers to make the code terser and easier to read?
> If you still think that jsoncpp is our best option I'll pull it in but I
> don't know how to handle this particular scenario; I suppose that it'll need
> to be at very least vetted by legal because of the license, etc...

Nah, it's open-source and with a permissive license: "JsonCpp is licensed under the MIT license, or public domain if desired and recognized in your jurisdiction", we just might want to list it in license.html:
https://dxr.mozilla.org/mozilla-central/source/toolkit/content/license.html

We can probably skip that if we choose to use it under a PD license, but you could ask Gerv if you're not sure.

> That was my idea too but in bug 1280484 Benjamin asked for changes in the
> output of the minidump analyzer that makes it fundamentally different than
> Socorro's output (e.g. within a frame the owning module is referenced by
> index instead of by name). As long as they produce different output we won't
> be able to merge them.

I think we should at least figure out a path that gets us to using the same code. We could rename one of the module fields, either rename socorro's to "module_name", or rename this one to "module_index", so that you could have one or the other (or both) present.
This is a patch that imports version 1.7.5 of jsoncpp into the tree and builds it as a static library. As with breakpad all the non-necessary bits have been removed (non-moz.build build system files, test files, etc...). License, authors, and similar files have been left alone. There's a few things that still need to be done:

- I'm working on a script akin to update-breakpad.sh to easily maintain these sources

- I've tested this on Linux, so I need to test it on other platforms too

- I'm obsoleting the main patch of this bug because I need to update it to pick up this stuff
Attachment #8782634 - Attachment is obsolete: true
Here's the main patch which includes the new jsoncpp-based output (I've tried to be as close as possible to socorro's stack walker when writing it), addresses all the review comments and includes the changes requested by Benjamin in bug 1280484 (namely "module" is replaced with "module_index" and is an index within the module array and "offset" is renamed "ip").
Revised patch with an additional script used to update jsoncpp based on update-breakpad.sh
Attachment #8788170 - Attachment is obsolete: true
This stuff is almost ready but I'm not asking for review just yet because jsoncpp code doesn't seem to pass our static analysis so I'll need to work around that first. I'm currently waiting on my static-analysis enable build to finish :-(
Updated patch with a few additions:

- An out-of-tree patch was added to fix the clang static analysis issues. Unfortunately this is not the kind of stuff we'll be able to land easily upstream so we'll have to deal with it this way for now.

- A script was added to automatically update the clone, this is essentially a modified version of udpate-breakpad.sh
Attachment #8788872 - Attachment is obsolete: true
Attachment #8789356 - Flags: review?(ted)
Comment on attachment 8788855 [details] [diff] [review]
[PATCH  2/2 v2] Write the stack traces extracted from a crash dump into the crash.main event file

Ready for review, see comment 32.
Attachment #8788855 - Flags: review?(ted)
Comment on attachment 8789356 [details] [diff] [review]
[PATCH 1/2 v3] Import jsoncpp into the source tree and integrate it into the build

Review of attachment 8789356 [details] [diff] [review]:
-----------------------------------------------------------------

I had thought we'd use the amalgamated source file just to minimize the number of files we need in-tree, but it's not particularly important:
https://github.com/open-source-parsers/jsoncpp#generating-amalgamated-source-and-header

::: toolkit/crashreporter/jsoncpp-patches/0001-Fix-various-issues-with-clang-s-static-analysis.patch
@@ +1,4 @@
> +From 9559272ad329f5009e4a8640c1d6243602165257 Mon Sep 17 00:00:00 2001
> +From: Gabriele Svelto <gsvelto@mozilla.com>
> +Date: Thu, 8 Sep 2016 10:58:30 +0200
> +Subject: [PATCH] Fix various issues with clang's static analysis

Is there not a way to disable static analysis checks for a particular file or directory? Obviously those `MOZ_IMPLICIT` changes are not upstreamable, which is unfortunate.

If there is a way I'd obviously rather use that. It doesn't really make sense to try to enforce our coding standards (no explicit constructors) onto third-party projects. If there isn't can you file a follow-up bug about adding one? I'm actually surprised we haven't hit this sort of situation in other places.
Attachment #8789356 - Flags: review?(ted) → review+
Comment on attachment 8788855 [details] [diff] [review]
[PATCH  2/2 v2] Write the stack traces extracted from a crash dump into the crash.main event file

Review of attachment 8788855 [details] [diff] [review]:
-----------------------------------------------------------------

I'd like to get a bug on file to get us to a place where we're sharing the same stackwalker code between Socorro and this feature, even if we have to add some options to meet these requirements. I think that will save us a lot of headaches in the long term. Longer-term I'd love for that to wind up being built on my `rust-minidump` crate, but that needs a bit more work to be usable there. All these silly helper functions that you have to write because you're in C++ make me terribly sad!

::: toolkit/crashreporter/docs/index.rst
@@ +87,5 @@
>  argument.
>  
>  The *crash reporter client* performs a number of roles. There's a lot going
>  on, so you may want to look at ``main()`` in ``crashreporter.cpp``. First,
> +stack traces are extracted from the dump via the *minidump analyzer* tool.

It would be nice to mention what the minidump analyzer *does* with those stack traces.

::: toolkit/crashreporter/minidump-analyzer/minidump-analyzer.cpp
@@ +141,5 @@
> +  }
> +};
> +
> +// We use this map to get the index of a module when listed by address
> +static map<const CodeModule*, unsigned int, ModuleCompare> gModules;

I would have a slight preference to having this be a local variable in `ConvertProcessStateToJSON` and then passing it in as a parameter to the functions that need it. Not critical, but globals make the code harder to reason about.

@@ +228,5 @@
> +      }
> +    }
> +
> +    frameNode["trust"] = FrameTrust(frame->trust);
> +    // The 'ip' field is equivalent to socorro's 'offset' field

In hindsight 'offset' was kind of a terrible name for this field. (It was intended to distinguish it from module_offset / function_offset.)

@@ +355,5 @@
> +
> +static bool
> +ProcessMinidump(Json::Value& aRoot) {
> +  BasicSourceLineResolver resolver;
> +  MinidumpProcessor minidumpProcessor(nullptr, &resolver);

You might want to note that the `nullptr` here is because we don't have a symbol provider.
Attachment #8788855 - Flags: review?(ted) → review+
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #37)
> I had thought we'd use the amalgamated source file just to minimize the
> number of files we need in-tree, but it's not particularly important:
> https://github.com/open-source-parsers/jsoncpp#generating-amalgamated-source-
> and-header

I missed that, we can use it later maybe.

> Is there not a way to disable static analysis checks for a particular file
> or directory? Obviously those `MOZ_IMPLICIT` changes are not upstreamable,
> which is unfortunate.

I've dug into this further and it turns out that there is: modifying the clang plugin to explicitly whitelist namespaces & files. I thought we'd have a moz.build option and when I couldn't find it I gave up. I'll modify the patch to use that whitelisting mechanism instead.

(In reply to Ted Mielczarek [:ted.mielczarek] from comment #38)
> I'd like to get a bug on file to get us to a place where we're sharing the
> same stackwalker code between Socorro and this feature, even if we have to
> add some options to meet these requirements. I think that will save us a lot
> of headaches in the long term.

Yes, I'll do that. It's just a matter of adding an option to exclude certain things and rename a couple of others when outputting in one format of the other, it should be really trivial to make this shared.

> Longer-term I'd love for that to wind up
> being built on my `rust-minidump` crate, but that needs a bit more work to
> be usable there. All these silly helper functions that you have to write
> because you're in C++ make me terribly sad!

Yes, we've talked with David and Benjamin about it and the long-term goal is to port all this stuff into Rust. In general the crash reporter client code is old, crufty and very system-specific. It could use a Rust rewrite to get rid of most of the cruft and make it more solid and easier to maintain.

> It would be nice to mention what the minidump analyzer *does* with those
> stack traces.

Good point, I'll add that.

> I would have a slight preference to having this be a local variable in
> `ConvertProcessStateToJSON` and then passing it in as a parameter to the
> functions that need it. Not critical, but globals make the code harder to
> reason about.

OK, I can do that too.

> In hindsight 'offset' was kind of a terrible name for this field. (It was
> intended to distinguish it from module_offset / function_offset.)

Yeah, I just picked it up from Socorro but this makes more sense.

> You might want to note that the `nullptr` here is because we don't have a
> symbol provider.

Will do.
Blocks: 1303077
Updated patch with local changes removed and the necessary adjustments to the clang plugin. Carrying over the r+ flag.
Attachment #8789356 - Attachment is obsolete: true
Attachment #8791763 - Flags: review+
Updated patch with review comments addressed, carrying over the r+ flag.
Attachment #8788855 - Attachment is obsolete: true
Attachment #8791764 - Flags: review+
Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d52564553d5e
Import jsoncpp into the source tree and integrate it into the build r=ted
https://hg.mozilla.org/integration/mozilla-inbound/rev/17fe7328e60a
Write the stack traces extracted from a crash dump into the crash.main event file r=ted
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/d55d2a73d11233578b47c03e4a07922f354cd9bc for depending on one or both of the other things of yours I backed out.
Blocks: 1307153
I've had the time to revise this patch a little more and I've changed it slightly. The logic is the same but instead of having the analyzer append the stack traces to the main crash event file; it appends them to the .extra file that comes with the minidump.

The rationale is that we'll need this for analyzing content crashes too and those don't have a matching event files, but they do have an extra file so this works in both cases. The crashreporter client already has all the bits to handle both the extra and event files so the changes were minimal there.

An additional upside of this change is that the patch is smaller since I could throw away a nice amount of duplicate code from the minidump analyzer.

Asking for review again since the change is non-trivial.
Attachment #8791764 - Attachment is obsolete: true
Attachment #8798412 - Flags: review?(ted)
Attachment #8798412 - Attachment description: [PATCH 2/2 v4] Write the stack traces extracted from a crash dump into the crash.main event file r=ted → [PATCH 2/2 v4] Write the stack traces extracted from a crash dump into the crash.main event file
Comment on attachment 8798412 [details] [diff] [review]
[PATCH 2/2 v4] Write the stack traces extracted from a crash dump into the crash.main event file

Review of attachment 8798412 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, thanks!

::: toolkit/crashreporter/docs/index.rst
@@ +89,5 @@
>  The *crash reporter client* performs a number of roles. There's a lot going
>  on, so you may want to look at ``main()`` in ``crashreporter.cpp``. First,
> +stack traces are extracted from the dump via the *minidump analyzer* tool.
> +The resulting traces are appended to the main crash event file. Then, the
> +*crash reporter client* verifies that the dump data is sane. If it isn't

This isn't quite correct as of this patch, is it?
Attachment #8798412 - Flags: review?(ted) → review+
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #46)
> ::: toolkit/crashreporter/docs/index.rst
> @@ +89,5 @@
> >  The *crash reporter client* performs a number of roles. There's a lot going
> >  on, so you may want to look at ``main()`` in ``crashreporter.cpp``. First,
> > +stack traces are extracted from the dump via the *minidump analyzer* tool.
> > +The resulting traces are appended to the main crash event file. Then, the
> > +*crash reporter client* verifies that the dump data is sane. If it isn't
> 
> This isn't quite correct as of this patch, is it?

Right, nice catch, I'll adjust it before landing.
This is ready to land, here's a green try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0a645c8b8b46

I've addressed the inconsistencies in the documentation and explicitly added an extra provision in the CrashManager do delete the StackTraces entry in the crash ping metadata. I'll add it explicitly in bug 1280484; in the meantime I don't want spurious data to be sent to our servers.
Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/59b0d8cc0303
Import jsoncpp into the source tree and integrate it into the build r=ted
https://hg.mozilla.org/integration/mozilla-inbound/rev/825173ada166
Write the stack traces extracted from a crash dump into the crash.main event file r=ted
No longer blocks: 1303077
https://hg.mozilla.org/mozilla-central/rev/59b0d8cc0303
https://hg.mozilla.org/mozilla-central/rev/825173ada166
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Blocks: 1317968
Depends on: 1322983
Blocks: 1328657
Blocks: 1345654
No longer blocks: 1307153
Whiteboard: [fce-active] → [fce-active-legacy]
You need to log in before you can comment on or make changes to this bug.