Use hg.mozilla.org to map crashes to bug components by way of source files when possible

NEW
Assigned to

Status

Socorro
Backend
2 years ago
3 months ago

People

(Reporter: ted, Assigned: kanru)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

2 years ago
In a dev.platform thread a few people expressed that they'd like to have a feed of crashes in the area of the codebase they work on. I think it's feasible to implement this nowadays, since we have bug components for many files in the source tree in moz.build files. For example:
https://dxr.mozilla.org/mozilla-central/rev/4d63dde701b47b8661ab7990f197b6b60e543839/dom/media/moz.build#7

We also have a service on hg.mo for reading this metadata for any file at any revision in the repository:
http://mozilla-version-control-tools.readthedocs.io/en/latest/hgmo/mozbuildinfo.html

Unfortunately this is currently broken (bug 1263973), so that will need to be fixed first.

Armed with that, I think what we should do is take the source file from the last stack frame that was used to build the signature, and if it starts with hg:hg.mozilla.org, use that to build a query against hg.mo for the bug component. If we get a result, we should store that in the processed crash. If we allow querying based on that data then people should be able to get lists of crashes by bug component easily. Since the data is maintained in the tree it can easily be updated by developers.

As a concrete example, this crash:
https://crash-stats.mozilla.com/report/index/a8ccd6bc-209c-4d60-a07d-7c76c2160526

has its signature generated from a single frame, frame 0, which has:
"file": "hg:hg.mozilla.org/mozilla-central:dom/media/MediaFormatReader.cpp:829d3be6ba64",

so we could build the query:
https://hg.mozilla.org/mozilla-central/json-mozbuildinfo/829d3be6ba64?p=dom/media/MediaFormatReader.cpp
(basically '{repo}/json-mozbuildinfo/{rev}?p={file}' like we do to build the source links in report/index).

When that web service is working again that ought to return something like:
```
{
  "files": {
    "dom/media/MediaFormatReader.cpp": {
      "bug_component": [
        "Core",
        "Video/Audio"
      ]
    }
  }
}
```
(Assignee)

Comment 1

a year ago
Last week uptime meeting I mentioned a per-component crash rates dashboard would be useful. I initially thought we need to query bugzilla for triaged component but then I found this bug. I might take a stab at this when bug 1263973 is fixed.
As a workaround for bug 1263973, you could use the data from https://wiki.mozilla.org/Modules/All.

There's a module in the libmozdata library that makes this data easy to use (https://github.com/mozilla/libmozdata/blob/master/libmozdata/modules.py).

Example usage:
from libmozdata import modules
modules.module_from_path('dom/indexedDB/IDBDatabase.cpp')

The output is an object:
{u'ownersEmeritus': [], u'name': u'IndexedDB', u'peers': [{u'name': u'Jonas Sicking', u'email': u'jonas@sicking.cc'}, {u'name': u'Kyle Huey', u'email': u'me@kylehuey.com'}, {u'name': u'Jan Varga', u'email': u'jvarga@mozilla.com'}], u'discussionGroup': u'http://www.mozilla.org/community/forums/#dev-platform', u'peersEmeritus': [], u'urls': [{u'directory': u'https://developer.mozilla.org/en/IndexedDB'}], u'owners': [{u'name': u'Ben Turner', u'email': u'bent@mozilla.com'}], u'bugzillaComponents': [u'Core::DOM: IndexedDB'], u'sourceDirs': [u'dom/indexedDB/'], u'description': u''}

Of course it isn't perfect, but better than nothing.
(Assignee)

Comment 3

a year ago
(In reply to Marco Castelluccio [:marco] from comment #2)
> As a workaround for bug 1263973, you could use the data from
> https://wiki.mozilla.org/Modules/All.
> 
> There's a module in the libmozdata library that makes this data easy to use
> (https://github.com/mozilla/libmozdata/blob/master/libmozdata/modules.py).
> 
> Example usage:
> from libmozdata import modules
> modules.module_from_path('dom/indexedDB/IDBDatabase.cpp')
> 
> The output is an object:
> {u'ownersEmeritus': [], u'name': u'IndexedDB', u'peers': [{u'name': u'Jonas
> Sicking', u'email': u'jonas@sicking.cc'}, {u'name': u'Kyle Huey', u'email':
> u'me@kylehuey.com'}, {u'name': u'Jan Varga', u'email':
> u'jvarga@mozilla.com'}], u'discussionGroup':
> u'http://www.mozilla.org/community/forums/#dev-platform', u'peersEmeritus':
> [], u'urls': [{u'directory':
> u'https://developer.mozilla.org/en/IndexedDB'}], u'owners': [{u'name': u'Ben
> Turner', u'email': u'bent@mozilla.com'}], u'bugzillaComponents':
> [u'Core::DOM: IndexedDB'], u'sourceDirs': [u'dom/indexedDB/'],
> u'description': u''}
> 
> Of course it isn't perfect, but better than nothing.

Looks like the modules module uses a static copy of the wiki in file modules.json. Maybe we want to do that anyway since querying hg.mozilla.org when processing each crash reports maybe too expensive?

Maybe we could generate a static copy of json-mozbuildinfo info for each official build and socorro only need to query that.
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #3)
> Looks like the modules module uses a static copy of the wiki in file
> modules.json. Maybe we want to do that anyway since querying hg.mozilla.org
> when processing each crash reports maybe too expensive?
> 
> Maybe we could generate a static copy of json-mozbuildinfo info for each
> official build and socorro only need to query that.

Yes, that's what I did at the time because most moz.build files did not
contain any info about the component. I don't know if the situation is
different now.
(Reporter)

Comment 5

a year ago
On the plus side, using the data from the moz.build files means that you can get developers to annotate things properly in order to help themselves get better crash reporting :)
(Assignee)

Updated

a year ago
Assignee: nobody → kchen
(Assignee)

Updated

a year ago
Depends on: 1299747
(Assignee)

Updated

a year ago
No longer depends on: 1299747
I keep poking at bug 1263973 (what has json-mozbuildinfo broken) every few months and don't have much to show for it.

All that HTTP API is doing is essentially invoking `hg mozbuildinfo` from a sandbox. `hg mozbuildinfo` is implemented at https://hg.mozilla.org/hgcustom/version-control-tools/file/407adc612136/hgext/hgmo/__init__.py#l554. And that command is essentially a JSON wrapper around https://hg.mozilla.org/hgcustom/version-control-tools/file/407adc612136/pylib/mozhg/mozhg/mozbuildinfo.py.

If you wanted to, you could make a clone of the repo anywhere and invoke this functionality. You do want to sandbox execution since the whole thing is essentially arbitrary code execution.
The json-mozbuildinfo endpoint is now working again. e.g. https://hg.mozilla.org/mozilla-central/json-mozbuildinfo/829d3be6ba64?p=dom/media/MediaFormatReader.cpp

If you find any issues with it, please open new bugs against Developer Services :: hg.mozilla.org.
I should mention that in some cases you may want to query for the metadata for the latest version of a file. In that case, you can plug a symbolic revision name into the URL. e.g. https://hg.mozilla.org/mozilla-central/json-mozbuildinfo/default?p=dom/media/MediaFormatReader.cpp

This will ask for metadata from the "default" branch head, which should also be equivalent to the "tip" revision on mozilla-central.
Depends on: 1337806
I fixing the dependent bug, I am moving at a good rate through the source tree of adding BUG_COMPONENTS to the moz.build files, more work would be helpful :)
Depends on: 1328351
No longer depends on: 1337806
(Assignee)

Comment 10

a year ago
(In reply to Joel Maher ( :jmaher) from comment #9)
> I fixing the dependent bug, I am moving at a good rate through the source
> tree of adding BUG_COMPONENTS to the moz.build files, more work would be
> helpful :)

Great! I'm about to do the same but you beat me to it ;)
(Reporter)

Comment 11

3 months ago
gps did some work recently to make the build spit out a JSON file with this info:
https://groups.google.com/forum/#!topic/mozilla.dev.platform/l8DaPwjOMqA

It'd probably be simpler to ingest that data instead. The nicest thing would be to pull in the data that matches a build when we find new builds (I assume there's some replacement for ftpscraper nowadays?), but as a start we could just have Socorro use the latest version of the data since our source tree layout doesn't change that often.

The latest info from mozilla-central can be fetched via the taskcluster index here:
https://index.taskcluster.net/v1/task/gecko.v2.mozilla-central.latest.source.source-bugzilla-info/artifacts/public/components-normalized.json
You need to log in before you can comment on or make changes to this bug.