Closed
Bug 634498
Opened 14 years ago
Closed 8 years ago
setting up a data source for a variety of module analysis map reduce jobs.
Categories
(Socorro :: General, task)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: chofmann, Unassigned)
Details
clone of the idea in https://bugzilla.mozilla.org/show_bug.cgi?id=634343#c7
it would be nice if we could run a number of jobs that run analysis over all
the module lists of all crash reports.
See:
- bug 634343 for the need to find mismatched .dll versions that lead to crashes that result from incomplete installs
dbaron's module correlations would be another (ratio's of the presence of a .dll in a 1 day sample of crashes for a signature v. ratio of reports in all signatures.)
mapping of all suspected malware/unknown .dlls another,
-- and stuff like in Bug 634097] Compare beta 10 hardware acceleration usage (the presents of d2d1.dll being loaded) to beta 11 hardware acceleration usage still another example.
these all would benefit by setting up a map of crash_rpt/module data list pairs, and maybe a few pieces of other crash meta data, so additional map/reduce operations like those listed above could work efficiently.
-chris "maybe read just enough about hadoop" to be dangerous" hofmann
Reporter | ||
Comment 1•14 years ago
|
||
https://bugzilla.mozilla.org/show_bug.cgi?id=630201#c19 is also similar to the request in bug 634097
Comment 2•14 years ago
|
||
Data to e.g. people and then in a later version to the UI?
I feel like we ought to have somewhere better to put all these reports, since I think there are going to be more and more as time goes on, and some of them will be confidential. Jabba, any thoughts?
This is what ted and I are talking about for bug 598098, too.
Comment 3•14 years ago
|
||
Is this related to bug 620146 ?
Reporter | ||
Comment 4•14 years ago
|
||
bug 620146 is about a place to do a variety of reporting experiments where we still have gaps in understanding crash data and making the best use of the data we have.
I think this is more directly similar to Bug 594777, but for the kind of reporting we need here we would also have to .dll version info to the output suggested in bug 594777 comment 5.
none of the data that would be in this particular output would be confidentical. Its basically an output that makes access to the existing "module info" for all reports easy to get at, search, and do sample counts and correlations.
Comment 5•14 years ago
|
||
data source as mentioned by laura in comment #2 will also help https://bugzilla.mozilla.org/show_bug.cgi?id=620180
Reporter | ||
Comment 6•14 years ago
|
||
reposting some comments made on irc as food for thought
are three basic pieces of data that could be output daily from which many different reports can be derived.
1) "crash meta data" -- [report_id] signature, url,
.. this is basically all the stuff in the .csv files [1]
2) "module list data" [report_id] module1, module2, module3, ...
3) stack data [report id] frame1, frame2, frame3....
from these 3 "maps" you could define just about all the custom reports that we are doing now, and many more interesting correlations and other reports that we need.
building most report is just a matter of setting up a list of "interesting crash report" that are a subset of things like product,release,signature or other interesting combination pairs, then using that list of reports to do further reductions on module data, stack data [jesse's frame2 report, and/or stuff at http://people.mozilla.org/crash_stacks/stack-summary-4.0b11pre.txt )
or do additional correlations on other pieces of meta data.
[1]
meta data list
1 signature
2 url
3 uuid_url
4 client_crash_date
5 date_processed
6 last_crash
7 product
8 version
9 build
10 branch
11 os_name
12 os_version
13 cpu_info
14 address
15 bug_list
16 user_comments
17 uptime_seconds
18 email
19 adu_count
20 topmost_filenames
21 addons_checked
22 flash_version
23 hangid
24 reason
25 process_type
26 app_notes
Comment 7•14 years ago
|
||
I really wish we could make some progress on the alternate processed json format. Once we have that, we can do another POC for the ElasticSearch and most of these analysis tasks could easily be done directly in there.
Comment 8•14 years ago
|
||
(In reply to comment #7)
> I really wish we could make some progress on the alternate processed json
> format. Once we have that, we can do another POC for the ElasticSearch and
> most of these analysis tasks could easily be done directly in there.
Agreed. How do you guys feel about tackling that for 1.7.8?
Reporter | ||
Comment 9•14 years ago
|
||
(In reply to comment #8)
> (In reply to comment #7)
> > I really wish we could make some progress on the alternate processed json
> > format. Once we have that, we can do another POC for the ElasticSearch and
> > most of these analysis tasks could easily be done directly in there.
>
> Agreed. How do you guys feel about tackling that for 1.7.8?
and are there bugs on file to define and track that work?
Comment 10•14 years ago
|
||
bug 573100 is already targeted at 1.7.8.
Assignee | ||
Updated•13 years ago
|
Component: Socorro → General
Product: Webtools → Socorro
Comment 11•8 years ago
|
||
we no longer have map reduce jobs
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•