Closed
Bug 554373
Opened 15 years ago
Closed 15 years ago
Correlation Reports API
Categories
(Socorro :: General, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: ozten, Assigned: xstevens)
References
Details
(Whiteboard: Implemented feedback and now testing speed with production dataset)
Currently dbaron's correlation reports are hosted on people as giant text files.
His code should be migrated to hbase/hadoop and a report or API should be created to make accessing this data easier and more robust.
Reporter | ||
Updated•15 years ago
|
Whiteboard: cloud
Assignee | ||
Updated•15 years ago
|
Assignee: nobody → xstevens
Reporter | ||
Comment 1•15 years ago
|
||
What is the status update?
Assignee | ||
Comment 2•15 years ago
|
||
I've been working out some bugs and trying to improve performance.
Assignee | ||
Comment 3•15 years ago
|
||
I've got a working implementation. It does the counting with Hadoop MapReduce and then reads that output with a python script for formatting. I checked the python code into crash-data-tools project where dbaron's original code lives. I checked the MapReduce code into moco/metrics/hadoop/crash-reports. I don't really want things to live in two separate projects so I'm looking for feedback if anyone has any.
Comment 4•15 years ago
|
||
You could put the python into the Socorro repo, but that doesn't solve the two repos problem. (Would be socorro + metrics).
Reporter | ||
Comment 5•15 years ago
|
||
What does the API for getting these correlations out look like?
The current prod integration is a hack. We'll want a clean API for:
For each type of correlation report
1) Accessing correlations by prod/version/os/signature
2) Access multiple correlations by prod/version/os/list_of_signatures
Comment 6•15 years ago
|
||
If we can limit retrieval of correlation data to *always* requiring at least a prod/version/os/ prefix, then we can set up a correlations table with a rowkey of prod/version/os/signature.
That would allow an API to retrieve the correlation data for one specific signature, or even to scan all signatures for a given prod/version/os/ prefix.
Does this sound useful? More importantly, can you think of cases where this wouldn't work?
Assignee | ||
Updated•15 years ago
|
Summary: Correlation reports should be generated in hadoop report system → Correlation Reports API
Whiteboard: cloud → Basic design has been started
Assignee | ||
Comment 7•15 years ago
|
||
I created a MR job to count all of the product/version/os/sigs for a given day. Then these numbers can create a correlation report via the REST API like so.
http://cm-hadoop01:8080/correlation-report/report/20100701/Thunderbird/3.0/Windows%20NT/JS_CallTracer%7CEXCEPTION_ACCESS_VIOLATION
Assignee | ||
Updated•15 years ago
|
Whiteboard: Basic design has been started → Implementation is nearing completion - would love to get feedback
Reporter | ||
Comment 8•15 years ago
|
||
(In reply to comment #7)
Great work! This API is very similar. We can ship this and change the semantics of what is in the correlation report, or tweak this API to match the prod data. There is also one additional API in use against dbaron's correlation reports (flat files).
Details:
Correlations are placed on the following screens in production:
/report/list
/report/index/{uuid]
/topcrasher/{Product}/{Version}
Existing Page:
http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&signature=UserCallWinProcCheckWow&version=Firefox%3A3.6.8#modver
Snippet
Modules
EXCEPTION_ACCESS_VIOLATION (12411)
97% (12031/12411) vs. 61% (98974/162039) shdocvw.dll
52% (6469/12411) vs. 19% (30824/162039) msvcr71.dll
41% (5054/12411) vs. 9% (13901/162039) nppdf32.dll
89% (11035/12411) vs. 58% (94239/162039) samlib.dll
99% (12229/12411) vs. 68% (110292/162039) nssckbi.dll
API output
"interesting-modules":[
{
"module":"oleaut32.dll",
"sigCount":2,
"totalSigCount":184,
"sigPercent":1.0869565,
"osCount":2,
"totalOsCount":184,
"osPercent":1.0869565
From our first example if we break down '97% (12031/12411) vs. 61% (98974/162039) shdocvw.dll' into the new API's output variables:
module = shdocvw.dll
sigCount = 12031
totalSigCount = 12411
sigPercent = 97%
A group of properties missing are the "all crashes that match ignoring the crash signature".
overallCount = 98974
totalCount = 162039
overallPercent = 61%
osCount, totalOsCount, and osPercent don't exist and are new properties. They look fine. I'm not sure about the name osCount, etc.
Another issue is that we have a bulk version of this API which does not take a signature. It would be something like
http://cm-hadoop01:8080/correlation-report/report/{Day}/{Product}/{Version}/{OS Name}
The results have a list of signatures and the correlations have only the highest correlation. To see this API in action, check out
http://crash-stats.mozilla.com/topcrasher/byversion/Firefox/3.6.8
Look in the Correlation column and click on one with data.
Reporter | ||
Comment 9•15 years ago
|
||
It would be nice if we split signature in the output into
signature
crash_reason
This way the UI doesn't have to split on '|'.
Assignee | ||
Comment 10•15 years ago
|
||
Austin,
Just to be clear osCount, totalOsCount, and osPercent are calculated in the same way they are currently. Those represent the "all crashes that match ignoring the
crash signature". I can rename those to overallCount, totalCount, and overallPercent.
I've split out crash_reason from signature in the return value.
Reporter | ||
Comment 11•15 years ago
|
||
Okay, perfect. I thought they were different (based on IRC conversation).
Assignee | ||
Comment 12•15 years ago
|
||
Firefox numbers are always easier for me to look at:
http://cm-hadoop01:8080/correlation-report/report/20100701/Firefox/3.6.4/Windows%20NT/hang%20%7C%20KiFastSystemCallRet%7CEXCEPTION_BREAKPOINT
Again this is staging so don't expect to compare these to production numbers for this day just yet.
Reporter | ||
Comment 13•15 years ago
|
||
(In reply to comment #12)
Just confirming there is no specific question for me here.
Assignee | ||
Comment 14•15 years ago
|
||
Nope. I'm working on some of the changes you suggested including adding top crashers.
Assignee | ||
Updated•15 years ago
|
Whiteboard: Implementation is nearing completion - would love to get feedback → Implemented feedback and now testing speed with production dataset
Assignee | ||
Comment 15•15 years ago
|
||
This functionality is now complete, but we will need to code review, document, etc. probably before we deploy.
Assignee | ||
Updated•15 years ago
|
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•14 years ago
|
Component: Socorro → General
Product: Webtools → Socorro
You need to log in
before you can comment on or make changes to this bug.
Description
•