Closed
Bug 544583
Opened 14 years ago
Closed 14 years ago
Identify source of TCBS data changes between 3/5 and 3/4
Categories
(Socorro :: General, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: ozten, Unassigned)
References
()
Details
Attachments
(1 file)
270.10 KB,
image/png
|
Details |
Ken Kovash noticed the ADU numbers changed between being retrieved on today (3/5) versus yesterday. Date Firefox 3.5.7 Crashes (retrieved today) Firefox 3.5.7 Crashes (retrieved yesterday) 1/22/10 15,535 15,535 1/23/10 14,911 122,160 1/24/10 118,930 14,392 1/25/10 16,389 125,948 1/26/10 16,727 16,727 1/27/10 112,084 112,084 1/28/10 16,116 126,000 1/29/10 11,347 91,573 1/30/10 6,718 6,718 1/31/10 6,873 6,873 2/1/10 38,016 4,771 2/2/10 12,626 12,626 2/3/10 10,431 16,087 ADU data comes from: raw_adu and our top_crashes_by_signature (TCBS) tables. TCBS is calculated on date_processed on an hourly basis. Once created, this data does not change. Is any of the above incorrect? Did we rerun TCBS reports for 1/23, 1/24, 1/25, 1/28, 1/29, and 2/1 to backfill missing data after our data recover efforts?
Reporter | ||
Comment 1•14 years ago
|
||
Odd that 1/23, 1/25 and 1/28 grow by 8x and 1/24 and 2/1 shrink by 8x (not exactly 8, no I'm not doing drugs)
Reporter | ||
Comment 2•14 years ago
|
||
Adding link to daily ADU report, which is the report referenced in Comment 0.
Comment 3•14 years ago
|
||
This does look like a change in the contents of the top_crashes_by_signature table to me. No way to directly examine that. I suspect that if there was a change, Aravind had a hand in doing it. Added him to the CC list.
Reporter | ||
Comment 4•14 years ago
|
||
Filed a bug to refresh our development copy of production data.
Depends on: 544608
Comment 5•14 years ago
|
||
Is this fixed?
Comment 6•14 years ago
|
||
I'm also trying to figure out if this is behind some of the spikes we see for some signatures over the last few days for bugs like https://bugzilla.mozilla.org/show_bug.cgi?id=542203 day count signature 20100122-crashdata 146 nsXHREventTarget::GetParentObject 20100123-crashdata 148 nsXHREventTarget::GetParentObject 20100124-crashdata 145 nsXHREventTarget::GetParentObject 20100125-crashdata 1950 nsXHREventTarget::GetParentObject 20100126-crashdata 11731 nsXHREventTarget::GetParentObject 20100127-crashdata 10300 nsXHREventTarget::GetParentObject 20100129-crashdata 1143 nsXHREventTarget::GetParentObject 20100130-crashdata 467 nsXHREventTarget::GetParentObject 20100131-crashdata 444 nsXHREventTarget::GetParentObject 172:crashdata chofmann$ ./stacktrend.sh nsXHREventTarget::GetParentObject 201002* date nsXHREventTarget::GetParentObjectcrashes 20100201-crashdata 761 nsXHREventTarget::GetParentObject 20100202-crashdata 527 nsXHREventTarget::GetParentObject 20100203-crashdata 406 nsXHREventTarget::GetParentObject 20100204-crashdata 328 nsXHREventTarget::GetParentObject
Comment 7•14 years ago
|
||
these are counts out of the .csv files.
Reporter | ||
Comment 8•14 years ago
|
||
I've traced the web service call that populates this data... SQL for 1/24 outputs 2010-01-24 00:00:00 Linux 20 2010-01-24 00:00:00 Mac 8295 2010-01-24 00:00:00 Windows 61472 2010-01-24 00:00:00 Windows 12 Service responds with Windows - 12 crashes 57434300 users Linux - 20 crashes 673199 users Mac - 8295 crashes 4025750 users Why a fluctuation? - It's possible that the order the results come back is non-determinate and the service layer only uses the first row per OS.
Comment 9•14 years ago
|
||
I looked at the hourly rates for total, 3.6, and 3.5.7 for recent days of the .csv files that have produced after the system down-time on 2010 01 28. overall crash volume on total crashes per hour, and 3.5.7 is down and 3.6 is up a little. this would be explained by people moving from 3.5.7 to 3.6, and the fact that 3.6 throttling was adjusted from 25% to 15% coming out of socorro 1.4 udpate on 1/28. so overall these numbers seem to be reasonable. that would leave the strange spikes in some signatures like in comment 6 unexplained.
Comment 10•14 years ago
|
||
How are we doing here? Are the numbers consistent? Can Ken do the things he needs to do to compare 3.5 w/ 3.6 yet? Who owns this bug?
Comment 11•14 years ago
|
||
The numbers I get using data from the .csv files are looking great for firefox 3.6 https://wiki.mozilla.org/CrashKill/Crashr#3.6_RC1.2C_RC2.2C_Final shows 3.4-3.8 crashes per 100 users, which would make it more stable than 3.0.x there is one outlier in 2010 02 05 but its possible that we just didn't get all the data on that day. there are about 12k few crashes on that day than I would have expected in the .csv file
Comment 12•14 years ago
|
||
and it looks like the .cvs file for that date only contains data up to 4pm crash count date_processed 7186 2010020500 7943 2010020501 8314 2010020502 8628 2010020503 9963 2010020504 11252 2010020505 12234 2010020506 12553 2010020507 13030 2010020508 12912 2010020509 12535 2010020510 12127 2010020511 12052 2010020512 11297 2010020513 10350 2010020514 9634 2010020515 8166 2010020516
Comment 13•14 years ago
|
||
was this resolved? it'd be nice to be able to do analysis with our crash data and know that we're 100% sure of the underlying data.
Reporter | ||
Comment 14•14 years ago
|
||
The fix is on stage http://crash-stats.stage.mozilla.com/ it is schedule for production Thursday night. Please help test the fix (Bug#545035) on stage.
Comment 15•14 years ago
|
||
Looking good in production.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•12 years ago
|
Component: Socorro → General
Product: Webtools → Socorro
You need to log in
before you can comment on or make changes to this bug.
Description
•