re-process Mac OS X crash reports from the past week to pick up Flash symbol data

RESOLVED FIXED

Status

task
--
critical
RESOLVED FIXED
9 years ago
8 years ago

People

(Reporter: jaas, Assigned: lars)

Tracking

Trunk
All
macOS
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

We should have Flash symbols on Mac OS X as of today. However, by default only new reports will have the symbol data. We need better crash data ASAP due to the impending release of Firefox 4. Can we re-process Mac OS X reports from the past week in order to pick up the symbol data?
Yep.  High priority.  Need to know what we can do here.
the first thing we need to do is figure out how many crashes there are to reprocess.  So can I assume that we want to do every mac crash since (and including) Jan 31.

We could just put all their ooids into the priority queue.  The processors will then chew through them. 

This could result in signature changes to these crashes.  That would mean that the Top Crash By Signature aggregate report would be incorrect.  It would have to be regenerated.  I'm looking into the practicality of that.
Assignee: nobody → lars
there are 108K mac crashes in last week.  The technique of running them through the priority system will work fine.  However, while these 108K crashes are processing, all regular crashes will be delayed.  I suggest we do this during the night or other off hours.

the order of operations will be as such:
1) IT: update dm-breakpad-devdb to latest from snapshot from production so that I may test*
2) lars: test regeneration of TCBS 
3) IT or jberkus: insert 108K mac crashes into priority jobs table in postgres
4) IT or jberkus: deletion of all TCBS data from 2011/01/31 onward
5) IT: manual run of TCBS app 

steps 3,4,5 need do be done during off hours to minimize impact

* waiting to hear what the status of dm-breakpad-devdb actually is.  It may have to be upgraded to postgres 9 before it can be loaded with production data
Wait, do we want to delete ALL TCBS data, or just the data associated with OSX?
Mostly out of curiosity, about how many crash reports are we capable of processing per hour (not counting any special setup for this particular run)?
to my list in comment #3, I add:

6) IT or jberkus: regenerate the signature_productdims table.

we've gotten around step 1 and I've completed step 2.

We're going to have to regenerate TCBS data for all OSes for the last week.  There may be some minor changes the tallies to other OSes.  Normally priority jobs that are submitted after TCBS runs are not included.  By regenerating the TCBS, these late priority jobs will get included.

When we hit step 3, there will be user impact for up to 3 hours.
Josh,

Currently we can do around 120,000 crashes per hour.  There's a TODO to investigate speeding this up, but since that's way above current processing requirements, it hasn't been that urgent.
at 7:00pm, we started the resubmission of all the last week's OSX crashes

at 7:37pm, the processors completed reprocessing all those OSX jobs.  Someone that can tell if they got the proper symbols, please take a look.

at 8:10pm, the top crashes by signature table was regenerated for the last week.  the TCBS report should now look normal.

at 8:25pm, the signature_productdims table was regenerated.

That wraps it up, we're done.  If any one sees anything suspicious or odd, please give a shout.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
I don't see any Mac OS X crash reports from the past week with proper symbols for Flash. Everything is still just library address offsets.
Can you give some ooids of crashes that do not show proper symbols, please?  Are crashes being submitted today getting the proper symbols?
I just looked up one:
https://crash-stats.mozilla.com/report/index/f1f63d30-666f-4e4d-bb70-4a45f2110207

the top frame is showing as:
0 	FlashPlayer-10.6 	FlashPlayer-10.6@0x2ace83

I downloaded the raw dump and the matching symbol file that I had uploaded to the symbol server, and processing it locally gives a function for the top frame instead:
Thread 0 (crashed)
 0  FlashPlayer-10.6!F_1512892623______________________________________ + 0x23

I uploaded the symbols to dm-symbolpush01 yesterday, and Aravind manually synced them to PHX for me. For reference, the Flash Player symbol file for this crash is:
symbols_os/FlashPlayer-10.6/7F846A865E18B6E54DA2E5B46E7CF5B20/FlashPlayer-10.6.sym
Ok, this is not the fault of the reprocessing, it's my fault. When I uploaded the symbols I didn't check the permissions on the symbol files, and they aren't world-readable. We'll need to get someone from IT to chmod them all properly, and have this re-run.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
In particular, something like "chmod -R +r symbols_os/Flash*" in the PHX symbol store should work.
Depends on: 632241
Aravind took care of that, we should be able to do this again and have it work. Sorry for the trouble.
The reprocessing finished on Tuesday evening at about 8:20.  I spot checked some of the crashes and they appear to have the proper symbols this time.

I think we're done.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Looks great, thanks guys!
Component: Socorro → General
Product: Webtools → Socorro
You need to log in before you can comment on or make changes to this bug.