Closed
Bug 765001
Opened 12 years ago
Closed 12 years ago
Add version numbers to modulelist
Categories
(Socorro :: Data request, task)
Socorro
Data request
Tracking
(Not tracked)
RESOLVED
FIXED
22
People
(Reporter: brandon, Assigned: rhelmer)
References
Details
(Whiteboard: [qa-])
Attachments
(1 file)
1.99 MB,
application/octet-stream
|
Details |
A daily module list is generated for modules in Socorro crash reports. We'd like to use this to feed data to the Dragnet DLL directory. Having the version numbers included in the comma-delimited data would be helpful for this.
Reporter | ||
Comment 1•12 years ago
|
||
It occurs to me that we wanted to have a call about this bug. Xavier, can you put one on the calendar for early next week?
Comment 2•12 years ago
|
||
Sorry I didn't get back to you earlier. I sent out the invite for a meeting on Monday morning.
Comment 3•12 years ago
|
||
Sent code out and discussed setup details. Going to assign to Brandon/Rob to take it from here. Please ping me if you need any further help.
Assignee: xstevens → bsavage
Reporter | ||
Updated•12 years ago
|
Assignee: bsavage → rhelmer
Assignee | ||
Comment 4•12 years ago
|
||
Now that we have a place to run these, I have been testing the latest modulelist.pig and it doesn't seem to be finding anything: Input(s): Successfully read 0 records (5696 bytes) from: "hbase://crash_reports" Output(s): Successfully stored 0 records in: "hdfs://hp-node70.phx1.mozilla.com/user/rhelmer/modulelist-20120801-20120801" I don't know if this is a problem with data missing on secondary, or a problem with the pig script.. looking into it.
Status: NEW → ASSIGNED
Assignee | ||
Comment 5•12 years ago
|
||
Hmm reading the code it looks like the date format is 'yyMMdd' and I have been giving it 'YYYYMMdd', going to try the former.
Assignee | ||
Comment 6•12 years ago
|
||
Of course we are blocked on getting this to production since the admin box is for some reason unable to run even the old module list map/reduce job (bug 779912)
Depends on: 779912
Assignee | ||
Comment 7•12 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #5) > Hmm reading the code it looks like the date format is 'yyMMdd' and I have > been giving it 'YYYYMMdd', going to try the former. OK yeah that's the problem this works fine :)
Assignee | ||
Comment 8•12 years ago
|
||
This ran ok but the output looks like: sxs.dll,sxs.pdb,6C678DA4C31348AF8CDF90FC641DD4FB2,0x6f540000 Is that right ^? "0x6f540000" looks like a memory address, so I am suspecting this is really: module,pdb,checksum,addr where addr is either addr_start or addr_end. The underlying data here is a pipe-delimited string (key is "dump") in the processed JSON, looks like the parser is https://github.com/mozilla-metrics/socorro-toolbox/blob/master/src/main/java/com/mozilla/socorro/pig/eval/ModuleBag.java
Comment 9•12 years ago
|
||
Looks like it's grabbing the wrong field, yes. The parser seems to have the right naming for the fields: https://github.com/mozilla-metrics/socorro-toolbox/blob/master/src/main/java/com/mozilla/socorro/pig/eval/ModuleBag.java#L49 http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/minidump_stackwalk.cc#346 so I'd guess someone just used the wrong index. (My kingdom for bug 573100's JSON output!)
Assignee | ||
Comment 10•12 years ago
|
||
(In reply to Ted Mielczarek [:ted] from comment #9) > Looks like it's grabbing the wrong field, yes. The parser seems to have the > right naming for the fields: > https://github.com/mozilla-metrics/socorro-toolbox/blob/master/src/main/java/ > com/mozilla/socorro/pig/eval/ModuleBag.java#L49 Doing something about that "TODO" would be a good start :) I can do that actually. > http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/ > minidump_stackwalk.cc#346 > so I'd guess someone just used the wrong index. (My kingdom for bug 573100's > JSON output!) Indeed!
Assignee | ||
Comment 11•12 years ago
|
||
OK bug mentioned in comment 8 is fixed: https://github.com/mozilla-metrics/socorro-toolbox/commit/9ae18469c853a48c6505ad2e03a4240c90b4eaa0 Investigating further, it looks like various fields are sometimes blank. I've just put up a PR to ignore these crashes: https://github.com/mozilla-metrics/socorro-toolbox/commit/9ae18469c853a48c6505ad2e03a4240c90b4eaa0 Ted, does that make sense? I am assuming that if we don't have all of: libname (dll), version, pdb, checksum Then it's not useful. Is that correct or would you like to see those?
Comment 12•12 years ago
|
||
It's fairly common to have some information missing, I think leaving blank fields would be okay. We should always have the library name, but I'd expect quite a few modules with missing fields, especially dealing with things like malware. My Windows symbol fetching script requires version, debug file (pdb), debug identifier (what you have labeled as checksum), but it can deal with them missing, I believe.
Comment 13•12 years ago
|
||
Related: if you can give me a sample output of this once you're fairly confident in the code, I'd like to test it to make sure my script doesn't choke.
Comment 14•12 years ago
|
||
Fixed the comment: https://github.com/mozilla-metrics/socorro-toolbox/pull/3
Assignee | ||
Comment 15•12 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #11) > OK bug mentioned in comment 8 is fixed: > https://github.com/mozilla-metrics/socorro-toolbox/commit/ > 9ae18469c853a48c6505ad2e03a4240c90b4eaa0 > > Investigating further, it looks like various fields are sometimes blank. > I've just put up a PR to ignore these crashes: > https://github.com/mozilla-metrics/socorro-toolbox/commit/ > 9ae18469c853a48c6505ad2e03a4240c90b4eaa0 BTW this is the wrong link, the PR for input validation is: https://github.com/mozilla-metrics/socorro-toolbox/pull/2
Assignee | ||
Comment 16•12 years ago
|
||
(In reply to Ted Mielczarek [:ted] from comment #13) > Related: if you can give me a sample output of this once you're fairly > confident in the code, I'd like to test it to make sure my script doesn't > choke. Here's a sample, now I've added input validation (each field must either be what we expect or be an empty string): 1.dll,,1.pdb,E033DD21934E4925A721E8D253C62F384 KS.dll,2.6.10465.1,, TV.dll,5.1.0.0,TV.pdb,4A1370C8ECF042F880288DCD7E49BB901 dr.dll,6.7.2213.202,dr.pdb,81A3CCA461394040A624CD506E51462D7 dr.dll,7.0.2813.205,dr.pdb,F55DB1C5E0EB4ED0BD15C0A62E3DAC2E4 rf.dll,7.4.1.0,RoboformSDK.pdb,B4FEADDDB78E411BACA9302A048EDA801 ACE.dll,2.17.1.1,ACE.pdb,459B464CD5F84E1D98C6E6F553EC94462 ACE.dll,2.13.54.1,, Log.dll,2.3.0.7,Log.pdb,35B871ED1CC64B3DB9DB9F80005139833 MPR.dll,6.1.7600.16385,mpr.pdb,1408743D42224025A49E02A79CE7190A2 Wpc.dll,1.0.0.1,Wpc.pdb,F57BCFDA3EE4436B88F68BCFDABD1DD02 awt.dll,6.0.30.5,awt.pdb,68C7A88C6F894BECB6A9BC7FAADD0A131 gkh.dll,1.0.4.1,gkh.pdb,E025BC1BCB624879B3938628E2920A581 jvm.dll,10.0.0.23,jvm.pdb,13857CF9BBF14B49B33472C9CCB72D5E1 lpk.dll,6.1.7127.0,lpk.pdb,E461F2C4516742D08461A74251C5EA7E2 lpk.dll,5.1.2600.2135,lpk.pdb,0E28834B363C48078C9C042C979B617D1 lpk.dll,6.0.6002.18051,lpk.pdb,60147F32956B430090EE31AD9827F6B22 mmm.dll,,,
Assignee | ||
Comment 17•12 years ago
|
||
Ted - is the "version" field always 4 places? I've rebased the branch I am using for input validation, here is what it look like now: https://github.com/mozilla-metrics/socorro-toolbox/pull/2
Comment 18•12 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #16) > Here's a sample, now I've added input validation (each field must either be > what we expect or be an empty string): I was sort of hoping for the full output of a run as a csv file that I could point my script at, just for convenience's sake. (In reply to Robert Helmer [:rhelmer] from comment #17) > Ted - is the "version" field always 4 places? Yes, it will either be x.x.x.x or empty: http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/minidump.cc#1942
Comment 19•12 years ago
|
||
Oh, for ease of use on my script, it'd be better if you stuck the version number at the end, to keep the other fields in their existing order.
Assignee | ||
Comment 20•12 years ago
|
||
Assignee | ||
Comment 21•12 years ago
|
||
(In reply to Ted Mielczarek [:ted] from comment #18) > (In reply to Robert Helmer [:rhelmer] from comment #16) > > Here's a sample, now I've added input validation (each field must either be > > what we expect or be an empty string): > > I was sort of hoping for the full output of a run as a csv file that I could > point my script at, just for convenience's sake. (In reply to Ted Mielczarek [:ted] from comment #19) > Oh, for ease of use on my script, it'd be better if you stuck the version > number at the end, to keep the other fields in their existing order. Let me know how attachment 651474 [details] looks.
Comment 22•12 years ago
|
||
With a tiny fix, the symbol fetching script accepts this data just fine: http://hg.mozilla.org/users/tmielczarek_mozilla.com/fetch-win32-symbols/rev/84eda43d42c9
Comment 23•12 years ago
|
||
Also the data looks about as spotty as I would expect from having looked at a number of crash reports over the years. :)
Assignee | ||
Comment 24•12 years ago
|
||
Cool, ok I think I have the changes we need to Socorro to support using the newer metrics repo and pig job etc, just need a bit more testing: https://github.com/rhelmer/socorro/tree/bug765001-modulelist-pig This should be able to make the release immediately following mobeta (rapid beta support)
Target Milestone: --- → 19
Assignee | ||
Updated•12 years ago
|
Target Milestone: 19 → 20
Assignee | ||
Comment 25•12 years ago
|
||
r? https://github.com/mozilla/socorro/pull/840 This is a royal pain to test locally, I suggest we get it on dev and test there.
Comment 26•12 years ago
|
||
Commits pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/486a479f141d02212ec63e9dd4db0ab0d65c9c9a bug 765001 - switch to new way of doing mapreduce jobs, starting with modulelist https://github.com/mozilla/socorro/commit/24128887edb45a6cecd0fdabcb260dd79fa6ddb6 Merge pull request #840 from rhelmer/bug765001-modulelist-pig bug 765001 - switch to new way of doing mapreduce jobs, starting with mo...
Comment 27•12 years ago
|
||
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/438fa4e1bb985e5a0936c2f457208c9deb8a0e55 Revert "bug 765001 - switch to new way of doing mapreduce jobs, starting with modulelist" This reverts commit 486a479f141d02212ec63e9dd4db0ab0d65c9c9a.
Assignee | ||
Comment 28•12 years ago
|
||
(In reply to [github robot] from comment #27) > Commit pushed to master at https://github.com/mozilla/socorro > > https://github.com/mozilla/socorro/commit/ > 438fa4e1bb985e5a0936c2f457208c9deb8a0e55 > Revert "bug 765001 - switch to new way of doing mapreduce jobs, starting > with modulelist" > > This reverts commit 486a479f141d02212ec63e9dd4db0ab0d65c9c9a. Had to back this out until Jenkins gets maven installed (bug 792630)
Depends on: 792630
Target Milestone: 20 → 21
Assignee | ||
Updated•12 years ago
|
Target Milestone: 21 → 23
Comment 29•12 years ago
|
||
Commits pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/57a5b2de77cca0ff8b03084a09d8014ab02d803c bug 765001 - switch to new way of doing mapreduce jobs, starting with modulelist This reverts commit 438fa4e1bb985e5a0936c2f457208c9deb8a0e55. https://github.com/mozilla/socorro/commit/50df35a936fb3c0034532f3a63a51359fc4b4bcb Merge pull request #858 from rhelmer/bug765001-modulelist-version bug 765001 - switch to new way of doing mapreduce jobs, starting with mo...
Assignee | ||
Updated•12 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: 23 → 22
Comment 30•12 years ago
|
||
Commits pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/b73207aa75d2829e352877b13dd458683eef66e8 bug 765001 - use fatal function correctly https://github.com/mozilla/socorro/commit/4e25a68e9f302fdb2af3fba47297c154ea465d3d bug 765001 - specify JAVA_HOME and PIG_CLASSPATH https://github.com/mozilla/socorro/commit/875c85d2bb9b8b8d45fd65e5af1ae1ee8de07a9f bug 765001 - upgrade to latest version of socorro-toolbox https://github.com/mozilla/socorro/commit/fa5cd33f7134ddedae96d7e1de7b96b9efc3b1ae Merge pull request #860 from rhelmer/bug765001-modulelist-version Bug765001 modulelist version
Comment 31•12 years ago
|
||
Commits pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/813b5da82350c17efcf7adc7d2c1078b916f7e5b bug 765001 - PIG_CLASSPATH and JAVA_HOME must be exported for pig https://github.com/mozilla/socorro/commit/2a5d498293c9df3365ca6dbb49c5549350d8aead Merge pull request #861 from rhelmer/bug765001-modulelist-version bug 765001 - PIG_CLASSPATH and JAVA_HOME must be exported for pig
Comment 32•12 years ago
|
||
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/f331ebdbd88a901d53925dd8ddc74a3a20ca50e5 Merge pull request #863 from rhelmer/bug765001-modulelist-version bug 650904 - use SOCORRO_DIR, make system specify JAVA_HOME
Comment 33•12 years ago
|
||
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/ee975822f8bdb23cbb42e92fe1c183e06a29aa9a Merge pull request #865 from rhelmer/bug765001-modulelist-version bug 650904 - getmerge requires file as arg
Assignee | ||
Comment 34•12 years ago
|
||
OK! we are able to run pig jobs from stage now. Please verify: http://people.mozilla.org/~rhelmer/temp/modulelist-20121004.txt.gz
Updated•12 years ago
|
OS: Mac OS X → All
Hardware: x86 → All
Whiteboard: [qa-]
Reporter | ||
Comment 35•12 years ago
|
||
The data looks good here. Thanks Rob.
You need to log in
before you can comment on or make changes to this bug.
Description
•