818069 - Provide some way to get data equivalent to Ted's dump-lookup tool

Reporter

Description

•

13 years ago

I have this tool: http://hg.mozilla.org/users/tmielczarek_mozilla.com/dump-lookup/ You give it a minidump and it scans the stack of the crashing thread and prints everything that looks like it could possibly be a return address. For crashes with horrible stacks this can be very useful. See bug 817946 comment 6 for an example of the output. I run this occasionally locally, but it requires downloading the minidump and then downloading the matching symbols, which is quite a pain. It would be awesome if there was some way to have Socorro run this on-demand for me, since it has access to all these things already. I wouldn't want to run it automatically since it's not necessary in most cases, but providing the ability for logged-in users to click a button and get the output would be really handy.

Benjamin Smedberg

Comment 1

•

13 years ago

There's nothing private about the results, right? As long as we hid it behind a POST so that webcrawlers didn't hose our server, we could probably do this on-demand in the middleware.

(not currently active) Ted Mielczarek

Reporter

Comment 2

•

13 years ago

No, the output is totally safe, it's just module, function, source info. I'd only be worried about people DOSing us.

Selena Deckelmann :selenamarie :selena

Comment 3

•

12 years ago

This seems like a priority job for a processor. Does that sound ok to you, :lars?

Laura Thomson :laura

Comment 4

•

12 years ago

We could hook it up to the raw dump tab in the webapp, and run it on demand, so we'd need a mware service. The binary needs to run somewhere that has access to HBase and symbols. This could work via a priority job - I like that idea. Once we do it we should probably save and cache the result, somewhere - PostgreSQL maybe?

Assignee: nobody → sdeckelmann

Target Milestone: --- → 55

Benjamin Smedberg

Comment 5

•

12 years ago

We could also just hook this up so it's included in the default output of minidump-stackwalk, either right now or after we have JSON output. How's that JSON output coming?

Selena Deckelmann :selenamarie :selena

Comment 6

•

12 years ago

(In reply to Benjamin Smedberg [:bsmedberg] from comment #5) > We could also just hook this up so it's included in the default output of > minidump-stackwalk, either right now or after we have JSON output. How's > that JSON output coming? Patch is on dev! We've got about 800 raw crashes in there now. Will be on stage next week.

Selena Deckelmann :selenamarie :selena

Comment 7

•

12 years ago

Need to revisit this -- unsure what I'm supposed to be doing on this. Was the JSON raw_crash enough?

Target Milestone: 55 → 56

(not currently active) Ted Mielczarek

Reporter

Comment 8

•

12 years ago

If we switch to the JSON-producing minidump_stackwalk we could pretty easily include this output there, but it doesn't currently exist. I think laura assigned this to you to investigate the feasibility of just stuffing the output of this tool (a wall of text) into Postgres.

Selena Deckelmann :selenamarie :selena

Updated

•

12 years ago

Target Milestone: 56 → 57

Wayne Mery (:wsmwk)

Comment 9

•

12 years ago

perhaps an error, but according to "[tools-socorro] Socorro 57 Released" this landed in production for 57. where is this in UI? I'm looking at https://crash-stats.mozilla.com/report/index/327c0b39-6655-4ac6-adf6-96a112130829 for example

Flags: needinfo?

[DEACTIVATED] Adrian Gaudebert

Comment 10

•

12 years ago

It didn't land with 57.

Flags: needinfo?

Target Milestone: 57 → 58

Selena Deckelmann :selenamarie :selena

Updated

•

12 years ago

Target Milestone: 58 → 59

Selena Deckelmann :selenamarie :selena

Comment 11

•

12 years ago

I believe this depends on us turning json_minidump_stalkwalk on. Happy to put it into the database once that's done.

Target Milestone: 59 → ---

(not currently active) Ted Mielczarek

Reporter

Comment 12

•

12 years ago

Sort of, in that we can shoehorn this data into the JSON output, but I don't know that we want to by default. This tool can be pretty verbose, and it's not necessary for 99+% of crashes.

Benjamin Smedberg

Comment 13

•

12 years ago

How big is it? Certainly always including that data would be easier than reprocessing dumps to get the data later.

Robert Kaiser

Comment 14

•

12 years ago

Well, I guess when breakpad can flawlessly walk the frames without guessing the processor probably could omit running this. And that said, we want processing to stay really fast, esp. as we are looking forward to processing 100% of all collected crashes for getting their classifications so we can message back to the users (we still would only put the full data for 10% of release crashes in the DB for analysis and store only the classification for user feedback for the rest).

Benjamin Smedberg

Comment 15

•

12 years ago

Here's my UI spec and suggested implementation of this: * Do not run dump-lookup by default, but will run it on-demand and store the results for a period of time. * On the report/index page there should be a new tab "Stack Lookup" * If a stack lookup is available, it should be displayed to everyone * If a stack lookup has not been run but we still have the minidump, there should be a button for logged-in users "Request Stack Lookup" * Requested stack lookups should be run asynchronously as a priority job * Stack lookups should be saved in hbase in a new field such as processed_lookup:txt with an expiration of 90 days I realize this involves some moving parts and so it's not a trivial change, but this would help normal Mozilla engineers a lot.

K Lars Lohn [:lars] [:klohn]

Comment 16

•

11 years ago

implementation of the backend of this could be sped up significantly if the output of the "dump-lookup" could be saved in the processed_crash itself and share its retention policy. I'm imagining this implementation: 1) middleware has method to: 1.1) fetch a raw_crash with a given 'crash_id' 1.2) add a 'dump-lookup' flag to it 1.3) resave the raw_crash 1.4) put the 'crash_id' into the reprocessing queue 2) processor 2.1) reprocess normally with all the standard rules 2.2) using a new rule, if 'dump-lookup' flag is present in the raw crash, invoke 'dump-lookup' tool and save results in a new 'stack-lookup' key in the processed crash 2.3) save the processed_crash normally in the UI: if the "stack-lookup" is present in the processed crash, display it. this same method can be used to solve Bug 977778

K Lars Lohn [:lars] [:klohn]

Updated

•

11 years ago

Depends on: 1121462

K Lars Lohn [:lars] [:klohn]

Updated

•

11 years ago

Depends on: 1121469

Selena Deckelmann :selenamarie :selena

Updated

•

9 years ago

Assignee: sdeckelmann → nobody

Lonnen :lonnen

Updated

•

9 years ago

Flags: needinfo?(chris.lonnen)

Lonnen :lonnen

Updated

•

9 years ago

Flags: needinfo?(chris.lonnen)

Lonnen :lonnen

Comment 17

•

8 years ago

discussing possible processor changes with team, sec, etc. we may need to consider what possible processes we run more carefully if we pursue this

Bugzilla

Provide some way to get data equivalent to Ted's dump-lookup tool

Categories

(Socorro :: General, task)

Tracking

(Not tracked)

People

(Reporter: ted, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Comment 9

Comment 10

Updated

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Updated

Updated

Updated

Updated

Updated

Comment 17