Closed Bug 1097262 Opened 10 years ago Closed 10 years ago

Distill vadump data into something supersearch-able

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: away, Assigned: away)

References

Details

Attachments

(3 files)

I'd like to be able to slice-and-dice crashes based on information about memory allocation regions. Specifically, for bug 1062065, I want to supersearch on "total size of write-combine regions". I can provide code that shows how to extract this number from a minidump. Can we run it on the processors and insert an annotation? (I'd prefer to avoid doing this on the client since it may add hundreds of milliseconds on a slow machine) For now the write-combine thing is all I want. If we can create a generic system for other vadump metrics, then great, but I don't want to block on that. Ted are you able to work on this and/or recommend a better owner?
Flags: needinfo?(ted)
(In reply to David Major [:dmajor] (UTC+13) from comment #0) > Can we run it on the processors and insert an annotation? I guess with "insert an annotation" you man "put something into the processed JSON"? I hope processed fields are in SuperSearch. :)
Currently we have an alternate minidump-stackwalk-like tool which can dump the VA list. We can add that code to the normal stackwalker, but we don't want to store all of that data in the processed crash because it is large and not useful in general. So basically this would be something like: * make minidump-stackwalk output the VA data * make that data available to classifiers, but exclude it from the stored processed crash * write a classifier to calculate "total size of write-combine regions" * make sure we can supersearch on that classifier field
In this specific case we'd only care about counters[GFX]
Couldn't we do this in a simpler way by simply calculating this value in the stackwalker and outputting it in the processed JSON? We already calculate "largest free VM region" using the same data: https://github.com/mozilla/socorro/blob/master/minidump-stackwalk/stackwalker.cc#L694
Flags: needinfo?(ted)
For the problem at hand, it sounds like that would work great. Would it get messy if we had a bunch of these things?
Life is messy, it's fine. :)
I don't speak git so I'm posting this here :) Can you try it out?
Attachment #8521974 - Flags: review?(ted)
Ping - this is pretty much my only hope of getting anywhere on the OOMs in bug 1062065.
Comment on attachment 8521974 [details] [diff] [review] stackwalker patch Review of attachment 8521974 [details] [diff] [review]: ----------------------------------------------------------------- This looks pretty sensible. I'll try to build it and give it a spin today. Do you have a sample crash report I could run it on?
Attachment #8521974 - Flags: review?(ted) → review+
> Do you have a sample crash report I could run it on? Try these and see if you get the same numbers (within MB/MiB tolerance): 34.0a1 bp-1379d5dc-0263-42dc-a0b0-ec2c02140901 2947MB 34.0a1 bp-bdb4c0af-8159-4628-93ff-2281a2140901 3121MB 34.0a1 bp-a3b212d7-505f-44c7-8f62-0d41b2140831 3384MB
Blocks: 1101179
Could you try this version instead? I fixed a typo and added a measurement for the fragmentation that I just discovered at bug 1101179.
Attachment #8524872 - Flags: review?(ted)
(In reply to David Major [:dmajor] (UTC+13) from comment #10) > Try these and see if you get the same numbers (within MB/MiB tolerance): > > 34.0a1 bp-1379d5dc-0263-42dc-a0b0-ec2c02140901 2947MB "tiny_block_size" : "276303872", "write_combine_size" : "3090685952" > 34.0a1 bp-bdb4c0af-8159-4628-93ff-2281a2140901 3121MB "tiny_block_size" : "63107072", "write_combine_size" : "3272806400" > 34.0a1 bp-a3b212d7-505f-44c7-8f62-0d41b2140831 3384MB "tiny_block_size" : "109993984", "write_combine_size" : "3548434432"
Comment on attachment 8524872 [details] [diff] [review] stackwalker patch with fragmentation size Review of attachment 8524872 [details] [diff] [review]: ----------------------------------------------------------------- ::: stackwalker.cc.orig @@ +722,5 @@ > + write_combine_size += raw_info->region_size; > + } > + > + if (raw_info->state == MD_MEMORY_STATE_FREE && > + raw_info->region_size < 0x100000) { What's the constant size here for? (Do you have a comment we can put in here?) @@ +738,5 @@ > } > > root["largest_free_vm_block"] = ToHex(largest_free_block); > + root["write_combine_size"] = ToInt(write_combine_size); > + root["tiny_block_size"] = ToInt(tiny_block_size); I had to tweak ToInt to take a uint64_t, we were overflowing 32-bit ints here.
Attachment #8524872 - Flags: review?(ted) → review+
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/fad8d33aea7c37fae36bb544da5261a30dba297a bug 1097262 - calculate some more memory stats in processed JSON. r=ted
Assignee: nobody → dmajor
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Thanks for the tweaks and landing! The test numbers look good. Does it need additional work to get into production? > What's the constant size here for? (Do you have a comment we can put in here?) Something like: "Minimum block size required by jemalloc and JS allocator"
This will go live whenever the next Socorro push to production happens.
Target Milestone: --- → 111
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: