Closed Bug 1014343 Opened 10 years ago Closed 10 years ago

DMD: add ability to do diffs of memory dumps

Categories

(Core :: DMD, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla36

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

Attachments

(1 file)

dbaron described on dev-platform one way he still uses trace-malloc:

> One is to check for leaks that involve caches (i.e., don't involve
> unreachable pointers).  One can take two memory dumps at different
> times and build a allocation-stack-tree-diff of the two dumps.  Thus
> one can get into a certain state, do things that should lead to the
> memory usage being the same as the first state, and see if anything
> interesting increased (with allocation stacks).  This has been
> useful for finding caches that weren't being cleared properly.
I've dabbled in a tool to do this for CC logs.  One thing to watch out for is "replay attacks": an object at address 0xFOO can get freed, then another object allocated at the same address.  I'm not sure if you can deal with this problem without some kind of explicit support from the logging, like "Oh, here's a new log, and FYI, these objects that were in the old log have been freed."  That sounds doable, but it will make it more annoying.  Maybe trace-malloc suffers from this, and a feature like that isn't really critical.
> I've dabbled in a tool to do this for CC logs.  One thing to watch out for
> is "replay attacks": an object at address 0xFOO can get freed, then another
> object allocated at the same address.

Valgrind has a hack to work around this behaviour in a similar case: it holds onto freed blocks for a while so they don't get recycled too quickly. There's a flag that lets you control how long the blocks are held onto.
The trace-malloc tool in question doesn't care about addresses, it only cares about total amount of memory allocated at a given stack prefix.
This patch implements diffs. Changes of note in dmd.py:

- Previously the file reading and output printing was done in a single function
  (main). Now it's split into two, because we have to read two files, diff
  them, and then print the output for the diff. I've used the term "digest" for
  the intermediate data in this process. See main() to understand this better.

- There's now code for diffing digests.

- All the sorting code now uses absolute values.

- The way blocks are aggregated into records has changed. It used to be based
  on the base-32 frame keys (e.g. "AXd") of the |allocatedAt| and |reportedAt|
  stack traces -- if two blocks had matching frame keys for all their traces,
  they'd be put in the same record.
  
  It's now based on the frame values (e.g. "#00: foo (X.cpp:99)") of
  those same stack traces. This is because you can't sensible compare
  frame keys from two different DMD runs, because they're quasi-random, but you
  can compare frame values.

There's also some new tests, and test_dmd.js needed some refactoring to handle
tests that take two filenames.
Attachment #8506571 - Flags: review?(continuation)
Assignee: nobody → n.nethercote
Status: NEW → ASSIGNED
Comment on attachment 8506571 [details] [diff] [review]
Add diff support to dmd.py

Review of attachment 8506571 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/replace/dmd/dmd.py
@@ +69,3 @@
>          self.usableSizes = collections.defaultdict(int)
>  
> +    def isZero(self, args):

Please add a comment to the effect that Record can be used to store the difference of two DMD records as well as an individual record.  Maybe also add a comment on these three methods that they are for diffs?

Part of me thinks it would be good to have RecordDiff as a subclass of Record, but I guess that's overengineering.

@@ +128,5 @@
>  
>      @staticmethod
>      def cmpByUsableSize(r1, r2):
>          # Sort by usable size, then req size, then by isSampled.
> +        return cmp(abs(r1.usableSize), abs(r2.usableSize)) or \

Maybe this is a matter of taste, but I'd think you'd want all of the positive diffs (from largest to smallest), then all of the negative diffs (from largest to smallest), not mixed all together like this.  If I get the diff of two DMD logs, I'm usually either interested in either what appeared or (sometimes) what went away, not both.

@@ +468,5 @@
> +    return records3
> +
> +
> +def diffDigests(args, d1, d2):
> +    d3 = {}

It isn't a big deal, but the way you create the digest here is inconsistent with how the digest is created above.

::: memory/replace/dmd/test/script-diff1.json
@@ +1,5 @@
> +{
> + "version": 1,
> + "invocation": {
> +  "dmdEnvVar": "--sample-below=127",
> +  "sampleBelowSize": 127 

micro nit: trailing space
Attachment #8506571 - Flags: review?(continuation) → review+
> Maybe this is a matter of taste, but I'd think you'd want all of the
> positive diffs (from largest to smallest), then all of the negative diffs
> (from largest to smallest), not mixed all together like this.

I strongly disagree. For one, about:memory used to behave like you suggest and I ended up changing it to the sort-on-absolute approach because it ends up being more what people want. Secondly, because dmd.py only shows the first 1000 records, doing it your way is likely to cut off all of the large negative diff records :(
Yeah, I guess you'd have to sort by absolute value, trim off the excess, then sort by non-absolute value. I can always add some additional sort options later if it bothers me.
https://hg.mozilla.org/mozilla-central/rev/9fe610b8aa4f
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: