Last Comment Bug 845609 - Need a way to diff memory reports
: Need a way to diff memory reports
Status: RESOLVED FIXED
[MemShrink]
:
Product: Toolkit
Classification: Components
Component: about:memory (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla22
Assigned To: Nicholas Nethercote [:njn]
: about.memory
:
Mentors:
: 791873 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-26 16:59 PST by Nicholas Nethercote [:njn]
Modified: 2013-03-25 16:08 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
(part 1) - Add a script to diff JSON memory report dumps. (12.11 KB, patch)
2013-02-27 22:03 PST, Nicholas Nethercote [:njn]
bugmail: review+
Details | Diff | Splinter Review
(part 2) - Add a diff mode to about:memory. (4.80 KB, patch)
2013-02-27 22:04 PST, Nicholas Nethercote [:njn]
justin.lebar+bug: review+
Details | Diff | Splinter Review

Description Nicholas Nethercote [:njn] 2013-02-26 16:59:13 PST
The ability to get the diff of two memory reports will be very useful.  The main question is how to do it.

- Diffing about:memory (or about:memory?verbose) output is probably a bad way to do it.

- Diffing the JSON reports is probably a much better way to do it.  There exist lots of JSON diffing tools, though I don't know how good a job they do and if we need something custom that understands our JSON schema.  I have a vague idea that kats already has a script that does this.

- I really want this capability on AWSY as well.  It queries the memory reporters directly and I don't know anything about the internal data structure it uses to present the data in tree form.

Any thoughts about the best way to proceed?
Comment 1 Nicholas Nethercote [:njn] 2013-02-26 16:59:43 PST
*** Bug 791873 has been marked as a duplicate of this bug. ***
Comment 2 John Schoenick [:johns] 2013-02-26 17:09:36 PST
You can see an example of AWSY's json format for build dumps here:

[Warning: 12M] http://people.mozilla.com/~jschoenick/example_awsy_dump.json

The biggest problem would be memory reporter names that change, e.g. ones with non-deterministic window IDs and such. For AWSY, we could just crawl both trees diffing matching nodes, but handling nodes that diverge is tricky. If two windows have identical child nodes, but represent different websites, it doesn't make sense to diff them just because dump A has one window and dump B has the other.

when I get some more time I can add a "dumb" memory differ to AWSY that ignores divergent nodes and we can see how useful that is, though something that diffs memory reports from vastly different browsers/OSes/scenarios would probably only show useful numbers for the top levels of the tree.
Comment 3 Nicholas Nethercote [:njn] 2013-02-26 19:09:24 PST
> The biggest problem would be memory reporter names that change, e.g. ones
> with non-deterministic window IDs and such.

Good point.  billm's zones patch will add some addresses in there two.  It might help to filtering out the non-deterministic bits before diffing.
Comment 4 Kartikaya Gupta (email:kats@mozilla.com) 2013-02-26 20:26:33 PST
(In reply to Nicholas Nethercote [:njn] from comment #0)
> I have a
> vague idea that kats already has a script that does this.
> 

My quick/dirty tool to do this is at https://github.com/staktrace/awsy-armv6/blob/master/awsy-data-generator/Differ.java. The gist of it is that it builds a cumulative size tree from the JSON data (basically the same thing you see on AWSY for any given data point) and then does a dumb compare of the two trees, sorting the results at a given level of the tree by size difference. See example output at https://bug837131.bugzilla.mozilla.org/attachment.cgi?id=709064

I found that this was decent enough to get a quick idea of what might account for large differences, but also found there was a lot of noise in the diffs and matching up nodes in the two trees better might have reduced some of that.
Comment 5 Justin Lebar (not reading bugmail) 2013-02-26 22:53:46 PST
I think ideally we'd show the diff in about:memory; I at least have been quite happy with how we show the B2G memory reports in about:memory.  We could generate the diff using an external tool and then load it into about:memory, or we could generate the diff in about:memory from the two source JSON files.  Logistically, posting a diff as a single file instead of as two JSON files is simpler, so maybe we'd want to at least support the single-file thing.

I don't think it would be particularly complicated to hack "diff mode" into about:memory.
Comment 6 Nicholas Nethercote [:njn] 2013-02-27 22:03:55 PST
Created attachment 719328 [details] [diff] [review]
(part 1) - Add a script to diff JSON memory report dumps.

For the moment I've taken the "diff with an external tool" approach.  The
attached patch puts this script in a new dir
toolkit/components/aboutmemory/tools/, which seemed as good a place as any.

I wrote the script in JS so that it can be incorporated into aboutMemory.js
later if we decide that's useful.

It doesn't do any kind of stripping of non-deterministic info because I haven't
found it necessary yet.  If you're diffing two reports taken within a single
browser session a lot of those things (e.g. window identifiers, zone addresses
once they're added) will be the same.

This doesn't help us with AWSY, but I figured it's a reasonable place to start.
Comment 7 Nicholas Nethercote [:njn] 2013-02-27 22:04:34 PST
Created attachment 719329 [details] [diff] [review]
(part 2) - Add a diff mode to about:memory.

This patch adds an undocumented "diff" mode to aboutMemory.js.  When enabled, 
it doesn't complain about negative amounts and percentages greater than 100,
and it sorts sub-trees according to their absolute size.

It's invoked with e.g. "about:memory?diff" or "about:memory?verbose&diff".
Comment 8 Kartikaya Gupta (email:kats@mozilla.com) 2013-02-28 04:54:52 PST
Comment on attachment 719328 [details] [diff] [review]
(part 1) - Add a script to diff JSON memory report dumps.

Review of attachment 719328 [details] [diff] [review]:
-----------------------------------------------------------------

The patch looks pretty good, I have a bunch of fairly minor comments below. I would like to see some code to handle stripping out hex addresses from the path, though, unless you think it's not needed for some reason. Going to about:memory in my current Firefox instance I see things like "worker(resource://gre/modules/osfile/osfile_async_worker.js, 0x117075800)" where the address will almost certainly change between runs and clutter up the diff unnecessarily.

(I guess my use case for this tool is mostly to compare between two snapshots on different runs of FF, whereas in a comment above you said your use case was to compare two snapshots from the same run of FF at different times, which is why I view the address-stripping as more important).

No serious issues though, so r+'ing.

::: toolkit/components/aboutmemory/tools/diff-memory-reports.js
@@ +55,5 @@
> +  this._nMerged = aNMerged;
> +}
> +
> +Report.prototype = {
> +  assertCompatible: function(aKind, aUnits)

I think this function should take just a report as an argument, rather than specific fields from the report. That way the function can be enhanced to check other things without affecting call sites too.

@@ +122,5 @@
> +// Return a new reportMap which is the diff of two reportMaps.  Empties
> +// aReportMap2 along the way.
> +function diffReportMaps(aReportMap1, aReportMap2)
> +{
> +  let reportMap3 = {};

Not a fan of "reportMap3" as a name; I'd prefer something more conventional like "result"

@@ +162,5 @@
> +      // copy, and then use amount=0 in the remainder.  When viewed in
> +      // about:memory, this shows up as an entry with a "[2]"-style suffix
> +      // and the correct amount.
> +      let split = processPath.split('::');
> +      assert(split.length == 2);

I feel like this assert could very easily be false if the path contains '::' (and since the paths sometimes contain URLs, it's quite possible). I'd feel better if this code handled that case rather than asserting.

@@ +191,5 @@
> +
> +// Diff two JSON objects holding memory reports.
> +function diff(aJson1, aJson2)
> +{
> +  let json3 = {}

Same here; I'd prefer a name like "result"

@@ +201,5 @@
> +
> +  json3.hasMozMallocUsableSize = aJson1.hasMozMallocUsableSize;
> +  assert(json3.hasMozMallocUsableSize !== undefined &&
> +         json3.hasMozMallocUsableSize === aJson2.hasMozMallocUsableSize,
> +         "hasMozMallocUsableSize properties don't match");

This hunk of code to copy a property and throw if it's not the same in both files could be pulled out into a helper method.
Comment 9 John Schoenick [:johns] 2013-02-28 12:04:29 PST
We have issue #21 to implement this on AWSY:
  https://github.com/Nephyrin/MozAreWeSlimYet/issues/21
Comment 10 Nicholas Nethercote [:njn] 2013-02-28 17:11:25 PST
> I would like to see some code to handle stripping out hex addresses from the
> path

I've added stripping for hex addresses and the PID listed in the process... without the latter comparisons between different runs didn't work at all :)


> > +Report.prototype = {
> > +  assertCompatible: function(aKind, aUnits)
> 
> I think this function should take just a report as an argument, rather than
> specific fields from the report.

I'd like that too, but at one of the call sites it has a Report object, and at the other it has a JSON report object.  I.e. the types don't match.
Comment 11 Justin Lebar (not reading bugmail) 2013-02-28 17:25:24 PST
Comment on attachment 719329 [details] [diff] [review]
(part 2) - Add a diff mode to about:memory.

That was surprisingly simple...

> +TreeNode.compareAmounts = function(aA, aB) {
> +  let a, b;
> +  if (gIsDiff) {
> +    a = Math.abs(aA._amount), b = Math.abs(aB._amount);
> +  } else {
> +    a = aA._amount, b = aB._amount;
> +  }

Please use semicolons here (or something) instead of commas.  If you care about
brevity, you could do

  let a = gIsDiff ? Math.abs(aA._amount) : aA._amount;
  let b = gIsDiff ? Math.abs(aB._amount) : aB._amount;

.
Comment 12 Nicholas Nethercote [:njn] 2013-02-28 18:48:10 PST
I ended up moving the test entirely within the script.  It was neater and allowed me to write comments within the test objects.

https://hg.mozilla.org/integration/mozilla-inbound/rev/0d6047885fbf
https://hg.mozilla.org/integration/mozilla-inbound/rev/68aa2847d965

Note You need to log in before you can comment on or make changes to this bug.