Closed Bug 918207 Opened 11 years ago Closed 11 years ago

Fast per-tab memory profiling

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla27

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

(Blocks 1 open bug)

Details

(Whiteboard: [MemShrink:P1])

Attachments

(3 files, 2 obsolete files)

It'd be great to be able to quickly measure the memory consumption of a single tab.  This could be the basis of a "tab monitor" widget, which e.g. measures it once per second and graphs it over time.

The measurement needs to be fast so that it doesn't interfere too much with normal browser execution.  Its accuracy need not be perfect, though it must not miss too much stuff.
Summary: Provide a way to quickly measure the memory consumption of a single tab → Fast per-tab memory profiling
This version uses the same (very thorough and non-optimized) paths as
about:memory, but it only measures the current tab.

See browser/devtools/commandline/BuiltinCommands.jsm for example usage, where I
implemented a basic "mem" GCLI command that measures the tab and prints the
results.
BTW, the attached patch only measures the DOM/Other measurements for the top window;  it misses all the child windows (iframes, etc) in the tab.  I'll fix this oversight tomorrow.
Could this also be the basis for a "window.performance.memoryUsage" feature that authors can use to take memory measurements at certain parts of their code? That'd help with debugging memory leaks.

(Note that I'm aware of side-channel information leak issues that will probably prevent this from being activated by default, but it could be activated (maybe per-site) in the devtools).
> Could this also be the basis for a "window.performance.memoryUsage" feature
> that authors can use to take memory measurements at certain parts of their
> code?

Not sure.  It requires access to the memory reporter manager, which I think you can only do from chrome code.  I guess we could find a way to expose that info to content if it's deemed safe, but that's beyond the scope of this bug.
Depends on: 919889
Blocks: 919950
OS: Linux → All
Hardware: x86_64 → All
Depends on: 920852
Depends on: 921923
This version makes the following changes.

- Now measures style memory consumption separately (as an extra outparam of
  nsIMemoryReporter::sizeOfTab()).

- Now measures all the window objects in a tab, not just the top one.

Measuring a fully-loaded TechCrunch on my new, fast desktop machine takes about
50 ms.  (On my two-year old Mac laptop it's about 60ms.)  Measuring Gmail takes
about 30 ms on my desktop.  There's scope for getting that down considerably,
though I'm not worrying too much about performance right now;  I figure I can
work on that more if/when this lands.
Panos, have you had a chance to try the patches out?  I'd like to have confirmation that the interface I've exposed is suitable.  Thanks!
Flags: needinfo?(past)
I just did, and the responsiveness improvement is impressive! I'm working on using this API to create a little monitor widget embedded in the developer toolbar that will constantly display memory usage when enabled.

Is this API available from chrome workers for extra jank-free operation, or is the measurement impact already going to affect page scrolling/rendering making it unnecessary?
Flags: needinfo?(past)
> Is this API available from chrome workers for extra jank-free operation

Nope, sorry.  The memory reporters are strongly tied to the main thread.  E.g. the JS reporter basically pauses the main JS runtime so it can take a heap census.  I don't think this is ever going to be completely jank-free :/
A new version, ready for review...

- billm, can you review the JS/XPConnect changes?

- bz, can you review the DOM/layout/XPCOM changes?

- Panos, the main change over the previous versions is that the time taken to
  do the measurement is now available via outparams of sizeOfTab.  You might be
  able to use this to automatically adjust the sampling frequency so that
  responsiveness isn't harmed too much.  (The user could select how much
  perturbation they're willing to endure, e.g. 10% of time spent on memory
  measurements.)

This patch implements coarse-grained per-tab memory measurements, which will be
used for a devtool memory ticker widget (bug 923275) that Panos is working on.

Notes for both reviewers:
- The existing fine-grained memory reporters are re-used by this patch, but
  they are coarsened by putting them into the new TabSizes/nsTabSizes structs,
  which feature only a handful of buckets (i.e. many fewer than about:memory
  provides).

Notes for bz:
- The FOR_EACH_SIZE macro is just like one that's already in use in
  js/public/MemoryMetrics.h;  it's a bit ugly but makes it impossible to omit a
  field in any operation involving all fields.

- nsMemoryReporterManager gets a couple of new fields to hold callbacks.  It
  already has some fields for existing callbacks.

One shortcoming is that there are no tests.  I'm currently working on a JSAPI
that constructs an exact number of objects and strings whose sizes should be
verifiable.  Measuring the other JS stuff (e.g. shapes) is much harder and I
don't have any ideas how to do that.  Testing the measurements of non-JS stuff
is also hard;  DOM nodes might be possible, but style and other stuff is much
harder.

There is definitely also scope for making sizeOfTab() faster, but that can be a
follow-up.
Attachment #816435 - Flags: review?(wmccloskey)
Attachment #816435 - Flags: review?(bzbarsky)
Attachment #812448 - Attachment is obsolete: true
Attachment #808428 - Attachment is obsolete: true
bz is on vacation for a couple of weeks.
Comment on attachment 816435 [details] [diff] [review]
Support per-tab memory profiling.

Thanks for letting me know... and sorry :)
Attachment #816435 - Flags: review?(bzbarsky) → review?(continuation)
Comment on attachment 816435 [details] [diff] [review]
Support per-tab memory profiling.

Review of attachment 816435 [details] [diff] [review]:
-----------------------------------------------------------------

I only reviewed the changes to Iteration.cpp, jsgc.h, jsobj.cpp, and jsobj.h. The rest I don't understand.

::: js/src/gc/Iteration.cpp
@@ +54,5 @@
>                                          JSIterateCompartmentCallback compartmentCallback,
>                                          IterateArenaCallback arenaCallback,
>                                          IterateCellCallback cellCallback)
>  {
>      AutoPrepareForTracing prop(rt);

This should probably be called prep.

@@ +70,5 @@
> +                                       JSIterateCompartmentCallback compartmentCallback,
> +                                       IterateArenaCallback arenaCallback,
> +                                       IterateCellCallback cellCallback)
> +{
> +    AutoPrepareForTracing prop(rt);

This too.
Attachment #816435 - Flags: review?(wmccloskey) → review+
Comment on attachment 816435 [details] [diff] [review]
Support per-tab memory profiling.

Review of attachment 816435 [details] [diff] [review]:
-----------------------------------------------------------------

I looked at the other files in js/ (including the xpconnect stuff) and it looks okay too.
Comment on attachment 816435 [details] [diff] [review]
Support per-tab memory profiling.

Till, can you review just MemoryMetrics.{h,cpp}?  You're probably not very familiar with it, but with jlebar gone nobody (other than me) really is.  Basically, it's all memory reporting stuff, and this patch is adding a new way to invoke the memory reporters, one that works with a single zone (instead of all zones) and aggregates the existing fine-grained measurements into TabSizes, which is a new struct holding coarse-grained measurements.

A couple of things worth noting:
- AddSizeOfTab() is a simplified version of the existing CollectRuntimeStats() function.  Looking at the two in parallel will be useful.

- The FOR_EACH_SIZE macros in MemoryMetrics.h are a recent addition.  We used to not have them, and we'd just have all these structs with all these fields and I always worried that someone would write a function that was supposed to do something with every field (e.g. a constructor, or copy constructor, or an add() function) and miss a field.  Or add a field and not update all the functions appropriately.  FOR_EACH_SIZE is a bit ugly, but makes it much harder for that kind of error to occur.

Thanks!
Attachment #816435 - Flags: review?(till)
I'll review it, too, but you should get a layout person to review nsArenaMemoryStats.h.  It looks like roc reviewed the initial landing of the code in that file.
Whiteboard: [MemShrink] → [MemShrink:P1]
Comment on attachment 816435 [details] [diff] [review]
Support per-tab memory profiling.

Review of attachment 816435 [details] [diff] [review]:
-----------------------------------------------------------------

I did try my best to understand what the changes in MemoryMetrics.* do. Based on that, and FWIW, I think this is all good.

::: js/src/vm/MemoryMetrics.cpp
@@ +555,5 @@
> +    if (!rtStats.zoneStatsVector.reserve(1))
> +        return false;
> +
> +    if (!rtStats.compartmentStatsVector.reserve(zone->compartments.length()))
> +        return false;

Maybe switch these two around? It's not important at all, but I think I'd like for this function's and CollectRuntimeStats' basic structure to be the same as much as possible.
Attachment #816435 - Flags: review?(till) → review+
Comment on attachment 816435 [details] [diff] [review]
Support per-tab memory profiling.

Review of attachment 816435 [details] [diff] [review]:
-----------------------------------------------------------------

froydnj, can you please review the parts relating to nsArenaMemoryStats?  Comment 14 (and to a lesser extent, comment 9) gives some background.  Thanks.
Attachment #816435 - Flags: review?(nfroyd)
Comment on attachment 816435 [details] [diff] [review]
Support per-tab memory profiling.

Review of attachment 816435 [details] [diff] [review]:
-----------------------------------------------------------------

nsTabStats doesn't really belong here.  I think nsWindowMemoryReporter should be taking the responsibility of aggregating measurements for both the fast and slow paths; nsArenaMemoryStats should just be recording the raw numbers.  You lose the niceness of FOR_EACH_SIZE, unfortunately (or the export of it gets complicated).

::: layout/base/nsArenaMemoryStats.h
@@ +16,5 @@
> +      Style,      // Style stuff.
> +      Other       // Everything else.
> +  };
> +
> +  nsTabSizes() { mozilla::PodZero(this); }

Needs a #include "mozilla/PodOperations.h"

@@ +24,5 @@
> +    switch (kind) {
> +      case DOM:   mDom   += n; break;
> +      case Style: mStyle += n; break;
> +      case Other: mOther += n; break;
> +      default:    MOZ_CRASH("bad nsTabSizes kind");

Needs a #include "mozilla/Assertions.h"

@@ +40,5 @@
> +
> +struct nsArenaMemoryStats {
> +#define FOR_EACH_SIZE(macro) \
> +  macro(Other, mLineBoxes) \
> +  macro(Other, mRuleNodes) \

This should be Style, not Other.

@@ +50,5 @@
> +      #define FRAME_ID(classname) FRAME_ID_STAT_FIELD(classname)(),
> +      #include "nsFrameIdList.h"
> +      #undef FRAME_ID
> +      dummy()
> +  {}

++ for having a real constructor here.  You could just mozilla::PodZero(this) instead of initializing individual fields.

@@ +52,5 @@
> +      #undef FRAME_ID
> +      dummy()
> +  {}
> +
> +  void updateTabSizes(nsTabSizes *sizes) const

Ergo, this should be moved to somewhere within nsWindowMemoryReporter's implementation.
Attachment #816435 - Flags: review?(nfroyd) → feedback+
Comment on attachment 816435 [details] [diff] [review]
Support per-tab memory profiling.

Review of attachment 816435 [details] [diff] [review]:
-----------------------------------------------------------------

A DOM peer should look at the changes in nsWindowMemoryReporter.cpp.  I don't know too much about the window-y stuff you are doing there.  I nominate Olli.  Sorry to be a pain about this, I know how annoying it is to get things landed that cut across multiple modules.  This is an advantage of having bz review things, who is a peer of layout and DOM, vs me, who is a peer of neither.

I didn't review any of the js/ aside from the XPConnect change.

r=me assuming you get a layout person to review nsArenaMemoryStats.h.

::: dom/base/nsWindowMemoryReporter.cpp
@@ +32,5 @@
>  NS_IMPL_ISUPPORTS3(nsWindowMemoryReporter, nsIMemoryReporter, nsIObserver,
>                     nsSupportsWeakReference)
>  
> +static nsresult
> +AddNonJSSizeOfWindowAndItsDescendents(nsGlobalWindow* aWindow, nsTabSizes* sizes)

aSizes

Maybe you should assert that aWindow is an outer window?

@@ +34,5 @@
>  
> +static nsresult
> +AddNonJSSizeOfWindowAndItsDescendents(nsGlobalWindow* aWindow, nsTabSizes* sizes)
> +{
> +  // Measure the outer window.

The whitespace around these comments seems a little odd to me.  Maybe either get rid of the blank line right between the section comment and the section that implements it, or something?

::: dom/base/nsWindowMemoryReporter.h
@@ +37,5 @@
> +  macro(Other, mLayoutPresShell) \
> +  macro(Style, mLayoutStyleSets) \
> +  macro(Other, mLayoutTextRuns) \
> +  macro(Other, mLayoutPresContext) \
> +  macro(Other, mPropertyTables) \

Given that these are all the same type, an alternate way to implement this would be what the GC does, which is have an array, and then a bunch of enums for each case.  For instance, see Phase in gc/Statistics.h.  Then you don't need these weird higher order macros, which I find hard to understand, even as somebody who has done a ton of programming in pure higher-order dependently typed programming languages.  To deal with the extra "kind" info (ie DOM vs. Style vs. Other), you can order the enums, and add extra limits for each kind, like what AllocKind does (FINALIZE_OBJECT_LAST).

Anyways, I'm not asking you to rewrite it or anything, just some food for thought.

@@ +43,5 @@
>  public:
> +  nsWindowSizes(mozilla::MallocSizeOf aMallocSizeOf)
> +    : FOR_EACH_SIZE(NS_ZERO_SIZE)
> +      mArenaStats(),
> +      mMallocSizeOf(aMallocSizeOf)

It looks like you could just PodZero this then set mMallocSizeOf. I don't know if that's an improvement or not.

@@ +46,5 @@
> +      mArenaStats(),
> +      mMallocSizeOf(aMallocSizeOf)
> +  {}
> +
> +  void updateTabSizes(nsTabSizes *sizes) const {

Here and elsewhere, "updateTabSizes" is not a descriptive name.  Update how?  Maybe "addToTabSizes"?

::: js/xpconnect/src/XPCJSRuntime.cpp
@@ +2674,5 @@
>      return NS_OK;
>  }
>  
> +static nsresult
> +JSSizeOfTab(JSObject* obj, size_t* jsObjectsSize, size_t* jsStringsSize,

* to the right, I think...

::: layout/base/nsArenaMemoryStats.h
@@ +35,5 @@
>  };
>  
> +#define NS_DECL_SIZE(kind, mSize)        size_t mSize;
> +#define NS_ZERO_SIZE(kind, mSize)        mSize(),
> +#define NS_UPDATE_TAB_SIZES(kind, mSize) sizes->add(nsTabSizes::kind, mSize);

Maybe these 3 macro names could end in _REPORTER_SIZE or something instead of _SIZE?  The names as-is are quite generic given how specialized they are for the higher-order-macropalypse, and they are being used across multiple files.

@@ +52,5 @@
> +      #undef FRAME_ID
> +      dummy()
> +  {}
> +
> +  void updateTabSizes(nsTabSizes *sizes) const

I don't really understand froydnj's comment on updateTabSizes.  This file looks reasonable-ish to me.

::: xpcom/base/nsIMemoryReporter.idl
@@ +329,5 @@
>  };
>  
>  %{C++
>  
> +#include "js/TypeDecls.h"

This include seems like overkill if you are just using it to forward declare JSObject, but you know more than me about the philosophy of that file, so whatever you think is best.

@@ +388,5 @@
> +                                     size_t* aStyleSize,
> +                                     size_t* aOtherSize);
> +
> +#define DECL_REGISTER_SIZE_OF_TAB(name) \
> +    nsresult Register##name##SizeOfTab(name##SizeOfTabFn aSizeOfTabFn);

This kind of seems like macro overkill, given that there are only two cases here (are we ever going to have something besides JS and non-JS?), and it makes it hard to search for the registration functions, but okay.

::: xpcom/base/nsMemoryReporterManager.cpp
@@ +1305,5 @@
> +
> +    TimeStamp t3 = TimeStamp::Now();
> +
> +    *aTotalSize = 0;
> +    #define DO(aN, n) { *aN = (n); *aTotalSize += (n); }

Maybe ACCUMULATE_SIZE instead of the not-very-descriptive DO?  This is pretty simple and locally scoped so not a big deal.
Attachment #816435 - Flags: review?(continuation)
Attachment #816435 - Flags: review?(bugs)
Attachment #816435 - Flags: review+
Comment on attachment 816435 [details] [diff] [review]
Support per-tab memory profiling.


>+static nsresult
>+AddNonJSSizeOfWindowAndItsDescendents(nsGlobalWindow* aWindow, nsTabSizes* sizes)
aSizes

>+{
>+  // Measure the outer window.
>+
>+  nsWindowSizes outerWindowSizes(moz_malloc_size_of);
>+  aWindow->AddSizeOfIncludingThis(&outerWindowSizes);
>+  outerWindowSizes.updateTabSizes(sizes);
>+
>+  // Measure the inner window.
>+
>+  nsWindowSizes innerWindowSizes(moz_malloc_size_of);
>+  nsRefPtr<nsGlobalWindow> inner = aWindow->GetCurrentInnerWindowInternal();
No need for nsRefPtr

>+  for (uint32_t i = 0; i < length; i++) {
>+      nsCOMPtr<nsIDOMWindow> child;
>+      rv = frames->Item(i, getter_AddRefs(child));
>+      NS_ENSURE_SUCCESS(rv, rv);
Based on the ::Item implementation success check isn't too useful.
NS_ENSURE_STATE(child); would be better

>+
>+      nsGlobalWindow *childWin =
* goes with the type.

>+        static_cast<nsGlobalWindow *>(static_cast<nsIDOMWindow *>(child.get()));
ditto


>+static nsresult
>+NonJSSizeOfTab(nsPIDOMWindow* aWindow, size_t* aDomSize, size_t* aStyleSize, size_t* aOtherSize)
>+{
>+  nsRefPtr<nsGlobalWindow> window = static_cast<nsGlobalWindow*>(aWindow);
nsGlobalWindow* window =
Attachment #816435 - Flags: review?(bugs) → review+
This version addresses various review comments.  Most notably:

- I kept the FOR_EACH_SIZE macros.  It's passed review previously, it's an
  approach used in quite a few places in the Mozilla codebase (e.g. jsast.tbl,
  jsop.tbl, jsprototypes.h), and it allows concise references like |x| instead
  of |fields[X].size| that you get with the Phase-style approach.

- I was going to move nsTabSizes from nsArenaMemoryStats.h to
  nsWindowMemoryReporter.h, but then I can't define
  nsArenaMemoryStats::addToTabSizes().  So I left it in place.  Having the
  fine-grained sizes struct know about the coarse-grained sizes struct seems
  better than the other way around.  froydnj, sound ok?
  
- I kept the use of the FOR_EACH_SIZE macros, but I avoided exposing
  NS_DECL_SIZE et al to all and sundry by duplicating them in both
  nsArenaMemoryStats.h and nsWindowMemoryReporter.h.  I figure the duplication
  is ok because the macros are so simple, and the reduced exposure makes it
  worthwhile.

- I renamed updateTabSizes/UPDATE_TAB_SIZES as addToTabSizes/ADD_TO_TAB_SIZES.

- I augmented test_memoryReporters.xul to at least call sizeOfTab(), even if it
  can't actually check the results.

- Various other minor things.
Attachment #820110 - Flags: review?(nfroyd)
> > +static nsresult
> > +AddNonJSSizeOfWindowAndItsDescendents(nsGlobalWindow* aWindow, nsTabSizes* sizes)
> 
> Maybe you should assert that aWindow is an outer window?

I tried that but it's always false for the top window that gets measured, but then true for all the nested windows.
Comment on attachment 820110 [details] [diff] [review]
Address review comments.

Review of attachment 820110 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Nicholas Nethercote [:njn] from comment #21)
> - I was going to move nsTabSizes from nsArenaMemoryStats.h to
>   nsWindowMemoryReporter.h, but then I can't define
>   nsArenaMemoryStats::addToTabSizes().  So I left it in place.  Having the
>   fine-grained sizes struct know about the coarse-grained sizes struct seems
>   better than the other way around.  froydnj, sound ok?

I think of the coarse-grained sizes struct as a better place to make decisions about how to aggregate the members of the fine-grained sizes struct.  We already let nsWindowMemoryReporter make decisions about aggregating nsArenaMemoryStats for about:memory, so letting nsArenaMemoryStats make decisions about aggregating into nsTabSizes seems backwards in this case.  But I can see arguments both ways.

Anyway, the desire for code cleanliness with FOR_EACH_SIZE and whatnot is on your side, I think this can just go in as-is.

::: layout/base/nsArenaMemoryStats.h
@@ +10,1 @@
>  class nsTabSizes {

Don't forget the additional #includes for PodZero and MOZ_CRASH mentioned in the other review.
Attachment #820110 - Flags: review?(nfroyd) → review+
Here's a simple patch implementing a new GCLI command ("mem") that invokes the
new API.  (Actually it invokes it 100 times... I've been using this to measure
the speed of the measurement.)  It shows how to invoke the new API.
https://hg.mozilla.org/mozilla-central/rev/2d2444eecf83
https://hg.mozilla.org/mozilla-central/rev/25ca894f1716
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
(In reply to Nicholas Nethercote [:njn] from comment #25)
> Created attachment 820813 [details] [diff] [review]
> Implement a GCLI command that calls the new API.
> 
> Here's a simple patch implementing a new GCLI command ("mem") that invokes
> the
> new API.  (Actually it invokes it 100 times... I've been using this to
> measure
> the speed of the measurement.)  It shows how to invoke the new API.

Nick, can we get this added to a separate bug in Firefox :: Developer Tools: Command Line ?

We should land it.

Nice work!
Depends on: 932440
I filed bug 932609 for the GCLI command.  I have some questions there about the presentation of the results.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: