Last Comment Bug 698968 - Add mallocSizeOf functions and start using them
: Add mallocSizeOf functions and start using them
Status: RESOLVED FIXED
[MemShrink:P2]
:
Product: Core
Classification: Components
Component: General (show other bugs)
: unspecified
: x86_64 Linux
: -- normal (vote)
: mozilla11
Assigned To: Nicholas Nethercote [:njn]
:
Mentors:
Depends on: 705945
Blocks: DarkMatter 700914 704400 704695
  Show dependency treegraph
 
Reported: 2011-11-01 18:36 PDT by Nicholas Nethercote [:njn]
Modified: 2011-11-28 17:09 PST (History)
7 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (against m-i 80142:fe41fa77e51a) (72.77 KB, patch)
2011-11-10 18:44 PST, Nicholas Nethercote [:njn]
justin.lebar+bug: review+
bhackett1024: review+
Details | Diff | Splinter Review
patch, v2 (against m-i 80363:aa05d607c95e) (76.54 KB, patch)
2011-11-16 20:34 PST, Nicholas Nethercote [:njn]
no flags Details | Diff | Splinter Review
patch v3 (52.81 KB, patch)
2011-11-22 17:37 PST, Nicholas Nethercote [:njn]
n.nethercote: review+
Details | Diff | Splinter Review
patch v4 (73.84 KB, patch)
2011-11-22 18:12 PST, Nicholas Nethercote [:njn]
jfkthame: review+
bzbarsky: superreview+
Details | Diff | Splinter Review

Description Nicholas Nethercote [:njn] 2011-11-01 18:36:09 PDT
Several memory reporters use moz_malloc_usable_size to get the sizes of heap blocks (and ones that don't should).  Because moz_malloc_usable_size always returns 0 on some platforms, we need a fallback size computation.  So we see stuff like this a lot:

  size_t usable = moz_malloc_usable_size(p);
  size_t n = usable ? usable : sizeof(P);

bhackett suggested introducing a new function that takes a pointer and a computed size, like this:

  size_t n = moz_foo(p, sizeof(P));

On platforms where moz_malloc_usable_size returns 0, it would return its second argument.  On other platforms it would check that the usable size is not smaller than the computed size.  It would also provide a useful hook for determining the amount of live slop -- we wouldn't put slop counts into about:memory, but it would be easy to add a debugging printf statement to moz_foo() for ad hoc slop profiling.

I need to decide what the function would actually be called (moz_sizeof_block?) and where to put it (mozalloc.cpp?).
Comment 1 Nicholas Nethercote [:njn] 2011-11-10 18:44:42 PST
Created attachment 573719 [details] [diff] [review]
patch (against m-i 80142:fe41fa77e51a)

[Preliminary info for reviewers:

- bhackett: can you please review the JS engine/XPCJSRuntime changes.

- jfkthame: can you please review the gfx/layout changes.

- jlebar: can you please review everything else.

- bz:  does this need super-review?  If so, can you please do it?
]

This bug is an important step towards improving our memory reporter
infrastructure.  Some explanation is required.  Sorry it's so long, but it's
necessary to understand some of the code changes.

One key thing is that memory reporters are easy to get wrong.  We've had
numerous bugs in them, causing them to over- or under-count, often leading
to bogus (e.g. negative) numbers in about:memory.  We need to (a) make them
easier to get right, and (b) easier to catch when they're wrong.

For (a), we need to use moz_malloc_usable_size everywhere possible.  It's
much easier and more reliable than computing sizes of things.  It has the
added bonus of measuring slop bytes, which are a sizeable part of
"heap-unclassified".

For (b), we need DMD.  It catches when bytes of memory are double-reported.
It will also, once I've made some changes, detect if you report memory that
isn't on the heap or mmap'd.  And DMD is also obviously crucial for getting
"heap-unclassified" down.

But DMD needs integration with Firefox -- memory reporters need to tell DMD
what memory they are reporting.  That integration currently involves a
biggish patch (see bug 676724) which is a pain to maintain.  Moving forward,
I want that patch to be as small as possible, and that's a big motivation
for many of the changes in this patch.  (In fact, hopefully the integration
patch will just become part of Firefox one day, but that's not ready to
happen yet.)

---

Now, about the patch.  It adds the notion of moz_malloc_usable_size()-like 
function that has a computed size as a fallback.  The basic name I'm using
for this kind of function is |mucSize|, short for
"malloc-usable-or-computed-size".  (Such functions will be used a lot, so I
chose short name.)

The idea is that eventually, every SizeOf()-style function in the codebase
is going to use one of these mucSize() functions.  In fact, there'll be a
standard one that's used for memory reporters, called
mozilla::MemoryReporterMUCSize(), and this patch adds it.  It's crucial that
all memory reporters end up using this function, because it'll be a single
integration point with DMD.  This function can also do some sanity checking
of the computedSizes (I've already found some bugs thanks to this checking).
Finally, by adding some temporary debugging printfs to
mozilla::MemoryReporterMUCSize(), we can get information about slop, which
is really useful.

Having read that, you're probably thinking that lots of SizeOf() style
functions should call MemoryReporterMUCSize() directly.  It turns out that's
a bad idea, because it pigeon-holes those SizeOf() functions into only being
used by memory reporters, which is something worth avoiding.  So instead,
I'm passing a |mucSize| function to every SizeOf()-style function, keeping
them flexible.  One use I've already found for this flexibility is that you
can create a reporter-specific variant of MemoryReporterMUCSize() if you
want reporter-specific info.  (I used this trick to get the slop numbers for
the reporters such as "text-runs", see below.)  This is a useful enough
trick that I created a macro, NS_MEMORY_REPORT_MUCSIZE, for creating
variants, and MemoryReporterMUCSize() is implemented using it (with the
variant name "default").

Another benefit of passing the |mucSize| function everywhere:  if we didn't
do that, we'd have to define/declare MemoryReporterMUCSize() somewhere
really low-level, like nscore.h.  With the passing approach I only needed to
add the typedef |nsMUCSizeFun| to nscore.h.

Another design decision in this patch:  sometimes when you call
foo->SizeOf() you want to count |foo| itself, and sometimes you don't.  I
was using a bool parameter for some of these cases, but jlebar and I agreed
that having foo->SizeOfIncludingThis() and/or foo->SizeOfExcludingThis() is
better, because it makes it clear if you are counting, and makes it hard for
people to fail to consider which case they need.  (And that's good, because
calling moz_malloc_usable_size() on stack-allocated memory is a no-no.)  I
converted some existing cases to use this naming convention, but there are
more to be done which I'll do later.

The JS engine changes are fairly straightforward.  There's that
shrinkSlots() change bhackett gave me via IRC.

The gfx/layout changes are more involved.  In particular:
- I converted lots of PRUint64 variables to size_t;
- I used return values instead of outparams wherever possible;
- I renamed various size-computation functions to include "SizeOf" in their
  names;
- I avoided using aTotal==NULL to indicate when sizeof bits should be cleared.

Here's some data about the reductions in "heap-unclassified" due to using
moz_malloc_usable_size instead of a computed size:

- "textruns": on a big TBPL log, |computedSize| was 125MB, actual size was
  138MB;  that 13MB would be missed without this patch.

- "textrun-word-cache": on the same log, |computedSize| was 62MB, actual
  size was only 14KB more.  This is because most of the allocations are
  really big so the slop fraction is tiny.

- "history-links-hashtable": unchanged, it allocates a power-of-two and so
  never has slop.
Comment 2 Brian Hackett (:bhackett) 2011-11-11 08:20:41 PST
Comment on attachment 573719 [details] [diff] [review]
patch (against m-i 80142:fe41fa77e51a)

Review of attachment 573719 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/shell/js.cpp
@@ +3839,5 @@
>  
>  #ifdef JS_METHODJIT
>  
>  static size_t
> +zero_usable_size(void *p, size_t computedSize)

This function should be renamed.
Comment 3 Justin Lebar (not reading bugmail) 2011-11-15 22:48:12 PST
I'm sorry for sitting on this review, Nick.

One thing which strikes me after skimming the patch is the new function's name.  mucSize is a much less obvious name than malloc_usable_size.

I wonder if we could just call mucSize mallocUsableSize, or even mallocSize.  The only difference is the extra parameter, right?  malloc_usable_size isn't a standard function, so we could just declare that our malloc_usable_size takes two parameters, the pointer and a guess at the size.

I know it's a bikeshed, but we're committing to paint a lot of walls in this color...

I wish we didn't have to pass a function pointer to SizeOf -- it's just another complicated thing that people writing these functions have to understand -- but I totally agree with your reasons for having it.  Maybe I'll have some insight after I read the patch.
Comment 4 Nicholas Nethercote [:njn] 2011-11-16 01:18:32 PST
I went through various names in earlier versions of the patch and wasn't happy until I hit upon mucSize.  malloc_usable_size is semi-standard, so I don't like using MallocUsableSize for a function with a different signature.  And malloc_usable_size is called malloc_size on Mac, so that's no good either.  (It's called _msize on Windows, go figure.)

I think Huffman encoding (common names are short) works well for variable/function names, and we're going to have quite a lot of these.  And the fact that there's the nsMUCSize typedef means that someone unfamiliar with the meaning has something they can look up -- the acronym is explained in the typedef's comment.

As for the function pointer, it's the least worst solution I could think of, and I spent quite a bit of time thinking about it.  I've been using it in the JS engine for a while because moz_malloc_usable_size isn't available (because the JS engine can be built independently from Gecko) and it's been working well there.  And it really makes things work really nicely for DMD.
Comment 5 Justin Lebar (not reading bugmail) 2011-11-16 08:09:36 PST
We could do mallocSize, right?  That doesn't collide with mac's malloc_size, and it's just a few chars longer than mucSize.

I like having "malloc" in the name so you know you can only use it on pointers returned by malloc.  "mucSize" is pretty opaque in comparison.

Obviously it's a tradeoff between length and readability.

Again, sorry to make a bikeshed out of this.
Comment 6 Justin Lebar (not reading bugmail) 2011-11-16 15:02:42 PST
Nick and I talked on IRC about this.  I think mallocSize is better than mucSize, and he disagrees.  Basically, I prefer mallocSize because it gives the reader a decent hint as to what it does, and Nick prefers mucSize because mallocSize is misleading (Mac has malloc_size which doesn't have the or-computed fallback that mucSize has).

I'm not going to gate the review on this, although opinions one way or another might help us reach consensus.
Comment 7 Jonathan Kew (:jfkthame) 2011-11-16 15:43:46 PST
FWIW, my preference would be mallocSize; I find mucSize rather cryptic and non-memorable, even after it's been explained.
Comment 8 Jonathan Kew (:jfkthame) 2011-11-16 15:48:07 PST
Or even mallocedBlockSize, maybe? Longer, but I could live with that for the sake of clarity.
Comment 9 Justin Lebar (not reading bugmail) 2011-11-16 16:25:54 PST
Comment on attachment 573719 [details] [diff] [review]
patch (against m-i 80142:fe41fa77e51a)

>+    size_t sizeOfIncludingThis(JSMUCSizeFun mucSize) const {
>+        return mucSize((void*)this, sizeof(HashTable)) + sizeOfExcludingThis(mucSize);

Ugh, can we remove the need for this cast, here and elsewhere?  Is the problem
that JSMUCSizeFun takes a void* rather than a const void*?

>@@ -1092,17 +1090,18 @@ class HashMap
>      *     char c = r.front().value;
>      *
>      * Also see the definition of Range in HashTable above (with T = Entry).
>      */
>     typedef typename Impl::Range Range;
>     Range all() const                                 { return impl.all(); }
>     size_t count() const                              { return impl.count(); }
>     size_t capacity() const                           { return impl.capacity(); }
>-    size_t sizeOf(JSUsableSizeFun usf, bool cm) const { return impl.sizeOf(usf, cm); }
>+    size_t sizeOfExcludingThis(JSMUCSizeFun mucSize) const { return impl.sizeOfExcludingThis(mucSize); }
>+    size_t sizeOfIncludingThis(JSMUCSizeFun mucSize) const { return impl.sizeOfIncludingThis(mucSize); }

impl is a member variable.  Since it's the first member and HashMap doesn't have a vtable, then this == &impl, and the code should work.  Otherwise, you'll end up calling malloc_usable_size on &impl when &impl isn't heap-allocated, and we'll crash, right?

Since this is so tricky, I'd prefer we set a good example and do

  return mucSize(*this, sizeof(*this)) + impl.sizeOfExcludingThis(mucSize);

>@@ -1293,17 +1292,18 @@ class HashSet
>      *     int i = r.front();
>      *
>      * Also see the definition of Range in HashTable above.
>      */
>     typedef typename Impl::Range Range;
>     Range all() const                                 { return impl.all(); }
>     size_t count() const                              { return impl.count(); }
>     size_t capacity() const                           { return impl.capacity(); }
>-    size_t sizeOf(JSUsableSizeFun usf, bool cm) const { return impl.sizeOf(usf, cm); }
>+    size_t sizeOfExcludingThis(JSMUCSizeFun mucSize) const { return impl.sizeOfExcludingThis(mucSize); }
>+    size_t sizeOfIncludingThis(JSMUCSizeFun mucSize) const { return impl.sizeOfIncludingThis(mucSize); }

Same as for HashMap.

>diff --git a/js/public/Utility.h b/js/public/Utility.h
>--- a/js/public/Utility.h
>+++ b/js/public/Utility.h
>@@ -890,21 +890,19 @@ RoundUpPow2(size_t x)
>     return size_t(1) << JS_CEILING_LOG2W(x);
> }
> 
> } /* namespace js */
> 
> #endif /* defined(__cplusplus) */
> 
> /*
>- * This signature is for malloc_usable_size-like functions used to measure
>- * memory usage.  A return value of zero indicates that the size is unknown,
>- * and so a fall-back computation should be done for the size.
>+ * This is SpiderMonkey's equivalent to |nsMUCSizeFun|.
>  */
>-typedef size_t(*JSUsableSizeFun)(void *p);
>+typedef size_t(*JSMUCSizeFun)(void *p, size_t computedSize);

You could define this in mfbt, if you wanted.  (mfbt is a set of headers shared between Gecko and JS.)

>diff --git a/js/xpconnect/src/XPCJSRuntime.cpp b/js/xpconnect/src/XPCJSRuntime.cpp
>--- a/js/xpconnect/src/XPCJSRuntime.cpp
>+++ b/js/xpconnect/src/XPCJSRuntime.cpp

> PRInt64
> GetCompartmentTjitDataAllocatorsMainSize(JSCompartment *c)
> {
>     return c->hasTraceMonitor()
>-         ? c->traceMonitor()->getVMAllocatorsMainSize(moz_malloc_usable_size)
>+         ? c->traceMonitor()->getVMAllocatorsMainSize(MemoryReporterMUCSize)
>          : 0;
> }
> 
> PRInt64
> GetCompartmentTjitDataAllocatorsReserveSize(JSCompartment *c)
> {
>     return c->hasTraceMonitor()
>-         ? c->traceMonitor()->getVMAllocatorsReserveSize(moz_malloc_usable_size)
>+         ? c->traceMonitor()->getVMAllocatorsReserveSize(MemoryReporterMUCSize)
>          : 0;
> }
> 
> PRInt64
> GetCompartmentTjitDataTraceMonitorSize(JSCompartment *c)
> {
>     return c->hasTraceMonitor()
>-         ? c->traceMonitor()->getTraceMonitorSize(moz_malloc_usable_size)
>+         ? c->traceMonitor()->getTraceMonitorSize(MemoryReporterMUCSize)
>          : 0;
> }

Maybe add a comment explaining that you should only call these if you're
actually a memory reporter?  Otherwise we'll have DMD problems, right?

Or you could just take the mucSizeFun as a parameter.  Maybe that's clearer.

> PRInt64 GetHistoryObserversSize()
> {
>   History* history = History::GetService();
>   if (!history)
>     return 0;
>-  return sizeof(*history) + history->SizeOf();
>+  return history->SizeOfIncludingThis(MemoryReporterMUCSize);
> }

Comment that this func is only for use with a memory reporter, please, or take
mucSizeFun as a parameter.

>+ * - They provide more information to DMD which is a tool that helps keep
>+ *   about:memory's "heap-unclassified" number low (see bug 676724 for
>+ *   details).

Comma before "which", please.

>+namespace mozilla {
>+
>+/*
>+ * This function should be used by all traversal-based memory reporters.
>+ * - On platforms where moz_malloc_usable_size() returns 0 it just returns
>+ *   |computedSize|.

How about "On platforms where malloc_usable_size() or an equivalent isn't
available".

>+/*
>+ * These functions are like moz_malloc_usable_size(), and should be used by all
>+ * counter-based memory reporters. They don't have a |computedSize|
>+ * fallback.
>+ */
>+size_t MemoryReporterMUSizeForCounterInc(void *ptr);
>+size_t MemoryReporterMUSizeForCounterDec(void *ptr);

Why don't these have a |computedSize| fallback?  I see that in hunspell we
don't have the computed size available, but shouldn't the fallback still be
here, perhaps with a param that defaults to 0?

>+/*
>+ * For the purposes of debugging, temporary profiling, and DMD integration, it
>+ * is sometimes useful to temporarily create multiple variants of
>+ * MemoryReporterMUCSize(), with each one distinguished by a string id.  This
>+ * macro makes creating such variants easy.
>+ */
>+#define NS_MEMORY_REPORTER_MUCSIZE_FUN(fn, name)                              \
>+  size_t fn(void *ptr, size_t computedSize)                                   \
>+  {                                                                           \
>+      size_t usable = moz_malloc_usable_size(ptr);                            \
>+      if (!usable) {                                                          \
>+          return computedSize;                                                \
>+      }                                                                       \
>+      /* The factor of two is because no reasonable heap allocator will    */ \
>+      /* return a block more than twice the requested size.  The one       */ \
>+      /* exception is for tiny blocks, where any request less than 8 bytes */ \
>+      /* will likely be rounded up to 8 bytes.  Also, if computedSize is 0 */ \
>+      /* that means "I don't know or care" and so we don't check.          */ \

Need a period or a semicolon after 0.

>+      NS_ASSERTION(usable >= computedSize,                                    \
>+                   "MemoryReporterMUCSize: computedSize is too big");         \
>+      NS_ASSERTION(usable < computedSize * 2 || usable <= 8 ||                \
>+                   computedSize == 0,                                         \
>+                   "MemoryReporterMUCSize: computedSize is too small");       \
>+      return usable;                                                          \
>+  }

8 bytes here and in the comment above should be sizeof(void*).  At least, that
matches what we do in jemalloc.

|name| isn't used anywhere in this macro; is that intentional?

>diff --git a/xpcom/base/nscore.h b/xpcom/base/nscore.h
>--- a/xpcom/base/nscore.h
>+++ b/xpcom/base/nscore.h
>@@ -52,16 +52,24 @@
> #  include "mozilla/mozalloc_macro_wrappers.h"
> #endif
> 
> /**
>  * Incorporate the core NSPR data types which XPCOM uses.
>  */
> #include "prtypes.h"
> 
>+/**
>+ * "mucsize" is short for "malloc--usable-or-computed_size".  This is for
>+ * functions that are like malloc_usable_size but also take a computed size as
>+ * a fallback.  Such functions are used for measuring the size of data
>+ * structures.
>+ */
>+typedef size_t(*nsMUCSizeFun)(void *p, size_t computedSize);

Probably should be const void* p, but really, whatever it has to be so you
don't have to cast |this| to void*...

>diff --git a/xpcom/glue/nsTHashtable.h b/xpcom/glue/nsTHashtable.h
>--- a/xpcom/glue/nsTHashtable.h
>+++ b/xpcom/glue/nsTHashtable.h
>@@ -246,20 +246,20 @@ public:
>    */
>   void Clear()
>   {
>     NS_ASSERTION(mTable.entrySize, "nsTHashtable was not initialized properly.");
> 
>     PL_DHashTableEnumerate(&mTable, PL_DHashStubEnumRemove, nsnull);
>   }
> 
>-  PRUint64 SizeOf()
>+  size_t SizeOfEntryStorage(nsMUCSizeFun mucSize)
>   {
>     if (IsInitialized()) {
>-      return PL_DHashTableSizeOf(&mTable);
>+      return PL_DHashTableSizeOfEntryStorage(&mTable, mucSize);
>     }
>     return 0;
>   }

It's not clear to me from the name that this is equivalent to a shallow version of SizeOfExcludingThis.  Do you like |ShallowSizeOfExcludingThis| better?  I'd be fine with |SizeOfIncludingThis| too -- it should be obvious that the hashtable can't follow pointers of the things it holds.

>diff --git a/xpcom/glue/pldhash.cpp b/xpcom/glue/pldhash.cpp
>--- a/xpcom/glue/pldhash.cpp
>+++ b/xpcom/glue/pldhash.cpp
>-PRUint64
>-PL_DHashTableSizeOf(PLDHashTable *table)
>+size_t
>+PL_DHashTableSizeOfEntryStorage(PLDHashTable *table, nsMUCSizeFun mucSize)
> {
>-  PRUint64 size = 0;
>-
>-#ifdef MOZALLOC_HAVE_MALLOC_USABLE_SIZE
>-  // Even when moz_malloc_usable_size is defined, it might always return 0, if
>-  // the allocator in use doesn't support malloc_usable_size.
>-  size = moz_malloc_usable_size(table->entryStore);
>-#endif
>-
>-  if (size == 0) {
>-    size = PL_DHASH_TABLE_SIZE(table) * table->entrySize;
>-  }
>-
>-  return size;
>+  return mucSize(table->entryStore,
>+                 PL_DHASH_TABLE_SIZE(table) * table->entrySize);
> }

Same naming issue as with nsHashtable.h.
Comment 10 Boris Zbarsky [:bz] 2011-11-16 16:30:06 PST
actualSize ?
Comment 11 Nicholas Nethercote [:njn] 2011-11-16 20:10:24 PST
> >+    size_t sizeOfIncludingThis(JSMUCSizeFun mucSize) const {
> >+        return mucSize((void*)this, sizeof(HashTable)) + sizeOfExcludingThis(mucSize);
> 
> Ugh, can we remove the need for this cast, here and elsewhere?  Is the
> problem
> that JSMUCSizeFun takes a void* rather than a const void*?

That's right.  But changing JSMUCSizeFun/nsMUCSizeFun to take a |const void*| works and avoids the need for the casts, so I'll do that.  (I have to do a (void*) cast inside MemoryReporterMUCSize when passing the pointer to moz_malloc_usable_size, but it's fewer casts in total.)


> > PRInt64
> > GetCompartmentTjitDataTraceMonitorSize(JSCompartment *c)
> > {
> >     return c->hasTraceMonitor()
> >-         ? c->traceMonitor()->getTraceMonitorSize(moz_malloc_usable_size)
> >+         ? c->traceMonitor()->getTraceMonitorSize(MemoryReporterMUCSize)
> >          : 0;
> > }
> 
> Maybe add a comment explaining that you should only call these if you're
> actually a memory reporter?  Otherwise we'll have DMD problems, right?
> 
> Or you could just take the mucSizeFun as a parameter.  Maybe that's clearer.

Taking a parameter is better than a comment, but this code is about to be removed along with the rest of the tracer (bug 698201) so I won't bother.


> > PRInt64 GetHistoryObserversSize()
> > {
> >   History* history = History::GetService();
> >   if (!history)
> >     return 0;
> >-  return sizeof(*history) + history->SizeOf();
> >+  return history->SizeOfIncludingThis(MemoryReporterMUCSize);
> > }
> 
> Comment that this func is only for use with a memory reporter, please, or
> take
> mucSizeFun as a parameter.

GetHistoryObserverSize() is the function that implements the |amount| field of the "explicit/history-links-hashtable" memory reporter so it can't take a mucSizeFun parameter.  (In other words, this is the highest point in the call chain that MemoryReporterMUCSize can be injected as a parameter.)  Do you still want the comment?


> >+      NS_ASSERTION(usable >= computedSize,                                    \
> >+                   "MemoryReporterMUCSize: computedSize is too big");         \
> >+      NS_ASSERTION(usable < computedSize * 2 || usable <= 8 ||                \
> >+                   computedSize == 0,                                         \
> >+                   "MemoryReporterMUCSize: computedSize is too small");       \
> >+      return usable;                                                          \
> >+  }
> 
> 8 bytes here and in the comment above should be sizeof(void*).  At least,
> that matches what we do in jemalloc.

It's not guaranteed that we're using jemalloc.  (E.g. if you're on a --disable-jemalloc build on Windows or Mac, moz_malloc_usable_size() still works by falling back to _msize or malloc_size.)  And some allocators have 8 as their minimum size;  in fact I think various standards actually mandate this and jemalloc violates it.


> >+typedef size_t(*JSMUCSizeFun)(void *p, size_t computedSize);
>
> You could define this in mfbt, if you wanted.  (mfbt is a set of headers shared between > Gecko and JS.)

I tried that but it looks like lots of places don't #include mfbt headers (e.g. I got my first compile error in xpcom/glue/pldhash.h).  The current code works so I'll stick with it.


> >-  PRUint64 SizeOf()
> >+  size_t SizeOfEntryStorage(nsMUCSizeFun mucSize)
> >   {
> >     if (IsInitialized()) {
> >-      return PL_DHashTableSizeOf(&mTable);
> >+      return PL_DHashTableSizeOfEntryStorage(&mTable, mucSize);
> >     }
> >     return 0;
> >   }
> 
> It's not clear to me from the name that this is equivalent to a shallow
> version of SizeOfExcludingThis.  Do you like |ShallowSizeOfExcludingThis|
> better? I'd be fine with |SizeOfIncludingThis| too -- it should be obvious
> that the hashtable can't follow pointers of the things it holds.

I like the idea of "Shallow" -- all SizeOf-style functions should be implicitly "deep" (i.e. chase pointers) and so exceptions should be marked.

Beyond that, I want to have a way to get the size of the pointee of one member in an object.  In the JS engine we have Shape::sizeOfPropertyTableIncludingThis() and Shape::sizeOfKidsIncludingThis() and there is a circumstance where we want to call one but not the other.  (The |IncludingThis| suffix is necessary in those cases because the members are pointers to objects, not just pointers to arrays.)  So, we need a per-member convention and |SizeOfFoo|, where |Foo| is the member name, seems reasonable.

With those two conventions in mind, I think |ShallowSizeOfEntryStorage| is clear -- we're measuring just the entryStorage field, and we're doing it in a shallow manner.  But |ShallowSizeOfExcludingThis| is less clear -- what's being measured in a shallow manner?  The object itself?  What if there were other fields as well?  So I've gone with |ShallowSizeOfEntryStorage|.


I've addressed all the other issues.  I've left the mucSize name alone for now, waiting on further opinions and final agreement.  I'll attach the patch shortly.
Comment 12 Nicholas Nethercote [:njn] 2011-11-16 20:34:20 PST
Created attachment 575083 [details] [diff] [review]
patch, v2 (against m-i 80363:aa05d607c95e)
Comment 13 Nicholas Nethercote [:njn] 2011-11-16 20:35:48 PST
I'm getting this link error with patch v2, no idea what it's about:

/usr/bin/ld.gold.real: error: /home/njn/moz/mi4/d64/toolkit/library/../../extensions/spellcheck/hunspell/src/mozHunspell.o: requires dynamic R_X86_64_PC32 reloc against 'mozilla::MemoryReporterMUCSizeForCounterInc(void*, unsigned long)' which may overflow at runtime; recompile with -fPIC
/usr/bin/ld.gold.real: error: /home/njn/moz/mi4/d64/toolkit/library/../../extensions/spellcheck/hunspell/src/mozHunspell.o: requires dynamic R_X86_64_PC32 reloc against 'mozilla::MemoryReporterMUCSizeForCounterDec(void*, unsigned long)' which may overflow at runtime; recompile with -fPIC
/home/njn/moz/mi4/extensions/spellcheck/hunspell/src/mozHunspell.cpp:103: error: undefined reference to 'mozilla::MemoryReporterMUCSizeForCounterInc(void*, unsigned long)'
/home/njn/moz/mi4/extensions/spellcheck/hunspell/src/mozHunspell.cpp:107: error: undefined reference to 'mozilla::MemoryReporterMUCSizeForCounterDec(void*, unsigned long)'
collect2: ld returned 1 exit status
Comment 14 Justin Lebar (not reading bugmail) 2011-11-16 20:42:37 PST
> some allocators have 8 as their minimum size;  in fact I think various standards actually mandate 
> this and jemalloc violates it.

You're right.  Here's what C99 says:

> The pointer returned if the allocation
> succeeds is suitably aligned so that it may be assigned to a pointer to any type of object
> and then used to access such an object or an array of such objects in the space allocated
> (until the space is explicitly deallocated)

Our assumption is that if we have to align all pointers to X bytes, that X is the minimum malloc usable size.

But void* is not the type with the largest alignment on all platforms.  In particular, alignof(double) > alignof(void*) on some 32-bit platforms.  Our jemalloc, in violation of the spec, has sizeof(void*) as the minimum alignment.
Comment 15 Justin Lebar (not reading bugmail) 2011-11-16 20:43:49 PST
> I'm getting this link error with patch v2, no idea what it's about:

I've gotten that recently.  I think it means the linker can't find that function (see the undefined reference errors at the end).  Maybe you added const in the implementation but not the header?
Comment 16 Justin Lebar (not reading bugmail) 2011-11-16 20:50:15 PST
> GetHistoryObserverSize() is the function that implements the |amount| field of the "explicit/history-
> links-hashtable" memory reporter so it can't take a mucSizeFun parameter.  (In other words, this is 
> the highest point in the call chain that MemoryReporterMUCSize can be injected as a parameter.)  Do  
> you still want the comment?

Ah, right.  Up to you; my concern is just that people will cargo-cult and break DMD by using MemoryReporterMUCSize when they shouldn't.  But you're the one who would feel most of the pain of that.
Comment 17 Justin Lebar (not reading bugmail) 2011-11-16 21:05:40 PST
> |name| isn't used anywhere in this macro; is that intentional?

This parameter is still in your patch v2.  Is it supposed to be there?

> I think |ShallowSizeOfEntryStorage| is clear -- we're measuring just the entryStorage field, and 
> we're doing it in a shallow manner.  But |ShallowSizeOfExcludingThis| is less clear -- what's being 
> measured in a shallow manner?  The object itself?  What if there were other fields as well?

I prefer SizeOfEntryStorage to ShallowSizeOfEntryStorage myself.  But I feel like asking for the size of the hashtable's "entry storage" exposes an implementation detail unnecessarily.  I don't really care how the hashtable stores things; I just want the size of the hashtable, with the understanding that it's not going to peer into the pointers contained in the table.

I'm also concerned about the explosion of SizeOf* names, so I was hoping that if ShallowSizeOfExcludingThis was a close enough approximation, it would be better than adding a whole new class of name.

But we have bigger names to bikeshed about, so whatever you decide is fine with me.
Comment 18 Nicholas Nethercote [:njn] 2011-11-16 21:08:59 PST
> Maybe you added const in the implementation but not the header?

That's it, thanks for the tip!
Comment 19 Nicholas Nethercote [:njn] 2011-11-16 21:11:35 PST
(In reply to Justin Lebar [:jlebar] from comment #17)
> > |name| isn't used anywhere in this macro; is that intentional?
> 
> This parameter is still in your patch v2.  Is it supposed to be there?

Yes, I augmented the comment in an attempt to explain why it's there.

> I prefer SizeOfEntryStorage to ShallowSizeOfEntryStorage myself.  But I feel
> like asking for the size of the hashtable's "entry storage" exposes an
> implementation detail unnecessarily.  I don't really care how the hashtable
> stores things; I just want the size of the hashtable, with the understanding
> that it's not going to peer into the pointers contained in the table.

Maybe the right thing to do is to make a deep SizeOf for hashtables -- you'd pass in extra params (a function and data) that would be called for each entry.
Comment 20 Nicholas Nethercote [:njn] 2011-11-16 21:31:58 PST
> Maybe the right thing to do is to make a deep SizeOf for hashtables -- you'd
> pass in extra params (a function and data) that would be called for each
> entry.

Then you end up with the odd situation that part of the measurement (the size of the entryStorage array) is returned via the function's return value, and the other part (the size of the things pointed to by the entries) is returned via the void* data argument.  And the caller would have to add them together.

So I haven't done that for the moment, but I have switched to |ShallowSizeOfExcludingThis| in my local copy of the patch.
Comment 21 Justin Lebar (not reading bugmail) 2011-11-16 21:57:56 PST
> Yes, I augmented the comment in an attempt to explain why it's there.

Thanks.

> Then you end up with the odd situation that part of the measurement (the size of the entryStorage 
> array) is returned via the function's return value, and the other part (the size of the things 
> pointed to by the entries) is returned via the void* data argument.

I like the deep sizeof!  You could work around this issue with some gymnastics.  If you'll forgive my bizarre syntax:

typedef fn_type as (entry_type -> PRUint64)
typedef itr_fn_data as {fn_type fn, PRUint64 sum}

def DeepSizeOf(fn_type user_fn):
  def itr_fn(entry_type entry, void* data):
    d = (itr_fn_data*) data;
    d.sum += d.fn(entry);
    return PL_DHASH_NEXT;
  
  itr_fn_data d = {user_fn, 0};
  PL_DHASH_ITERATE(mTable, d);
  return d.sum + size of entry array + [sizeof(*this)]
Comment 22 Nicholas Nethercote [:njn] 2011-11-16 22:01:43 PST
What would be even better would be proper iterators for nsTHashTable/pldhash.  Like the ones that js::HashTable has :)
Comment 23 Justin Lebar (not reading bugmail) 2011-11-16 22:27:57 PST
> PL_DHASH_ITERATE(mTable, d);

er, this should be

> PL_DHASH_ITERATE(mTable, itr_fn, d);

but you get it.

Anyway, nobody's stopping you from fixing pldhash, or at least adding iterators to nsTHashTable.  :)  Gecko's collections are really rusty in general, compared to those in JS.
Comment 24 Nicholas Nethercote [:njn] 2011-11-20 11:29:39 PST
jfkthame:  can you give an ETA on review?  This patch is the foundation for a ton of memory reporter work I'm planning to do.
Comment 25 Nicholas Nethercote [:njn] 2011-11-21 19:15:48 PST
jlebar and I have agreed to change |mucSize| to |mallocSizeOf|.  I won't bother updating the patch under review, though.
Comment 26 Nicholas Nethercote [:njn] 2011-11-22 17:37:34 PST
Created attachment 576363 [details] [diff] [review]
patch v3

I split the textrun changes off into bug 704695.  Now only super-review is needed for the non-textrun parts.  I've carried over jlebar and bhackett's r+ review results.
Comment 27 Nicholas Nethercote [:njn] 2011-11-22 18:12:17 PST
Created attachment 576377 [details] [diff] [review]
patch v4

Ugh, the nsHashTable changes can't be easily separated from the textrun changes.  Going back to the bigger patch.  It uses m|allocSizeOf| as the name now, instead of |mucSize|.
Comment 28 Boris Zbarsky [:bz] 2011-11-23 10:32:26 PST
Comment on attachment 576377 [details] [diff] [review]
patch v4

sr=me.  I like the explicit bits about whether |this| is included!
Comment 29 Nicholas Nethercote [:njn] 2011-11-23 14:14:51 PST
> I like the explicit bits about whether |this| is included!

Me too!  I originally had |bool countMe| flags but jlebar suggested the SizeOfIncludingThis/SizeOfExcludingThis convention which is better.
Comment 30 Jonathan Kew (:jfkthame) 2011-11-25 04:44:23 PST
Comment on attachment 576377 [details] [diff] [review]
patch v4

Review of attachment 576377 [details] [diff] [review]:
-----------------------------------------------------------------

(This only relates to the textrun stuff: I looked at the changes in gfxFont.*, gfxTextRunWordCache.*, nsLayoutUtils.*, nsPresShell.* and nsTextRunTransformations.*. AIUI the other files have been reviewed separately, right?)

Looks good, modulo a couple of nits, see below. Please consider using a more explicit name for ClearSizeOf(), such as ClearSizeOfAccountedFlag() (or maybe Reset.....?) to make it clearer what this method is for.

r=me with comments addressed. I don't want to bikeshed interminably on that name, but at least give it some consideration before landing.

::: gfx/thebes/gfxFont.cpp
@@ +1183,5 @@
>  
>      // synthetic-bold strikes are each offset one device pixel in run direction
>      // (these values are only needed if IsSyntheticBold() is true)
> +    double synBoldOnePixelOffset = 0;
> +    PRInt32 strikes = 0;

Why this change? It doesn't seem related to the issue here at all, and shouldn't be necessary.

@@ +4499,5 @@
>  
>      return total;
>  }
> +size_t
> +gfxTextRun::SizeOfIncludingThis(nsMallocSizeOfFun aMallocSizeOf)

From the nits department: please leave a blank line between functions.

::: gfx/thebes/gfxFont.h
@@ +2065,2 @@
>          mFlags &= ~gfxTextRunFactory::TEXT_RUN_SIZE_ACCOUNTED;
>      }

I don't much like the new name ClearSizeOf() here (and in gfxTextRunWordCache)... it seems less meaningful than ClearSizeAccounted was. If anything, ClearSizeOf sounds like it would cause MaybeSizeOf to return zero, whereas in fact it causes it to *not* return zero the next time it's called.

So, could you consider reverting this name, or trying to come up with something clearer?

(Do we have a similar pattern anywhere else, btw?)

::: gfx/thebes/gfxTextRunWordCache.h
@@ +114,5 @@
> +
> +    /**
> +     * This clears the TEXT_RUN_MEMORY_ACCOUNTED flag on each textRun found.
> +     */
> +    static void ClearSizeOf();

As above, I don't think this name is very clear; would prefer something like ClearSizeOfAccounted.

::: layout/base/nsLayoutUtils.h
@@ +1440,4 @@
>     * total, and sets the TEXT_RUN_MEMORY_ACCOUNTED flag to avoid double-
>     * accounting. (Runs with this flag already set will be skipped.)
>     * Expected usage pattern is therefore to call twice:
> +   *    (void)SizeOfTextRunsForFrames(rootFrame, mallocSizeOf, true);

With clear==true, the mallocSizeOf function will never be used, will it? So (although it doesn't really matter), it would seem logical to pass nsnull here.

(Which would mean the bool parameter is redundant, as you could take a null mallocSizeOf function pointer as indicating "clear the flags, rather than actually doing the accounting". But perhaps that's unnecessarily cryptic - there's no harm in having the explicit parameter, I suppose.)
Comment 31 Nicholas Nethercote [:njn] 2011-11-27 19:05:58 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/d807cb7b91e5


> (This only relates to the textrun stuff: I looked at the changes in
> gfxFont.*, gfxTextRunWordCache.*, nsLayoutUtils.*, nsPresShell.* and
> nsTextRunTransformations.*. AIUI the other files have been reviewed
> separately, right?)

Correct.


> > +    double synBoldOnePixelOffset = 0;
> > +    PRInt32 strikes = 0;
> 
> Why this change? It doesn't seem related to the issue here at all, and
> shouldn't be necessary.

It was to avoid some bogus uninitialized variable warnings in GCC.


> I don't much like the new name ClearSizeOf() here (and in
> gfxTextRunWordCache)... it seems less meaningful than ClearSizeAccounted
> was. If anything, ClearSizeOf sounds like it would cause MaybeSizeOf to
> return zero, whereas in fact it causes it to *not* return zero the next time
> it's called.

I changed it to ResetSizeOfAccountingFlags throughout.


> With clear==true, the mallocSizeOf function will never be used, will it? So
> (although it doesn't really matter), it would seem logical to pass nsnull
> here.
> 
> (Which would mean the bool parameter is redundant, as you could take a null
> mallocSizeOf function pointer as indicating "clear the flags, rather than
> actually doing the accounting". But perhaps that's unnecessarily cryptic -
> there's no harm in having the explicit parameter, I suppose.)

I kept the explicit parameter, but changed it to nsnull where appropriate.

Thanks for the review!
Comment 32 Marco Bonardo [::mak] (Away 6-20 Aug) 2011-11-28 05:30:16 PST
https://hg.mozilla.org/mozilla-central/rev/d807cb7b91e5

Note You need to log in before you can comment on or make changes to this bug.