Last Comment Bug 898558 - Move heap-dirty and heap-committed-unused into explicit, add more heap reporters, and change their names
: Move heap-dirty and heap-committed-unused into explicit, add more heap report...
Status: RESOLVED FIXED
[MemShrink]
:
Product: Toolkit
Classification: Components
Component: about:memory (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla25
Assigned To: Justin Lebar (not reading bugmail)
:
Mentors:
Depends on: 912462 899880
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-26 11:26 PDT by Justin Lebar (not reading bugmail)
Modified: 2013-09-04 05:43 PDT (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch, v1 (3.61 KB, patch)
2013-07-26 14:01 PDT, Justin Lebar (not reading bugmail)
n.nethercote: feedback+
Details | Diff | Review
Part 0, v1: Rework jemalloc_stats so it exposes how much memory is used for bookkeeping. (3.83 KB, patch)
2013-07-26 18:22 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Review
Part 1, v2: Rework the jemalloc heap memory reporters. (13.39 KB, patch)
2013-07-26 18:24 PDT, Justin Lebar (not reading bugmail)
n.nethercote: review+
Details | Diff | Review
Part 0, v2: Rework jemalloc_stats so it exposes how much memory is used for bookkeeping. (4.06 KB, patch)
2013-07-27 00:23 PDT, Justin Lebar (not reading bugmail)
mh+mozilla: review+
Details | Diff | Review

Description Justin Lebar (not reading bugmail) 2013-07-26 11:26:27 PDT
I'd like us to copy heap-dirty and heap-committed-unused into explicit.

The reason is that, when diff'ing memory reports, it's very difficult to look
at anything other than the explicit tree.  Only that tree is guaranteed not to
double-count.

I grant that heap-dirty and heap-committed-unused don't count memory
'explicitly requested by the application'.  But I think putting them under
explicit will confuse exactly zero people, and changing the name 'explicit'
will break too many consumers.
Comment 1 Justin Lebar (not reading bugmail) 2013-07-26 14:01:12 PDT
Created attachment 781922 [details] [diff] [review]
Patch, v1
Comment 2 Nicholas Nethercote [:njn] 2013-07-26 14:46:46 PDT
> The reason is that, when diff'ing memory reports, it's very difficult
> to look at anything other than the explicit tree.

Why is it difficult?

In my current session I have:

559.91 MB (100.0%) -- explicit
  3.87 MB ── heap-dirty
 87.57 MB ── heap-committed-unused

So this would be a major change to the meaning of explicit, which would (among other things) throw off our AWSY numbers.

Can you explain what this change would let us do that we currently cannot?
Comment 3 Justin Lebar (not reading bugmail) 2013-07-26 15:24:48 PDT
> Can you explain what this change would let us do that we currently cannot?

Because human beings are turing-complete, it is impossible to rearrange memory reporters such that there exists a value we can compute using the rearranged memory reporters that we could not have computed before the rearrangement.

I obviously cannot meet this burden, but I also think it's obviously the wrong burden.

> Why is it difficult?

This is the right question.

Suppose you want to compare two memory reports.  The difference in their RSS is R, and the difference in their explicit is E.

If |R - E| ~= 0 -- that is, if most of the difference in memory usage between the two processes is explained by the difference in explicit -- then this change isn't important.

But consider the opposite situation, when |R - E| >> 0.  Now we need to understand why the RSS increased even though explicit stayed the same.

This could happen for a variety of reasons.  We could have non-heap memory that isn't covered by a memory reporter (e.g. graphics memory).  We could have any number of bugs in our memory reporter infrastructure or in about:memory.  In general understanding these situations is difficult, so we want to avoid them as much as possible.

That is, the goal is for |R - E| to be as small as possible, because this makes understanding where memory is going as easy as possible, and that's the goal of about:memory.  API cleanliness is subsidiary to this goal; a clean memory-reporting API that doesn't give reports that help us easily understand where memory is going isn't useful, because the API is not an end in itself.

Putting |heap-committed-unused| and |heap-dirty| in explicit furthers the goal of decreasing |R - E|.  This has real-world consequences.  Today khuey and I were trying to understand why a B2G process's explicit stayed the same but its RSS increased.  We wasted about twenty minutes on it before I remembered the business with the heap numbers.  Almost anyone else would have simply misunderstood the diff.  By far the easiest way I can think of to prevent these mistakes is to make this change.

> So this would be a major change to the meaning of explicit

I agree it's a change.  Like I said in comment 0, I think the question is, will anyone be confused by this?  I think the answer is clearly no.

> which would (among other things) throw off our AWSY numbers.

a) We don't care about the AWSY explicit number.

   Note that we don't care about explicit precisely because |R - E| can
   sometimes be large.  So it's inconsistent to at once say that we shouldn't
   make this change because it will change AWSY's explicit numbers and they are
   important to keep consistent, and on the other hand to say that we don't
   care about making |R - E| smaller.

b) It would be a one-time bump to explicit.  I seriously doubt this would
   preclude our ability to find bugs, particularly because:

c) As you say, making this change does not preclude us from subtracting these
   values out if we want to.  Just as this change does not let us do anything
   that we currently cannot, it also does not prevent us from continuing to do
   anything that we currently can.

I think we agree that the Right Thing to do would be to change the hirearchy so that something else is at the top, and "explicit" lives under that.  I'd be totally fine with that change, but that's a week of work that I don't have to dedicate to this right now; I'm trying to fix a problem we actually face.
Comment 4 Nicholas Nethercote [:njn] 2013-07-26 16:02:35 PDT
Ok, I'm warming to the idea.  One sticking point is that I keep forgetting what heap-committed-unused and heap-dirty mean, even after reading the about:memory tooltips multiple times.  Let me try to summarize:

- heap-committed-unused:  jemalloc fragmentation and internal book-keeping, basically.

- heap-dirty: freed blocks that jemalloc is holding onto for quick recycling.

I wonder if there are better names for these.

Are you 100% certain they don't overlap?  I could imagine heap-dirty being a subset of heap-committed-unused.

The tool-tip for explicit will also need updating if this change goes ahead.
Comment 5 Justin Lebar (not reading bugmail) 2013-07-26 16:17:02 PDT
> - heap-committed-unused:  jemalloc fragmentation and internal book-keeping, basically.
>
> - heap-dirty: freed blocks that jemalloc is holding onto for quick recycling.

That's what I understand.

> I wonder if there are better names for these.

I think we could split out jemalloc bookkeeping and fragmentation pretty easily; in jemalloc_stats, it just adds "base" (bookkeeping) to committed.  Then we could have

  heap-fragmentation (yay)
  heap-bookkeeping
  heap-page-cache (?)

> Are you 100% certain they don't overlap?

It looks like you're right, they do overlap.  Or at least, I see situations where we decrement ndirty and committed at the same time, and if they didn't overlap, we wouldn't do that.

Good call.

(This speaks to my point about how hard it is to do these computations to figure out why RSS increased.)

> The tool-tip for explicit will also need updating if this change goes ahead.

How?
Comment 6 Nicholas Nethercote [:njn] 2013-07-26 16:27:30 PDT
> It looks like you're right, they do overlap.  Or at least, I see situations
> where we decrement ndirty and committed at the same time, and if they didn't
> overlap, we wouldn't do that.

So whatever we end up with, good names and non-overlapping is paramount.

> > The tool-tip for explicit will also need updating if this change goes ahead.
> 
> How?

Currently we have:

"This tree covers explicit memory allocations by the application, both at the \
operating system level (via calls to functions such as VirtualAlloc, \
vm_allocate, and mmap), and at the heap allocation level (via functions such \
as malloc, calloc, realloc, memalign, operator new, and operator new[]) that \
have not been explicitly decommitted (i.e. evicted from memory and swap). \
\n\n\
It excludes memory that is mapped implicitly such as code and data segments, \
and thread stacks.  It also excludes heap memory that has been freed by the \
application but is still being held onto by the heap allocator. \
\n\n\
It is not guaranteed to cover every explicit allocation, but it does cover \
most (including the entire heap), and therefore it is the single best number \
to focus on when trying to reduce memory usage.",

The first paragraph could explain that it includes malloc-heap book-keeping (and whatever else we end up including).

The "It also excludes heap memory..." sentence in the second paragraph might be wrong, depending on what happens with heap-dirty.

We'll also need to think about the non-jemalloc case.  It might be easy.
Comment 7 Justin Lebar (not reading bugmail) 2013-07-26 16:31:50 PDT
> The tool-tip for explicit will also need updating if this change goes ahead.

Oh, right; I missed the "for explicit" bit.  I forgot that explicit had a tooltip!
Comment 8 Nicholas Nethercote [:njn] 2013-07-26 16:58:30 PDT
Comment on attachment 781922 [details] [diff] [review]
Patch, v1

f+: basic idea seems ok, but another viewing is needed.  Thanks.
Comment 9 Justin Lebar (not reading bugmail) 2013-07-26 18:11:55 PDT
> 16.31 MB (16.88%) -- heap-overhead
> ├───6.74 MB (06.97%) ── bookkeeping
> ├───6.71 MB (06.94%) ── fragmentation
> └───2.87 MB (02.97%) ── page-cache

I think this is nice.

Upon reflection, I think putting these values in explicit only makes sense.  Don't Repeat Yourself.

I got rid of heap-unused (which counted mapped memory), because it was confusing and not an interesting data point.

Patch in a moment.
Comment 10 Justin Lebar (not reading bugmail) 2013-07-26 18:22:50 PDT
Created attachment 782057 [details] [diff] [review]
Part 0, v1: Rework jemalloc_stats so it exposes how much memory is used for bookkeeping.

This patch also gets rid of the redundant "committed" entry, so now
there's no confusion as to which entries overlap.
Comment 11 Justin Lebar (not reading bugmail) 2013-07-26 18:24:01 PDT
Created attachment 782058 [details] [diff] [review]
Part 1, v2: Rework the jemalloc heap memory reporters.
Comment 12 Nicholas Nethercote [:njn] 2013-07-26 20:49:15 PDT
Comment on attachment 782058 [details] [diff] [review]
Part 1, v2: Rework the jemalloc heap memory reporters.

Review of attachment 782058 [details] [diff] [review]:
-----------------------------------------------------------------

We'll need to add an annotation to AWSY to explain the jump in 'explicit'.

::: toolkit/components/aboutmemory/content/aboutMemory.js
@@ +322,5 @@
>    'explicit' :
> +"This tree covers explicit memory allocations by the application.  It includes
> +\n\n
> +* allocations made at the operating system level (via calls to functions such as
> +VirtualAlloc, vm_allocate, and mmap),

Plaintext list formatting inside HTML tooltips feels ironic.  I trust you've checked this looks ok.

@@ +336,2 @@
>  \n\n\
> +Explicit is not guaranteed to cover every explicit allocation, but it does cover \

Change "Explicit" to "'explicit'", for consistency with other tooltips.

::: toolkit/components/telemetry/TelemetryPing.js
@@ +72,5 @@
>    "images-content-used-uncompressed":
>      "MEMORY_IMAGES_CONTENT_USED_UNCOMPRESSED",
>    "heap-allocated": "MEMORY_HEAP_ALLOCATED",
> +  "heap-overhead": "MEMORY_HEAP_COMMITTED_UNUSED",
> +  "heap-overhead-ratio": "MEMORY_HEAP_COMMITTED_UNUSED_RATIO",

This doesn't look right.  Surely you should change the values as well as the keys?

::: xpcom/base/nsMemoryReporterManager.cpp
@@ +497,5 @@
>          return (int64_t) stats.allocated;
>      }
>  };
> +
> +class HeapFragmentationReporter MOZ_FINAL : public MemoryReporterBase

Nit: I try to mirror the report class names and the path names.  So this should be |HeapOverheadFragmentationReporter|.  Likewise for the other reporters below.

@@ +501,5 @@
> +class HeapFragmentationReporter MOZ_FINAL : public MemoryReporterBase
> +{
> +public:
> +    // We mark this and the other heap-overhead reporters as non-heap because
> +    // "heap" memory, in our context, means "counted in heap-allocated", which

I think it's clearer if you change 'non-heap' to 'KIND_NONHEAP' and '"heap"' to 'KIND_HEAP'.

@@ +502,5 @@
> +{
> +public:
> +    // We mark this and the other heap-overhead reporters as non-heap because
> +    // "heap" memory, in our context, means "counted in heap-allocated", which
> +    // this does not.

s/does not/is not/.

@@ +510,5 @@
> +"Committed bytes which do not correspond to an active allocation and which the "
> +"allocator is not intentionally keeping alive (i.e., not 'heap-bookkeeping' or "
> +"'heap-page-cache').  The allocator will waste some space under any "
> +"circumstances, a large value here may indicate that the allocator is "
> +"performing poorly.")

OMG is that a comma splice? :)  I think you want to add "While" to the start of the sentence.

@@ +534,5 @@
> +    int64_t Amount()
> +    {
> +        jemalloc_stats_t stats;
> +        jemalloc_stats(&stats);
> +        return stats.fragmentation;

Copy/paste error:  |stats.bookkeeping|, not |stats.fragmentation|.

And now I'm wondering how you got different numbers for bookkeeping and fragmentation in your example in comment 9...

@@ +544,5 @@
> +public:
> +    // aExplicit indicates whether this memory reporter should include
> +    // "explicit/" in its path.  We mark the explicit reporter as non-heap
> +    // because "heap" memory, in our context, means "counted in
> +    // heap-allocated", which this is not.

This comment should be removed now.

@@ +558,5 @@
> +    int64_t Amount()
> +    {
> +        jemalloc_stats_t stats;
> +        jemalloc_stats(&stats);
> +        return (int64_t) stats.dirty;

Is it worth renaming |dirty| as |pagecache| in the first patch?  It would make this patch easier to understand.

@@ +571,5 @@
> +"Memory mapped by the heap allocator that is committed, i.e. in physical "
> +"memory or paged to disk.  This value should be equal to 'heap-allocated' + "
> +"'heap-fragmentation' + 'heap-bookkeeping' + 'heap-page-cache', but because "
> +"these values are not read at exactly the same time, the result probably "
> +"won't be exact.")

Use of "exactly" and "exact" close together is awkward.  How about "..., but may not be, because these values are not read at exactly the same time".
Comment 13 Justin Lebar (not reading bugmail) 2013-07-26 23:57:25 PDT
> And now I'm wondering how you got different numbers for bookkeeping and fragmentation in 
> your example in comment 9...

It's that whole bit about reading the jemalloc stats multiple times.

I should have been more suspicious of the nearly-identical values there.

> Surely you should change the values as well as the keys?

That would create new telemetry histograms, which I figure we don't want to do, since the data is the same.

> OMG is that a comma splice?

Agh!  I am ashamed.
Comment 14 Justin Lebar (not reading bugmail) 2013-07-27 00:07:36 PDT
> I trust you've checked this looks ok.

It looks fine to me, but I'm not known for my discerning eye.
Comment 15 Justin Lebar (not reading bugmail) 2013-07-27 00:19:29 PDT
> So this should be |HeapOverheadFragmentationReporter|.

Okay, but note the ambiguity with HeapOverheadRatioReporter, which corresponds to heap-overhead-ratio, not heap-overhead/ratio.

> Is it worth renaming |dirty| as |pagecache| in the first patch?  It would make this patch 
> easier to understand.

Sure, if glandium is OK with that, that's fine with me.  I'd thought it was better to keep the stats members aligned with jemalloc's names, but I've already changed most of the other names (e.g., "bookkeeping" instead of "base"), so changing this one makes sense to me.

I reworked it to

> Memory mapped by the heap allocator that is committed, i.e. in physical
> memory or paged to disk.  This value corresponds to 'heap-allocated' + 
> 'heap-fragmentation' + 'heap-bookkeeping' + 'heap-page-cache', but because
> these values are read at different times, the result probably won't be
> exact.

does that sound OK?
Comment 16 Justin Lebar (not reading bugmail) 2013-07-27 00:23:32 PDT
Created attachment 782103 [details] [diff] [review]
Part 0, v2: Rework jemalloc_stats so it exposes how much memory is used for bookkeeping.
Comment 17 Nicholas Nethercote [:njn] 2013-07-27 15:35:30 PDT
> It's that whole bit about reading the jemalloc stats multiple times.

Ah, yep.


> > So this should be |HeapOverheadFragmentationReporter|.
> 
> Okay, but note the ambiguity with HeapOverheadRatioReporter, which
> corresponds to heap-overhead-ratio, not heap-overhead/ratio.

That's fine.


> I reworked it to
> 
> > Memory mapped by the heap allocator that is committed, i.e. in physical
> > memory or paged to disk.  This value corresponds to 'heap-allocated' + 
> > 'heap-fragmentation' + 'heap-bookkeeping' + 'heap-page-cache', but because
> > these values are read at different times, the result probably won't be
> > exact.

s/be exact/match exactly/?  If you want.
Comment 18 Mike Hommey [:glandium] 2013-07-28 16:19:11 PDT
Comment on attachment 782103 [details] [diff] [review]
Part 0, v2: Rework jemalloc_stats so it exposes how much memory is used for bookkeeping.

Review of attachment 782103 [details] [diff] [review]:
-----------------------------------------------------------------

Bikeshedding.

::: memory/mozjemalloc/jemalloc_types.h
@@ +74,5 @@
>  	size_t	mapped;		/* Bytes mapped (not necessarily committed). */
> +	size_t	allocated;	/* Bytes allocated (committed, in use by application). */
> +        size_t  fragmentation;  /* Bytes committed, not in use by the
> +                                   application, and not intentionally left
> +                                   unused (i.e., not dirty). */

It's a count of wasted memory. It has not so much to do with fragmentation.
Comment 19 Mike Hommey [:glandium] 2013-07-28 16:20:03 PDT
Can you file a bug to get something like this in jemalloc 3 and make it block bug 762449?
Comment 20 Justin Lebar (not reading bugmail) 2013-07-28 20:12:15 PDT
> It's a count of wasted memory. It has not so much to do with fragmentation.

I started listing out the different things counted here in an attempt to demonstrate that the majority was fragmentation, but then I discovered that you were right.  :)

I'll update things; thanks.
Comment 21 Justin Lebar (not reading bugmail) 2013-07-29 09:12:55 PDT
Filed bug 899126 for jemalloc3.
Comment 22 Justin Lebar (not reading bugmail) 2013-07-29 10:09:01 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/905805f269ba
https://hg.mozilla.org/integration/mozilla-inbound/rev/d4f1b39a8575

Thanks, guys.  I think this will be really helpful.
Comment 24 Justin Lebar (not reading bugmail) 2013-07-30 11:33:08 PDT
Annotation added to awsy: https://github.com/Nephyrin/MozAreWeSlimYet/commit/8ea378087d39c64f64b174155dde6ac4760dfa09
Comment 25 Jonathan Griffin (:jgriffin) 2013-07-31 15:16:26 PDT
This patch breaks my local emulator builds with:

BUILDSTATUS TIERDIR_START memory/replace
jemalloc_config.c
mozjemalloc_compat.c
/home/jgriffin/mozilla-inbound/src/memory/build/mozjemalloc_compat.c: In function 'replace_jemalloc_stats':
/home/jgriffin/mozilla-inbound/src/memory/build/mozjemalloc_compat.c:81: error: 'jemalloc_stats_t' has no member named 'dirty'
/home/jgriffin/mozilla-inbound/src/memory/build/mozjemalloc_compat.c:82: error: 'jemalloc_stats_t' has no member named 'committed'
/home/jgriffin/mozilla-inbound/src/memory/build/mozjemalloc_compat.c:82: error: 'jemalloc_stats_t' has no member named 'dirty'

The error doesn't seem to affect TBPL builds, however.  Any idea why?
Comment 26 Jonathan Griffin (:jgriffin) 2013-07-31 15:20:22 PDT
TBPL seems to build different files than my local build.  TBPL shows:

11:31:38     INFO -  BUILDSTATUS TIERDIR_START memory
11:31:38     INFO -  jemalloc.c
11:31:39     INFO -  jemalloc_config.c
11:31:39     INFO -  mozmemory_wrap.c
11:31:39     INFO -  BUILDSTATUS TIERDIR_FINISH memory

whereas my local build compiles mozjemalloc_compat.c after jemalloc_config.c.
Comment 27 Justin Lebar (not reading bugmail) 2013-07-31 15:20:49 PDT
If you're building mozjemalloc_compat.c, I think that means you're building with jemalloc3 (there's a #error at the top of the file otherwise).

jemalloc3 shouldn't be the default.  At the risk of asking the obvious, do you have something weird in your build config?
Comment 28 Jonathan Griffin (:jgriffin) 2013-07-31 15:25:46 PDT
Hmm, I don't, but I see TBPL has:

11:30:35     INFO -  	export DISABLE_JEMALLOC="" && \

which may explain the difference.  Should the non-DISABLE_JEMALLOC codepath be supported?  If not, I'll update my config.
Comment 29 Justin Lebar (not reading bugmail) 2013-07-31 15:30:20 PDT
> Should the non-DISABLE_JEMALLOC codepath be supported?

I'm having trouble with the nested negatives here, but yes, we should not require DISABLE_JEMALLOC.
Comment 30 Mike Hommey [:glandium] 2013-07-31 17:22:37 PDT
bug 899880 was filed for that build error, fwiw.
Comment 31 Jonathan Griffin (:jgriffin) 2013-07-31 17:38:54 PDT
Ah, defining MOZ_DMD is apparently responsible for triggering the MOZ_JEMALLOC3 code path.
Comment 32 Mike Hommey [:glandium] 2013-07-31 17:46:48 PDT
(In reply to Jonathan Griffin (:jgriffin) from comment #31)
> Ah, defining MOZ_DMD is apparently responsible for triggering the
> MOZ_JEMALLOC3 code path.

Yes, DMD enabled replace-malloc, which enables jemalloc3 as a replace-malloc lib.
Comment 33 Justin Lebar (not reading bugmail) 2013-07-31 18:00:47 PDT
I guess I need to fix that bug then, don't I?  :)

Note You need to log in before you can comment on or make changes to this bug.