Closed Bug 1094552 (cumulative-heap-profiling) Opened 10 years ago Closed 9 years ago

DMD: support cumulative heap profiling

Categories

(Core :: DMD, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla37

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

(Blocks 3 open bugs)

Details

(Whiteboard: [MemShrink:P2])

Attachments

(6 files, 1 obsolete file)

I've been having some success with a hacked version of DMD that doesn't free any heap memory -- so instead of doing live memory profiling, it does cumulative memory profiling. It uses *lots* of memory, but I've used it to find lots of places where we have excessive heap churn, such as silly reallocation practices.

Therefore, I think it's worth cleaning this up and landing it.
Summary: Add a no-free mode, to enable cumulative memory profiling → DMD: Add a no-free mode, to enable cumulative memory profiling
Blocks: 1094564
Blocks: 1095272
Depends on: 1095307
froydnj requested this.
Summary: DMD: Add a no-free mode, to enable cumulative memory profiling → DMD: support cumulative heap profiling
When bug 1073662 landed it caused an tresize improvement of something like 5% on OS X 10.6:

http://graphs.mozilla.org/graph.html#tests=%5B%5B254,63,21%5D%5D&sel=1415894932000,1416067732000&displayrange=7&datatype=running

Recycling may have to be disabled on 10.6 (bug 1100485), but I'm curious if the improvement is because there's a lot of heap churn during tresize. If you have time, is that something you could look into?
Alias: cumulative-heap-profiling
Blocks: 1100219
Blocks: 1096624
Blocks: 1095307
No longer depends on: 1095307
Depends on: 1102525
The getter is called IsSampled(), so the field should match that.
Attachment #8526543 - Flags: review?(continuation)
Attachment #8526543 - Flags: review?(continuation) → review+
This patch:

- Uses |auto| in Range loops, so more of them fit on a single line.

- Converts one use of HashSet::Enum (which is only needed if you're modifying
  the HashSet as you iterate) to HashSet::Range.
Attachment #8526549 - Flags: review?(continuation)
This is to give better contrast with |DeadBlock|, which will be added in the
next patch.
Attachment #8526550 - Flags: review?(continuation)
That's the preliminary patches out of the way. The big one that actually implements the cumulative profiling is still to come, but I won't finish it until next week.
Attachment #8526549 - Flags: review?(continuation) → review+
Whiteboard: [MemShrink]
Comment on attachment 8526550 [details] [diff] [review]
(part 3) - DMD: rename |Block| as |LiveBlock|

Review of attachment 8526550 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for splitting these cleanups into separate patches!
Attachment #8526550 - Flags: review?(continuation) → review+
Whiteboard: [MemShrink] → [MemShrink:P2]
Attachment #8520932 - Attachment is obsolete: true
Attachment #8528817 - Attachment is patch: true
> (part 5) - DMD: choose the profiling mode at start-up

I apologize for how messy the changes relating to the tests are. It was unavoidable.
Attachment #8528817 - Flags: review?(continuation) → review+
Comment on attachment 8528841 [details] [diff] [review]
(part 5) - DMD: choose the profiling mode at start-up

Review of attachment 8528841 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/replace/dmd/DMD.cpp
@@ +304,5 @@
>        : mDefault(aDefault), mMax(aMax), mActual(aDefault)
>      {}
>    };
>  
> +  // About modes: DMD has several modes. These modes affect what data is

nit: "About modes:" seems unnecessary.

@@ +305,5 @@
>      {}
>    };
>  
> +  // About modes: DMD has several modes. These modes affect what data is
> +  // recorded and written to the output file. And the written data affects the

nit: Drop the "And".

@@ +315,5 @@
> +  // single DMD run. But if you are only interested in one of the simpler
> +  // modes, you'd pay the price of (a) increased memory usage and (b) *very*
> +  // large log files.
> +  //
> +  // Another possibility is at the other extreme: specify the mode as soon as

Please put the option you actually implemented first, because that's what most people reading this comment care about.

It seems odd to me to describe things we don't do in a comment, rather than just in Bugzilla, but if you want to leave it here, that's fine.

@@ +328,5 @@
> +  //
> +  enum Mode
> +  {
> +    // For each live block, this mode outputs: size (usable and slop),
> +    // allocation stack, and whether it's sampled. Good for live heap

In this paragraph and the next, I'd prefer if there were complete sentences.  Something like "Good for" -> "This option is good for" and "The default mode." -> "This is the default mode."

@@ +1551,5 @@
>  public:
>    ToIdStringConverter() : mNextId(0) { mIdMap.init(512); }
>  
> +  // Converts a pointer to a unique ID. Reuses the existing ID for the pointer
> +  // if it's been seen before.

Accidental change?

@@ +1645,5 @@
> +        mode = "live";
> +      } else if (gOptions->IsLiveWithReportsMode()) {
> +        mode = "live-with-reports";
> +      } else {
> +        MOZ_ASSERT(false);

Add a description for this assert, please, like "Unknown DMD mode.".

::: memory/replace/dmd/DMD.h
@@ +275,5 @@
>  {
>    return !!DMDFuncs::Get();
>  }
>  
> +// Resets all DMD options and clears all recorded data about allocations. Only

I think this comment should explicitly call out that it sets the options to those passed in via aOptions.

::: memory/replace/dmd/test/SmokeDMD.cpp
@@ +171,5 @@
> +  e2 = (char*) realloc(e2, 512);
> +  Report(e2);
> +
> +  // First realloc is like malloc;  second realloc creates a min-sized block.
> +  // XXX: on Windows, second realloc frees the block.

Woah, really?  How does that happen?

::: memory/replace/dmd/test/test_dmd.js
@@ +130,5 @@
>  
> +  function test2(aTestName, aMode) {
> +    let name = "full-" + aTestName + "-" + aMode;
> +    jsonFile = FileUtils.getFile("CurWorkD", [name + ".json"]);
> +    test(name, [jsonFile.path])

nit: missing semicolon.

::: python/mozbuild/mozbuild/mach_commands.py
@@ +839,5 @@
>  
>      @CommandArgumentGroup('DMD')
>      @CommandArgument('--dmd', action='store_true', group='DMD',
>          help='Enable DMD. The following arguments have no effect without this.')
> +    @CommandArgument('--mode', choices=['live', 'live-with-reports'], group='DMD',

Maybe --dmd-mode?  --mode seems really generic.
Attachment #8528841 - Flags: review?(continuation) → review+
Comment on attachment 8528845 [details] [diff] [review]
(part 6) - DMD: add support for cumulative heap profiling.

Review of attachment 8528845 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the comments addressed.

This is quite nice.

I find it odd that we know whether a block is alive or dead at runtime, but then throw that information out when it comes to the JSON output.  It seems like it wouldn't be too much more work to somehow record that in the JSON.  Then the dmd.py output could have nice Live { ... } and Dead { ... } entries.  Of course, you've demonstrated that this is quite useful as is, so you or I can file a followup bug for that.

::: memory/replace/dmd/DMD.cpp
@@ +444,5 @@
>    bool IsLocked() { return mIsLocked; }
>  };
>  
>  // This lock must be held while manipulating global state such as
> +// gStackTraceTable, gLiveBlockTable, gDeadBlockList. Note that gOptions is

Accidental change?

@@ +1012,5 @@
> +    , mSlopSize(aLb.SlopSize())
> +    , mAllocStackTrace_mIsSampled(aLb.AllocStackTrace(), aLb.IsSampled())
> +  {
> +    MOZ_ASSERT(AllocStackTrace());
> +    MOZ_ASSERT_IF(IsSampled(), SlopSize() == 0);

This assertion is axiomatically true.  SlopSize() should be changed so the assertion is useful, as described below.

@@ +1022,5 @@
> +
> +  // Sampled blocks always have zero slop.
> +  size_t SlopSize() const
> +  {
> +    return IsSampled() ? 0 : mSlopSize;

This should always return mSlopSize. Then you can also remove the comment.

@@ +1027,5 @@
> +  }
> +
> +  size_t UsableSize() const
> +  {
> +    return IsSampled() ? mReqSize : (mReqSize + mSlopSize);

This should be |return mReqSize + mSlopSize|.  No need to check IsSampled() when we have mSlopSize computed already.

@@ +1263,3 @@
>    } else {
> +    // If realloc fails, we undo the prior operations by re-inserting the old
> +    // pointer into the live block table. (We don't have to do anything with

Parens don't really feel necessary to me.  That's an important comment!

@@ +1263,5 @@
>    } else {
> +    // If realloc fails, we undo the prior operations by re-inserting the old
> +    // pointer into the live block table. (We don't have to do anything with
> +    // the dead block list because the dead block hasn't yet been inserted.)
> +    // list. It will look like it was allocated for the first time here, which

spurious "list." snuck in here.

@@ +1528,5 @@
>      gLiveBlockTable = InfallibleAllocPolicy::new_<LiveBlockTable>();
>      gLiveBlockTable->init(8192);
> +
> +    // Create this even if the mode isn't LiveAndDead, in case the mode is
> +    // changed later on (as is done by SmokeDMD.cpp, for example).

How big is this empty dead block list going to be?  It doesn't seem great to pay this space cost when we're mostly going to not be using it.  It seems like we could allocate it later if needed in the ResetOptions() method.  Though you'd need to add a bunch of null checks, so whatever you think is best.

::: memory/replace/dmd/dmd.py
@@ +635,2 @@
>          liveUsableSize, liveBlocks = \
> +            printRecords(mode, liveRecords, heapUsableSize)

Please change the recordKind argument to be 'live or dead' when |mode| is 'live-and-dead'.  In the dmd.py output, seeing that a block is "Live-and-dead" makes it sound like it is both live and dead, which confused me.  Plus the dashes aren't needed for human-read output.
Attachment #8528845 - Flags: review?(continuation) → review+
> > +  // Converts a pointer to a unique ID. Reuses the existing ID for the pointer
> > +  // if it's been seen before.
> 
> Accidental change?

The line had more than 80 chars, so I changed it.

> > +  // First realloc is like malloc;  second realloc creates a min-sized block.
> > +  // XXX: on Windows, second realloc frees the block.
> 
> Woah, really?  How does that happen?

The behaviour of realloc(p, 0) depends on the implementation. Some allocators treat is like free(p). Others treat it like a realloc down to a minimum-sized block. See bug 1035001.

> Maybe --dmd-mode?  --mode seems really generic.

It is... but all the other DMD-specific options to |mach run| match the actual DMD options, and I don't want to lose that consistency, and it's not causing a problem now, so I'll leave it unchanged.
> I find it odd that we know whether a block is alive or dead at runtime, but
> then throw that information out when it comes to the JSON output.  It seems
> like it wouldn't be too much more work to somehow record that in the JSON. 
> Then the dmd.py output could have nice Live { ... } and Dead { ... }
> entries.  Of course, you've demonstrated that this is quite useful as is, so
> you or I can file a followup bug for that.

I thought about that and decided that merging live and dead blocks was better. Because otherwise we'll like end up with cases where we have, say, 1000 dead blocks and then 10 live blocks with the same allocation trace. And presenting them separately doesn't seem useful -- you'll probably just miss the 10 live blocks because it'll be some insignificant record down the bottom. In such a case the live blocks will probably just end up dead in a little while anyway.


> >  // This lock must be held while manipulating global state such as
> > +// gStackTraceTable, gLiveBlockTable, gDeadBlockList. Note that gOptions is
> 
> Accidental change?

No. Adding gDeadBlockList to the list is useful information.


> How big is this empty dead block list going to be?  It doesn't seem great to
> pay this space cost when we're mostly going to not be using it.  It seems
> like we could allocate it later if needed in the ResetOptions() method. 
> Though you'd need to add a bunch of null checks, so whatever you think is
> best.

Running with |--mode=live --show-dump-stats=yes| answers this:

> Dead block list:              32 bytes (0 entries)

I'll add a comment that it's tiny :)
Blocks: 1110530
Blocks: 1110596
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
Blocks: 1111432
Blocks: 1111879
Blocks: 1111885
Blocks: 1113010
Blocks: 1113069
Blocks: 1124545
You need to log in before you can comment on or make changes to this bug.