DMD: Refactor and reduce space consumption of blocks and block groups

RESOLVED FIXED in Firefox 19

Status

()

defect
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: njn, Assigned: njn)

Tracking

Trunk
mozilla20
Points:
---

Firefox Tracking Flags

(firefox19 fixed, firefox20 fixed, b2g18 fixed)

Details

Attachments

(8 attachments, 1 obsolete attachment)

35.06 KB, patch
justin.lebar+bug
: review+
justin.lebar+bug
: approval-mozilla-aurora+
justin.lebar+bug
: approval-mozilla-b2g18+
Details | Diff | Splinter Review
5.08 KB, patch
justin.lebar+bug
: review+
Details | Diff | Splinter Review
19.88 KB, patch
justin.lebar+bug
: review+
Details | Diff | Splinter Review
5.10 KB, patch
justin.lebar+bug
: review+
Details | Diff | Splinter Review
11.57 KB, patch
justin.lebar+bug
: review+
Details | Diff | Splinter Review
7.15 KB, patch
justin.lebar+bug
: review+
Details | Diff | Splinter Review
9.49 KB, patch
justin.lebar+bug
: review+
Details | Diff | Splinter Review
806 bytes, patch
justin.lebar+bug
: review+
Details | Diff | Splinter Review
In bug 717853 comment 133 I promised a post-landing refactoring to fix up various things.  There will be several patches, and there will be some easy space wins along the way.
This patch:

- Splits BlockKey into LiveBlockKey and DoubleReportBlockKey.  They don't have
  much overlap, so this is reasonable.  And it gets rid of that bloody union.

- Renames Block as LiveBlock, because it inherits from LiveBlockKey.

- Splits BlockGroup into three:  a base class (named BlockGroup) and two
  sub-classes, LiveBlockGroup (which also inherits from LiveBlockKey) and
  DoubleReportBlockGroup (which also inherits from DoubleBlockKey).

Problems this solves:

- Prior to this patch, Block inherited from BlockKey, but only the |Live|
  variant, never the |DoubleReport| variant.  This necessitated things like
  asserting that we had a |Live| variant at various places which was silly.

- This shrinks the per-live-heap-block metadata.  (LiveBlock is smaller than
  Block.)  Here are the (64-bit) stats from stress mode before and after:

    Stack traces:           208,000 bytes
    Stack trace table:      262,224 bytes (16,384 entries, 1,000 used)
    Live block table:    83,886,160 bytes (1,048,576 entries, 500,000 used)
    Double-report table:        400 bytes (4 entries, 0 used)
    Unreported table:       163,840 bytes (2,048 entries, 1,000 used)
    Reported table:         163,840 bytes (2,048 entries, 0 used)
  ->
    Stack traces:           208,000 bytes
    Stack trace table:      262,224 bytes (16,384 entries, 1,000 used)
    Live block table:    67,108,944 bytes (1,048,576 entries, 500,000 used)
    Double-report table:        368 bytes (4 entries, 0 used)
    Unreported table:       131,072 bytes (2,048 entries, 1,000 used)
    Reported table:         131,072 bytes (2,048 entries, 0 used)
Attachment #691196 - Flags: review?(justin.lebar+bug)
LiveBlockGroup is a LiveBlockKey, so the has policy can be moved from the
former into the latter.  Likewise for DoubleReportBlock{Group,Key}.
Attachment #691197 - Flags: review?(justin.lebar+bug)
BlockSize is used for Blocks and BlockGroups and FrameGroups.  This patch
renames it as AllocSize, which makes it sound less Block-specific.
Attachment #691198 - Flags: review?(justin.lebar+bug)
I thought I could make BlockSize smaller, because mSlop is redundant w.r.t. mReq.  But while that's true for malloc/calloc/realloc, it's not true for memalign/posix_memalign/realloc.

For those ones, jemalloc gives an actual size that's a multiple of the aligned size.  For example:

- memalign(128, 129) gives a 256 byte block, for 127 bytes of slop.
- valloc(1) gives a 4096 byte block, for 4095 bytes of slop.

So we have to store both.  How annoying.
Could you go the other way around?  Store mSlop and use malloc_usable_size to compute the allocated size?
> Could you go the other way around?  Store mSlop and use malloc_usable_size
> to compute the allocated size?

Hmm, if I have mReq and the actual pointer to the block, that would suffice.  Maybe that'll work.
(In reply to Nicholas Nethercote [:njn] from comment #1)
> Created attachment 691196 [details] [diff] [review]
> (part 1) - DMD: Split BlockKey in two, and fix the ensuing fall-out.

A note about this patch:  I've done a minimal amount of renaming in this patch, using names that make sense in terms of the names we already have.  But I'm planning a subsequent patch that will improve the naming overall.
Comment on attachment 691198 [details] [diff] [review]
(part 3) - DMD: Rename BlockSize as AllocSize.

Let's hold off on patch 3 for the moment, as a subsequent patch might render it unnecessary.
Attachment #691198 - Flags: review?(justin.lebar+bug)
Attachment #691197 - Flags: review?(justin.lebar+bug) → review+
Comment on attachment 691196 [details] [diff] [review]
(part 1) - DMD: Split BlockKey in two, and fix the ensuing fall-out.

>+// This is used by both LiveBlocks and LiveBlockGroups.
>+class LiveBlockKey

[...]

>+  // Live blocks can be reported in two ways.
>+  // - The most common is via a memory reporter traversal -- the block is
>+  //   reported when the reporter runs, causing DMD to mark it as reported,
>+  //   and DMD must clear the marking once it has finished its analysis.
>+  // - Less common are ones that are reported immediately on allocation.  DMD
>+  //   must *not* clear the markings of these blocks once it has finished its
>+  //   analysis.  The |mReportedOnAlloc| field is set for such blocks.
>+  const StackTrace* mReportStackTrace;  // nullptr if unreported
>+  const char*       mReporterName;      // gUnreportedName if unreported
>+  bool              mReportedOnAlloc;   // true if block was reported

You may have thought of this already, but:

* You could save another word by moving mReportedOnAlloc into one of the
pointers.

* We could save two or three words for most blocks by splitting out the
reported-on-alloc blocks from the other ones (since msot blocks aren't
reported-on-alloc).  Given that the block table is where the vast majority of
our memory goes, perhaps this would be worthwhile.

>   bool IsReported() const
>   {
>-    MOZ_ASSERT(IsSaneLiveBlock());  // should only call this on live blocks
>-    bool isRep = ReporterName() != gUnreportedName;
>+    MOZ_ASSERT(IsSane());  // should only call this on live blocks

Missed a spot.

>+  static bool Match(const LiveBlockKey& aA, const LiveBlockKey& aB)
>   {
>+    return aA.mAllocStackTrace   == aB.mAllocStackTrace &&
>+           aA.mReportStackTrace == aB.mReportStackTrace &&
>+           aA.mReporterName     == aB.mReporterName;

Alignment.

>@@ -1040,132 +998,205 @@ replace_free(void* aPtr)
>   FreeCallback(aPtr, t);
>   gMallocTable->free(aPtr);
> }
> 
> namespace mozilla {
> namespace dmd {
> 
> //---------------------------------------------------------------------------
>-// Block groups
>+// Live and double-report block groups
> //---------------------------------------------------------------------------
> 
>-// A group of one or more heap blocks with a common BlockKey.
>-class BlockGroup : public BlockKey
>+class BlockGroup

Maybe BlockGroupValues or BlockGroupStats or something would be a better name
here.  But I see now that you're planning a mass-rename sometime later.  OK.

> {
>+protected:
>+  // {Live,DoubleReport}BlockKey serve as the key in
>+  // {Live,DoubleReport}BlockGroupTable.  Thes two fields constitute the value,
>+  // so it's ok for them to be |mutable|.
>+  mutable uint32_t  mNumBlocks;     // number of blocks with this LiveBlockKey
>   mutable BlockSize mCombinedSize;  // combined size of those blocks
> 
> public:
>+  explicit BlockGroup()
>+    : mNumBlocks(0),
>       mCombinedSize()
>   {}

explicit only does something to one-arg constructors.  At least, I'm pretty
sure.

Explicitly calling the mCombinedSize default constructor isn't necessary, but
it's no problem if you want to keep it.

>+// A group of one or more live heap blocks with a common LiveBlockKey.
>+class LiveBlockGroup : public LiveBlockKey, public BlockGroup
>+{
>+  friend class FrameGroup;      // FrameGroups are created from LiveBlockGroups
>+
>+public:
>+  explicit LiveBlockGroup(const LiveBlockKey& aKey)
>+    : LiveBlockKey(aKey),
>+      BlockGroup()
>+  {}

Don't need the explicit BlockGroup() constructor call (unless you want it).
Attachment #691196 - Flags: review?(justin.lebar+bug) → review+
> * You could save another word by moving mReportedOnAlloc into one of the
> pointers.

I have that on my list, thanks.


> * We could save two or three words for most blocks by splitting out the
> reported-on-alloc blocks from the other ones (since msot blocks aren't
> reported-on-alloc).  Given that the block table is where the vast majority of
> our memory goes, perhaps this would be worthwhile.

Then I'd need another table to hold all the report data when the memory reporters run.  In the case where we open lots of tabs, then close them all, then run the memory reporters, it would be better space-wise.  But in the case where we open lots of tabs and then run the memory reporters, it would be worse because of the overhead of splitting the data across two tables.


> explicit only does something to one-arg constructors.  At least, I'm pretty
> sure.

Copy-paste fail.  Interesting there's no compiler warning.


> Explicitly calling the mCombinedSize default constructor isn't necessary, but
> it's no problem if you want to keep it.

Yeah, I like it.
> But in the case where we open lots of tabs and then run the memory reporters, it would be 
> worse because of the overhead of splitting the data across two tables.

I was thinking that after you ran the memory reporters, you'd clear all the data for all of the non-report-on-alloc blocks.  The memory report info is only persistent for those blocks for which it needs to be persistent.

Whether you implement that by having a third table that you simply delete after reporting is done, or whether you shove all of the reported stuff into the same table as the reported-on-alloc blocks is up to the implementer, of course...
This patch splits BlockSize into BlockSize (which pertains to a single block)
and GroupSize (which pertains to a group of blocks).  It also reduces the size
of BlockSize by not storing the slop size;  it can be recovered with the
requested size and the block pointer.

This reduces the size of an entry in the LiveBlock table from 64 to 56 bytes.
Attachment #691665 - Flags: review?(justin.lebar+bug)
Comment on attachment 691665 [details] [diff] [review]
(part 3) - DMD: Distinguish BlockSize and GroupSize.

>+  // This assumes that we'll never request an allocation of 2 GiB or more on
>+  // 32-bit platforms.

You could assert this if you're worried about it, since we are in fact
intercepting calls to malloc.  :)

>+  size_t Usable(const void* aPtr) const
>   {
>+    return mSampled ? mReq : MallocSizeOf(aPtr);
>+  }

You said earlier that we couldn't do this.  What changed?
Attachment #691665 - Flags: review?(justin.lebar+bug) → review+
> You could assert this if you're worried about it, since we are in fact
> intercepting calls to malloc.  :)

You mean assert-assert, or always-crash-assert?  The latter is more attractive.


> You said earlier that we couldn't do this.  What changed?

I'm not sure what you mean by "this"...  but maybe I was mistaken.
> You mean assert-assert, or always-crash-assert?  The latter is more attractive.

The latter.

> I'm not sure what you mean by "this"...  but maybe I was mistaken.

The issue in comment 4.
Attachment #691198 - Attachment is obsolete: true
> The issue in comment 4.

I thought that given the requested size, you could find the actual size with malloc_good_size.  You can't.  But if you also have the block pointer, you can (that's what I meant in comment 6).
Now that BlockSize is only inherited by LiveBlock, it can be inlined into 
LiveBlock.
Attachment #691694 - Flags: review?(justin.lebar+bug)
 > I thought that given the requested size, you could find the actual size with
> malloc_good_size.  You can't.  But if you also have the block pointer, you
> can (that's what I meant in comment 6).

Ah, I see.  Got it.
Attachment #691694 - Flags: review?(justin.lebar+bug) → review+
Currently LiveBlockTable is a hashmap(ptr, LiveBlock).  This patch moves the
ptr into LiveBlock and changes the tabe to a hashset(LiveBlock).

This means there is no need to pass the pointer into Slop() and Usable().

Note that LiveBlock's hash policy now overrides the one it inherits from 
LiveBlockKey, which is gross.  I'll fix that in the next patch.
Attachment #691695 - Flags: review?(justin.lebar+bug)
Attachment #691695 - Flags: review?(justin.lebar+bug) → review+
LiveBlockKey is currently the base class for both LiveBlock and
LiveBlockGroup.  This patch changes that for LiveBlock.

As a result, LiveBlock and LiveBlockKey have some same-named fields and some
similar methods, and you can no longer use a LiveBlock to look up a
LiveBlockGroupTable (you have to build a LiveBlockKey from the LiveBlock
first).

I think the slight repetition is worth it for the general unspaghettifying
effect, e.g.:

- Hash policies are easier to understand now -- no overriding.

- The mReportedOnAlloc field is only needed in LiveBlock.

- The same-named fields are mutable in LiveBlock and const in LiveBlockKey.
Attachment #691697 - Flags: review?(justin.lebar+bug)
This previous patch didn't actually compile, due to badly ordered classes.
This patch just moves LiveBlockKey and DoubleReportBlockKey without any other
changes.  (I did it like this because patch 6 and 6b combined produce a
horrible diff.)
Attachment #691699 - Flags: review?(justin.lebar+bug)
Attachment #691699 - Flags: review?(justin.lebar+bug) → review+
Might as well do this while I'm in here.
Attachment #691700 - Flags: review?(justin.lebar+bug)
Comment on attachment 691700 [details] [diff] [review]
(part 7) - DMD: Fix bug in strdup_.

Sorry again about missing this one during review; I've written this bug before and know to look for it.
Attachment #691700 - Flags: review?(justin.lebar+bug) → review+
Comment on attachment 691697 [details] [diff] [review]
(part 6) - DMD: Don't use LiveBlockKey in LiveBlock.

Wouldn't it be nice if we could use virtual functions?  Ah well...  :)

>@@ -782,37 +767,68 @@ public:
>            aA.mReportStackTrace1 == aB.mReportStackTrace1 &&
>            aA.mReportStackTrace2 == aB.mReportStackTrace2 &&
>            aA.mReporterName1     == aB.mReporterName1 &&
>            aA.mReporterName2     == aB.mReporterName2;
>   }
> };
> 
> // A live heap block.
>+class LiveBlock
> {
>+  friend class LiveBlockKey;

You know what I'm going to say about this...  Can we just add getters for the
necessary fields?  (It took me forever to figure out that you needed the friend
for the LiveBlockKey constructor.)
Attachment #691697 - Flags: review?(justin.lebar+bug) → review+
> You know what I'm going to say about this...

Alright, alright.  It was late in the day, I was feeling lazy, and the friend option was less code :P
Blocks: 822148
No longer blocks: 822148
Comment on attachment 691196 [details] [diff] [review]
(part 1) - DMD: Split BlockKey in two, and fix the ensuing fall-out.

[Triage Comment]
a=me for all patches that landed as part of this bug.  RyanVM, let me know if not marking each patch individually makes your life harder.
Attachment #691196 - Flags: approval-mozilla-b2g18+
Attachment #691196 - Flags: approval-mozilla-aurora+
Not a problem as long as you say so.
Comment on attachment 691665 [details] [diff] [review]
(part 3) - DMD: Distinguish BlockSize and GroupSize.

>+  const GroupSize& GroupSize() const { return mGroupSize; }
This is invalid C++, because in this scope GroupSize should refer to the function name and not the type. It just so happens that the compiler doesn't know that GroupSize refers to the function name until after it's parsed it as the type. GCC detects this if you use -fpedantic.
(In reply to comment #31)
>(From update of attachment 691665 [details] [diff] [review])
>>+  const GroupSize& GroupSize() const { return mGroupSize; }
>This is invalid C++, because in this scope GroupSize should refer to the
>function name and not the type. It just so happens that the compiler doesn't
>know that GroupSize refers to the function name until after it's parsed it
>as the type. GCC detects this if you use -fpedantic.
(Fixed by bug 822698)
Component: General → DMD
You need to log in before you can comment on or make changes to this bug.