Closed
Bug 1094564
Opened 6 years ago
Closed 6 years ago
Use a segmented vector in SnowWhiteKiller
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla36
People
(Reporter: njn, Assigned: njn)
References
Details
Attachments
(2 files, 1 obsolete file)
|
9.55 KB,
text/plain
|
Details | |
|
7.57 KB,
patch
|
smaug
:
review+
|
Details | Diff | Splinter Review |
SnowWhiteKiller allocates an nsTArray whose size is based on a worst-case
scenario, e.g. if every object in the purple buffer needs to be appended to it.
And it can be quite big -- often 0.5 MiB or more each time. Here are some
sizes from a DMD run:
> Individual block sizes: 913,408; 733,184; 724,992; 716,800 x 2; 499,712;
> 462,848 x 2; 454,656 x 2; 450,560 x 2; 446,464; 442,368; 438,272 x 2; 434,176
> x 2; 430,080 x 2; 425,984; 421,888; 413,696; 409,600; 401,408 x 2; 397,312;
> 393,216; 376,832; 368,640; 364,544 x 2; 233,472 x 5; 167,936; 135,168;
> 57,344; 40,960 x 3; 36,864 x 7; 32,768 x 4; 28,672 x 8; 24,576 x 9; 20,480 x
> 5; 16,384 x 5; 12,288 x 24; 8,192 x 17; 4,096 x 20; ~4,093 x 2
But in practice, most of the time the nsTArray never gets even close to being
filled. In fact, around 95% of the time a 4 KiB (on 32-bit) or 8 KiB (on
64-bit) buffer will suffice. So a better approach is to use a segmented vector
type and grow it as necessary, instead of allocating for the worse case.
Benefits:
- It will greatly reduce the cumulative allocations.
- It will avoids the need for large allocations, which can be a problem on
win32 when virtual address space runs low.
- It might help with fragmentation, because all the allocations will be a
single nice size (4 or 8 KiB) instead of having all these awkward sizes like
892 KiB and 716 KiB and 708 KiB.| Assignee | ||
Comment 1•6 years ago
|
||
Here are some more measurements from a different browser session. Each line
shows a SnowWhiteKiller instantiation, the required length and the allocated
capacity for the array. (This is measured in elements, not bytes.)
E.g. this line:
> 3 / 12547 (0.02%)
indicates that we allocated an nsTArray with a capacity of 12547, but we only
ended up using 3 of those elements.
Summary stats:
- 574 cases in total.
- 80 (13.9%) of them ended up using 0 elements.
- 555 (96.7%) of them needed <= 341 elements, which is how many can fit in 4
KiB on 32-bit and 8 KiB on 64-bit.
- The maximum number of used elements is 7212, which is 173,088 bytes on
64-bit.
- The maximum allocated capacity is 30,519, which is 732,456 bytes on 64-bit.| Assignee | ||
Comment 2•6 years ago
|
||
This patch implements the change suggested in comment 0. If I start the browser, open Gmail, CNN and TechCrunch, and the shutdown, I get: - old: 28,440,888 bytes of cumulative allocations from 210 allocs - new: 1,351,680 bytes of cumulative allocations from 330 allocs I considered writing a generic SegmentedVector class, but ended up opting for code specific to this file, for conciseness.
Attachment #8517878 -
Flags: review?(bugs)
| Assignee | ||
Comment 3•6 years ago
|
||
Here are cumulative measurements using the current code in an e10s build when running http://gregor-wagner.com/tmp/mem50: > PARENT PROCESS > -------------- > Live { > ~1,200 blocks in heap block record 33 of 4,903 > ~82,550,742 bytes (~80,033,302 requested / ~2,517,440 slop) > Individual block sizes: 1,122,304; 1,118,208; 479,232; 434,176; 417,792 x 2; 413,696 x 5; 409,600; 364,544; 323,584; 319,488; 307,200; 294,912 x 2; 266,240; 253,952; 249,856; 245,760; 241,664; 237,568 x 5; 233,472 x 6; 229,376 x 6; 225,280 x 3; 221,184 x 4; 217,088; 212,992 x 3; 204,800 x 2; 200,704 x 2; 196,608; 192,512; 188,416 x 5; 180,224 x 4; 176,128 x 4; 172,032 x 8; 167,936 x 7; 163,840 x 4; 159,744 x 2; 155,648 x 5; 151,552 x 6; 147,456 x 8; 143,360 x 3; 139,264 x 13; 135,168 x 9; 131,072 x 8; 126,976 x 16; 122,880 x 13; 118,784 x 16; 114,688 x 9; 110,592 x 14; 106,496 x 22; 102,400 x 18; 98,304 x 18; 94,208 x 17; 90,112 x 19; 86,016 x 31; 81,920 x 27; 77,824 x 32; 73,728 x 29; 69,632 x 39; 65,536 x 30; 61,440 x 33; 57,344 x 43; 53,248 x 24; 49,152 x 29; 45,056 x 40; 40,960 x 44; 36,864 x 48; 32,768 x 55; 28,672 x 41; 24,576 x 61; 20,480 x 49; 16,384 x 53; 12,288 x 64; 8,192 x 72; 4,096 x 37; ~4,093 x 14 > 0.64% of the heap (60.73% cumulative) > Allocated at { > #01: nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithMemutils>::EnsureCapacity(unsigned long, unsigned long) (/home/njn/moz/mi5/co64dmd/toolkit/components/ur > l-classifier/../../../dist/include/nsTArray-inl.h:135) > #02: nsTArray_Impl<SnowWhiteObject, nsTArrayFallibleAllocator>::SetCapacity(unsigned long) (/home/njn/moz/mi5/co64dmd/xpcom/base/../../dist/include/nsTArray.h:1478) > #03: SnowWhiteKiller (/home/njn/moz/mi5/co64dmd/xpcom/base/../../../xpcom/base/nsCycleCollector.cpp:2606) > } > } > > CHILD PROCESS > ------------- > Live { > ~1,319 blocks in heap block record 7 of 5,967 > ~270,606,294 bytes (~267,927,758 requested / ~2,678,536 slop) > Individual block sizes: 2,383,872; 2,367,488; 2,301,952; 2,256,896; 2,252,800 x 16; 2,093,056; 2,068,480; 1,998,848; 1,359,872; 1,101,824; 933,888; 786,432; 741,376; 733,184; 716,800; 700,416; 696,320; 692,224; 679,936; 675,840; 667,648; 663,552 x 2; 638,976 x 2; 630,784 x 2; 626,688 x 2; 622,592; 618,496; 614,400 x 2; 610,304; 606,208 x 2; 602,112 x 4; 598,016 x 2; 593,920 x 2; 589,824 x 2; 585,728 x 3; 581,632; 577,536 x 2; 569,344 x 4; 565,248 x 7; 561,152 x 2; 557,056 x 2; 552,960 x 4; 548,864; 544,768 x 3; 540,672; 536,576 x 2; 532,480 x 4; 528,384; 520,192; 512,000; 507,904; 503,808 x 2; 499,712; 495,616; 483,328 x 2; 475,136; 471,040 x 3; 466,944; 462,848 x 2; 458,752 x 2; 454,656 x 2; 450,560 x 2; 446,464 x 3; 442,368; 434,176; 425,984 x 4; 421,888; 417,792 x 2; 413,696 x 3; 405,504 x 3; 401,408 x 3; 397,312 x 2; 393,216 x 2; 389,120 x 3; 385,024 x 2; 376,832 x 2; 372,736 x 4; 368,640 x 4; 364,544 x 5; 360,448 x 5; 352,256 x 2; 348,160 x 5; 344,064 x 3; 339,968 x 3; 335,872 x 6; 331,776 x 6; 327,680; 323,584 x 4; 319,488; 315,392 x 2; 311,296 x 3; 307,200 x 8; 303,104 x 4; 299,008 x 2; 294,912 x 2; 290,816 x 5; 286,720 x 4; 282,624; 278,528 x 6; 274,432 x 5; 270,336 x 4; 266,240 x 5; 262,144 x 6; 258,048 x 7; 253,952 x 2; 249,856 x 4; 245,760 x 3; 241,664 x 6; 237,568 x 7; 233,472 x 7; 229,376 x 5; 225,280 x 8; 221,184 x 8; 217,088 x 9; 212,992 x 14; 208,896 x 14; 204,800 x 19; 200,704 x 15; 196,608 x 13; 192,512 x 8; 188,416 x 13; 184,320 x 10; 180,224 x 18; 176,128 x 19; 172,032 x 17; 167,936 x 18; 163,840 x 22; 159,744 x 18; 155,648 x 14; 151,552 x 31; 147,456 x 28; 143,360 x 15; 139,264 x 30; 135,168 x 32; 131,072 x 27; 126,976 x 9; 122,880 x 21; 118,784 x 14; 114,688 x 17; 110,592 x 7; 106,496 x 22; 102,400 x 18; 98,304 x 11; 94,208 x 22; 90,112 x 29; 86,016 x 19; 81,920 x 5; 77,824 x 16; 73,728 x 6; 69,632 x 15; 65,536 x 14; 61,440 x 16; 57,344 x 17; 53,248 x 19; 49,152 x 29; 45,056 x 20; 40,960 x 30; 36,864 x 29; 32,768 x 19; 28,672 x 22; 24,576 x 28; 20,480 x 34; 16,384 x 9; 12,288 x 12; 8,192 x 32; 4,096 x 25; ~4,093 x 14 > 1.91% of the heap (44.37% cumulative) > Allocated at { > #01: nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithMemutils>::EnsureCapacity(unsigned long, unsigned long) (/home/njn/moz/mi5/co64dmd/toolkit/components/ur > l-classifier/../../../dist/include/nsTArray-inl.h:135) > #02: nsTArray_Impl<SnowWhiteObject, nsTArrayFallibleAllocator>::SetCapacity(unsigned long) (/home/njn/moz/mi5/co64dmd/xpcom/base/../../dist/include/nsTArray.h:1478) > #03: SnowWhiteKiller (/home/njn/moz/mi5/co64dmd/xpcom/base/../../../xpcom/base/nsCycleCollector.cpp:2606) > } > }
| Assignee | ||
Updated•6 years ago
|
Depends on: cumulative-heap-profiling
Updated•6 years ago
|
Summary: Used a segmented vector in SnowWhiteKiller → Use a segmented vector in SnowWhiteKiller
Comment 4•6 years ago
|
||
(We really need to put a proper segmented array to some public place, like xpcom/* or mfbt/*.) Couldn't you at least use LinkedList and not ChunkList. But even better would be if you could reuse SegmentedArray which is already used for JSPurpleBuffer. It does have magical '60' for the array size, but that could be factored out to be more flexible.
Comment 5•6 years ago
|
||
Comment on attachment 8517878 [details] [diff] [review] Used a segmented vector in SnowWhiteKiller So unless there is some really good reason to add yet another segmented array, we should use the existing one (especially because it is a template class already).
Attachment #8517878 -
Flags: review?(bugs) → review-
| Assignee | ||
Comment 6•6 years ago
|
||
This patch generalizes SegmentedArray a little, and then uses it instead of nsTArray in SnowWhiteKiller.
Attachment #8518666 -
Flags: review?(bugs)
| Assignee | ||
Updated•6 years ago
|
Attachment #8517878 -
Attachment is obsolete: true
Comment 7•6 years ago
|
||
Comment on attachment 8518666 [details] [diff] [review] Used SegmentedArray in SnowWhiteKiller >+ // Segments are 4 KiB on 32-bit and 8 KiB on 64-bit. >+ static const size_t kIdealSegmentSize = sizeof(void*) * 1024; >+ static const size_t kSingleElemSegmentSize = >+ sizeof(SegmentedArrayElement<SnowWhiteObject, 1>); >+ static const size_t kSegmentCapacity = >+ (kIdealSegmentSize - kSingleElemSegmentSize) / sizeof(SnowWhiteObject) + 1; >+ >+ static const size_t kActualSegmentSize = >+ sizeof(SegmentedArrayElement<SnowWhiteObject, kSegmentCapacity>); 2 spaces for indentation. > ~SnowWhiteKiller() ... >+ auto segment = mObjects.GetFirstSegment(); Ok, this is about the only case where use o >+ while (segment) { >+ for (uint32_t i = 0; i < segment->Length(); i++) { >+ SnowWhiteObject& o = segment->ElementAt(i); >+ if (!o.mRefCnt->get() && !o.mRefCnt->IsInPurpleBuffer()) { >+ mCollector->RemoveObjectFromGraph(o.mPointer); >+ o.mRefCnt->stabilizeForDeletion(); >+ o.mParticipant->Trace(o.mPointer, *this, nullptr); >+ o.mParticipant->DeleteCycleCollectable(o.mPointer); >+ } > } >+ segment = segment->getNext(); > } Given the current SegmentedArray setup, you'd need to have mObjects.Clear(); call here. Don't you get an assertion all the time without it? (I don't recall why I wanted the explicit .Clear() call and MOZ_ASSERT(IsEmpty()); in SegmentedArray dtor.) > > void > Visit(nsPurpleBuffer& aBuffer, nsPurpleBufferEntry* aEntry) > { > MOZ_ASSERT(aEntry->mObject, "Null object in purple buffer"); > if (!aEntry->mRefCnt->get()) { > void* o = aEntry->mObject; > nsCycleCollectionParticipant* cp = aEntry->mParticipant; > CanonicalizeParticipant(&o, &cp); > SnowWhiteObject swo = { o, cp, aEntry->mRefCnt }; >- if (mObjects.AppendElement(swo)) { >- aBuffer.Remove(aEntry); >- } >+ mObjects.AppendElement(swo); >+ aBuffer.Remove(aEntry); > } > } > > bool HasSnowWhiteObjects() const > { >- return mObjects.Length() > 0; >+ return !mObjects.IsEmpty(); > } > > virtual void Trace(JS::Heap<JS::Value>* aValue, const char* aName, > void* aClosure) const > { > JS::Value val = *aValue; > if (val.isMarkable()) { > void* thing = val.toGCThing(); >@@ -2696,17 +2717,17 @@ public: > > virtual void Trace(JS::Heap<JSFunction*>* aFunction, const char* aName, > void* aClosure) const > { > } > > private: > nsCycleCollector* mCollector; >- FallibleTArray<SnowWhiteObject> mObjects; >+ ObjectsArray mObjects; > }; > > class RemoveSkippableVisitor : public SnowWhiteKiller > { > public: > RemoveSkippableVisitor(nsCycleCollector* aCollector, > uint32_t aMaxCount, bool aRemoveChildlessNodes, > bool aAsyncSnowWhiteFreeing,
Attachment #8518666 -
Flags: review?(bugs) → review+
| Assignee | ||
Comment 8•6 years ago
|
||
> >+ auto segment = mObjects.GetFirstSegment(); > Ok, this is about the only case where use o Sorry, I don't understand this comment. > Given the current SegmentedArray setup, you'd need to have mObjects.Clear(); > call here. > Don't you get an assertion all the time without it? > (I don't recall why I wanted the explicit .Clear() call and > MOZ_ASSERT(IsEmpty()); in SegmentedArray dtor.) Good catch. I did a try run and got build errors on most platforms, due to having SegmentedArrayElement as a class name and a typedef. Linux-ASAN was one of the few platforms that built and I got a bunch of leaks there, very likely because of this. Shouldn't be hard to fix, I'll definitely do so and re-run on try before landing. BTW, something I forgot to mention is that adding a new SnowWhiteObject to the array is now infallible. I figure this is probably ok, because if we fail to allocate 4 KiB or 8 KiB then we're in deep trouble anyway. But I could make it fallible if you think it's important.
Comment 9•6 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #8) > > >+ auto segment = mObjects.GetFirstSegment(); > > Ok, this is about the only case where use o > > Sorry, I don't understand this comment. What happened to my comment... I think I said "Ok, this is about the only case where use of auto is not horrible." > BTW, something I forgot to mention is that adding a new SnowWhiteObject to > the array is now infallible. I figure this is probably ok, because if we > fail to allocate 4 KiB or 8 KiB then we're in deep trouble anyway. But I > could make it fallible if you think it's important. I don't think it is important.
| Assignee | ||
Comment 10•6 years ago
|
||
Try looks good on an updated patch that calls Clear(): https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=70ae0f061954
| Assignee | ||
Comment 11•6 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/095f5e23c5c5
Comment 12•6 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/095f5e23c5c5
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
You need to log in
before you can comment on or make changes to this bug.
Description
•