Closed
Bug 1368369
Opened 7 years ago
Closed 7 years ago
Use an nsTArray rather than AutoTArray to hold frame properties and embed FrameProperties directly in nsIFrame to reduce RAM footprint
Categories
(Core :: Layout, enhancement)
Core
Layout
Tracking
()
RESOLVED
FIXED
mozilla55
Tracking | Status | |
---|---|---|
firefox55 | --- | fixed |
People
(Reporter: jfkthame, Assigned: jfkthame)
References
Details
Attachments
(2 files, 2 obsolete files)
5.42 KB,
patch
|
MatsPalmgren_bugz
:
review+
|
Details | Diff | Splinter Review |
9.28 KB,
patch
|
MatsPalmgren_bugz
:
review+
|
Details | Diff | Splinter Review |
The FrameProperties object in bug 1365982 uses an AutoTArray<PropertyValue,1>, so that a single property can be stored in the FrameProperties object itself. When more than one property is present, an additional heap allocation will be needed to create a separate array contents buffer. It might be worth making the autoarray buffer slightly larger, so that the need for an additional allocation becomes rarer, but of course there'll be a space trade-off to be considered. I did some general browsing with an instrumented build (visiting major sites like wikipedia, ebay, amazon, nytimes, bbc news, gmail, ...). Altogether, about 170K frames of all types were created during the (short) session. Of these, 66% never had any frame properties set, and so no FrameProperties object was created for them. Of the 34% (57K) frames that did get properties, about 60% only ever had a single property, and so it was stored in the autoarray's inline buffer, with no additional allocation. A further 22% had two properties, and a further 8% had three. Higher numbers of properties on a frame, as would be expected, become increasingly rare. This suggests that if we bumped the autoarray buffer size from 1 to 3 (increasing the size of FrameProperties, assuming a 64-bit build, from 32 to 64 bytes), we'd avoid separate array buffer allocation for about 90% of the frames that have any properties attached (up from 60% now). Profiling suggests this may significantly reduce the time spent under FrameProperties methods. With a current trunk build, loading the HTML5 spec and filtering the profile for FrameProperties, I generally see around 18-20ms. With the autoarray buffer increased to 3, this drops to around 10-12ms.
Assignee | ||
Comment 1•7 years ago
|
||
Maybe worth doing, for the slight extra perf (see above)?
Attachment #8872190 -
Flags: review?(mats)
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → jfkthame
Status: NEW → ASSIGNED
Comment 2•7 years ago
|
||
Another way of saying that is that increasing the AutoTArray size to 3 makes an additional
10.2% (96.6% - 86.4%) of the total number of frames avoid one extra malloc during its lifetime?
Using your numbers:
66% of all frames have no properties
20.4% have 1 property => 86.4% (with <=1)
7.48% have 2 properties => 93.88% (with <=2)
2.72% have 3 properties => 96.6% (with <=3)
I looked into the allocation size overhead a bit, which increases significantly with this change:
AutoTArray<PropertyValue,1>
0: 0 (i.e. I'm not including the 8 bytes for nsIFrame:mProperties in any numbers)
1: required_sz:16 actual_sz:32 mallocs:1 (overhead: 16 -- for 34199 frames that's 547184 bytes)
2: required_sz:32 actual_sz:96 mallocs:2 (overhead: 64 -- for 12539 frames that's 802496 bytes)
3: required_sz:48 actual_sz:96 mallocs:2 (overhead: 48 -- for 4559 frames that's 218832 bytes)
4: required_sz:64 actual_sz:160 mallocs:3 (overhead: 96)
5: required_sz:80 actual_sz:160 mallocs:3 (overhead: 80)
6: required_sz:96 actual_sz:160 mallocs:3 (overhead: 64)
7: required_sz:112 actual_sz:160 mallocs:3 (overhead: 48)
8: required_sz:128 actual_sz:288 mallocs:4 (overhead: 160)
AutoTArray<PropertyValue,3>
0: 0 (i.e. I'm not including the 8 bytes for nsIFrame:mProperties in any numbers)
1: required_sz:16 actual_sz:64 mallocs:1 (overhead: 48 -- for 34199 frames that's 1641552 bytes)
2: required_sz:32 actual_sz:64 mallocs:1 (overhead: 32 -- for 12539 frames that's 401248 bytes)
3: required_sz:48 actual_sz:64 mallocs:1 (overhead: 16 -- for 4559 frames that's 72944 bytes)
4: required_sz:64 actual_sz:192 mallocs:2 (overhead: 128)
5: required_sz:80 actual_sz:192 mallocs:2 (overhead: 112)
6: required_sz:96 actual_sz:160 mallocs:2 (overhead: 64)
7: required_sz:112 actual_sz:160 mallocs:2 (overhead: 48)
8: required_sz:128 actual_sz:288 mallocs:3 (overhead: 160)
The "overhead" here is the sum of moz_malloc_usable_size for the allocations made
minus the sizeof(mProperties[0]) * number of properties), which is also an
*underestimation* since we're wasting 8 bytes for the key for each value,
but for comparing N=1 vs N=3 this doesn't matter much.
So, currently for N=1, we have: total overhead 1568512, 68395 mallocs
and for N=3, we have: total overhead 2115744, 51297 mallocs.
So with an additional 547232 bytes overhead, we save 17098 mallocs.
(only counting frames with 3 properties or less for which you gave frame counts)
I'm not sure these stats helps much to make a decision, but I wanted to save
them for the record anyway.
> ... loading the HTML5 spec and filtering the profile for FrameProperties,
> I generally see around 18-20ms. With the autoarray buffer increased to 3,
> this drops to around 10-12ms.
While it's an impressive improvement in terms of time spent in FrameProperties,
we're still "just" saving 8ms for a page load that takes about 6s, so I'm
kind of leaning towards it's not worth it given the additional memory overhead.
Can you provide more details about the page load improvement for some important
web site(s)? How much memory is typically allocated for that page and what is
the typical load time? how many operations is performed for each of the sizes?
how much FrameProperties time did we save by using N=3?
This would make it easier to see if the memory cost is worth it or not.
Flags: needinfo?(jfkthame)
Assignee | ||
Comment 3•7 years ago
|
||
Actually, what I think we should do, to make it easier to explore this, is to address the fact that currently frame properties are not accounted for at all in about:memory. :( Prior to bug 1365982, the size of the FramePropertyTable was included in the layout/pres-contexts figure, but now that the properties are no longer managed by the presContext, that's not the case. But they're also not accounted for by the individual frames, which are just reported via the arena stats. So we need to add a reporter/mechanism that handles the FrameProperties. I'll file a separate bug on this.
Flags: needinfo?(jfkthame)
Assignee | ||
Comment 4•7 years ago
|
||
Another possibility here, actually, would be to replace the AutoTArray with a plain nsTArray, and then in nsIFrame, make the mProperties field directly hold a FrameProperties object instead of a pointer. The size of FrameProperties would just be the size of an nsTArray (i.e. the same as a pointer); a frame with no properties would do no additional allocations; when the first property is added, the nsTArray would allocate storage (where currently we create the FrameProperties object at that time); as more properties are added, array storage allocation just continues as usual. Exactly how this compares (in speed and memory use) to the AutoTArray<PropertyValue,1> version would depend on details of how array allocation is managed, but it should be pretty close. In simple profiling, it seems to show more time spent in FrameProperties methods; but I suspect that's misleading, because the current code has the overhead of checking "if (!mProperties) ..." in the nsIFrame accessors (so it doesn't show up under FrameProperties in the profile), whereas the nsTArray version can drop these checks; in effect, they happen implicitly inside the FrameProperties methods.
Assignee | ||
Comment 5•7 years ago
|
||
Here's the alternative approach mentioned above. I rather like the simplification this gives. Better memory reporting would be helpful in evaluating the trade-offs here, though, before making any actual decisions
Assignee | ||
Comment 6•7 years ago
|
||
Using the patch in bug 1368654, here are snippets from about:memory showing frame-properties memory usage for a few typical pages. (These initial results are with current trunk code where FrameProperties has an AutoTArray<PropertyValue,1>.) eBay search results page │ │ │ │ │ ├────133,792 B (00.06%) ── frame-properties [4] nytimes article │ │ │ │ │ ├─────69,008 B (00.04%) ── frame-properties [3] bbc news home page │ │ │ │ ├─────76,784 B (00.04%) ── frame-properties amazon search results │ │ │ │ ├────403,120 B (00.33%) ── frame-properties wikipedia article │ │ │ ├────489,712 B (00.50%) ── frame-properties Updating FrameProperties to use AutoTArray<PropertyValue,3>, and loading the same pages, we get: eBay │ │ │ │ │ ├────206,784 B (00.10%) ── frame-properties [4] nytimes │ │ │ │ │ ├────111,904 B (00.06%) ── frame-properties [2] bbc │ │ │ ├────117,232 B (00.07%) ── frame-properties amazon │ │ │ │ ├────447,472 B (00.36%) ── frame-properties wikipedia │ │ │ ├────818,832 B (00.83%) ── frame-properties (The test pages may not be identical, due to search results changing between runs, etc., but should be pretty comparable.) So that's a pretty substantial increase in RAM footprint, which I don't think looks worthwhile for the sake of a millisecond here or there. OTOH, replacing the AutoTArray with an nsTArray<PropertyValue>, as in the alternative patch above (comment 5) is much more interesting: eBay │ │ │ │ ├─────60,032 B (00.03%) ── frame-properties [4] nytimes │ │ │ │ │ ├─────26,416 B (00.01%) ── frame-properties [2] bbc │ │ │ │ ├─────36,320 B (00.02%) ── frame-properties amazon │ │ │ │ ├────150,160 B (00.12%) ── frame-properties wikipedia │ │ │ ├────169,104 B (00.17%) ── frame-properties Given this, I think it's clear we should switch to the nsTArray implementation and get rid of the AutoTArray overhead altogether.
Assignee | ||
Updated•7 years ago
|
Summary: Consider increasing the autoarray buffer size in FrameProperties, to reduce number of separate allocations → Use an nsTArray rather than AutoTArray to hold frame properties and embed FrameProperties directly in nsIFrame to reduce RAM footprint
Assignee | ||
Comment 7•7 years ago
|
||
Attachment #8872706 -
Flags: review?(mats)
Assignee | ||
Updated•7 years ago
|
Attachment #8872606 -
Attachment is obsolete: true
Comment 8•7 years ago
|
||
You might want to re-measure, see bug 1368654 comment 3. :-)
Assignee | ||
Comment 9•7 years ago
|
||
(In reply to Mats Palmgren (:mats) from comment #8) > You might want to re-measure, see bug comment 3. :-) Indeed, thanks for catching that! Re-measuring with the part 2 patch from bug 1368654 included: Current (FrameProperties uses AutoTArray<PropertyValue,1>): eBay │ │ │ │ │ ├────151,968 B (00.08%) ── frame-properties [4] nytimes │ │ │ │ ├─────71,808 B (00.04%) ── frame-properties bbc news │ │ │ │ │ ├─────95,392 B (00.05%) ── frame-properties amazon │ │ │ │ ├────264,928 B (00.23%) ── frame-properties wikipedia │ │ │ ├────488,928 B (00.47%) ── frame-properties AutoTArray<PropertyValue,3>: eBay │ │ │ │ │ ├────174,976 B (00.08%) ── frame-properties [4] nytimes │ │ │ │ ├─────92,288 B (00.04%) ── frame-properties bbc news │ │ │ │ │ ├─────95,936 B (00.04%) ── frame-properties amazon │ │ │ │ ├────371,776 B (00.32%) ── frame-properties wikipedia │ │ │ ├────647,936 B (00.63%) ── frame-properties nsTArray<PropertyValue>: eBay │ │ │ │ │ ├────118,400 B (00.07%) ── frame-properties [4] nytimes │ │ │ │ │ ├─────54,688 B (00.03%) ── frame-properties bbc news │ │ │ │ │ ├─────68,960 B (00.04%) ── frame-properties amazon │ │ │ │ ├────222,880 B (00.19%) ── frame-properties wikipedia │ │ │ ├────366,112 B (00.36%) ── frame-properties
Assignee | ||
Updated•7 years ago
|
Attachment #8872190 -
Attachment is obsolete: true
Attachment #8872190 -
Flags: review?(mats)
Comment 10•7 years ago
|
||
Comment on attachment 8872706 [details] [diff] [review] Use nsTArray instead of AutoTArray in FrameProperties, and embed the FrameProperties object directly in nsIFrame > - MOZ_ASSERT(mMaxLength > 0, "redundant FrameProperties!"); It looks like this was the only use of mMaxLength, so we can now remove it? Also, I was looking through the Set/Add/Remove/DeleteInternal methods in the .cpp file and I think those are trivial enough to be moved to the .h file to give the compiler the chance to optimize these calls better. It will then see all the way from call site -> nsIFrame -> FrameProperties -> nsTArray. (Alternatively, add a FramePropertiesInlines.h file if you think it's too much code for the .h file) Feel free to tack that on as a part 2 here if you agree.
Attachment #8872706 -
Flags: review?(mats) → review+
Assignee | ||
Comment 11•7 years ago
|
||
Added patch to move the internal methods to the header, as suggested. (I've also updated the first patch locally to remove mMaxLength, now redundant.)
Attachment #8873078 -
Flags: review?(mats)
Comment 12•7 years ago
|
||
Comment on attachment 8873078 [details] [diff] [review] part 2 - Move internal FrameProperties methods to the header as inlines, for better optimization opportunities Thanks, r=mats >+ void DeleteAll(const nsIFrame* aFrame) { >+ for (auto& prop : mProperties) { >+ prop.DestroyValueFor(aFrame); Perhaps this is overkill, but it wouldn't hurt to assert that Length() doesn't change while we iterate here. Something like this?: #ifdef DEBUG size_t len = mProperties.Length(); #endif for (auto& prop : mProperties) { prop.DestroyValueFor(aFrame); MOZ_ASSERT(len == mProperties.Length());
Attachment #8873078 -
Flags: review?(mats) → review+
Assignee | ||
Comment 13•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/4abbee1979c215ffbf126534d45bf1ca76c0b29b Bug 1368369 - pt 1 - Use nsTArray instead of AutoTArray in FrameProperties, and embed the FrameProperties object directly in nsIFrame. r=mats https://hg.mozilla.org/integration/mozilla-inbound/rev/0ea05f32de96073ce623e0831c0f79a65daebda0 Bug 1368369 - pt 2 - Move internal FrameProperties methods to the header as inlines, for better optimization opportunities. r=mats
Comment 14•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/4abbee1979c2 https://hg.mozilla.org/mozilla-central/rev/0ea05f32de96
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in
before you can comment on or make changes to this bug.
Description
•