Last Comment Bug 675132 - JSRope::flatten wastes ~25% of the memory it allocates due to bad rounding up
: JSRope::flatten wastes ~25% of the memory it allocates due to bad rounding up
Status: RESOLVED FIXED
[MemShrink:P1][clownshoes][inbound]
: footprint
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla8
Assigned To: Nicholas Nethercote [:njn]
:
Mentors:
Depends on:
Blocks: DarkMatter
  Show dependency treegraph
 
Reported: 2011-07-28 20:56 PDT by Nicholas Nethercote [:njn]
Modified: 2012-06-14 19:50 PDT (History)
13 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Instrumentation patch (1.57 KB, patch)
2011-07-28 20:56 PDT, Nicholas Nethercote [:njn]
no flags Details | Diff | Splinter Review
patch (1.55 KB, patch)
2011-07-28 21:20 PDT, Nicholas Nethercote [:njn]
luke: review-
Details | Diff | Splinter Review
stress test (284 bytes, text/plain)
2011-07-28 21:24 PDT, Nicholas Nethercote [:njn]
no flags Details
variation on njn's patch (4.14 KB, patch)
2011-07-29 09:34 PDT, Luke Wagner [:luke]
n.nethercote: review+
Details | Diff | Splinter Review

Description Nicholas Nethercote [:njn] 2011-07-28 20:56:31 PDT
Created attachment 549299 [details] [diff] [review]
Instrumentation patch

Memory allocators are usually very good at allocating chunks that have a size that is a power-of-two.  Indeed, often they'll round up to the nearest power-of-two.  For example, if you ask jemalloc for 1048577 (2^20+1) bytes of memory, it'll give you 2097152 (2^21) bytes.

For this reason, when you have flexibility in the amount you are allocating, it can be a good idea to round up to a power of two.  That way, it's very likely that the allocator won't need to round up, and so you won't waste any space.

JSRope::flatten does exactly this.  The overallocation is reasonable -- having some spare space at the end means that we may be able to do future concatenations without having to allocate a new buffer.

However, JSRope::flatten makes a small but critical error.  RopeCapacityFor() does the power-of-two computation, but doesn't allow for the terminating NULL char.  AllocChars then adds sizeof(JSChar) -- 2 bytes -- to the number returned by RopeCapacityFor(), and requests that much.  So it ends up requesting an amount that is 2 bytes more than a power-of-two, and jemalloc rounds it up more itself.

Here's a small table showing what this means:

intended request    actual request     actual allocation
----------------    --------------     -----------------
             64                 66                    80
            128                130                   144
            256                258                   272
            512                514                  1024
           1024               1026                  2048
            ...                ...                   ...
        1048576            1048578               2097152

Not only is that excess space wasted, it's also contributing to the heap-unclassified bucket in about:memory.

Attached is a patch that instruments JSRope::flatten so that it sums up the total amount of waste.  For a browser session where I open about:memory?verbose and gmail.com, I get this:

  TOTAL: (req: 7682010, usable: 10139632, waste: 2457622)

So ~25% of the allocated memory is completely wasted.  (Note that those numbers are the sum of all allocations;  not all of them will be live at the end.)  With the problem fixed (patch coming shortly) the results are like this:

  TOTAL: (req: 8010624, usable: 8010624, waste: 0)

There don't seem to be enough enough strings live to make the improvement obvious in about:memory for gmail.  But I did write a synthetic stress test that showed clear improvements in both 'explicit' and 'heap-unclassified'.


than the threshold of noise for gmail.  But this should still be fixed.
Comment 1 Nicholas Nethercote [:njn] 2011-07-28 21:20:01 PDT
Created attachment 549301 [details] [diff] [review]
patch

Luke, this appears to work, but I'm not totally clear on which of |wholeLength|, |wholeCapacity| and |str->d.s.u2.capacity| are supposed to include the NULL char.
Prior to my change, none of them included it.  With my change, the latter two do count it, which sounds right to me, but I could be wrong.

(Nb: are ExtensibleStrings always NULL-terminated?  I can't tell from the big comment in vm/String.h.)
Comment 2 Nicholas Nethercote [:njn] 2011-07-28 21:24:11 PDT
Created attachment 549302 [details]
stress test

In case anyone's interested, here's the synthetic stress test.  Some numbers from about:memory:

                     before      after
                     ------      -----
explicit             366.2MB     343.3MB
a.html/string-chars  297.6MB     297.6MB
heap-unclassified     37.6MB      20.2MB
Comment 3 Luke Wagner [:luke] 2011-07-29 09:32:50 PDT
Comment on attachment 549301 [details] [diff] [review]
patch

Great find, and this makes an important general point to be wary of when doubling large buffers!

You are right that capacity currently does not include the null char.  But I think there is a bug in this patch since, if you change capacity to include the null char, you need to fix the uses to avoid an off by one error.  That's simple enough, but I'd rather just keep capacity and length having the same wrt including the null char or not since the sole purpose of 'capacity' is to compare with 'length'.

Another thing: I think RopeCapacityFor can just be rolled into AllocChars since there is now only a single caller and we want to see all the logic at once so that all the null-char-counting logic is together.  Patch shortly.
Comment 4 Luke Wagner [:luke] 2011-07-29 09:34:33 PDT
Created attachment 549402 [details] [diff] [review]
variation on njn's patch
Comment 5 Andrew McCreight [:mccr8] 2011-07-29 09:48:27 PDT
Comment on attachment 549402 [details] [diff] [review]
variation on njn's patch

typo in a comment ('does't'): String length does't include the null char, so include it here before
Comment 6 Nicholas Nethercote [:njn] 2011-07-29 13:52:43 PDT
(In reply to comment #3)
> 
> Another thing: I think RopeCapacityFor can just be rolled into AllocChars
> since there is now only a single caller and we want to see all the logic at
> once so that all the null-char-counting logic is together.  Patch shortly.

I almost did that, but |wholeCapacity| is used lower down in the function so it's not clear to me that it's safe.
Comment 7 Nicholas Nethercote [:njn] 2011-08-01 18:04:19 PDT
Comment on attachment 549402 [details] [diff] [review]
variation on njn's patch

Review of attachment 549402 [details] [diff] [review]:
-----------------------------------------------------------------

r=me if you fix the "not null-terminated" error in the JSFlatString comment, as discussed on IRC.  Thanks!
Comment 8 Nicholas Nethercote [:njn] 2011-08-01 21:11:53 PDT
Luke, are you happy to land this?
Comment 9 Luke Wagner [:luke] 2011-08-02 09:40:17 PDT
[clownshoes], huh; that's... sassy

http://hg.mozilla.org/integration/mozilla-inbound/rev/0884753e359c
Comment 10 Marco Bonardo [::mak] 2011-08-03 02:12:23 PDT
http://hg.mozilla.org/mozilla-central/rev/0884753e359c

Note You need to log in before you can comment on or make changes to this bug.