Closed Bug 54601 Opened 25 years ago Closed 25 years ago

fix string allocation policy

Categories

(Core :: XPCOM, defect, P3)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: scc, Assigned: scc)

Details

(Whiteboard: [rtm++])

Attachments

(1 file)

Currently strings (|nsString| and |nsCString| anyway) allocate storage always in increments of some chunk-size. This leads to much waste, particularly for strings that are never edited. At one time, we tried to move to an exact-size allocation strategy, which solved the waste, but visibly negatively impacted run-time because of heavily edited strings where the size changes ended up being, e.g., grow by one character at a time. This lead to excessive copying (to maintain string data during the reallocation that represented `growing the string'). The well-known answer to growth and copying problems of this nature is to double the size on a request for growth. Combining these two rules gives us a strategy that we think will improve both current run-time and current storage overhead, that is: allocate exact size, double on fault.
cc'ing the people involved in the recent email exchange about this situation
Status: NEW → ASSIGNED
I will try this patch and will report back with deltas from the "live bloat" stuff. (That'll give us an idea of how much memory this saves.)
To test this, I ran: ./mozilla --trace-malloc /dev/null \ /export/waterson/seamonkey/mozilla/tools/trace-malloc/live-bloat.html and then immediately clicked the "dump" button. This'll basically show all the memory that's in use after running through startup and loading an almost-empty web page. Anyway, here's the bottom line. Before this change, nsStr::Alloc() (which is the funnel through which all string allocation occurs) accounted for 343,832 bytes of "live bloat" (that is, live objects that are resident in the heap). After the change, nsStr::Alloc() had only left 93,147 bytes resident in the heap. I would presume the savings to be even more significant with mail, real content, etc.; however, I couldn't collect the data because trace-malloc kept dying with... Assertion failure: _PT_PTHREAD_MUTEX_IS_LOCKED(mon->lock.mutex), at ptsynch.c:507 brendan, you got any ideas why this happens? IMO we should have a few people (esp. those who noticed the previous performance problems!) run with this change in their tree. If nothing turns up, then let's check it in on the trunk and nominate for the nscp branch.
I'm applying this patch to my linux and winnt branch builds now. If I see any problems, I'll report them here.
I'm running with this patch on the branch as well. I'll try it with Purify too. So far, so good.
Purify didn't show any problems.
This is a significant gain for our footprint problems, and a simple well-understood solution. Nominating for beta3.
Keywords: nsbeta3
Whiteboard: [nsbeta3+]
Brendan points out that this still doesn't solve the problem of ``cut to fit _after_ editing''. These string classes currently support the notion that |S.SetCapacity(0)| discards all storage. |S.SetCapacity(N)| where 0 < N <= S.`capacity' is ignored. Before this patch, |S.SetCapacity(N)| where N > S.`capacity' grows |S| to a multiple of some chunk size. With this patch, if |S|'s capacity was |0|, it is cut-to-fit (exact size allocation) the new capacity; if |S| is non-empty, it is grown by multiplying by the smallest power of 2 that will make it >= the desired capacity. This means that before the patch you could _never_ cut-to-fit (except to discard all storage); after the patch, you can cut-to-fit an empty string. This behavior favors reduces copying time but does not automatically squeeze the last bit of space savings out of strings. Callers can cut-to-fit a string that has been heavily edited (and hence, probably has a capacity > its length) by assigning it into a new string, e.g., nsString temp = ...; // edit |temp| nsString y(temp); // |y| is now a copy of |temp|, but with no extra storage We could add a member function to accomplish this directly, but it would have the same performance as the pattern above. If |y| already existed (e.g., when assigning into a member variable) you can use this form nsString temp = ...; // edit |temp| mMemberString.SetCapacity(0); mMemberString = temp; Whether I add a member function to do this, or we advocate this pattern, savvy string consumers who need cut-to-fit-after-editing will need to make source changes; that's why a new member function is not part of this patch.
Cut to fit is a separate issue. Let's lobby to get what we've got into the branch today.
Talked to Waterson about this. Make that RTM, not b3.
Keywords: nsbeta3rtm
Whiteboard: [nsbeta3+] → [rtm+]
I'm not seeing any problems on win32 or linux.
I think there should be a "reduce my capacity to fit my length" method, because realloc to a smaller size should not allocate and copy -- it can work in place. So the longer form with a temporary is not equivalent to the best implementation for this case. /be
My irc comments on cut-to-fit were not meant to stop this patch from going into the trunk. It looks like warren has blessed it; I'll a= if that helps. /be
Marking rtm++. Please check this in as soon as possible.
Whiteboard: [rtm+] → [rtm++]
checked in on the trunk and on the branch. Thanks to everyone who helped me test this patch.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
rayw: please verify. Thanks.
Is there anyone else on this list who can help me verify this bug fix?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: