Closed Bug 46702 Opened 25 years ago Closed 24 years ago

parser token recycler unbounded in size

Categories

(Core :: DOM: HTML Parser, defect, P1)

All
Windows NT
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: warrensomebody, Assigned: vidur)

References

Details

(Keywords: memory-footprint, memory-leak, Whiteboard: [nsbeta3-][fixinhand][PDTP1][rtm-])

Attachments

(1 file)

1) Parser token and string usage Description: The parser currently heap allocates small CToken instances and keeps them on a recycler list "for performance reasons". The recycler list is currently unbounded in size. Text tokens also use nsStrings to store copies of their text data - also heap allocated. The overall memory usage shows up high in bloatblame logs. The working set memory usage also stays pretty high because of the unbounded token recycler. Module owner: harishd@netscape.com Task owner: harishd@netscape.com, scc@netscape.com, vidur@netscape.com Status: Harish is looking at arena allocating the token objects. Scott is working on a segmented string implementation which can deal with substrings that share the underlying buffer for the token strings.
P.S. This was from the footprint meeting 7/26/00.
Keywords: footprint, mlk, nsbeta3
Whiteboard: nsbeta3+
Whiteboard: nsbeta3+ → [nsbeta3+]
Depends on: 46899
harishd: I moved nsFixedSizeAllocator.[h|cpp] out of rdf and into xpcom/ds, so you can use it for this if you like.
Status 8/9/00 footprint meeting: Plan is to have scc get fragmented string stuff into the build. Vidur to do the parser work to integrate it. Scc should focus on the nsString methods required by the parser first.
Setting priority of P1.
Priority: P3 → P1
Status: NEW → ASSIGNED
I think there is a better answer here. First, though, the recycler wasn't used for "performance reasons" -- it was added to reduce memory usage. Here's what I'd do: 1. Add an upper bound to the recycler. It's trivial to do, with the bulk of the effort going toward measuring performance. 2. I'd have (text) tokens point to the original buffer rather than copy the text data into strings. Note: other tokens are pretty thin -- and usually dont use heap allocation for string data. 3. Use a segmented buffer for the main parser buffer. Add a min. of synchronization so that tokens that point to the segmented buffer never lose the content they point to. (This is a very brief period of time, but it can be tricky due to scripts and rentrancy).
Harish has the fix in hand but it causing an additional 25K of bloat. He's going to work with waterson to figure that out over today.
Whiteboard: [nsbeta3+] → [nsbeta3+][fixinhand]
So what is the net gain here? PDT needs numbers to make proper call here.
Bringing up a the first browser window leads to about 9Mb of data allocated from the heap left in memory once all is said and done. (This was measured in a debug build on Linux using the "live bloat" tool that I describe in news://news.mozilla.org/39AC57EA.1020501%40netscape.com). With harishd's fix (which I think has already been checked in), we have 8.3Mb of data that's still allocated in the heap after the first window comes up. That's a savings of about 700Kb, or almost 10% to the heap size. The next step is to reduce the peak memory size. CTokens currently use nsString's as their member variables. So even though the CToken objects themselves are allocated and destroyed en masse, they have member variables the go out to the heap to allocate space for their buffers. This causes a "large amount" of transient memory use and data copying. (How much is a "large amount"? I don't know. I can find out for you if you'd like.) So, vidur and scc plan to replace the nsString objects with "placeholders" that look and smell like nsStrings, but really just indirect into the raw necko buffer that's being parsed. In other words, no data copied, no extra heap allocation, lower peak memory usage. Hope that helps.
Token cache work is done. Giving bug to Scott for segmented string work. Scott, please reassign this bug to Vidur once you're done. Thanx
Assignee: harishd → scc
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
PDT agrees P1, but segmented string work is in, right? Can this go to vidur now?
Whiteboard: [nsbeta3+][fixinhand] → [nsbeta3+][fixinhand][PDTP1]
Yes ... Vidur has requested a few small API changes to the |nsSlidingSubstring| and |nsSlidingString| classes; but the classes themselves are done and checked- in, and he can have this bug. I'll continue to support him in this effort.
Assignee: scc → vidur
Status: ASSIGNED → NEW
marking nsbeta3-. We won't hold pr3 for this. Nominating for rtm
Keywords: rtm
Whiteboard: [nsbeta3+][fixinhand][PDTP1] → [nsbeta3-][fixinhand][PDTP1]
Why is this bug totally languishing??? It's too late for patches of this size, did we actually need this???
We have all the memory performance stuff we could handle for RTM at this point. Marking rtm minus. Note that the first wave of this bug fix landed back in late August.
Whiteboard: [nsbeta3-][fixinhand][PDTP1] → [nsbeta3-][fixinhand][PDTP1][rtm-]
updated qa contact.
QA Contact: janc → bsharma
Fixed earlier this year.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Verified on: build: 2001-07-02-04-Trunk platform: WinNT Marking it verified as per above developer comments.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: