This patch seeks to reduce the number of new/delete calls from the nsLineLayout class. This class builds a 2 lists for its internal use, and building these lists can cause a lot of memory allocation activity. This had apparently been noticed before and the problem had been partially addressed by obtaining the first 5 elements for each list from internal arrays that were embeded into the class, and also managing a free list of previously used elements. This was generally effective. I put some print statements into the code, and most nsLineLayout classes are destroyed with out ever exceeding the 5 element limit which can be satisfied w/o calling new. However, there are situations where the list can be much longer than 5 elements. I observed this on a page which provided chatroom services. On this page the class needs approximatly 50 elements and classes are created and destroyed rapidly. It is particulary annoying when a page like this is slow because it is susposed to be providing a realtime experience. Jprof runs showed new/delete from nsLineLayout contributing a measurable amount of runtime. I removed the embeded arrays from the class and replaced it with a single Arena which is used to provide elements for both lists. Jprof shows that this reduces time spend in malloc. I am slightly concerned that for the non-problematic case where the lists are less than 5 elements, we are now doing a single memory allocation where we were not doing any before. However, I have not observed any problems resulting from this. A side effect of this change is that the size of the nsLineLayout class drops from 1100 byes to 200 bytes due to the removal of the internal buffers. This class was created on the stack and was called recursivly. There was some code which switched from stack allocation to heap allocation after 30 recursions were done. This was to avoid running out of stack space. Since the class is 5x smaller, this complexity no longer seems worth the trouble. We can go to 150 recursions using the same amount of memory, and I have never seen us get close to even 30. I changed the code to always allocate on the stack and put in a check to bail out if we ever hit 150 levels deep. The net result of these changes is a slight reduction in the size of the program, a larger reduction in the amount of source code, and a better ability to handle pages which require long lists in nsLineLayout.
Created attachment 152512 [details] [diff] [review] Enhanced patch This patch adds a call to PL_FreeArenaList() which puts the deallocated Arena on a global free list. This allows it to be quickly reused by the Arena system. It gets rid of the problem of adding an extra allocation, and this is quite visable with jprof.
Comment on attachment 152512 [details] [diff] [review] Enhanced patch Marvellous. Thanks.
Created attachment 152573 [details] [diff] [review] Further size reduction in nsLineLayout I did some more playing around, and discovered that I could trigger my recursion limiting code by creating a deeply nested table. Tables seem to be limited to 32 levels of nesting, and at that level, I needed a depth of 202 in the Inline Reflow code. I managed to remove another 50 bytes from the nsLineLayout class, dropping it from its original size of 1100 bytes down to 150. At this size I dont think we need to worry about it consuming excessive stack space anymore, and I removed the logic that limited the number recursive calls. The latest size reduction was possible because the class contains an nsDeque which is only used in between 5 and 10 percent of the instances. I removed this member from the class and replaced it with a point that is only allocated when we are actually going to make use of it.
Robert, thanks for the r+sr. I have done a bit more work and included an updated patch. Can I get an r+sr on it?
Status: NEW → RESOLVED
Last Resolved: 14 years ago
Resolution: --- → FIXED
(In reply to comment #2) > Created an attachment (id=152512) > Enhanced patch > > This patch adds a call to PL_FreeArenaList() which puts the deallocated Arena > on a global free list. I don't know if it is a good thing to do. I've searched lxr for PL_FreeArenaList/PL_FinishArenaList and it looks like noone use PL_FreeArenaList. It's the same with JS_FreeArenaList wrt JS_FinishArenaList. The global freelist is not used by Mozilla (it's used by NSS though). So I guess there might be some problem with it after all.
You need to log in before you can comment on or make changes to this bug.