Closed
Bug 113393
Opened 23 years ago
Closed 23 years ago
zlib : 407 allocs account for about 8% of all allocation cost
Categories
(SeaMonkey :: UI Design, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
mozilla0.9.8
People
(Reporter: dp, Assigned: dp)
References
Details
(Keywords: perf)
Attachments
(1 file, 3 obsolete files)
11.31 KB,
patch
|
dp
:
review+
dveditz
:
superreview+
|
Details | Diff | Splinter Review |
[279,700] malloc : 0.896 sec [ 42.32%]
[ 9,151] calloc : 0.211 sec [ 9.94%] <-- zlib
[ 3,493] realloc: 0.022 sec [ 1.05%]
[287,080] free : 0.989 sec [ 46.70%]
-------------------------------------
[579,424] total : 2.118 sec [100.00%]
I instrumented trace-malloc to dump before and after timing on each malloc/free
in an attempt to find a pattern on expensive mallocs.
Here is the high level view. Total startup is about 6 secs (debug - once garrett
checks in tracemalloc for optimized builds I will switch over).
Overall malloc/free is certainly about 30% of startup cost.
The big depressing part is for every jar file read we do:
<size of file> + 32k + 11k + 3 more tiny allocs
In effect, allmost all the 9.94% of total alloc cost is from zlib. 80% of it is
caused by 413 allocs (207 32768, 206 11520) which are allocations done when
reading a file from jar by zlib. Buffer management into zlib could be a win.
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Priority: -- → P2
Target Milestone: --- → mozilla0.9.7
Comment 1•23 years ago
|
||
InflateItem() could at least be cleaned up to use the passed-in buffer directly
in the inflate, instead of coping everything around twice (and probably more).
I'll write that separately: bug 113404
Assignee | ||
Updated•23 years ago
|
Assignee | ||
Comment 2•23 years ago
|
||
Time consumed by the 32768 allocs and 11520 allocs from zlib
chetu> grep 'calloc 32768' alloc.out | perl bucket.pl
[ 0] malloc : 0.000 sec [ 0.00%]
[ 207] calloc : 0.114 sec [100.00%]
[ 0] realloc: 0.000 sec [ 0.00%]
[ 0] free : 0.000 sec [ 0.00%]
-------------------------------------
[ 207] total : 0.114 sec [100.00%]
chetu> grep 'calloc 11520' alloc.out | perl bucket.pl
[ 0] malloc : 0.000 sec [ 0.00%]
[ 206] calloc : 0.032 sec [100.00%]
[ 0] realloc: 0.000 sec [ 0.00%]
[ 0] free : 0.000 sec [ 0.00%]
-------------------------------------
[ 206] total : 0.032 sec [100.00%]
Updated•23 years ago
|
QA Contact: sairuh → jrgm
Assignee | ||
Comment 3•23 years ago
|
||
This eliminates 1200+ mallocs from zlib. Timing wise it improves startup by
about 1.5%
Assignee | ||
Comment 4•23 years ago
|
||
Minor code modifications from previous patch.
The only thing left is to figure out when we release the 48k of memory that the
allocator holds.
Assignee | ||
Updated•23 years ago
|
Attachment #60573 -
Attachment is obsolete: true
Assignee | ||
Comment 5•23 years ago
|
||
performance improvement is about 4% of startup on a optimized builg (after I
killed the virus scan that kept running and screwing up my timging measurements)
Assignee | ||
Comment 6•23 years ago
|
||
Strategy for releasing the memory the allocator holds:
a) timer - set a timer and if allocator wasnt used within a specific time,
cleanup. I am thinking of 10 secs.
b) Once startup is over, release and disable this allocator all together.
Disadvantage is mail, mailcompose, aim windows wont get the benefit of the
performant allocator.
I am leaning towards (a). With (a) if we release, and mail startsup say, the
allocator will kick in, grab its 48k and 10sec after mail has started up will
release the 48k
Chris/Simon, would love your input.
Assignee | ||
Comment 7•23 years ago
|
||
After talking to Alec, I am going to wait to do the timer thing. Pav is moving
timers into xpcom and I can do it then. Right now jar using widget (where timer
lives) is a bad dependency
Total memory bloat caused by this would be 46k I am going to make module
shutdown clean the memory for now.
Comment 8•23 years ago
|
||
Comment on attachment 60785 [details] [diff] [review]
Eliminte 1200+ allocations from zlib
the current patch looks good to me :)
sr=darin
out of curiosity... wby the memset in zalloc? is zalloc meant to resemble
calloc?
Assignee | ||
Comment 9•23 years ago
|
||
Yeah zalloc is replacing a calloc that zlib does.
Assignee | ||
Comment 10•23 years ago
|
||
Improvements from previous patch:
- uses a ptr as opposed to a static object
- releases alloced memory on shutdown
Attachment #60785 -
Attachment is obsolete: true
Assignee | ||
Comment 11•23 years ago
|
||
dveditz/darin : r=/sr= ?
Comment 12•23 years ago
|
||
Comment on attachment 61120 [details] [diff] [review]
Final patch - implements zlib allocator
r/sr=darin, but how about putting the
extern nsZlibAllocator *gZlibAllocator;
in nsZlibAllocator.h ?
Attachment #61120 -
Flags: superreview+
Comment 13•23 years ago
|
||
Comment on attachment 61120 [details] [diff] [review]
Final patch - implements zlib allocator
>+nsZlibAllocator::nsZlibAllocator()
>+{
>+ for (int i = 0; i < NBUCKETS; i++) {
>+ mMemBucket[i].ptr = NULL;
>+ mMemBucket[i].size = 0;
>+ mMemBucket[i].inUse = PR_FALSE;
>+ }
>+ return;
>+}
A memset() might be cheaper, and the constructor could be
moved inline into the .h file to reduce clutter.
>+int nsZlibAllocator::zClearBuckets()
...
>+ mMemBucket[i].ptr = NULL;
>+ mMemBucket[i].size = 0;
>+ mMemBucket[i].inUse = PR_TRUE;
why PR_TRUE? Is this to catch double clearing?
>+void * nsZlibAllocator::zAlloc(PRUint32 items, PRUint32 size)
This is totally not threadsafe -- how do you protect against collisions?
This needs to get tested on a dual processor machine, several thread problems
cropped up in nsJAR/nsZipArchive when used on dual-processor machines. The
locking already in nsJAR is likely to save you for the chrome case, but I worry
about other zip users who might not be on the UI thread. A quick glance through
lxr didn't turn up anything though.
>+ if (!mMemBucket[i].inUse && mMemBucket[i].ptr && mMemBucket[i].size == totalsize)
>+ {
>+ // zero out memory, increase refcnt and return it
>+ memset(mMemBucket[i].ptr, 0, totalsize);
>+ mMemBucket[i].inUse = PR_TRUE;
>+ return mMemBucket[i].ptr;
>+ }
So you memset if it's the right size
>+ // See if we have an allocated bucket
>+ if (freeAllocatedBucket)
>+ {
>+ // Mark it used
>+ freeAllocatedBucket->inUse = PR_TRUE;
>+ return freeAllocatedBucket->ptr;
>+ }
But not if it's too big.
>+ void *ptr = calloc(realitems, size);
And if it's too small it gets cleared, too.
>+ if (freeBucket)
>+ {
>+ // Found free slot. Store it
>+ freeBucket->inUse = PR_TRUE;
>+ freeBucket->ptr = ptr;
>+ freeBucket->size = realitems * size;
>+ }
But if it's too small you leak whatever used to be there (if anything).
>Index: nsZlibAllocator.h
>===================================================================
>RCS file: nsZlibAllocator.h
>diff -N nsZlibAllocator.h
>--- /dev/null 1 Jan 1970 00:00:00 -0000
>+++ nsZlibAllocator.h 10 Dec 2001 18:53:23 -0000
>@@ -0,0 +1,53 @@
>+/* -*- Mode: C; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
>+/*
>+ * The contents of this file are subject to the Netscape Public
>+ * License Version 1.1 (the "License"); you may not use this file
Need new boilerplate header from http://www.mozilla.org/MPL/
mozilla.org discourages new files from using the NPL, so use the *M*PL
tri-license version
http://www.mozilla.org/MPL/boilerplate-1.1/mpl-tri-license-c
Fill in the Original Code and Initial Developer as you have here:
"Mozilla client code" and "Netscape Communications Corporation".
Copyright date is 2001
Attachment #61120 -
Flags: superreview+
Assignee | ||
Comment 14•23 years ago
|
||
>A memset() might be cheaper, and the constructor could be
>moved inline into the .h file to reduce clutter.
Sure. will do.
>why PR_TRUE? Is this to catch double clearing?
That was a typo. It should have been PR_FALSE.
>>+ // See if we have an allocated bucket
>>+ if (freeAllocatedBucket)
>>+ {
>>+ // Mark it used
>>+ freeAllocatedBucket->inUse = PR_TRUE;
>>+ return freeAllocatedBucket->ptr;
>>+ }
>
>But not if it's too big.
Yes. Nice catch. Will add the memset here.
>But if it's too small you leak whatever used to be there (if anything).
That is true too. Lucky, that wasnt the intent and that case never happened.
freeBucket needs to point to a truely free bucket. Will make sure.
> Need new boilerplate header from http://www.mozilla.org/MPL/
Yup. Did that one too.
So the only thing is thread safeness. My logic was that there was locks in
nsJAR::GetInputStream() that does a lock and serializes calls to
nsJARInputStream::Init() -> nsZipArchive::ReadInit() -> nsZipArchive::InflateItem()
The comment here assumes nsZipArchive is not threadsafe. So I felt introducing
another level of locking was wasteful.
NS_IMETHODIMP
nsJAR::GetInputStream(const char* aFilename, nsIInputStream** result)
{
// nsZipArchive and zlib are not thread safe
// we need to use a lock to prevent bug #51267
nsAutoLock lock(mLock);
NS_ENSURE_ARG_POINTER(result);
nsresult rv;
nsJARInputStream* jis = nsnull;
rv = nsJARInputStream::Create(nsnull, NS_GET_IID(nsIInputStream), (void**)&jis);
if (!jis) return NS_ERROR_FAILURE;
rv = jis->Init(this, aFilename);
....
BTW, a very nice review as usual. Thanks. I will attach new patch fixing all
other parts of the review.
Comment 15•23 years ago
|
||
I did some lxr hunting and am satisfied on the threadsafety front. Only the
native installers use nsZipArchive directly (using the STANDALONE version) and
they're both single-threaded and separate from Mozilla. Everything else goes
though the locks in nsJAR.
Assignee | ||
Comment 16•23 years ago
|
||
This includes all of dveditz's comments + moving extern into .h file per darin.
Attachment #61120 -
Attachment is obsolete: true
Comment 17•23 years ago
|
||
Comment on attachment 61405 [details] [diff] [review]
Fixing comments from dveditz
>+ nsZlibAllocator() {
>+ memset(mMemBucket, 0, sizeof(mMemBucket));
>+ printf("sizeof(mMemBucket) = %d\n", sizeof(mMemBucket));
>+ return;
>+ }
printf?
Attachment #61405 -
Flags: needs-work+
Assignee | ||
Comment 18•23 years ago
|
||
That was a stray debug printf. Will remove on checkin.
Comment 19•23 years ago
|
||
Comment on attachment 61405 [details] [diff] [review]
Fixing comments from dveditz
>+ * The Original Code is Suresh Duddi <dp@netscape.com>
"I am not a number", likewise you are not code. "Mozilla client code"
or some such will do.
>+ * The Initial Developer of the Original Code is
>+ * Suresh Duddi <dp@netscape.com>
>+ * Portions created by the Initial Developer are Copyright (C) 2001
>+ * the Initial Developer. All Rights Reserved.
As an employee everything you produce is copyright the company, and
the initial developer (a legal term of the license) should be
Netscape Communications Corporation
>+ * Contributor(s):
This is where names go. Sometimes people attach comments like
(original author) or some particularly big restructuring they did,
so if you like:
+ * Contributor(s):
+ * Suresh Duddi <dp@netscape.com> (original author)
>+ nsZlibAllocator() {
>+ memset(mMemBucket, 0, sizeof(mMemBucket));
>+ printf("sizeof(mMemBucket) = %d\n", sizeof(mMemBucket));
>+ return;
>+ }
You've zapped the printf(), right?
sr=dveditz
Attachment #61405 -
Flags: superreview+
Comment 20•23 years ago
|
||
r=darin with the fixups dveditz mentioned.
Assignee | ||
Comment 21•23 years ago
|
||
You can tell I hate reading the license. Anywhere there was a _____ I
religiously put my name :-) Thanks Guys. I made the changes. Yeah I zapped the
printf.
Assignee | ||
Comment 22•23 years ago
|
||
Final cost analysis:
Before fix:
opt> perl e:/dp/bucket.pl < malloc-cost.out.sorted
[239623] malloc : 57900 ticks - 1.035 sec [ 43.65%]
[ 7808] calloc : 14318 ticks - 0.256 sec [ 10.79%]
[ 3395] realloc: 2082 ticks - 0.037 sec [ 1.57%]
[242391] free : 58347 ticks - 1.043 sec [ 43.99%]
----------------------------------------------------
[493218] total : 132647 ticks - 2.372 sec [100.00%]
After fix:
opt> perl e:/dp/bucket.pl < malloc-cost.out.sorted
[252784] malloc : 51906 ticks - 0.928 sec [ 45.79%]
[ 6910] calloc : 3866 ticks - 0.069 sec [ 3.41%]
[ 3452] realloc: 1690 ticks - 0.030 sec [ 1.49%]
[259093] free : 55892 ticks - 0.999 sec [ 49.31%]
----------------------------------------------------
[522241] total : 113354 ticks - 2.027 sec [100.00%]
Overall improvement is about 14% of allocation cost. This would translate to
anywhere betn 4-8% of startup.
Assignee | ||
Comment 23•23 years ago
|
||
Comment on attachment 61405 [details] [diff] [review]
Fixing comments from dveditz
darin r=ed.
Attachment #61405 -
Flags: needs-work+ → review+
Assignee | ||
Comment 24•23 years ago
|
||
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Comment 25•23 years ago
|
||
timers are now in xpcom (let's hope they can stay there -- thread-based timers
have caused regressions, as expected; not all are in hand, but I will help to
fix them yet preserve thread-based/safe timers). So, is there a bug on the 10
second timer idea?
/be
Assignee | ||
Comment 26•23 years ago
|
||
Just filed bug 118061 for the 10sec timer to release memory
Updated•20 years ago
|
Product: Core → Mozilla Application Suite
You need to log in
before you can comment on or make changes to this bug.
Description
•