Closed
Bug 116187
Opened 23 years ago
Closed 23 years ago
nsZipArchive:BuildFileList allocates 180k in 4500+ allocations
Categories
(Core :: General, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla1.0
People
(Reporter: dp, Assigned: dp)
References
Details
(Keywords: memory-footprint, perf)
Attachments
(1 file, 3 obsolete files)
5.10 KB,
patch
|
blythe
:
review+
alecf
:
superreview+
asa
:
approval+
|
Details | Diff | Splinter Review |
#allocs For Total bytes
--------------------------------------
[ 2251 ] - nsZipItem - 72032
[ 2251 ] - namelen - 95008
We should be able to arena allocate these.
Assignee | ||
Updated•23 years ago
|
Assignee | ||
Comment 1•23 years ago
|
||
nsZipItem and the name are arena allocated.
Assignee | ||
Comment 2•23 years ago
|
||
Brendan/dveditz sr=/r= ?
Comment 3•23 years ago
|
||
Comment on attachment 62534 [details] [diff] [review]
Arena allocate the 4500+ allocations of filelist and name
>+#if !defined(STANDALONE)
>+ PL_ARENA_ALLOCATE(mem, &mArena, sizeof(char) * (namelen + 1));
Standard C and C++ say sizeof(char) == 1, so you don't need to scale here.
> nsZipArchive::nsZipArchive()
>- : kMagic(ZIP_MAGIC), mFd(0)
>+ : kMagic(ZIP_MAGIC), kArenaBlockSize(8*1024), mFd(0)
Call me old-fashioned, but would a #define ZIP_ARENA_SIZE (8*1024) not be
better? Or do most C++ compilers "inline" the const and avoid a load on every
use?
>+#ifndef STANDALONE
>+ // Initialize our arena
>+ memset(&mArena, 0, sizeof mArena);
>+ PL_INIT_ARENA_POOL(&mArena, "ZipArena", kArenaBlockSize);
Don't memset mArena to 0, PL_INIT_ARENA_POOL => PL_InitArenaPool sets all
members.
>@@ -241,6 +247,9 @@
>
> PRFileDesc *mFd;
> nsZipItem* mFiles[ZIP_TABSIZE];
>+#ifndef STANDALONE
>+ PLArenaPool mArena;
>+#endif
Nit: members are not consistently indented.
/be
Comment 4•23 years ago
|
||
Comment on attachment 62534 [details] [diff] [review]
Arena allocate the 4500+ allocations of filelist and name
> nsZipArchive::nsZipArchive()
>- : kMagic(ZIP_MAGIC), mFd(0)
>+ : kMagic(ZIP_MAGIC), kArenaBlockSize(8*1024), mFd(0)
I'm a little worried about using this large a block. Each
zip file gets its own pool, netscape has 16 chrome archives,
so we could be wasting 128K worst case. Even if you say on
average we'll only waste a quarter of a block per archive
(and I don't know how you could say that) you'll still be
wasting 32K which just happens to be the savings Garret got
after all his work shrinking the size of nsZipItem.
For the very smallest archives you should use 512 byte blocks,
and a few of them fit in just one block. For larger archives
(transition should be at least 16K, but could reasonably go up
to 100K) use 1k blocks.
The largest archive, comm.jar, has 30,875 bytes of data we store
by my calculation. Will 31 1K allocations kill us vs. 4 8K
allocations? If so you can pick other points to transition to
larger block sizes, but I'm sure we don't want the footprint hit
on the smaller archives.
[To guess the amount of data we're storing I used
unzip -Z1 <archive> | wc -lc
then multiplied the first number by 32 (sizeof nsZipItem)
and added the second number for the name data. wc counts
newlines and we store null terminators so it works out.]
>+#ifndef STANDALONE
>+ // Initialize our arena
>+ memset(&mArena, 0, sizeof mArena);
>+ PL_INIT_ARENA_POOL(&mArena, "ZipArena", kArenaBlockSize);
>+#endif
If you do what I suggest you'll have to wait on this until we
know the filesize. Maybe in the Open, or in BuildFileList.
BuildFileList might be best since we're already looking for
the end of the file. You could also make guesses based on the
size of the central directory which might be slightly more
reliable if you worry about archives containing a few huge files
or lots of miniscule ones.
Comment 5•23 years ago
|
||
See also Bug 118455:
it is about using file mapping to avoid those allocations when all items the the
jar file are stored and not compressed, comments are welcome!
Comment 6•23 years ago
|
||
You're thinking of bug 113393, the zlib allocations (which have been largely
eliminated). Storing files rather than compressing them wouldn't help the
nsZipItem allocations at all because we still have to track where the files are
inside the archive.
Assignee | ||
Comment 7•23 years ago
|
||
Fixed all review comments.
Dan, I switched the ARENA_BLOCK_SIZE to 4k and made the arena global to all zip
archives. That would control the wastage if any. What do you think ?
Attachment #62534 -
Attachment is obsolete: true
Comment 8•23 years ago
|
||
4k still sounds too large, potentially 64K wasted with 16 .jar files (probably
closer to 32K in practice). We're trading footprint for performance here, but
we're only getting performance measurements. What's the footprint cost?
Please measure the allocations in the original code (I guess that's your 180K
figure), with your 4K buffer, and with a 1K buffer.
Assignee | ||
Comment 9•23 years ago
|
||
Dan it isn't an arena per nsZipAchive; I switched it to one global arena shared
by all nsZipArchives. Thus wastage is at worst 4k; probabilistically 2k
Updated•23 years ago
|
Attachment #64266 -
Flags: needs-work+
Comment 10•23 years ago
|
||
Comment on attachment 64266 [details] [diff] [review]
Comments from Brendan and Dveditz
Whoa, that's a huge change to hide under the description
"Comments from Brendan and Dveditz".
In STANDALONE mode (used by the native installers) you now no
longer delete nsZipItems so they all leak, and you removed its
destructor so even if you do delete them the names are all leaked.
By having a global pool we get screwed by transient archives,
such as might be opened by XPInstall or signed scripts. Or if
the ZipReader cache fills up and evicts an archive. And then
again when it most likely reopens the chrome archive it just
evicted.
By having a shared memory pool you're also exposed to thread-safety
issues. nsZipArchive is advertised as not threadsafe so people
will take care to use each on a single thread (or wrap it in
nsJAR which has locks for the critical entry points), but there's
no guarantee two different threads won't open two different archives
at the same time.
Even though it's potentially more wasteful of memory you were on
surer footing with a per-archive pool.
Comment 11•23 years ago
|
||
One way around the per-archive pool is to recycle items. The "screwed by
transient archives" problem dveditz mentions is one example of using non-LIFO
allocation lifetime request strings with an arena pool, which is always bad.
If, however, you could make a freelist of items, the freelist would buffer the
non-LIFO load and present a more LIFO load to the underlying pool.
That doesn't deal with thread safety. You'd want a PRLock, I think. But then
you're paying what might be too high an overhead per item allocation. So Dan's
point about per-archive pools probably still wins, if you can make the arena
size small enough that internal fragmentation doesn't waste too much space.
It's not too hard to compute the average utilization for a given arena size,
given real numbers on mean items per archive (max and variance would be useful
too) -- do you have those?
/be
Assignee | ||
Comment 12•23 years ago
|
||
yeah. I am convinced that per-archive pool is better now. Will revert back. Will
publish numbers too.
Assignee | ||
Comment 13•23 years ago
|
||
> Whoa, that's a huge change to hide under the description
> "Comments from Brendan and Dveditz".
Hey to my defense I did add comments about the switch to global arena.
Assignee | ||
Updated•23 years ago
|
Attachment #64266 -
Attachment is obsolete: true
Attachment #64266 -
Flags: needs-work+
Comment 14•23 years ago
|
||
Got it. Your global arena mention in the comments didn't sink in when I read it,
and later when I reviewed the patch I saw only the short comment in the
attachment table.
Assignee | ||
Updated•23 years ago
|
Target Milestone: mozilla0.9.8 → mozilla0.9.9
Assignee | ||
Updated•23 years ago
|
Target Milestone: mozilla0.9.9 → mozilla1.0
Assignee | ||
Updated•23 years ago
|
Assignee | ||
Comment 15•23 years ago
|
||
Using 1k blocks for Arena. Has all of brendan's earlier comments except the
block size definition - C++ will substitute the const value instead of loading
it each time.
Assignee | ||
Comment 16•23 years ago
|
||
chetu> grep Arena out | perl arena.pl
Arena Alloc blks waste
--------------------------------------
1a755c8 11790 - 12 1k Blocks : 498
1a755e0 32131 - 32 1k Blocks : 637
1a755f8 31863 - 32 1k Blocks : 905
1a75610 1260 - 2 1k Blocks : 788
1a75628 3275 - 4 1k Blocks : 821
1a75640 32057 - 32 1k Blocks : 711
1a75658 2067 - 3 1k Blocks : 1005
1a75670 13237 - 13 1k Blocks : 75
1a75688 344 - 1 1k Blocks : 680
1a756a0 7416 - 8 1k Blocks : 776
1a756b8 6443 - 7 1k Blocks : 725
--------------------------------------
Total 141883 - 146 1k Blocks : 7621
Malloc wastage : 30880 in 3860 allocations
So we save : 3,714 small malloc() calls totaling 22,091 bytes
This is for a mozilla build for startup.
Assignee | ||
Comment 17•23 years ago
|
||
Seems like Dan is out until 3/11. Garrett r=, Brendan sr=?
Assignee | ||
Comment 18•23 years ago
|
||
Ccing Alec. Brendan's email says he is out and cant sr.
Garrett r=, Alec sr= ?
Comment 19•23 years ago
|
||
Comment on attachment 72750 [details] [diff] [review]
Arena allocate zip archive list
r=blythe
Attachment #72750 -
Flags: review+
Comment 20•23 years ago
|
||
Comment on attachment 72750 [details] [diff] [review]
Arena allocate zip archive list
what's PL_ARENA_CONST_ALIGN_MASK for?
Don't we need to manually call the ~nsZipItem destructor in the arena case? If
we don't, can you comment to explain why it isn't necessary?
Assignee | ||
Comment 21•23 years ago
|
||
plarena.h:
/*
* If the including .c file uses only one power-of-2 alignment, it may define
* PL_ARENA_CONST_ALIGN_MASK to the alignment mask and save a few instructions
* per ALLOCATE and GROW.
*/
Setting it to 7 causes all arena allocation to be aligned on 8 byte boundary.
Malloc does that too.
Yeah. We really dont need to call ~nsZipItem. I added this comment.
// CAUTION:
// We dont need to delete each of the nsZipItem as the memory for
// the zip item and the filename it holds are both allocated from the Arena.
// Hence, destroying the Arena is like destroying all the memory
// for all the nsZipItem in one shot. But if the ~nsZipItem is doing
// anything more than cleaning up memory, we should start calling it.
I did some better error checking too when memory fails to protect from delete
item in low memory situation. Nice point. Will attach new patch.
Assignee | ||
Comment 22•23 years ago
|
||
- Added comments per alecf
- consistent use of #ifndef STANDALONE instead of #if !defined(STANDALONE) per
brendan
- Under low memory condition, prevent delete of nsZipItem
Attachment #72750 -
Attachment is obsolete: true
Comment 23•23 years ago
|
||
Comment on attachment 72836 [details] [diff] [review]
Added comments per alecf; more error checking
r=blythe
Attachment #72836 -
Flags: review+
Comment 24•23 years ago
|
||
Comment on attachment 72836 [details] [diff] [review]
Added comments per alecf; more error checking
ah, so I forgot about the whole MOZ_COUNT_CTOR/DTOR stuff
I noticed that MOZ_DECL_CTOR_COUNTER() is just defined as being empty, in both
debug and release builds. This means we've essentially lost bloat counting of
nsZipItems.
I guess I'm ok with this, but here's why: If we manually call the destructors,
we take an unnecessary perf hit just to get debug info. If we only call the
destructor on debug builds, I think that means there is too much difference
between debug and release, and could lead to worse problems along the way....
Attachment #72836 -
Flags: superreview+
Comment 25•23 years ago
|
||
Comment on attachment 72836 [details] [diff] [review]
Added comments per alecf; more error checking
ah, so I forgot about the whole MOZ_COUNT_CTOR/DTOR stuff
I noticed that MOZ_DECL_CTOR_COUNTER() is just defined as being empty, in both
debug and release builds. This means we've essentially lost bloat counting of
nsZipItems.
I guess I'm ok with this, but here's why: If we manually call the destructors,
we take an unnecessary perf hit just to get debug info. If we only call the
destructor on debug builds, I think that means there is too much difference
between debug and release, and could lead to worse problems along the way....
sr=alecf
Assignee | ||
Updated•23 years ago
|
Comment 26•23 years ago
|
||
Can't we just inline the MOZ_COUNT_DTOR stuff where we free up the structure?
Assignee | ||
Comment 27•23 years ago
|
||
Why ? Since we are allocating these objects on an Arena it seems wrong to count
how many are created and destroyed. It will be all or nothing destroyed. Nothing
is a leak of the Arena which we should know by other means.
Inlining even if possible would make DEBUG shutdown faster. I wont mind putting
it in non-inlined too. But then calling the destructor is the one that seems far
off. And writing another loop to go over all these objects and count them seems
off too 'cause of the Arena reason.
Comment 28•23 years ago
|
||
I think the object should still appear in the bloat logs - the bloat logs are
measuring object creation - since there (might) be some cost with the actual
creation of an object (i.e. through the constructor itself, independent of the
allocation) we still get to keep some decent accounting stats through the bloat
logs.
Comment 29•23 years ago
|
||
OK, I'll buy that. Thanks.
Comment 30•23 years ago
|
||
Alec: bloat logs are for tracking bloat, which is a space issue independent of
the complexity of a constructor. Performance issues are better identified with
profiles, which give call-context in addition to run-wide call counts, so I'm
not worried about losing that data here.
Comment 31•23 years ago
|
||
fair enough. ok, I guess we just #ifdef out the MOZ_COUNT_CTOR in non-standalone
builds.
Comment 32•23 years ago
|
||
Comment on attachment 72836 [details] [diff] [review]
Added comments per alecf; more error checking
a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #72836 -
Flags: approval+
Comment 33•23 years ago
|
||
Looks good to me, too. r=/moa=dveditz if anyone ever asks.
Assignee | ||
Comment 34•23 years ago
|
||
Fix checked in to trunk.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•