nsZipArchive:BuildFileList allocates 180k in 4500+ allocations

RESOLVED FIXED in mozilla1.0

Status

()

P2
normal
RESOLVED FIXED
17 years ago
5 years ago

People

(Reporter: dp, Assigned: dp)

Tracking

({memory-footprint, perf})

Trunk
mozilla1.0
x86
Windows 2000
memory-footprint, perf
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 3 obsolete attachments)

(Assignee)

Description

17 years ago
#allocs    For           Total bytes
--------------------------------------
[ 2251 ] -  nsZipItem  -  72032
[ 2251 ] -  namelen    -  95008

We should be able to arena allocate these.
(Assignee)

Updated

17 years ago
Blocks: 7251
Status: NEW → ASSIGNED
Keywords: perf
Priority: -- → P2
Target Milestone: --- → mozilla0.9.8
(Assignee)

Comment 1

17 years ago
Created attachment 62534 [details] [diff] [review]
Arena allocate the 4500+ allocations of filelist and name 

nsZipItem and the name are arena allocated.
(Assignee)

Comment 2

17 years ago
Brendan/dveditz sr=/r= ?
Comment on attachment 62534 [details] [diff] [review]
Arena allocate the 4500+ allocations of filelist and name 

>+#if !defined(STANDALONE)
>+    PL_ARENA_ALLOCATE(mem, &mArena, sizeof(char) * (namelen + 1));

Standard C and C++ say sizeof(char) == 1, so you don't need to scale here.

> nsZipArchive::nsZipArchive()
>-  : kMagic(ZIP_MAGIC), mFd(0)
>+  : kMagic(ZIP_MAGIC), kArenaBlockSize(8*1024), mFd(0)

Call me old-fashioned, but would a #define ZIP_ARENA_SIZE (8*1024) not be
better? Or do most C++ compilers "inline" the const and avoid a load on every
use?

>+#ifndef STANDALONE
>+  // Initialize our arena
>+  memset(&mArena, 0, sizeof mArena);
>+  PL_INIT_ARENA_POOL(&mArena, "ZipArena", kArenaBlockSize);

Don't memset mArena to 0, PL_INIT_ARENA_POOL => PL_InitArenaPool sets all
members.

>@@ -241,6 +247,9 @@
> 
>   PRFileDesc    *mFd;
>   nsZipItem*    mFiles[ZIP_TABSIZE];
>+#ifndef STANDALONE
>+  PLArenaPool         mArena;
>+#endif

Nit: members are not consistently indented.

/be
Comment on attachment 62534 [details] [diff] [review]
Arena allocate the 4500+ allocations of filelist and name 

> nsZipArchive::nsZipArchive()
>-  : kMagic(ZIP_MAGIC), mFd(0)
>+  : kMagic(ZIP_MAGIC), kArenaBlockSize(8*1024), mFd(0)

I'm a little worried about using this large a block. Each
zip file gets its own pool, netscape has 16 chrome archives,
so we could be wasting 128K worst case. Even if you say on
average we'll only waste a quarter of a block per archive
(and I don't know how you could say that) you'll still be
wasting 32K which just happens to be the savings Garret got
after all his work shrinking the size of nsZipItem.

For the very smallest archives you should use 512 byte blocks,
and a few of them fit in just one block. For larger archives
(transition should be at least 16K, but could reasonably go up
to 100K) use 1k blocks.

The largest archive, comm.jar, has 30,875 bytes of data we store
by my calculation. Will 31 1K allocations kill us vs. 4 8K
allocations? If so you can pick other points to transition to
larger block sizes, but I'm sure we don't want the footprint hit
on the smaller archives.

[To guess the amount of data we're storing I used
  unzip -Z1 <archive> | wc -lc
then multiplied the first number by 32 (sizeof nsZipItem)
and added the second number for the name data. wc counts
newlines and we store null terminators so it works out.]

>+#ifndef STANDALONE
>+  // Initialize our arena
>+  memset(&mArena, 0, sizeof mArena);
>+  PL_INIT_ARENA_POOL(&mArena, "ZipArena", kArenaBlockSize);
>+#endif

If you do what I suggest you'll have to wait on this until we
know the filesize. Maybe in the Open, or in BuildFileList.
BuildFileList might be best since we're already looking for
the end of the file. You could also make guesses based on the
size of the central directory which might be slightly more
reliable if you worry about archives containing a few huge files
or lots of miniscule ones.

Comment 5

17 years ago
See also Bug 118455: 
it is about using file mapping to avoid those allocations when all items the the
jar file are stored and not compressed, comments are welcome!
You're thinking of bug 113393, the zlib allocations (which have been largely
eliminated). Storing files rather than compressing them wouldn't help the
nsZipItem allocations at all because we still have to track where the files are
inside the archive.
(Assignee)

Comment 7

17 years ago
Created attachment 64266 [details] [diff] [review]
Comments from Brendan and Dveditz

Fixed all review comments.

Dan, I switched the ARENA_BLOCK_SIZE to 4k and made the arena global to all zip
archives. That would control the wastage if any. What do you think ?
Attachment #62534 - Attachment is obsolete: true
4k still sounds too large, potentially 64K wasted with 16 .jar files (probably
closer to 32K in practice). We're trading footprint for performance here, but
we're only getting performance measurements. What's the footprint cost?

Please measure the allocations in the original code (I guess that's your 180K
figure), with your 4K buffer, and with a 1K buffer.
(Assignee)

Comment 9

17 years ago
Dan it isn't an arena per nsZipAchive; I switched it to one global arena shared
by all nsZipArchives. Thus wastage is at worst 4k; probabilistically 2k
Attachment #64266 - Flags: needs-work+
Comment on attachment 64266 [details] [diff] [review]
Comments from Brendan and Dveditz

Whoa, that's a huge change to hide under the description
"Comments from Brendan and Dveditz".

In STANDALONE mode (used by the native installers) you now no
longer delete nsZipItems so they all leak, and you removed its
destructor so even if you do delete them the names are all leaked.

By having a global pool we get screwed by transient archives,
such as might be opened by XPInstall or signed scripts. Or if
the ZipReader cache fills up and evicts an archive. And then
again when it most likely reopens the chrome archive it just
evicted.

By having a shared memory pool you're also exposed to thread-safety
issues. nsZipArchive is advertised as not threadsafe so people
will take care to use each on a single thread (or wrap it in
nsJAR which has locks for the critical entry points), but there's
no guarantee two different threads won't open two different archives
at the same time.

Even though it's potentially more wasteful of memory you were on
surer footing with a per-archive pool.
One way around the per-archive pool is to recycle items.  The "screwed by
transient archives" problem dveditz mentions is one example of using non-LIFO
allocation lifetime request strings with an arena pool, which is always bad. 
If, however, you could make a freelist of items, the freelist would buffer the
non-LIFO load and present a more LIFO load to the underlying pool.

That doesn't deal with thread safety.  You'd want a PRLock, I think.  But then
you're paying what might be too high an overhead per item allocation.  So Dan's
point about per-archive pools probably still wins, if you can make the arena
size small enough that internal fragmentation doesn't waste too much space. 
It's not too hard to compute the average utilization for a given arena size,
given real numbers on mean items per archive (max and variance would be useful
too) -- do you have those?

/be
(Assignee)

Comment 12

17 years ago
yeah. I am convinced that per-archive pool is better now. Will revert back. Will
publish numbers too.
(Assignee)

Comment 13

17 years ago
> Whoa, that's a huge change to hide under the description
> "Comments from Brendan and Dveditz".

Hey to my defense I did add comments about the switch to global arena.

(Assignee)

Updated

17 years ago
Attachment #64266 - Attachment is obsolete: true
Attachment #64266 - Flags: needs-work+
Got it. Your global arena mention in the comments didn't sink in when I read it,
and later when I reviewed the patch I saw only the short comment in the
attachment table.
(Assignee)

Updated

17 years ago
Target Milestone: mozilla0.9.8 → mozilla0.9.9
(Assignee)

Updated

17 years ago
Target Milestone: mozilla0.9.9 → mozilla1.0
(Assignee)

Updated

17 years ago
Keywords: footprint, nsbeta1
(Assignee)

Comment 15

17 years ago
Created attachment 72750 [details] [diff] [review]
Arena allocate zip archive list

Using 1k blocks for Arena. Has all of brendan's earlier comments except the
block size definition - C++ will substitute the const value instead of loading
it each time.
(Assignee)

Comment 16

17 years ago
chetu> grep Arena out | perl arena.pl
Arena    Alloc    blks           waste
--------------------------------------
1a755c8  11790 -  12 1k Blocks :   498
1a755e0  32131 -  32 1k Blocks :   637
1a755f8  31863 -  32 1k Blocks :   905
1a75610   1260 -   2 1k Blocks :   788
1a75628   3275 -   4 1k Blocks :   821
1a75640  32057 -  32 1k Blocks :   711
1a75658   2067 -   3 1k Blocks :  1005
1a75670  13237 -  13 1k Blocks :    75
1a75688    344 -   1 1k Blocks :   680
1a756a0   7416 -   8 1k Blocks :   776
1a756b8   6443 -   7 1k Blocks :   725
--------------------------------------
Total   141883 - 146 1k Blocks :  7621
Malloc wastage : 30880 in 3860 allocations

So we save : 3,714 small malloc() calls totaling 22,091 bytes

This is for a mozilla build for startup.
(Assignee)

Comment 17

17 years ago
Seems like Dan is out until 3/11. Garrett r=, Brendan sr=?
(Assignee)

Comment 18

17 years ago
Ccing Alec. Brendan's email says he is out and cant sr.

Garrett r=, Alec sr= ?
Comment on attachment 72750 [details] [diff] [review]
Arena allocate zip archive list

r=blythe
Attachment #72750 - Flags: review+

Comment 20

17 years ago
Comment on attachment 72750 [details] [diff] [review]
Arena allocate zip archive list

what's PL_ARENA_CONST_ALIGN_MASK for?

Don't we need to manually call the ~nsZipItem destructor in the arena case? If
we don't, can you comment to explain why it isn't necessary?
(Assignee)

Comment 21

17 years ago
plarena.h:
/*
 * If the including .c file uses only one power-of-2 alignment, it may define
 * PL_ARENA_CONST_ALIGN_MASK to the alignment mask and save a few instructions
 * per ALLOCATE and GROW.
 */
Setting it to 7 causes all arena allocation to be aligned on 8 byte boundary.
Malloc does that too.

Yeah. We really dont need to call ~nsZipItem. I added this comment.

  // CAUTION:
  // We dont need to delete each of the nsZipItem as the memory for
  // the zip item and the filename it holds are both allocated from the Arena.
  // Hence, destroying the Arena is like destroying all the memory
  // for all the nsZipItem in one shot. But if the ~nsZipItem is doing
  // anything more than cleaning up memory, we should start calling it.

I did some better error checking too when memory fails to protect from delete
item in low memory situation. Nice point. Will attach new patch.
(Assignee)

Comment 22

17 years ago
Created attachment 72836 [details] [diff] [review]
Added comments per alecf;  more error checking

- Added comments per alecf
- consistent use of #ifndef STANDALONE instead of #if !defined(STANDALONE) per
brendan
- Under low memory condition, prevent delete of nsZipItem
Attachment #72750 - Attachment is obsolete: true
Comment on attachment 72836 [details] [diff] [review]
Added comments per alecf;  more error checking

r=blythe
Attachment #72836 - Flags: review+

Comment 24

17 years ago
Comment on attachment 72836 [details] [diff] [review]
Added comments per alecf;  more error checking

ah, so I forgot about the whole MOZ_COUNT_CTOR/DTOR stuff

I noticed that MOZ_DECL_CTOR_COUNTER() is just defined as being empty, in both
debug and release builds. This means we've essentially lost bloat counting of
nsZipItems. 

I guess I'm ok with this, but here's why: If we manually call the destructors,
we take an unnecessary perf hit just to get debug info. If we only call the
destructor on debug builds, I think that means there is too much difference
between debug and release, and could lead to worse problems along the way....
Attachment #72836 - Flags: superreview+

Comment 25

17 years ago
Comment on attachment 72836 [details] [diff] [review]
Added comments per alecf;  more error checking

ah, so I forgot about the whole MOZ_COUNT_CTOR/DTOR stuff

I noticed that MOZ_DECL_CTOR_COUNTER() is just defined as being empty, in both
debug and release builds. This means we've essentially lost bloat counting of
nsZipItems. 

I guess I'm ok with this, but here's why: If we manually call the destructors,
we take an unnecessary perf hit just to get debug info. If we only call the
destructor on debug builds, I think that means there is too much difference
between debug and release, and could lead to worse problems along the way....

sr=alecf
(Assignee)

Updated

17 years ago
Keywords: nsbeta1 → nsbeta1+
Can't we just inline the MOZ_COUNT_DTOR stuff where we free up the structure?
(Assignee)

Comment 27

17 years ago
Why ? Since we are allocating these objects on an Arena it seems wrong to count
how many are created and destroyed. It will be all or nothing destroyed. Nothing
is a leak of the Arena which we should know by other means.

Inlining even if possible would make DEBUG shutdown faster. I wont mind putting
it in non-inlined too. But then calling the destructor is the one that seems far
off. And writing another loop to go over all these objects and count them seems
off too 'cause of the Arena reason.

Comment 28

17 years ago
I think the object should still appear in the bloat logs - the bloat logs are
measuring object creation - since there (might) be some cost with the actual
creation of an object (i.e. through the constructor itself, independent of the
allocation) we still get to keep some decent accounting stats through the bloat
logs.
Alec: bloat logs are for tracking bloat, which is a space issue independent of
the complexity of a constructor.  Performance issues are better identified with
profiles, which give call-context in addition to run-wide call counts, so I'm
not worried about losing that data here.

Comment 31

17 years ago
fair enough. ok, I guess we just #ifdef out the MOZ_COUNT_CTOR in non-standalone
builds.
Comment on attachment 72836 [details] [diff] [review]
Added comments per alecf;  more error checking

a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #72836 - Flags: approval+
Looks good to me, too. r=/moa=dveditz if anyone ever asks.
(Assignee)

Comment 34

17 years ago
Fix checked in to trunk.
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED

Updated

11 years ago
Component: XP Miscellany → General
QA Contact: brendan → general
You need to log in before you can comment on or make changes to this bug.