Closed Bug 189528 Opened 22 years ago Closed 21 years ago

libjar and consumers make too many copies

Categories

(Core :: Networking: JAR, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.4alpha

People

(Reporter: alecf, Assigned: alecf)

References

Details

(Keywords: memory-footprint, perf, topembed+)

Attachments

(1 file, 5 obsolete files)

Ok, I was poking around looking under the covers for bug 121341, and I found
some nastiness. 

First of all, when reading a file out of a .jar file, libjar only provides a
Read() API, which copies stuff out of an internal buffer into one provided by
the client, even though libjar has a completely decompressed buffer in memory.
That's one extra copy.

In the case of CSS and String Bundles, we use UTF8InputStream to convert the
buffer to Unicode on the fly - it buffers the data in an nsByteInputStream (the
first extra copy mentioned above), and converts it into an internal
nsUnicodeBuffer, a second extra 'copy' (conversion is part of the process
though, so remember that for later...) The conversion of the whole buffer is
often unnecessary and generally only needs to apply to particular strings within
the buffer. 

Finally, when CSS or the String Bundle parses the unicode file, it makes a final
copy of the unicode data and uses it later - that's a 3rd extra copy of the
original data, the fourth copy including the original.

If we provide some sort of API on nsZipArchive like
Consume(PRUint32 aBytes, char ** aResultBuffer, PRUint32* aResultBytes);

nsZipArchive could hand back parts of the buffer to nsJARInputStream. Then
nsJARInputStream's ReadSegments could be fixed to call this instead, and pass
the original buffer back to the nsWriteSegmentFun. This would allow the CSS or
nsPersistentProperties parser to parse the original data. they would be dealing
with raw UTF8 data, and would have to do the conversion on the fly, of any
necessary strings.

In the case of CSS, keywords are 7-bit clean, and only random user data like
attribute names/values and URLs would have to be converted from UTF8 (and even
in the case of URLs, necko can already handle UTF8 strings just fine.

Benefits I forsee:
- simplified code should result in a reduction in static compiled code across
the board
- restructuring some of the consumers to use readSegments() will allow for
feeding of data to the CSS parser and nsPersistentProperties file asychronously.
- restructuring these consumers will allow us to benefit from future
memory-map-based caches (i.e. speeding up cached CSS)
- performance benefits because we're not making so many copies of the data.
- reduced memory usage because of fewer buffers needed for the copies

I have not looked at reading XUL out of JARs because most of the time that is
covered by the fastload stuff. It may get a speedup on the side, but it is not
part of the goal of this bug.
OS: Windows 2000 → All
Hardware: PC → All
Attached patch improve ReadSegments on libjar (obsolete) — Splinter Review
here's a non-copying version of ReadSegments for libjar.. nobody actually calls
nsJARInputStream::ReadSegments just yet, but they will :)
Status: NEW → ASSIGNED
Priority: -- → P2
Target Milestone: --- → mozilla1.4alpha
so I talked with darin for a bit about this - his suggestion was more that we
fix nsJARInputStream's Read() such that it uses zlib to expand into the buffer
on-demand rather than expanding thew whole file into memory. 

I did more investigation, and discovered yet ANOTHER copy - as zlib decodes the
file, it is expanded into a temporary stack-based buffer, before being appended
onto the whole-file buffer

we also discovered that nsSyncLoaderStream was using NS_NewBufferedInputStream
which loads the data synchronously on the UI thread - instead he suggested using
nsIStreamTransportService to make a nsITransport for the stream, and reading it
that way.

I've got a patch for that coming up.
Attached patch fix the syncloader (obsolete) — Splinter Review
use nsIStreamTransferService to load the data on another thread. looking for
comments/reviews
Comment on attachment 112427 [details] [diff] [review]
fix the syncloader

alec: this looks great!  any word on how this effects Ts/Txul?
+    if (NS_FAILED(rv))
+        return rv;

please use NS_ENSURE_SUCCESS(rv, rv); instead.
> please use NS_ENSURE_SUCCESS(rv, rv); instead.

I don't think reviewers should push for use of the NS_ENSURE macros, for 2 reasons:
1. They hide control flow.
2. It's harder to set a breakpoint on the 'return rv'.
I don't want to turn this bug into a holy war about the existance of the
NS_ENSURE macros. And i'm not holding a review of this patch for that
(especially since I shouldn't be reviewing the patch since it's way outside my
area of knowledge).

However

I don't like arguments against using NS_ENSURE macros, anywhere, ever. There are
mainly three arguments against it that i hear:

1. It makes it impossible to add a breakpoint
2. It warns
3. It hides control flow

To which my answer is
1. Yes, this is true. But the number of times that someone want to put a
breakpoint there is extreamly small compared to the number of times that someone
looks over the code and tries to figure out what it does (or doesn't) do.

Having the macros cleans up the code by a lot and therefor aids the times when
you try to grokk the code.

And for the times when you really need to have a breakpoint, change the code so
that you can put a breakpoint there.

2. Ok, then lets remove the warning. And maybe add a version called
NS_ENSURE_SUCCESS_AND_WARN or some such. I don't care much about the warning,
but more often then not I personally think it's a good thing to warn. 

see also http://bugzilla.mozilla.org/show_bug.cgi?id=183999#c15

3. It shouldn't be news to anybody hacking mozilla that NS_ENSURE returns if the
test fails. Without knowing that much you shouldn't IMHO get cvs-access.
However if it's a biggie for other people let's rename the macro. How about
NS_RETURN_UNLESS_* ?



My main concern with not using the macro is that it will obfuscate the code. You
*should* pretty much do errorchecking after every function that returns an
nsresult (which is 9 out of 10). Having a 2-line (or 3-line with {}) |if| after
every such code will make the code hard to read and will make people do less
errorchecking.

Yes, i care about this deeply :-)
Mainly because I'm always nervous when I'm returning an errorvalue since it's
quite likly that someone up the callchain isn't checking the returned errorvalue
and bad things will happen. (has anybody ever dared running mozilla when low on
memory? I'll bet this weeks bellybutton-lint that we'll crash since someone
fails to do proper errorchecking)
Pros and Cons of NS_ENSURE_* aside:

It has long been agreed that the use of NS_ENSURE_* is up to the coder and the
owner of the code they are modifying. The style of error handling should be
maintained within the file that is being modified. 

Thus, if there were a lot of NS_ENSURE_*'s I might be inclined to use them, but
as the coder it is still my own discretion. Reviewers should only make such
requests if they are strong owners of the code in question.

Feel free to quote me on that. :)
nominating some of my footprint/perf bugs for topembed
Keywords: topembed
Comment on attachment 112427 [details] [diff] [review]
fix the syncloader

ok, I'm going to fix this differently, based on the work I'm doing in bug 11232
Attachment #112427 - Attachment is obsolete: true
ok, this took longer than I thought, mainly due to me trying to understand the
ownership model.. and then trying to change it without breaking the STANDALONE
libjar (speaking of which, where is the standalone version used, so I can test
it? I tried building xpinstall, but I don't think I saw it being used)

Anyway, I've done a few things, that I don't want to forget:
1) in the normal world, nsJAR now owns/supplies the file descriptor that is
handed to its nsZipArchive in order to initialize the nsJARInputStream's
nsZipRead. Thus, nsJARInputStream owns the nsZipRead (which has a pointer to the
file descriptor) and the nsJAR, and the nsJAR also owns the file descriptor.
Hm.. that's a little whacky, but bear with me.

This allows the nsJARInputStream to exist on any thread, since it owns all the
state surrounding the reading of the file.. it actually doesn't even matter if
it moves from thread to thread, as it encapsulates all its own private data. The
other thing thats going on here that shouldn't get too tricky with threads, is
that when a JAR is opened and the nsZipItems are created, they are never
changed. They are also created before any other threads have access to the list
of items. This makes it safe for multiple threads to access the data at the same
time. The only trick is going to be the shutdown of the archive, so that the
data can be destroyed. it might be as simple as just making the addref/release
threadsafe (and that might already be, anyhow)

2) in the standalone world, nsZipArchive has an mFd member and the assumption is
that there is only one thread. Every read of the archive passes through this
file descriptor

Anyhow, that's my story for now. I've got a compiling patch and I've walked the
code a few times, but I haven't even tested it yet.
alecf:
BTW: Did you see the ideas in bug 118455 ("libjar performance / RFE: Replace
{compressed JAR, read(), uncompress()} with {uncompressed JARs, mmap()}") yet ?
alec: i think the stub installer might use the standalone version of libjar. 
cc'ing doug.. he'd know :)
certainly interesting. In talking with Darin, he said that his experience has
actually been that mmap'ing can potentially be slower if you're just sucking in
data off the disk to be read once.. mmap()'ing is more helpful if you're
frequently accessing different parts of a file. 

That said, as a part of this I'll certainly be making direct reads of
uncompressed data fast, so that there is still only one buffer. My guess is that
we'll see more or less the same performance as if we had mmap'ed...
Alec Flett wrote:
> That said, as a part of this I'll certainly be making direct reads of
> uncompressed data fast, so that there is still only one buffer. My guess is 
> that we'll see more or less the same performance as if we had mmap'ed...

... but what about the footprint ? Currently we cache the stuff in memory per
mozilla instance while mmap()'ed pages would
1) not count as allocated heap memory
and
2) read-only pages can be shared between mozilla instances and therefore will
only be loaded once on a server even if many many users use the zilla
I am already eliminating libjar's heap footprint... and sure, mmap wouldn't
count for heap footprint, but it would count for memory footprint (i.e. resident
size, etc) - you're just shuffling pages around. And yes, it would be readonly,
but the nature of libjar, given that we have things like XUL caching and
.properties file caching, is to read a file in once (maybe twice) and then
forget about it... so sharing the pages between mozilla instances isn't really a
win, since the pages would only be read once and then un-mmap()'ed anyway.

Hey, if someone comes up with a decent mmap() implementation that proves to be
better, more power to them.... my work doesn't prevent it from happening, or
make it any harder.
roland: plus there is little point optimizing for the multi user scenario you
describe if it is costly in the single user scenario.
Darin Fisher wrote:
> roland: plus there is little point optimizing for the multi user scenario you
> describe if it is costly in the single user scenario.

If the price for mmap() in the single-user scenario is nearly the same as in the
current code it will be a win (for the server users) ... :)
darin - the standaonle installer only requires zlib.
Keywords: topembedtopembed+
Attached patch decompress on demand (obsolete) — Splinter Review
this patch isn't finished, because I haven't written the async copy part yet
(See nsZipRead::ContinueCopy)

But in any case, I've turned libjar on its head - or maybe on its feet because
I think it was really backwards before.

The signifigant things are:
- nsZipArchive now only represents the meta data for the .jar file - it doesn't
have anything to do with the read state of any particular file in the .jar
- nsZipRead now does a lot of the work itself, rather than the work happening
in nsZipArchive, and updating nsZipRead. This allows consumers to hold on to an
nsZipRead and thus have a complete snapshot of a read/decompress in progress.

The issue I've run into with this so far is that inflate() is crashing under
specific, kind of wierd circumstances. It works fine when loading off the UI
thread, but crashes randomly in inflate() when running on a background thread.

After talking a little with darin, my theory is that it has something to do
with Read() being called on different threads. In a simple case where we simply
read one file on a background thread, there are no crashes. I'm going to try
expanding the test case to run many simultaneous reads..
Attachment #112202 - Attachment is obsolete: true
alec: you might also want to try limiting the number of stream transport threads
to just 1.  see nsStreamTransportService, and look for the thread pool it creates.
Well, it seems to run fine when its just one file being loaded from one
background thread - I think its the moving amongst the threads thats confusing it.
oh, perhaps if two distinct URLs are being loaded from the same JAR file?!?
theoretically, that should be possible, with these changes.
Attached patch decompress on demand v1.1 (obsolete) — Splinter Review
ok, here's an updated patch.. I'm very VERY close... turns out some of the
problems I was having was due to a miscalculated bytecount when reading in
large files.. this one works completely in my test case, but for some reason is
causing deadlock somewhere. It gets mostly through startup and then just
deadlocks before showing the first window. Still investigating.
Attachment #114463 - Attachment is obsolete: true
Attached patch decompress on demand v1.2 (obsolete) — Splinter Review
yay! I finally worked through all the off-by-one errors (And the deadlock that
I mentioned to darin just went away - my theory is that my tree wasn't up to
date when I saw it)

The last patch didn't clean itself up by closing files or destroying the
stream. This one cleans everything up. I ran this through spacetrace and saw
some of the big allocations disappear!

I'm finally looking for reviews on this.
Attachment #115674 - Attachment is obsolete: true
Comment on attachment 116444 [details] [diff] [review]
decompress on demand v1.2

ok, this is ready for reviews. I'll be running with this while I await them.

I don't know who to ask for reviews here, so I'm starting with darin (because
its necko related) and dveditz (cuz, well.. I dunno it seems like something you
might know something about)

If only dp were here :)
Attachment #116444 - Flags: superreview?(darin)
Attachment #116444 - Flags: review?(dveditz)
the "standalone" version is used by the native install stubs to unpack the
bootstrap xpcom.xpi before any Mozilla code exists.

I haven't had a look at the patch yet, but we had real threading problems in
nsJAR (wrapping the raw nsZipArchive) before we put locks in to protect shared
structures. I think the reading into a buffer was in part to get around that
(not sure, didn't do the nsJAR part).

CC'ing ssu to make sure the installers are still going to work with these patches.
To test the installers, just run build.pl from:
  mozilla/xpinstall/wizard/windows/builder

it'll create the installer in:
  mozilla/dist/install

If you're having problems generating the installer, let me know and I'll try to
apply your patch on my tree and test the installer for you.
Comment on attachment 116444 [details] [diff] [review]
decompress on demand v1.2

darin and I went one round on this so far, and I have a few things to clean up
(possibly leaking file descriptors, ownership issues, etc)

new patch coming tomorrow.
Attachment #116444 - Flags: superreview?(darin)
Attachment #116444 - Flags: review?(dveditz)
ok, here's an updated patch after talking with darin. I cleaned up all the
bogus ZIP_USES_FD crap. I've documented the ownership of the file descriptors
(it is confusing!) both in the top of nsZipArchive.h and when the ownership is
actually taken. Looking for more reviews...
Attachment #116444 - Attachment is obsolete: true
Comment on attachment 116785 [details] [diff] [review]
decompress on demand v1.3

Requesting reviews - I can sit down on wednesday or thursday with anyone to go
over this...
Attachment #116785 - Flags: superreview?(darin)
Attachment #116785 - Flags: review?(dougt)
Comment on attachment 116785 [details] [diff] [review]
decompress on demand v1.3

ok, I got verbal r=dougt and sr=darin with some minor tweaks..
Attachment #116785 - Flags: superreview?(darin)
Attachment #116785 - Flags: superreview+
Attachment #116785 - Flags: review?(dougt)
Attachment #116785 - Flags: review+
yay! marking fixed.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
This may have pushed up the leak stats from 729K to 1.93M. I say may, because
brad is the only box running the refcnt leak tests, and I was busy changing its
config at the time. Plus the bloat diff thing from tinderbox has been broken on
non-windows since late last year, and my the time we notied the old logs had
already been replaced.

http://bonsai.mozilla.org/cvsquery.cgi?module=MozillaTinderboxAll&date=explicit&mindate=1047667500&maxdate=1047675839

See
http://tegu.mozilla.org/graph/query.cgi?testname=trace_malloc_leaks&units=bytes&tbox=backupboy&autoscale=1&days=7&avg=1
for the graph

This checkin is the only non-trivial one in that window though. Its also
possible that something else is leaking the JAR stuff, or that the configuration
changes I made (and I unchanged the codesighs stuff because that has objdir
issues, so there wasn't anything by the time of the next run) did something.
brad managed to find an old leak log, from Feb22. Done on a different machine,
but the same config/compiler (except for different CPU speed). Its 500K, so I
won't attach it unless its wanted, but it does start:

1284771 malloc
  1273644 _ZN20nsRecyclingAllocator6MallocEji
    1273644
/mnt/4/tinderbox/brad/Linux_2.4.19_Clobber/mozilla/../tinderbox-obj/dist/bin/components/libjar50.so+69C9
      1241920 inflate_blocks_new
        1241920 inflateInit2_

so I'd guess that this bug was it.
alecf: ping? Is this leak real?
sorry yes this is a real leak, and filed as bug 198133 - I'm all over it, I just
this morning figured out part of the leak...
*** Bug 113404 has been marked as a duplicate of this bug. ***
*** Bug 125368 has been marked as a duplicate of this bug. ***
Component: XP Miscellany → General
QA Contact: brendan → general
Component: General → Networking: JAR
QA Contact: general → networking.jar
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: