Open
Bug 487394
Opened 16 years ago
Updated 2 years ago
investigate setting NSS_DISABLE_ARENA_FREE_LIST so that NSS doesn't hold on to memory it's not using
Categories
(Core :: Security: PSM, defect, P3)
Core
Security: PSM
Tracking
()
NEW
People
(Reporter: rob, Unassigned)
References
()
Details
(Whiteboard: [psm-logic][psm-backlog])
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2a1pre) Gecko/20090407 Minefield/3.6a1pre
Build Identifier: trunk
From bug #485052...
The committed patch allocates memory when PSM initializes NSS, and attempts to deallocate that memory when PSM deinitializes NSS. This causes a leak regression because currently neither NSS nor PSM call PL_ArenaFinish() to properly free (i.e. return to the heap) the global PLArena Free list.
Boris says (comment #32) that the rest of Gecko avoids bloat when it uses arenas by always calling PL_ArenaFinish() within a well-defined lifetime.
Nelson asserts (comment #25) that NSS should not call PL_ArenaFinish() itself. He suggests that PSM could work around the issue by setting the NSS_DISABLE_ARENA_FREE_LIST environment variable before initializing NSS.
On https://developer.mozilla.org/en/NSS_Memory_allocation, Nelson notes that disabling the arena free list "makes NSS slower".
Which is best: speed or lack-of-bloat?
Reproducible: Always
Reporter | ||
Updated•16 years ago
|
Summary: PSM should define NSS_DISABLE_AREA_FREE_LIST before initializing NSS → PSM should define NSS_DISABLE_ARENA_FREE_LIST before initializing NSS
Version: unspecified → Trunk
Comment 1•16 years ago
|
||
To be clear: Bug 485052 caused an Lk regression (trace-malloc leaks at shutdown) on all platforms of about 15kB for Firefox, a similar regression is seen on the Thunderbird tinderboxes.
Although the implementation of bug 485052 isn't at fault it has shown up (by increasing leak figures) the lack of clean up of the arenas.
There is a general effort to reduce leaks to zero. IMHO Having some shutdown leaks "allowed" will cloud what is a leak and what isn't, especially when patches like the one in bug 485052 and increase leaks in an "expected" area.
Flags: blocking1.9.1?
Summary: PSM should define NSS_DISABLE_ARENA_FREE_LIST before initializing NSS → PSM should define NSS_DISABLE_ARENA_FREE_LIST before initializing NSS / Lk regression on 7th April 2009
Updated•16 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 2•16 years ago
|
||
If we just want to fix the shutdown leak, we need to add a PL_ArenaFinish call during shutdown.
If the memory possibly used by the NSS arenas is unbounded, then we need to set the NSS_DISABLE_ARENA_FREE_LIST environment variable.
If it's bounded, what is the bound?
Comment 3•16 years ago
|
||
Not a regression, not a blocker, but we'd take a patch. Feel free to renominate if there are compelling reasons to block on it.
Flags: blocking1.9.1? → blocking1.9.1-
Comment 5•16 years ago
|
||
I want to clarify a number of points, some at Rob's request.
1. A PLArenaPool is a small structure that keeps track of a list of memory
blocks known as PLArenas. PLArena have a minimum size, which is often larger
than the size of a typical small data structure. When you attempt to allocate
memory from a PLArenaPool, the code attempts (in this order) to allocate it
from unused space in one of the arenas already associated with that PLArenaPool, or it tries to find a "free" PLArena on the global PLArena free
list, or it allocates a new PLArena from the heap and links that into the
PLArenaPool.
Given a PLArenaPool holding one or more PLArenas, there is no public function
to free just a single one of those arenas. Instead of freeing the individual
PLArenas, the caller "frees" the entire PLArenaPool. There are two ways to do
this. One method, PL_FinishArenaPool, frees all the PLArenas back to the heap.
The other method, PL_FreeArenaPool, takes all the PLArenas away from the PLArenaPool and puts them on the global free list of PLArenas.
When it wants to destroy a PLArenaPool, NSS calls PL_FreeArenaPool, unless
the NSS_DISABLE_ARENA_FREE_LIST environment variable is set, in which case,
NSS calls PL_FinishArenaPool. The environment variable may be set or cleared
at any time. It only affects how PLArenaPools are destroyed, not how they are
allocated. The allocation algorithm ALWAYS tries to allocate from the free
list before allocating from the heap, even if the free list is empty. So, setting the environment variable in the middle of the running program will cause the free list of PLArenas to stop growing, and to shrink until it is
empty. Once it is set, PLArenas that are already on the free list will be
taken from it for new allocations, but they will be freed to the heap when
the PLArenaPool is destroyed.
Function PL_FinishArena (not to be confused with PL_FinishArenaPool) flushes
the PLArena free list, freeing all those PLArenas back to the heap, and
destroying the lock that protects the free list. It is intended to be called
only at the end of the process, or at such time as PLArenaPools will never be
used thereafter in the remainder of the process lifetime. NSS never calls
this function because NSS does not presume itself to be the only user of the PLArenaPool code.
2. In answer to Boris's question, NSPR's PLArenaPool code does not keep track
of the amount of space, nor the number of PLArenas, on the PLArena free list.
There is a high water mark but it is not recorded, tracked, or bounded by
NSPR.
Summary: PSM should define NSS_DISABLE_ARENA_FREE_LIST before initializing NSS / Lk regression on 7th April 2009 → PSM should define NSS_DISABLE_ARENA_FREE_LIST / Lk regression on 7th April 2009
Comment 6•16 years ago
|
||
That doesn't answer my question. My question is whether NSS's specific use of the arena APIs is bounded in terms of the number of PLArenas it will allocate over a process lifetime, at least as used via PSM.
Or put another way, whether it's possible to cause the browser to allocate 500MB worth of PLArenas via NSS on visiting a web page, say.
Comment 7•16 years ago
|
||
(In reply to comment #6)
> That doesn't answer my question. My question is whether NSS's specific use of
> the arena APIs is bounded in terms of the number of PLArenas it will allocate
> over a process lifetime, at least as used via PSM.
NSS imposes no bound, but as a practical matter, it is bounded by Firefox's bound on the number of simultaneous SSL connections.
> Or put another way, whether it's possible to cause the browser to allocate
> 500MB worth of PLArenas via NSS on visiting a web page, say.
It would take thousands upon thousands of simultaneous TCP connections to
reach such numbers. So, I would say the answer is no.
This is all measurable. Given that all the PLArenas allocated through NSS
are now leaked at shutdown, just total up the space of those leaked PLArenaPools. That's the high water mark for that run.
Comment 8•16 years ago
|
||
> it is bounded by Firefox's bound on the number of simultaneous SSL connections.
Is that really guaranteed? The patch that caused this bug to be filed doesn't do any SSL connections at all, but increased the number of arenas allocated...
> That's the high water mark for that run.
That doesn't answer my question either. My question is whether there is a high water mark limit over all possible runs.
Comment 9•16 years ago
|
||
NSS allocates space from PLArenaPools for data objects that correspond to
sockets, keys, certificates, etc, but for for bulk data. The actual
application data (e.g. http requests and response) are not allocated from
PLArenaPools. Some of the objects allocated in PLArenaPools are long lived
containing information that is essentially configuration information.
So, the high water mark is a function of two classes of use:
a) very long lived objects, and
b) shorter lived objects whose numbers and total space correlate to the
high water number of simultaneous connections, but not to amount of data
transferred on those connections.
Please take my suggestion, and instead of imagining the worst, measure the
amount of space in leaked PLArenas allocated by NSS.
Your most recent question is answered in comment 7.
Comment 10•16 years ago
|
||
> and instead of imagining the worst, measure the amount of space in leaked
> PLArenas allocated by NSS
Since I'm precisely interested in the worst-case behavior, that won't do me much good.
> Your most recent question is answered in comment 7.
Meaning the answer is "no"?
Note that certificates can be quite long-lived, in general. Gecko assumes that certificate objects (or rather nsIX509Cert) objects are small enough to attach one to every image on a web page, for example. I have no idea whether they're sharing the same underlying NSS object if all the images come from the same server, say.
> correlate to the high water number of simultaneous connections, but not to
> amount of data transferred on those connections.
I wasn't assuming it was anything like the amount of data transferred, and I'm glad it's not. But the "simultaneous connections" thing doesn't match what I know of as far as treatment of certificates in PSM. It's at the very least closer to "high-water-mark number of SSL sites that have all been loaded in the browser and not yet navigated away from", which is quite a bit larger than the number of SSL connections.
In any case, the question I'm asking is not an NSS question, but a PSM question, since PSM is what mediates the browser's interaction with these objects and determines their lifetimes. I'm more or less waiting for Kai's answer here, unless someone else happens to know the details of that code.
Comment 11•16 years ago
|
||
> Meaning the answer is "no"?
Boris, As I wrote at the beginning of comment 7: "NSS imposes no bound"
(on the memory allocated from PLArenaPools).
The memory allocated by NSS can be divided into two categories:
a) That which NSS allocates for its own purposes, in the course of doing
SSL or S/MIME, and
b) That which NSS allocates at the request of the application that calls it.
I can characterize the behavior of the first category of memory allocation,
and did so in comment 9. I cannot characterize the second category.
Within Firefox, that is really a PSM question, as you've noted.
Regarding certificates, NSS itself is very miserly with the memory used to
hold certificates. NSS has reference counted objects for those, and a
hash table that keeps track of them all, to avoid duplication of certs in
multiple objects. So, hopefully, all those PSM objects that hold a cert
reference for every image (really!? I had no idea) are holding references
to the same object for all the images that come from the same server.
Updated•14 years ago
|
Assignee: kaie → nobody
Whiteboard: [psm-logic]
Updated•9 years ago
|
Whiteboard: [psm-logic] → [psm-logic][psm-backlog]
Updated•7 years ago
|
Priority: -- → P2
Summary: PSM should define NSS_DISABLE_ARENA_FREE_LIST / Lk regression on 7th April 2009 → investigate setting NSS_DISABLE_ARENA_FREE_LIST so that NSS doesn't hold on to memory it's not using
Comment 12•6 years ago
|
||
Moving to p3 because no activity for at least 1 year(s).
See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information
Priority: P2 → P3
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•