Status

NSPR
NSPR
P1
normal
ASSIGNED
17 years ago
11 years ago

People

(Reporter: Kirk Erickson, Assigned: Wan-Teh Chang)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(6 attachments, 2 obsolete attachments)

(Reporter)

Description

17 years ago
The iws folks (jpierre@netscape.com) found that the webserver 
is 60% faster when SmartHeap is engaged.  SmartHeap is not usable
for the most part though, because it breaks the Java garbage collector.
This bug is an attempt to address heap contention in NSS specifically.

Detailed notes are attached and available in:

	/u/kirke/nss/collector/HINTS

As of this filing, I've created 3 freelists for data structures that
are allocated often.  Negligable performance improvement running selfserv
resulted from these changes, but Wan-Teh asked that I file a bug for the record
with my patch.  It creates freelists for SECITEM, session, and context which
recycle data structures rather than calling malloc() repeatedly.

The NSS library addresses heap contention already by using arenas.
Larry (larryh@netscape.com) instrumented NSPR with lock wrapper code to
ascertain the "high-water-mark" of the arenas.  He found that we didn't
stand to gain anything by fiddling with the arena sizes. See [Bug 87725]
Instrument NSS use of PLArena.

Solaris Collector/Analyzer proved to be most useful in identifying contention.

2001-0720-0931 To reduce contention on the global lock that lives in malloc,
we replace calls to calls with high Inclusive Other sec.  Lock contention
(LMS_USER_LOCK - see p. 144 of Analyzing Program Performance with WorkShop) is
included in this stat.

    PORT_Alloc() with PORT_ArenaAlloc()     malloc()
    PORT_ZAlloc() with PORT_ArenaZAlloc()   calloc()
    
    1. pk11_CreateNewContextInSlot()        110.950
    2. pk11_NewSession()                     32.390
    3. pk11_InitGeneric()                    19.010
    4. SECTITEM_DupItem()                    19.010

Turning on "synctrace" yields "inclusive sync wait time".
After instituting the session and context freelists originating from
(1) and (2), I discovered how to get synctrace working.

killall selfserv
rm -rf .test* test*
selfserv --optimize --debug
collector store filename test.1.er
collector store directory /export/kirke/workarea-nss33
collector synctrace threshold 0
collector synctrace on
collector show
collector enable
run -t 8 -d /export/kirke/certs -n "iws-files.red.iplanet.com's Cert-O-Matic II
ID" -p 12344 -w iplanet

go --restartonly iws-files xxx
https://iws-files.red.iplanet.com:12344/stop

2001-0802-1438 Analyzer output (post changes from axilla)
is backed up in axilla:/tmp/collector-2001-0801

1. Use exclusive metrics to locate functions that have high metric values.
Exclusive metrics for a function are calculated from events which occur
inside the function itself; they exclude metrics coming from calls to
other functions.

    mutex_lock              671.741
    pthread_mutex_lock      590.656
    pthread_cond_wait       162.759

2. Use inclusive metrics to determine which call sequence was responsible
for high metric values.  Inclusive metrics are calculated from events
which occur inside the function and any function it calls; thet include
metrics coming from calls to other functions.

    mutex_lock
        malloc                              346.955
            PORT_Alloc                              220.448
                pk11_CreateNewContextInSlot         239.185
                pk11_InitGeneric                     89.980
                pk11_NewSession                      53.553
                SECITEM_DupItem                      50.668
            calloc (PORT_ZAlloc)            120.285
                pk11_getKeyFromList                  41.699
                ssl3_InitState                       39.052
                ssl3_CreateSecurityInfo              34.271
            calloc (PR_NewLock)              36.644
                pk11_CreateNewContextInSlot         239.185
                pk11_NewSession                      53.553
                pk11_getKeyFromList                  41.699
        free                                323.514
    pthread_mutex_lock
        PR_Lock                             590.158
                ssl_Read                            981.068
                pk11_CreateNewContextInSlot         239.185
                pk11_DestroyContext                 156.575
                pk11_CloneContext                   106.912
    pthread_cond_wait
            PORT_Alloc          220.448

3. Use attributed metrics to trace a particular call sequence to the
function or functions that are responsible for high metric values.
Attributed metricss tell you how much of a metric came from calls from
or to another function; they attribute metrics to another function.

    ssl_Read                                        981.068
        ssl_SecureRecv                                      942.061
            ssl_Do1stHandshake                              942.014
                ....
                ssl3_HandleHandshakeMessage                 839.295
                    ssl3_HandleClientHello                  710.477
                        ssl3_InitPendingCipherSpec          398.732
                            ssl3_GenerateSesssionKeys       256.173
                                PK11_Derive                 133.360
                                PK11_SymKeyFromHandle       117.690
                                    pk11_CloseSession        64.088
                                        pk11_DestroySession  59.602 (free)
                                    PK11_CreateSymKey        53.603
                            PK11_CreateContextBySymKey      129.557
                                pk11_getKeyFromList          41.699
                                PK11_ReferenceSlot           23.690
                                    pk11_CreateSymKey
                                        pk11_getKeyFromList  41.699
                                            pk11_getNewSession
                                                NSC_OpenSession
                                                    pk11_NewSession 53.553
                        ssl3_SendFinished                   201.115
    pk11_CreateNewContextInSlot                     239.185
        pk11_context_init                                   127.106
            NSC_SignInit                                     88.203
                pk11_InitGeneric                             38.436
                    PORT_Alloc                               87.165
            NSC_DigestInit                                   49.012
                pk11_SessionFromHandle                       26.961
                    PR_Lock
                pk11_InitGeneric                             21.414
                    PORT_Alloc
        SECITEM_DupItem                                      50.566
            PORT_Alloc                                       50.668
        pk11_GetNewSession                                   48.251
            NSC_OpenSession                                  89.527
                pk11_NewSession                              53.553
                PR_Lock                                      35.974
    pk11_DestroyContext                             156.575
        PK11_FreeSlot                                        89.663
            PR_Lock
        SECITEM_FreeItem                             89.663
            free                                             41.572
        free                                                 22.721
    pk11_CloneContext                               106.912
        pk11_CreateNewContextInSlot                          69.319
            pk11_context_init                               127.106
                NSC_SignInit                                 88.203
                    pk11_InitGeneric                         38.436
                        PORT_Alloc                           87.165
    pk11_InitGeneric                                 89.980
        PORT_Alloc                                           87.165
    pk11_NewSession                                  53.553
        PR_NewLock                                           35.823
        PORT_Alloc                                           17.730
    SECITEM_DupItem                                  50.668
        PORT_Alloc                                           50.668
    pk11_getKeyFromList                              41.699
        pk11_GetNewSession                                   41.275
            NSC_OpenSession                                  89.527
                pk11_NewSession                              53.553
                PR_Lock                                      35.974
    ssl3_InitState                                   39.052
        PK11_CreateDigestContext                             38.565
            pk11_CreateNewContextInSlot                      32.836
    ssl3_CreateSecurityInfo                          34.271
        PORT_ZAlloc                                          17.893
(Reporter)

Comment 1

17 years ago
Created attachment 47508 [details]
Kirk's notes
(Reporter)

Comment 2

17 years ago
Here are selfserv runs on Linux.
The "tip" is for comparison (no change).
The "items" isolate the secitem.c change.
The "lists" results are with all 3 (context, session, and item freelists).

box 010827 tip-delay-full               12.86   13.37   14.46    74.79
box 010827 tip-delay-restart             4.48    4.53    4.71   220.91
box 010827 tip-nodelay-full             12.79   12.87   13.51    77.67
box 010827 tip-nodelay-restart           4.42    4.61    4.71   216.68
box 010827 tip-nodelayclient-full       12.79   12.84   12.96    77.88
box 010827 tip-nodelayclient-restart     4.44    4.56    4.62   219.14
box 010827 tip-nodelayserver-full       12.62   12.77   12.90    78.32
box 010827 tip-nodelayserver-restart     4.52    4.59    4.66   218.02

box 010827 items-delay-full             12.79   12.94   13.65    77.31
box 010827 items-delay-restart           4.46    4.60    4.70   217.47
box 010827 items-nodelay-full           12.71   12.83   13.10    77.92
box 010827 items-nodelay-restart         4.39    4.52    4.60   221.08
box 010827 items-nodelayclient-full     12.83   13.14   14.13    76.12
box 010827 items-nodelayclient-restart   4.52    4.81    4.94   207.76
box 010827 items-nodelayserver-full     12.88   13.00   13.04    76.94
box 010827 items-nodelayserver-restart   4.50    4.57    4.65   218.74

box 010829 lists-delay-full             12.61   12.67   13.42    78.96
box 010829 lists-delay-restart           4.19    4.36    4.53   229.18
box 010829 lists-nodelay-full           12.58   12.71   12.81    78.69
box 010829 lists-nodelay-restart         4.27    4.39    4.47   228.05
box 010829 lists-nodelayclient-full     12.54   12.79   13.35    78.20
box 010829 lists-nodelayclient-restart   4.27    4.37    4.57   228.66
box 010829 lists-nodelayserver-full     12.57   12.69   12.94    78.80
box 010829 lists-nodelayserver-restart   4.23    4.32    4.37   231.57

Comment 3

16 years ago
Marking NEW.
Status: UNCONFIRMED → NEW
Ever confirmed: true
(Reporter)

Comment 4

16 years ago
Created attachment 51740 [details] [diff] [review]
NSPR based zone allocator
(Reporter)

Comment 5

16 years ago
Placing zone allocator in PORT_Alloc and friends didn't yield 
performance gain based on selfserv/strsclnt.   Entertained the
thought of creating a separate malloc lib so we catch freebl
and EVERYTHING, but Wan-Teh and I feared we would be like SmartHeap
and likely fail Java garbage collection overriding malloc.
Also this approach won't work on all platforms.  So the NSPR
PR_Malloc level is where we would like Nelson's zone allocator.

Initial tests show no performance gain.  Wondering now if we're 
stressing like Julien did (with GNAT), driving Enterprise Server.
Increasing strsclnt threads from default (8) to 100 degraded
performance.  Need to settle on appropriate strsclnt/selfserv
parameters to reveal the performance gain basically removing the
mallocs should yield.

The zone allocator is rigged to print out stats on normal exit.
Here's an example strsclnt/selfserv run, which was done when the
allocator was in the PORT_ memory routines and some collector/analyzer
output run under Solaris

SunOS iws-files 5.8 Generic sun4u sparc SUNW,Ultra-250

Running: selfserv -t 8 -d /export/kirke/certs -n "iws-files.red.iplanet.com's
Cert-O-Matic II ID" -p 12344 -w iplanet 
(process id 22796)
Reading libcollector.so 
Creating experiment database test.1.er ...
Reading libfreebl_hybrid_3.so
pool: 1, zone: 0, size: 16, free: 21, hit: 4440, miss: 25, contend: 0
pool: 1, zone: 1, size: 64, free: 37, hit: 25299, miss: 130, contend: 0
pool: 1, zone: 2, size: 256, free: 16, hit: 9329, miss: 79, contend: 0
pool: 1, zone: 3, size: 1024, free: 24, hit: 18020, miss: 44, contend: 2
pool: 1, zone: 4, size: 4096, free: 17, hit: 17987, miss: 76, contend: 1
pool: 2, zone: 0, size: 16, free: 18, hit: 41550, miss: 18, contend: 0
pool: 2, zone: 1, size: 64, free: 13, hit: 17337, miss: 18, contend: 0
pool: 2, zone: 2, size: 256, free: 24, hit: 60049, miss: 26, contend: 1
pool: 2, zone: 3, size: 1024, free: 4, hit: 6924, miss: 4, contend: 0 
pool: 2, zone: 4, size: 4096, free: 1, hit: 1153, miss: 6, contend: 0
pool: 2, zone: 6, size: 65536, free: 1, hit: 1153, miss: 1, contend: 0
pool: 3, zone: 0, size: 16, free: 18, hit: 41382, miss: 18, contend: 0
pool: 3, zone: 1, size: 64, free: 9, hit: 17241, miss: 11, contend: 0
pool: 3, zone: 2, size: 256, free: 25, hit: 59698, miss: 27, contend: 1
pool: 3, zone: 3, size: 1024, free: 3, hit: 6897, miss: 3, contend: 0
pool: 3, zone: 4, size: 4096, free: 1, hit: 1149, miss: 7, contend: 0
pool: 3, zone: 6, size: 65536, free: 1, hit: 1149, miss: 1, contend: 0
pool: 4, zone: 0, size: 16, free: 18, hit: 39630, miss: 18, contend: 0
pool: 4, zone: 1, size: 64, free: 13, hit: 16522, miss: 15, contend: 0
pool: 4, zone: 2, size: 256, free: 25, hit: 57238, miss: 26, contend: 1
pool: 4, zone: 3, size: 1024, free: 4, hit: 6604, miss: 4, contend: 0
pool: 4, zone: 4, size: 4096, free: 1, hit: 1100, miss: 1, contend: 0
pool: 4, zone: 6, size: 65536, free: 1, hit: 1100, miss: 1, contend: 0
pool: 5, zone: 0, size: 16, free: 18, hit: 39630, miss: 18, contend: 0
pool: 5, zone: 1, size: 64, free: 9, hit: 16529, miss: 17, contend: 0
pool: 5, zone: 2, size: 256, free: 24, hit: 57261, miss: 27, contend: 0
pool: 5, zone: 3, size: 1024, free: 4, hit: 6604, miss: 4, contend: 0
pool: 5, zone: 4, size: 4096, free: 1, hit: 1100, miss: 10, contend: 0
pool: 5, zone: 6, size: 65536, free: 1, hit: 1100, miss: 1, contend: 0
pool: 7, zone: 0, size: 16, free: 18, hit: 39306, miss: 18, contend: 0
pool: 7, zone: 1, size: 64, free: 13, hit: 16387, miss: 22, contend: 0
pool: 7, zone: 2, size: 256, free: 25, hit: 56774, miss: 27, contend: 0
pool: 7, zone: 3, size: 1024, free: 4, hit: 6550, miss: 4, contend: 0
pool: 7, zone: 4, size: 4096, free: 1, hit: 1091, miss: 10, contend: 0
pool: 7, zone: 6, size: 65536, free: 1, hit: 1091, miss: 1, contend: 0
pool: 8, zone: 0, size: 16, free: 18, hit: 41984, miss: 18, contend: 0
pool: 8, zone: 1, size: 64, free: 13, hit: 17535, miss: 19, contend: 0
pool: 8, zone: 2, size: 256, free: 24, hit: 60582, miss: 26, contend: 1
pool: 8, zone: 3, size: 1024, free: 4, hit: 7002, miss: 4, contend: 0
pool: 8, zone: 4, size: 4096, free: 1, hit: 1165, miss: 5, contend: 0
pool: 8, zone: 6, size: 65536, free: 1, hit: 1165, miss: 1, contend: 0
pool: 9, zone: 0, size: 16, free: 18, hit: 40554, miss: 18, contend: 0
pool: 9, zone: 1, size: 64, free: 9, hit: 16896, miss: 18, contend: 0
pool: 9, zone: 2, size: 256, free: 25, hit: 58605, miss: 26, contend: 0
pool: 9, zone: 3, size: 1024, free: 3, hit: 6759, miss: 3, contend: 0
pool: 9, zone: 4, size: 4096, free: 1, hit: 1126, miss: 11, contend: 0
pool: 9, zone: 6, size: 65536, free: 1, hit: 1126, miss: 1, contend: 0
pool: 10, zone: 0, size: 16, free: 18, hit: 39966, miss: 18, contend: 0
pool: 10, zone: 1, size: 64, free: 13, hit: 16677, miss: 16, contend: 0
pool: 10, zone: 2, size: 256, free: 24, hit: 57730, miss: 28, contend: 2
pool: 10, zone: 3, size: 1024, free: 4, hit: 6660, miss: 4, contend: 0
pool: 10, zone: 4, size: 4096, free: 1, hit: 1109, miss: 4, contend: 0
pool: 10, zone: 6, size: 65536, free: 1, hit: 1109, miss: 1, contend: 0

execution completed, exit code is 0
(/usr/suntools/internal/SUNWspro/bin/../WS6U1/bin/sparcv9/

2052 exclusive sync wait sec (pthread_mutex_lock) 
1219 exclusive sync wait sec (pthread_cond_wait)
 591 exclusive sync wait sec (mutex_lock)

2048 inclusive sync wait sec PR_Lock

Status: NEW → ASSIGNED
(Reporter)

Comment 6

16 years ago
kirke@iws-files[17] !! | grep so
Julien,

I thought I should let you know where the NSPR based zone allocator
can be found for the record incase you have an opportunity to exercise
the web server.  We're wondering if my selfserv/strsclnt benchmarking
is stressing enough.  I built on iws-files:

f "*nspr*" | grep so
./mozilla/dist/SunOS5.8_OPT.OBJ/bin/libnspr4.so
./mozilla/dist/SunOS5.8_OPT.OBJ/lib/libnspr4.so
./mozilla/dist/SunOS5.8_DBG.OBJ/bin/libnspr4.so
./mozilla/dist/SunOS5.8_DBG.OBJ/lib/libnspr4.so
./mozilla/nsprpub/pr/src/SunOS5.8_OPT.OBJ/libnspr4.so
./mozilla/nsprpub/pr/src/SunOS5.8_DBG.OBJ/libnspr4.so
kirke@iws-files[18] pwd
/export/kirke/workarea-nss33
kirke@iws-files[19] 

Comment 7

16 years ago
Kirk,

I don't have access to much hardware to do my stress testing yet as I was doing
before, since our labs aren't setup. I will try to see if the network in my cube
is sufficient for measurements but I'm not certain.
(Reporter)

Comment 8

16 years ago
Julien,

I've built on Axilla under Solaris 2.6:

kirke@axilla[18] f "*nspr*so"
./mozilla/nsprpub/pr/src/SunOS5.6_OPT.OBJ/libnspr4.so
./mozilla/dist/SunOS5.6_OPT.OBJ/bin/libnspr4.so
./mozilla/dist/SunOS5.6_OPT.OBJ/lib/libnspr4.so

-rwxrwxr-x   1 kirke    staff     374480 Oct  4 10:44
./mozilla/nsprpub/pr/src/SunOS5.6_OPT.OBJ/libnspr4.so*kirke@axilla[21] pwd
/export/home8/kirke/workarea-nss33
kirke@axilla[22] 

My certificate expired for Axilla, and cert-o-matic (troll.red.iplanet.com)
has disappeared so I need to create a cert server and generate new certs
before I'll have selfserv numbers.  Looking forward to see your results.

Thanks a bunch for your help.
(Reporter)

Comment 9

16 years ago
Created attachment 55933 [details]
Julien's iWS stress results comparing zones/smartheap
Created attachment 58817 [details] [diff] [review]
reimplement PR_Realloc to work with blocks from malloc

Note, this patch is a patch to the file prmem.c after the patch in 
attachment 51740 [details] [diff] [review] has already been applied.  

This patch also cleans up some indentation problems with the previous 
patch.

Please review this patch.  
Please retest iWS with this patch applied.
(Assignee)

Comment 11

16 years ago
Created attachment 62271 [details] [diff] [review]
Combined patch.

I combined the two patches (attachment 51740 [details] [diff] [review] and attachment 58817 [details] [diff] [review])
and did some editing to follow the NSPR coding style.

This patch was generated against the tip of NSPR.
Attachment #51740 - Attachment is obsolete: true
Attachment #58817 - Attachment is obsolete: true
(Assignee)

Comment 12

16 years ago
Some comments about the combined patch (attachment 62271 [details] [diff] [review]).

1. The zone allocator currently can only be used on Unix
platforms that have pthreads.  It can be ported to all
platforms that have native threads.

2. To enable the zone allocator, one needs to set the
environment variable NSPR_USE_ZONE_ALLOCATOR to 1.

Comment 13

16 years ago
Wan-Teh,

I believe we previously agreed to also have another method to turn on the zone
allocator - a call to an API. This is especially necessary on NT for services.
Otherwise if the system environment variable is used, all applications under NT
end up using the zone allocator, even the client, which will waste memory.
The recommended way to turn on the allocator should be the call to that API, not
setting the environment variable, which should be only for testing application
compatibility with the zone allocator.
(Assignee)

Comment 14

16 years ago
Julien,

You are right.  I am going to use an alternate method that
achieves the same goal.  I will allow an application to
define a magic global variable and initialize it to 1 to
enable the zone allocator.  The advantage of this method
over the function call method is that this magic global
variable can be checked by NSPR initialization routine,
whereas the function call method cannot guarantee that
the function is called before NSPR is initialized.

Comment 15

16 years ago
Wan-Teh,

I don't think your approach is portable. Trying to access a global variable
defined in the executable from a DLL is likely to be impossible on some
platforms. I know it works on Unix and NT. But I experimented with this on OS/2. 

DosQueryModuleHandle is specified to return a module handle only for DLLs.
However, I was able to get it to successfully return one by passing
"program.exe". Unfortunately, a subsequent DosQueryProcAddr - which is supposed
to find an address in a module - returns error 6, invalid module, regardless of
whether I try to look by symbol name or ordinal. 
This might be a question worth asking the OS/2 kernel guys.

Comment 16

16 years ago
Created attachment 62291 [details]
OS/2 test program to try to load a variable from an executable. Mike Kaply, please pass this to OS/2 kernel development

Comment 17

16 years ago
I know this started as an NSS issue, but the resolution will be in NSPR. Should
we change the component for this bug to NSPR ?

(Assignee)

Comment 18

16 years ago
Good point.  Changing the product to NSPR.
Status: ASSIGNED → NEW
Component: Libraries → NSPR
Product: NSS → NSPR
QA Contact: sonja.mirtitsch → wtc
Target Milestone: --- → 4.2
Version: 3.4 → 4.1.2

Comment 19

16 years ago
It is definitely not possible to do a DosQueryProcAddr on an entrypoint in an 
EXE on OS/2.

You can only load and call functions in DLLs.

The same is true of global variables.

I like Julien's idea of an API call to turn it on.

Comment 20

16 years ago
Mike,

I'd appreciate if you could go back to Scott Garfinkle and still ask if there is
a way to do that, if you haven't yet. I was able to get ahold of the EXE's
module handle in my test program. Just not to load any entry points from it as
the handle was considered invalid by DosQueryProcAddr. I'm wondering what the
EXE handle is good for.

About the API call, the issue is complicated. Specifically, there is the problem
of using a single allocator throughout the life of a process.
Ideally, we would like NSPR to select its allocation method when it gets
initialized. The problem is that it usually gets initialized incidentally rather
than explicitly. Typically, a C++ object at global scope will invoke some
function before main() would get a chance to make the API call to select the
desired allocator. However an environment variable works for that implicit NSPR
initialization case.

One solution we came up with is that we could default to using the current
malloc/free allocator in NSPR when the program starts up. Then the user makes
the API call, and the allocator is switched. The problem is how to work with
existing objects that were allocated before the switchover. One way to do that
is that if an object to free does not exist in the allocator's list, we fall
back to free. The downside is that we can no longer assert that this is an
invalid pointer like we would if we were sure everything was allocated with the
zone allocator. So it is not as clean an approach, though it is portable and
workable. I think what Wan-Teh is afraid of is that on some platforms the free()
of a bad pointer would be a no-op and mask some application bugs that would
normally be automatically detected when using the zone allocator.


(Assignee)

Updated

16 years ago
Status: NEW → ASSIGNED
Priority: -- → P1

Comment 21

16 years ago
on NT, you may want to look at the MPHeap sample:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vcsample98/html/vcsmpmpheapsamplemultithreadedheapmanager.asp

I think that it does the job


Comment 22

16 years ago
There are other heap products out there that work as well. The problem is they
all interfere with the Java garbage collection code. With the zone allocator, we
don't replace all malloc calls, just those which we use in NSS.

Also we really need something that works on all platforms (or most common unix
platforms, since that make up the most commonly used servers).
I submit that we now have something that works on all platforms and 
that satisfies the heap needs of NSS and NSPR without interfering with
things like Java's run time which makes platform-dependent assumptions
about the heap implementation.
(Assignee)

Comment 24

16 years ago
Created attachment 66300 [details] [diff] [review]
Additional patch, to be applied on top of attachment 62271 [details] [diff] [review]

I fixed a few bugs in the previous patch (attachment 62271 [details] [diff] [review])
and a compiler warning.  Initialize the zone allocator
first during NSPR initialization.  Print warning messages
if memory blocks from ordinary malloc are handed to the
zone allocator.
Maybe the new printfs should be inside #ifdef DEBUG or something similar.
(Assignee)

Comment 26

16 years ago
Nelson,

I was debating whether I should do that.  Since you made the
same point, I added #ifdef DEBUG around the warning printfs.
(Assignee)

Comment 27

16 years ago
Created attachment 66844 [details] [diff] [review]
Patch to support the magic "nspr_use_zone_allocator" global variable

This patch allows an application to define a magic global
variable to enable the zone allocator.	I have tested it
on HP-UX, Linux, and Solaris.

To enable the zone allocator, add the following to your
main program:

PRBool nspr_use_zone_allocator = PR_TRUE;

On Solaris, this symbols gets exported automatically.

On HP-UX, you need to pass "+ee nspr_use_zone_allocator"
to the linker.

On Linux, you need to use the -E linker option.

This is going to make our release notes complicated.  Maybe
I should listen to you guys and just add an API call for
enabling the zone allocator.

Comment 28

16 years ago
Wan-Teh,

Your approach also has another problem, which is that it may not be the main
program linking directly with NSPR but some other shared library or Windows DLL.
In this case trying to grab the global variable would likely fail.

I'd like to see the API call if it can be done without the expense of losing the
PR_ASSERT on invalid free after the switch to the zone allocator. I'm still
wondering how this can be done.

Here is a suggestion :
Since the asserts are only for debug builds presumably, we shouldn't care about
memory usage for those cases. So in the debug build, when NSPR is implicitly
initialized, it could allocate a few extra bytes before the buffer, to flag info
about the memory block. Then after the switch is done, the PR_Free could look at
those bytes and know even about the blocks that weren't initialized by the zone
allocator before the API call, and assert. Of course this means the debug and
opt build wil behave somewhat differently. But if the zone allocator is well
tested, it could work and let us continue to debug problems regardless of
whether the zone allocator is enabled or disabled.

Comment 29

16 years ago
Well actually with my suggestion you would have to return a pointer a few bytes
after the flagged data. Then in the free, you would go back a few bytes and
check the data. Possibly calling a page fault right there if the pointer is
invalid or it is a double free. So it wouldn't exactly be like a PR_ASSERT, but
it would still occur only in case of an application bug.
(Assignee)

Comment 30

16 years ago
Patrick Beard suggested that a symbol in the main executable
program can be looked up as follows:
You can get the CFBundle of the main executable CFBundleGetMainBundle(),
and then look up a symbol in it using CFBundleGetFunctionPointerForName().
See CFBundle.h.
QA Contact: wtchang → nspr

Comment 31

11 years ago
The target milestone is already released. Resetting target milestone.
Target Milestone: 4.2 → ---
You need to log in before you can comment on or make changes to this bug.