Closed Bug 1094650 Opened 10 years ago Closed 10 years ago

Avoid excessive heap allocation in nss_builtins_FindObjectsInit()

Categories

(NSS :: Libraries, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
3.17.3

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

Attachments

(1 file, 1 obsolete file)

nss_builtins_FindObjectsInit has this comment at the top:

  /* This could be made more efficient.  I'm rather rushed. */

It is correct. The function looks through all 398 NSS builtin objects looking
for matches. And it pre-allocates an array with 398 entries to hold the match
results. On 64-bit this array takes 4 KiB, due to jemalloc rounding up the
request of 8 * 398 = 3184 bytes.

While doing a run with an e10s build of Firefox of
http://gregor-wagner.com/tmp/mem50 (which opens 50 sites) I see a match
count distribution like this:

> 235038 counts:
> (  1)   119312 (50.8%, 50.8%): nss matches = 1 / 398
> (  2)   113122 (48.1%, 98.9%): nss matches = 0 / 398
> (  3)     2604 ( 1.1%,100.0%): nss matches = 2 / 398

We almost always get 0 or 1 matches, so allocating an array for 398 matches is
overkill. Doing it 235,038 times gives cumulative allocations of almost 1 GiB,
and this accounts for nearly 9% of the cumulative heap allocations in the e10s
parent process!
This patch changes nss_builtins_FindObjectsInit to start with a small
stack-allocated array, and only switch to a heap-allocated array if the small
array overflows.

I chose a length of 1 for the array because that's enough to avoid 99% or more
of the heap allocations, but the heap allocation case still happens frequently
enough that we can be confident that it is correct.
Attachment #8517971 - Flags: review?(brian)
Attachment #8517971 - Flags: review?(brian) → review?(rrelyea)
Comment on attachment 8517971 [details] [diff] [review]
Avoid excessive heap allocation in nss_builtins_FindObjectsInit()

Review of attachment 8517971 [details] [diff] [review]:
-----------------------------------------------------------------

This type of allocation is common within NSS (in softoken, for instance, we have a fixed static space for store attributes and only allocate if the attribute is bigger then the default static space). 

The only things I would ask are:

1) make sure this compiles on all the mozilla platforms (see comment in bfind.c).
2) how often would we get an additional win if STACK_BUF_LENGTH were larger (say up to 10)?
3) let's make sure Kai is CC'ed on this bug, he may need to make similiar changes to libpem.

I'll handle #3. If #1 is fine, please set the checkin_needed keyword

bob

::: security/nss/lib/ckfw/builtins/bfind.c
@@ +190,5 @@
> +   * array later if the number of matches exceeds STACK_BUF_LENGTH.
> +   */
> +  #define STACK_BUF_LENGTH 1
> +  builtinsInternalObject *stackTemp[STACK_BUF_LENGTH];
> +  builtinsInternalObject **temp = stackTemp;

Make sure this compiles on all our platforms. I vaguely remember some platforms having problems with static pointer initializers (HP comes to mind).

If it compiles on all the mozilla platforms, we should be fine (gcc doesn't have a problem with this kind of code).
Attachment #8517971 - Flags: review?(rrelyea)
Attachment #8517971 - Flags: review+
Attachment #8517971 - Flags: feedback?(kaie+oldbugzilla)
OK, I see the try looks good. I'm setting checkin-needed.
Keywords: checkin-needed
> 2) how often would we get an additional win if STACK_BUF_LENGTH were larger (say up to 10)?

As per comment 1, I prefer to keep it smaller so that we hit the slow path moderately often. It's still a 99%+ reduction.

Do I need to add "r=rrelyea" to the patch commit message?
yes, it'd a good idea, do you have checkin privilege into NSS?
New patch just adds "r=relyea" to the commit message.
Attachment #8517971 - Attachment is obsolete: true
Attachment #8517971 - Flags: feedback?(kaie+oldbugzilla)
Comment on attachment 8518609 [details] [diff] [review]
Avoid excessive heap allocation in nss_builtins_FindObjectsInit()

Carrying over r+ from previous version.
Attachment #8518609 - Flags: review+
Attachment #8518609 - Flags: feedback?(kaie+oldbugzilla)
(In reply to Robert Relyea from comment #6)
> yes, it'd a good idea, do you have checkin privilege into NSS?

No. This is the first NSS patch I've written.
Blocks: 1095272
I've now checked out NSS directly and I can generate the patch against it instead of against mozilla-central if that is helpful.
checkin-needed ping! Thank you.
https://hg.mozilla.org/projects/nss/rev/902bc119dcdb
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → 3.18
Attachment #8518609 - Flags: feedback?(kaie+oldbugzilla)
Comment on attachment 8518609 [details] [diff] [review]
Avoid excessive heap allocation in nss_builtins_FindObjectsInit()

Review of attachment 8518609 [details] [diff] [review]:
-----------------------------------------------------------------

r=wtc. I suggest some small changes.

::: security/nss/lib/ckfw/builtins/bfind.c
@@ +188,5 @@
> +   * 99% of the time we get 0 or 1 matches. So we start with a small
> +   * stack-allocated array to hold the matches and switch to a heap-allocated
> +   * array later if the number of matches exceeds STACK_BUF_LENGTH.
> +   */
> +  #define STACK_BUF_LENGTH 1

Nit: we can just use 1 when declaring the stackTemp array and
use PR_ARRAY_SIZE(stackTemp) instead of STACK_BUF_LENGTH. This
way we can avoid the STACK_BUF_LENGTH macro, whose scope is not
limited to this function.

@@ +227,5 @@
> +      if( fo->n == STACK_BUF_LENGTH ) {
> +        /* Switch from the small stack array to a heap-allocated array large
> +         * enough to handle matches in all remaining cases. */
> +        temp = nss_ZNEWARRAY((NSSArena *)NULL, builtinsInternalObject *,
> +                             fo->n + nss_builtins_nObjects - i);

Nit: replace fo->n with STACK_BUF_LENGTH or PR_ARRAY_SIZE(stackTemp).

@@ +234,5 @@
> +          goto loser;
> +        }
> +        tempIsHeapAllocated = PR_TRUE;
> +        (void)nsslibc_memcpy(temp, stackTemp,
> +                             sizeof(builtinsInternalObject *) * fo->n);

This can be a constant -- either
    sizeof(stackTemp)
or
    sizeof(builtinsInternalObject *) * STACK_BUF_LENGTH
Attachment #8518609 - Flags: review+
wtc: thank you for the review. This has already landed, with rrelyea's r+, and your suggested changes are all minor. So I'm not sure whether to just leave it as is or not.

Also, with all due respect, given that you have other review requests that have been open for a long time (e.g. bug 1073578), it might be better to spend time on them rather than reviewing patches that have already been reviewed. Thank you.
Target Milestone: 3.18 → 3.17.3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: