Closed Bug 812085 Opened 12 years ago Closed 11 years ago

Initialize Windows CRITICAL_SECTIONs without debug info and with nonzero spin count

Categories

(NSPR :: NSPR, defect, P2)

All
Windows Vista
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cpeterson, Assigned: cpeterson)

References

(Blocks 1 open bug)

Details

(Whiteboard: [MemShrink:P3])

Attachments

(1 file, 2 obsolete files)

(Nicholas, I thought you might be interested in this NSPR patch because it relates to Windows memory footprint.)

Starting with Windows Vista, every CRITICAL_SECTION allocates an extra RTL_CRITICAL_SECTION_DEBUG object. Unfortunately, this debug object is not reclaimed by DeleteCriticalSection(), causing an apparent memory leak [1]. This is a debugging "feature", not a bug. If we are running on Vista or later, use InitializeCriticalSectionEx() [2] to allocate CRITICAL_SECTIONs without debug objects.

I only fixed NSPR's WIN95 configuration because that is what Firefox uses. Should I also fix WINNT configuration? I would need to move the _PR_MD_*LOCK() functions from w95cv.c to a source file shared by both WIN95 and WINNT builds (such as ntmisc.c or ntsem.c).

[1] http://stackoverflow.com/questions/804848/critical-sections-leaking-memory-on-vista-win2008#889853

[2] http://msdn.microsoft.com/en-us/library/windows/desktop/ms683477%28v=vs.85%29.aspx
Attachment #681871 - Flags: review?(wtc)
Attachment #681871 - Flags: feedback?(n.nethercote)
By default, CRITICAL_SECTIONs are initialized with a spin count of 0. MSDN says the Windows heap manager uses a spin count of 4000 [1], but Joe Duffy's "Concurrent Programming on Windows" book suggests 1500 is a "more reasonable starting point" [2]. On single-processor systems, the spin count is ignored and the critical section spin count is set to 0.

Some Firefox code already calls InitializeCriticalSectionAndSpinCount() directly [3]. Some of the spin counts specified are 0, 2000, 4000, 4096, and 5000. A more conservative change might be to continue using the default zero spin count for PRLock on pre-Vista machines and only use a nonzero spin count for Vista and newer machines.

I compared some Talos test runs with a spin count of 0 (default), 1500, and 4000 but I did not really see any big performance changes (except some odd tp5n xref numbers, but those tests seem to fluctuate on mozilla-central too).

* spin count 0: https://tbpl.mozilla.org/?tree=Try&rev=fa994b83c4d2
* spin count 1500: https://tbpl.mozilla.org/?tree=Try&rev=5eb5def7147d
* spin count 4000: https://tbpl.mozilla.org/?tree=Try&rev=e9ea9a1adb9b

[1] http://msdn.microsoft.com/en-us/library/windows/desktop/ms683476%28v=vs.85%29.aspx

[2] http://books.google.com/books?id=o4ohrd0_yA0C&q=1%2C500

[3] https://mxr.mozilla.org/mozilla-central/ident?i=InitializeCriticalSectionAndSpinCount
Attachment #681876 - Flags: review?(wtc)
Attachment #681876 - Flags: feedback?(n.nethercote)
Thanks for CC'ing me.  I don't know the first thing about such Windows arcana, so I can't really provide feedback on the patches.  But I do have some questions.  Have you been able to confirm that this leak is actually happening, and that this change prevents it?  And if so, have you been able to approximately quantify the leak rate?

Thanks!
Whiteboard: [MemShrink]
Attachment #681871 - Flags: feedback?(n.nethercote)
Attachment #681876 - Flags: feedback?(n.nethercote)
Chris, have you done any measurements to know for sure if this is a real problem?
Priority: -- → P2
Target Milestone: --- → 4.9.5
(In reply to Chris Peterson (:cpeterson) from comment #1)
> Some Firefox code already calls InitializeCriticalSectionAndSpinCount()
> directly [3].
Are NSPR's system requirements same as Firefox's one?
(In reply to Ehsan Akhgari [:ehsan] from comment #3)
> Chris, have you done any measurements to know for sure if this is a real
> problem?

Ehsan, after running some more tests on Windows 7, I have NOT been able to reproduce any CRITICAL_SECTION resource leaks, as described on Stack Overflow. However, I have learned a lot more about CRITICAL_SECTIONs internals from MSDN [1]. <:)

The CRITICAL_SECTION struct has a DebugInfo field, which is a pointer to a RTL_CRITICAL_SECTION_DEBUG struct. DebugInfo maintains lock contention statistics and an entry in a global linked list of all RTL_CRITICAL_SECTION_DEBUGs active in the process. The first 64 RTL_CRITICAL_SECTION_DEBUGs are allocated from a static array in NTDLL's .data section. Additional RTL_CRITICAL_SECTION_DEBUGs are allocated (and freed) from the process heap.

So these RTL_CRITICAL_SECTION_DEBUGs are not leaked, but they do add extra overhead for allocating the debug struct, maintaining the linked list, and updating lock statistics on every EnterCriticalSection() call. If we are not using these debug features, then I think using the CRITICAL_SECTION_NO_DEBUG_INFO flag is worthwhile (even if there is no resource leak).

[1] http://msdn.microsoft.com/en-us/magazine/cc164040.aspx#S3
(In reply to Masatoshi Kimura [:emk] from comment #4)
> (In reply to Chris Peterson (:cpeterson) from comment #1)
> > Some Firefox code already calls InitializeCriticalSectionAndSpinCount()
> > directly [3].
> Are NSPR's system requirements same as Firefox's one?

What is NSPR's current version requirement for Windows? The NSPR docs mention some ancient versions, such as Windows 3.1, Windows 9x, and NT 3.51.

Firefox requires Windows XP SP2.
I'd be fine with using CRITICAL_SECTION_NO_DEBUG_INFO anyways.  I just was not sure if it's actually going to fix any memory related problem.  :-)
(In reply to Chris Peterson (:cpeterson) from comment #6)
>
> What is NSPR's current version requirement for Windows? The NSPR docs
> mention some ancient versions, such as Windows 3.1, Windows 9x, and NT 3.51.
> 
> Firefox requires Windows XP SP2.

Hi Chris. NSPR also requires Windows XP SP2.
MemShrink:P3 until we have some measurements.
Whiteboard: [MemShrink] → [MemShrink:P3]
wtc, are there any additional tests or reviews you would like for these patches?
Priority: P2 → --
Comment on attachment 681871 [details] [diff] [review]
CRITICAL_SECTION-debug-info.patch

Review of attachment 681871 [details] [diff] [review]:
-----------------------------------------------------------------

r=wtc. Thanks for the patch. I can take care of the suggested changes when I
check this in. I have a question for you below.

::: nsprpub/pr/src/md/windows/w95cv.c
@@ +329,5 @@
> +}
> +
> +PRStatus _PR_MD_NEW_LOCK(_MDLock *lock)
> +{
> +    CRITICAL_SECTION *cs = &((lock)->mutex);

The parentheses are not necessary when this is a function (as opposed to a macro):

    CRITICAL_SECTION *cs = &lock->mutex;

@@ +332,5 @@
> +{
> +    CRITICAL_SECTION *cs = &((lock)->mutex);
> +
> +    if (sInitializeCriticalSectionEx) {
> +        if (!sInitializeCriticalSectionEx(cs, 0,

Do you know what a dwSpinCount of 0 means?

Does it mean "use the default spin count" or "do not spin"?
Attachment #681871 - Flags: review?(wtc) → review+
thanks, wtc.

(In reply to Wan-Teh Chang from comment #11)
> > +    if (sInitializeCriticalSectionEx) {
> > +        if (!sInitializeCriticalSectionEx(cs, 0,
> 
> Do you know what a dwSpinCount of 0 means?
> 
> Does it mean "use the default spin count" or "do not spin"?

dwSpinCount 0 means "do not spin". This is the default value for Win32 CRITICAL_SECTIONs, even on multi-processor systems. On single-processor systems, dwSpinCount is ignored and the critical section spin count is always 0.

MSDN reference:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms683477%28v=vs.85%29.aspx
Comment on attachment 681876 [details] [diff] [review]
part-2-CRITICAL_SECTION-spin-count.patch

Review of attachment 681876 [details] [diff] [review]:
-----------------------------------------------------------------

r=wtc. I will take care of the suggested changes when I check this in.

::: nsprpub/pr/src/md/windows/w95cv.c
@@ +333,5 @@
> + * Joe Duffy's "Concurrent Programming on Windows" book suggests 1500 is
> + * a "reasonable starting point". On single-processor systems, the spin
> + * count is ignored and the critical section spin count is set to 0.
> + */
> +#define _MD_LOCK_SPIN_COUNT 1500

The MSDN pages for the critical section functions mention that the Windows
heap manager uses a spin count of 4000. So that would also be a good spin
count.

@@ +351,1 @@
>      }

I suggest rewriting this as follows:

    BOOL ok;

    if (sInitializeCriticalSectionEx) {
        ok = sInitializeCriticalSectionEx(cs, _MD_LOCK_SPIN_COUNT,
                                          CRITICAL_SECTION_NO_DEBUG_INFO);
    } else {
        ok = InitializeCriticalSectionAndSpinCount(cs, _MD_LOCK_SPIN_COUNT);
    }
    if (!ok) {
        PR_SetError(...);
        return PR_FAILURE;
    }
Attachment #681876 - Flags: review?(wtc) → review+
(In reply to Chris Peterson (:cpeterson) from comment #12)
> 
> dwSpinCount 0 means "do not spin". This is the default value for Win32
> CRITICAL_SECTIONs, even on multi-processor systems. On single-processor
> systems, dwSpinCount is ignored and the critical section spin count is
> always 0.
> 
> MSDN reference:
> http://msdn.microsoft.com/en-us/library/windows/desktop/ms683477%28v=vs.
> 85%29.aspx

If I only read that MSDN page, I cannot figure out what is the default
spin count for a CRITICAL_SECTION initialized with InitializeCriticalSection().
Could you explain why you think it is 0? Thanks.
Priority: -- → P2
Good question. MSDN does not actually document the default value, but the following MSDN article ("Break Free of Code Deadlocks in Critical Sections Under Windows") says:

"On multiprocessor systems, if the critical section is unavailable, the calling thread will spin dwSpinCount times before performing a wait operation on a semaphore associated with the critical section. ... This field defaults to zero, but can be set to a different value with the InitializeCriticalSectionAndSpinCount API.

I have confirmed this in the debugger on a single-processor and a multi-processor VMware virtual machine.

http://msdn.microsoft.com/en-us/magazine/cc164040.aspx
Chris, thanks a lot for answering my questions. I like both changes
so I checked in both of them together.

Patch checked in on the NSPR trunk (NSPR 4.9.6).

Checking in pr/include/md/_win95.h;
/cvsroot/mozilla/nsprpub/pr/include/md/_win95.h,v  <--  _win95.h
new revision: 3.42; previous revision: 3.41
done
Checking in pr/src/md/windows/w95cv.c;
/cvsroot/mozilla/nsprpub/pr/src/md/windows/w95cv.c,v  <--  w95cv.c
new revision: 3.7; previous revision: 3.6
done
Attachment #681871 - Attachment is obsolete: true
Attachment #681876 - Attachment is obsolete: true
Attachment #712277 - Flags: checked-in+
I will push a new NSPR update to mozilla-central soon.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: 4.9.5 → 4.9.6
The patch has just been pushed to mozilla-central:
https://hg.mozilla.org/mozilla-central/rev/a4ce303cca2f
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: