Open Bug 295514 Opened 19 years ago Updated 2 years ago

Do not use InitializeCriticalSection without exception handling wrappings [@ RtlRaiseStatus - InitializeCriticalSection - PR_NewLock] => Exception STATUS_NO_MEMORY (c0000017)

Categories

(NSPR :: NSPR, defect, P5)

x86
Windows XP

Tracking

(Not tracked)

People

(Reporter: timeless, Unassigned)

References

Details

(Whiteboard: #16 topcrash)

Attachments

(2 files, 1 obsolete file)

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/initializecriticalsection.asp

Return Values

This function does not return a value.

In low memory situations, InitializeCriticalSection can raise a STATUS_NO_MEMORY
exception.

NSPR doesn't use exception handling, which means we die:
Returning Client
(ea4.758): C++ EH exception(ea4.758): C++ EH exception - code e06d7363 (first
chance)
 - code e06d7363 (first chance)
Getting Remoting Client
Getting Remoting Client
Returning Client
Returning Client
(ea4.758): C++ EH exception(ea4.758): C++ EH exception - code e06d7363 (first
chance)
 - code e06d7363 (first chance)
(ea4.758): C++ EH exception(ea4.758): C++ EH exception - code e06d7363 (first
chance)
 - code e06d7363 (first chance)
(ea4.758): C++ EH exception(ea4.758): C++ EH exception - code e06d7363 (first
chance)
 - code e06d7363 (first chance)
(ea4.758): C++ EH exception(ea4.758): C++ EH exception - code e06d7363 (first
chance)
 - code e06d7363 (first chance)
(ea4.758): C++ EH exception(ea4.758): C++ EH exception - code e06d7363 (first
chance)
 - code e06d7363 (first chance)
(ea4.758): Unknown exception(ea4.758): Unknown exception - code c0000017 (first
chance)
 - code c0000017 (first chance)
(ea4.758): Unknown exception(ea4.758): Unknown exception - code c0000017 (!!!
second chance !!!)
 - code c0000017 (!!! second chance !!!)
eax=0012e7f0 ebx=05d66850 ecx=7c9119fa edx=7c97c080 esi=09a1ebd8 edi=00866e7e
eip=7c964ed1 esp=0012e7f0 ebp=0012e840 iopl=0         nv up ei pl zr na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
eax=0012e7f0 ebx=05d66850 ecx=7c9119fa edx=7c97c080 esi=09a1ebd8 edi=00866e7e
eip=7c964ed1 esp=0012e7f0 ebp=0012e840 iopl=0         nv up ei pl zr na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!RtlRaiseStatus+0x26:
7c964ed1 c9               leave
ntdll!RtlRaiseStatus+0x26:
7c964ed1 c9               leave
0:000> kv
0:000> kv
ChildEBP RetAddr  Args to Child              
0012e840 7c843944 c0000017 0012e8a8 3001509e ntdll!RtlRaiseStatus+0x26 (FPO:
[Non-Fpo])
0012e84c 3001509e 09a1ebf4 098ac080 00b953ce
kernel32!InitializeCriticalSection+0x19 (FPO: [Non-Fpo])
0012e858 00b953ce 05fe9c00 00b954b6 01cb618c nspr4!PR_NewLock+0x2e (FPO:
[0,0,0]) (CONV: cdecl)
[c:\build\chs3\build\mozilla\nsprpub\pr\src\threads\combined\prulock.c @ 204]
0012e860 00b954b6 01cb618c 00a58558 00000001
necko!nsTransportEventSinkProxy::nsTransportEventSinkProxy+0x26 (FPO: [3,0,0])
(CONV: thiscall)
[c:\build\chs3\build\mozilla\netwerk\base\src\nstransportutils.cpp @ 63]
0012e870 00bc9a97 05fe9c14 01cb618c 00a58558
necko!net_NewTransportEventSinkProxy+0x1f (FPO: [4,0,0]) (CONV: cdecl)
[c:\build\chs3\build\mozilla\netwerk\base\src\nstransportutils.cpp @ 189]
0012e8a8 00bc2480 00000001 08f02098 01cb61d0 necko!nsHttpTransaction::Init+0x43
(FPO: [Non-Fpo]) (CONV: thiscall)
[c:\build\chs3\build\mozilla\netwerk\protocol\http\src\nshttptransaction.cpp @ 184]
ChildEBP RetAddr  Args to Child              
0012e840 7c843944 c0000017 0012e8a8 3001509e ntdll!RtlRaiseStatus+0x26 (FPO:
[Non-Fpo])
0012e84c 3001509e 09a1ebf4 098ac080 00b953ce
kernel32!InitializeCriticalSection+0x19 (FPO: [Non-Fpo])
0012e858 00b953ce 05fe9c00 00b954b6 01cb618c nspr4!PR_NewLock+0x2e (FPO:
[0,0,0]) (CONV: cdecl)
[c:\build\chs3\build\mozilla\nsprpub\pr\src\threads\combined\prulock.c @ 204]
0012e860 00b954b6 01cb618c 00a58558 00000001
necko!nsTransportEventSinkProxy::nsTransportEventSinkProxy+0x26 (FPO: [3,0,0])
(CONV: thiscall)
[c:\build\chs3\build\mozilla\netwerk\base\src\nstransportutils.cpp @ 63]
0012e870 00bc9a97 05fe9c14 01cb618c 00a58558
necko!net_NewTransportEventSinkProxy+0x1f (FPO: [4,0,0]) (CONV: cdecl)
[c:\build\chs3\build\mozilla\netwerk\base\src\nstransportutils.cpp @ 189]
0012e8a8 00bc2480 00000001 08f02098 01cb61d0 necko!nsHttpTransaction::Init+0x43
(FPO: [Non-Fpo]) (CONV: thiscall)
[c:\build\chs3\build\mozilla\netwerk\protocol\http\src\nshttptransaction.cpp @ 184]
0012ea48 00bc5e0d 80000000 01cb6170 00831093
necko!nsHttpChannel::SetupTransaction+0x590 (FPO: [EBP 0x0012ea68] [0,90,0])
(CONV: thiscall)
[c:\build\chs3\build\mozilla\netwerk\protocol\http\src\nshttpchannel.cpp @ 622]
0012ea68 00bc6324 00a58118 01cb6170 0908f558 necko!nsHttpChannel::Connect+0x150
(FPO: [Non-Fpo]) (CONV: thiscall)
[c:\build\chs3\build\mozilla\netwerk\protocol\http\src\nshttpchannel.cpp @ 359]
0012ea84 013bfd5c 00a58118 0908f558 00000000
necko!nsHttpChannel::AsyncOpen+0x122 (FPO: [Non-Fpo]) (CONV: stdcall)
[c:\build\chs3\build\mozilla\netwerk\protocol\http\src\nshttpchannel.cpp @ 3153]
0012eaac 013c09f0 01cb6170 020a2810 01cb6170
docshell!nsDocumentOpenInfo::Open+0x38 (FPO: [Non-Fpo]) (CONV: thiscall)
[c:\build\chs3\build\mozilla\uriloader\base\nsuriloader.cpp @ 226]
0012eae4 013bb389 00ac8eb0 06441b18 00000000 docshell!nsURILoader::OpenURI+0xf9
(FPO: [Non-Fpo]) (CONV: stdcall)
[c:\build\chs3\build\mozilla\uriloader\base\nsuriloader.cpp @ 846]
0012eb08 013be1e4 00000003 00ac8eb0 020a2810 0012ea48 00bc5e0d 80000000 01cb6170
00831093 necko!nsHttpChannel::SetupTransaction+0x590 (FPO: [EBP 0x0012ea68]
[0,90,0]) (CONV: thiscall)
[c:\build\chs3\build\mozilla\netwerk\protocol\http\src\nshttpchannel.cpp @ 622]
0012ea68 00bc6324 00a58118 01cb6170 0908f558 necko!nsHttpChannel::Connect+0x150
(FPO: [Non-Fpo]) (CONV: thiscall)
[c:\build\chs3\build\mozilla\netwerk\protocol\http\src\nshttpchannel.cpp @ 359]
0012ea84 013bfd5c 00a58118 0908f558 00000000
necko!nsHttpChannel::AsyncOpen+0x122 (FPO: [Non-Fpo]) (CONV: stdcall)
[c:\build\chs3\build\mozilla\netwerk\protocol\http\src\nshttpchannel.cpp @ 3153]
0012eaac 013c09f0 01cb6170 020a2810 01cb6170
docshell!nsDocumentOpenInfo::Open+0x38 (FPO: [Non-Fpo]) (CONV: thiscall)
[c:\build\chs3\build\mozilla\uriloader\base\nsuriloader.cpp @ 226]
0012eae4 013bb389 00ac8eb0 06441b18 00000000 docshell!nsURILoader::OpenURI+0xf9
(FPO: [Non-Fpo]) (CONV: stdcall)
[c:\build\chs3\build\mozilla\uriloader\base\nsuriloader.cpp @ 846]
0012eb08 013be1e4 00000003 00ac8eb0 020a2810
docshell!nsDocShell::DoChannelLoad+0xad (FPO: [Non-Fpo]) (CONV: thiscall)
[c:\build\chs3\build\mozilla\docshell\base\nsdocshell.cpp @ 5717]
0012eb60 013bde09 06441b18 0923d488 08ef0698
docshell!nsDocShell::DoURILoad+0x38c (FPO: [Non-Fpo]) (CONV: thiscall)
[c:\build\chs3\build\mozilla\docshell\base\nsdocshell.cpp @ 5572]
0012ec50 013b7166 09419cd8 06441b18 0923d488
docshell!nsDocShell::InternalLoad+0x6f4 (FPO: [EBP 0x0012ece4] [14,46,0]) (CONV:
stdcall) [c:\build\chs3\build\mozilla\docshell\base\nsdocshell.cpp @ 5356]
0012ece4 013b8b9b 009eefa0 06441b18 00000000 docshell!nsDocShell::LoadURI+0x336
(FPO: [Non-Fpo]) (CONV: stdcall)
[c:\build\chs3\build\mozilla\docshell\base\nsdocshell.cpp @ 757]
0012edac 00865004 091887d0 08f75c08 00000000 docshell!nsDocShell::LoadURI+0x186
(FPO: [EBP 0x0012ede0] [6,40,0]) (CONV: stdcall)
[c:\build\chs3\build\mozilla\docshell\base\nsdocshell.cpp @ 2566]
0012ede0 00af357a 020a2820 00000008 00000005 xpcom_core!XPTC_InvokeByIndex+0x27
(CONV: cdecl)docshell!nsDocShell::DoChannelLoad+0xad (FPO: [Non-Fpo]) (CONV:
thiscall) [c:\build\chs3\build\mozilla\docshell\base\nsdocshell.cpp @ 5717]
0012eb60 013bde09 06441b18 0923d488 08ef0698
docshell!nsDocShell::DoURILoad+0x38c (FPO: [Non-Fpo]) (CONV: thiscall)
[c:\build\chs3\build\mozilla\docshell\base\nsdocshell.cpp @ 5572]
0012ec50 013b7166 09419cd8 06441b18 0923d488
docshell!nsDocShell::InternalLoad+0x6f4 (FPO: [EBP 0x0012ece4] [14,46,0]) (CONV:
stdcall) [c:\build\chs3\build\mozilla\docshell\base\nsdocshell.cpp @ 5356]
0012ece4 013b8b9b 009eefa0 06441b18 00000000 docshell!nsDocShell::LoadURI+0x336
(FPO: [Non-Fpo]) (CONV: stdcall)
[c:\build\chs3\build\mozilla\docshell\base\nsdocshell.cpp @ 757]
0012edac 00865004 091887d0 08f75c08 00000000 docshell!nsDocShell::LoadURI+0x186
(FPO: [EBP 0x0012ede0] [6,40,0]) (CONV: stdcall)
[c:\build\chs3\build\mozilla\docshell\base\nsdocshell.cpp @ 2566]
0012ede0 00af357a 020a2820 00000008 00000005 xpcom_core!XPTC_InvokeByIndex+0x27
(CONV: cdecl)
[c:\build\chs3\build\mozilla\xpcom\reflect\xptcall\src\md\win32\xptcinvoke.cpp @
102]
0012ef8c 00af5400 0012efa8 00000000 00000000
xpc3250!XPCWrappedNative::CallMethod+0x6c4 (FPO: [Non-Fpo]) (CONV: cdecl)
[c:\build\chs3\build\mozilla\js\src\xpconnect\src\xpcwrappednative.cpp @ 2065]
0012f020 00b2d071 0174e008 01fe50e8 00000005 xpc3250!XPC_WN_CallMethod+0x8e
(FPO: [Non-Fpo]) (CONV: cdecl)
[c:\build\chs3\build\mozilla\js\src\xpconnect\src\xpcwrappednativejsops.cpp @ 1287]
0012f0d4 00b3246f 00000001 00000005 00000000 js3250!js_Invoke+0x531 (FPO: [EBP
0x0012f1c8] [3,35,0]) (CONV: cdecl)
[c:\build\chs3\build\mozilla\js\src\jsinterp.c @ 1320]
 [c:\build\chs3\build\mozilla\xpcom\reflect\xptcall\src\md\win32\xptcinvoke.cpp
@ 102]
0012ef8c 00af5400 0012efa8 00000000 00000000
xpc3250!XPCWrappedNative::CallMethod+0x6c4 (FPO: [Non-Fpo]) (CONV: cdecl)
[c:\build\chs3\build\mozilla\js\src\xpconnect\src\xpcwrappednative.cpp @ 2065]
0012f020 00b2d071 0174e008 01fe50e8 00000005 xpc3250!XPC_WN_CallMethod+0x8e
(FPO: [Non-Fpo]) (CONV: cdecl)
[c:\build\chs3\build\mozilla\js\src\xpconnect\src\xpcwrappednativejsops.cpp @ 1287]
0012f0d4 00b3246f 00000001 00000005 00000000 js3250!js_Invoke+0x531 (FPO: [EBP
0x0012f1c8] [3,35,0]) (CONV: cdecl)
[c:\build\chs3\build\mozilla\js\src\jsinterp.c @ 1320]

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnucmg/html/UCMGch09.asp
says:
Moreover, there is a slight chance that entering a critical section may fail due
to memory limitations. The InitializeCriticalSectionAndSpinCount form of the
critical section function then returns a status of STATUS_NO_MEMORY. This is an
improvement over the InitializeCriticalSection function, which does not return
any status as can be determined by its void return type.
unfortunately,
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/initializecriticalsectionandspincount.asp

Return Values

If the function succeeds, the return value is nonzero.

If the function fails, the return value is zero (0). To get extended error
information, call GetLastError.

    Windows Me/98/95:  This function does not have a return value. If the
function fails, it raises an exception.

Requirements
Client 	Requires Windows XP, Windows 2000 Professional, Windows NT Workstation
4.0 SP3 and later, Windows Me, or Windows 98.
Server 	Requires Windows Server 2003, Windows 2000 Server, or Windows NT Server
4.0 SP3 and later.

So, that alternative doesn't work. i'm going to write a patch that uses __try /
__except
this is a checkpoint. current mingw doesn't support SEH. there's a patched
version of mingw which supports SEH, but for which i need to see some
documentation explaining how to use it (it doesn't like the GetExceptionCode()
statement). afaik, the compilers from borland, digitalmars, intel, microsoft,
and watcom all support SEH. the only one with issues is mingw, and i'm talking
w/ some reactos people to see if i can find an easy way to work around that
problem.
QA Contact: wtchang → nspr
This signature is still seen on a variety of releases, but seems to be in higher volume on 3.6.4 than expected.  #16 topcrash of non-flash non-hang 3.6.4 crashes.  maybe the elevation in rank is via removal of other higher ranking crashes??

checking --- RtlRaiseStatus 20100512-crashdata.csv
found in: 3.6.3 3.6.4 3.6 3.7a5pre 3.0.4 3.5.9 3.6.2 3.5.2 3.0.19 3.0.1 3.7a1pre 3.6b5 3.6b4 3.5.8 3.5.7 3.0.7 3.0.11 3.0
release total-crashes
              RtlRaiseStatus crashes
                         pct.
all     353463  96      0.000271598
3.6.3   242682  46      0.000189548
3.6.4   20166   18      0.000892591
3.6     13382   6       0.000448363
3.7a5pre        2128    4       0.0018797
3.0.4   627     4       0.00637959

All crashes are winXP from the sample I looked at on 2005 05 12

The stacks vary so it could be multiple bugs that we need to spin off.

3.6.4 reports at
http://crash-stats.mozilla.com/report/list?version=Firefox%3A3.6.4&signature=RtlRaiseStatus
looking at a sample of the  top of the stacks they generally breakdown into something like this,

  31 ntdll.dll RtlRaiseStatus ntdll.dll TransformMD5 ntdll.dll RtlLeaveCriticalSection
  28 ntdll.dll RtlRaiseStatus ntdll.dll TransformMD5 ntdll.dll RtlEnterCriticalSection
  28 ntdll.dll RtlRaiseStatus ntdll.dll SHATransformP3 ntdll.dll RtlEnterCriticalSection
   7 ntdll.dll RtlRaiseStatus ntdll.dll SHATransformP3 ntdll.dll RtlLeaveCriticalSection
   1 ntdll.dll RtlRaiseStatus ntdll.dll RtlpWaitForCriticalSection ntdll.dll RtlLeaveCriticalSection
   1 ntdll.dll RtlRaiseStatus ntdll.dll RtlpGetRegistrationHead ntdll.dll RtlEnterCriticalSection

more detailed breakdown including firefox version is attached.

should we spin other bugs off for some of these stacks?
Attachment #445119 - Attachment is obsolete: true
I think ideally we want to find a solution to this that doesn't involve the
use of an exception handler.  If we can find a way to do it that never throws
exceptions, that seems best.  Here are some issues with exception handlers:

- registering an exception handler before, and unregistering after every 
PR_Lock call adds time to a function that is already heavily scrutinized for 
its performance, and affects both the contended and uncontended cases.  
If we must register an exception handler, ideally we'd do it once when the 
NSPR shared lib (*) is loaded and unregister it once when the shared lib is unloaded.  And of course this begs the issue about how it's handled when NSPR 
is no longer a separate shared lib, but is a part of a monster libXUL. :-/

- Other exception handlers may exist.  We don't want to preempt them.  
We'd want some way to "daisy chain" exception handlers, so that if our 
exception handler got called, and we could determine that the exception was 
not due to a memory allocation failure during an EnterCriticalSection call,
we could pass it on to the next handler.  Perhaps this is easy to do.  
I don't know.  

Chris, I don't think we need separate bugs for separate stacks that are variants on this theme.  I think it's all just a single issue with a single
solution.
Priority: -- → P2
Whiteboard: #16 topcrash
is there any chance that out of process plugins or some other changes in firefox 3.6.4 and mozilla-central could have made this problem worse?

the higher concentration of 3.6.4 and 3.7alpha makes be a bit nervous about a volume regression.

crashes  version top frames of the stack

   8 3.6.4	ntdll.dll RtlRaiseStatus	ntdll.dll SHATransformP3	ntdll.dll RtlEnterCriticalSection	nspr4.dll PR_Lock	xul.dll mozilla::MutexAutoLock::MutexAutoLock(mozilla::Mutex &)	xul.dll mozilla::plugins::ChildAsyncCall::RemoveFromAsyncList()	

   5 3.6.4	ntdll.dll RtlRaiseStatus	ntdll.dll TransformMD5	ntdll.dll RtlEnterCriticalSection	nspr4.dll PR_Lock	xul.dll mozilla::MutexAutoLock::MutexAutoLock(mozilla::Mutex &)	xul.dll mozilla::plugins::ChildAsyncCall::RemoveFromAsyncList()	

   4 3.7a5pre	ntdll.dll RtlRaiseStatus	ntdll.dll SHATransformP3	ntdll.dll RtlEnterCriticalSection	nspr4.dll PR_Lock	xul.dll XPCJSRuntime::XPCJSRuntime(nsXPConnect *)	xul.dll XPCJSRuntime::newXPCJSRuntime(nsXPConnect *)
Well, ultimately the problem is exhaustion of heap space, I think, leading
to ENOMEM.  Anything that increases the size and/or frequency of leaks and
hence increases the frequency of ENOMEM errors will increase the frequency
of this crash on Windows I think.  Might be that plugins are just using up
a lot more of the address space, leaving less available for heap.  Or maybe
FF or some plugin has a bad leak.  I haven't looked to see ... is there 
some particular plugin/extension found common among these crash dumps?
timeless' patch uses __try / __catch inline SEH. I don't know what kind of performance penalty that creates, but the way he's written it you shouldn't have to worry about chaining to other exception handlers, since his __catch block explicitly filters out only the specific exception he wants to catch.
The bug that timeless filed and the crashes that chofmann mentions in comment 7 are *not* the same thing. Pretty much all of chofmann's crashes are real crashes in PR_Lock with a deleted lock object, not STATUS_NO_MEMORY exceptions in PR_NewLock.

I do not think that Firefox wants to pay any sort of performance penalty for a low-memory checking of InitializeCriticalSection (we would abort in Mutex or crash using the null PRLock* anyway).
Depends on: 565912
Chris, if it will make Ben happy, go ahead and file a separate bug about the 
stack traces involving EnterCriticalSection.  Please add a "see also" comment
pointing to this bug.
In reply to comment 9, one way to view the challenge is to figure out how to 
do in c what c++ does in response to timeless's try/catch code.  I'm guessing
that c++'s exception handling is quite different, with some general purpose
exception handler that's registered with the OS all the time, and try/catch
handlers that register with that general purpose handler, but that's only a 
guess.  A disassembly and/or trace would reveal all, at significant cost of 
time and effort.
(In reply to comment #11)
> Chris, if it will make Ben happy, go ahead and file a separate bug about the 
> stack traces involving EnterCriticalSection.  Please add a "see also" comment
> pointing to this bug.

https://bugzilla.mozilla.org/show_bug.cgi?id=563847#c17
structured exception handling is actually supposed to be relatively cheap iirc

http://msdn.microsoft.com/en-us/library/ms679270%28v=VS.85%29.aspx and please note that __try/__except is not C++ it's msvc (and supported by most windows compilers...)

The bug assignee didn't login in Bugzilla in the last 7 months and this bug has priority 'P2'/severity 'critical'.
:KaiE, could you have a look please?
For more information, please visit auto_nag documentation.

Assignee: wtc → nobody
Flags: needinfo?(kaie)
Severity: critical → S4
Flags: needinfo?(kaie)
Priority: P2 → P5
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: