Closed
Bug 308277
Opened 19 years ago
Closed 18 years ago
Norman Virus Control crashes thunderbird [@ Niphk.dll + 0x2bb3]
Categories
(NSPR :: NSPR, defect)
Tracking
(Not tracked)
RESOLVED
WONTFIX
4.7
People
(Reporter: bugzilla, Assigned: wtc)
References
(Depends on 1 open bug, )
Details
(Keywords: crash, relnote)
Crash Data
Attachments
(2 files, 2 obsolete files)
|
6.15 KB,
patch
|
Details | Diff | Splinter Review | |
|
629 bytes,
text/plain
|
Details |
thunderbird in the background suddenly crashed TB9297895W http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=TB9297895W Thunderbird version 1.6a1 (20050912) nightly build
http://forum.norman.com/viewtopic.php? p=1211&sid=9f76723bd1c492d1ec7b162c981ec1d8 Incident ID: 9297895 Stack Signature Niphk.dll + 0x2bb3 (0x678a2bb3) 0f68639a Product ID ThunderbirdTrunk Build ID 2005091206 Trigger Time 2005-09-12 22:34:02.0 Platform Win32 Operating System Windows NT 5.1 build 2600 Module Niphk.dll + (00002bb3) URL visited User Comments Since Last Crash 2561 sec Total Uptime 2561 sec Trigger Reason Access violation Source File, Line No. N/A Stack Trace Niphk.dll + 0x2bb3 (0x678a2bb3) _PR_MD_RECV [e:/builds/tinderbox/thunderbird- trunk/WINNT_5.0_Depend/mozilla/nsprpub/pr/src/md/windows/w95sock.c, line 214]
Summary: thunderbird in the background died → Norman Virus Control crashes thunderbird [@ Niphk.dll + 0x2bb3]
Assignee: mscott → nobody
Severity: normal → critical
Component: Mail Window Front End → MailNews: Networking
Keywords: relnote
Product: Thunderbird → Core
QA Contact: grylchan
Comment 2•19 years ago
|
||
NipHK.dll is not part of thunderbird - we can't really control crashes in dlls that aren't part of our product.
that's not true. at this point, i think nspr probably needs to add SEH around all calls it makes to this series of win32api's. we used to do that for plugins (until someone improved the compiler or something).
Assignee: nobody → wtchang
Component: MailNews: Networking → NSPR
Product: Core → NSPR
QA Contact: grylchan → wtchang
Version: Trunk → 4.6
| Assignee | ||
Comment 5•18 years ago
|
||
The three Talkback URLs in this bug report all say: Incident Not Found. DB error looking for Incident ID: 9297895 The forum.norman.com link points to a page that doesn't mention "Mozilla", "Firefox", or "Thunderbird". The only info useful to me is the NSPR source file and line number at the end of comment 1. It tells me NSPR is calling the Winsock "recv" function. I don't know what else I can do to fix this.
see bug 348170 and friends. basically for windows you need a flavor of try// around the call to wsock recv :)
Comment 7•18 years ago
|
||
This patch include NSPR Guard macros, and uses them to catch the error by means of SEH. This patch is not tested.
Comment 8•18 years ago
|
||
Timeless in comment #6 > see bug 348170 and friends. by friends you mean bug 156493 and bug ?____?
Comment 10•18 years ago
|
||
The previous patch (attachment #235086 [details] [diff] [review]) contained a bug which caused the linker to fail because of a non-existing reference to MessageBoxA (in user32.lib). The new patch includes code to automatically link in user32.lib for all source files which #include "prguard.h".
Attachment #235086 -
Attachment is obsolete: true
| Assignee | ||
Comment 11•18 years ago
|
||
The BEGIN_GUARD and END_GUARD macros should be defined in the NSPR-internal header file mozilla/nsprpub/pr/include/md/_win95.h. When the macros are defined there, we only need to handle the two possible C compilers: MSVC and GCC (MinGW). We don't need to handle __cplusplus or non-Windows compilers. NSPR may be used by command-line applications, so it should not call MessageBox to report an error. In this case you just need to call WSASetLastError. Isn't this really a bug in Norman Virus Control? Its Niphk.dll should not throw an exception. Has this crash been reported to Norman Virus Control? I'm not convinced that NSPR should catch exceptions.
Comment 12•18 years ago
|
||
wtc: i've seen probably half a dozen *different* apps cause crashes like this by messing up winsock. while it's their fault, our users shouldn't suffer.
Comment 13•18 years ago
|
||
(In reply to comment #11) > The BEGIN_GUARD and END_GUARD macros should be defined > in the NSPR-internal header file mozilla/nsprpub/pr/include/md/_win95.h. Well, I chose the public include directory, because these macros will actually be used in many other places besides NSPR internal source files, see bug #329299, bug #325756, bug #317420, bug #314723, etc. In fact, it may be used in any source file which is calling a Win32 API function (or calls into other APIs, such as plugins). > When the macros are defined there, we only need to handle > the two possible C compilers: MSVC and GCC (MinGW). We > don't need to handle __cplusplus or non-Windows compilers. For now, I'm not sure how to do SEH in MinGW, so the current code only handles MSVC (any gcc gurus out there who can help?). The __cplusplus handling is there because SEH in C and C++ are different. See my comments in bug #348170. For C++, the best option is enabling the -EHac- compiler option (see attachment #233455 [details] [diff] [review]) and then using try/catch(...) blocks, while in C, those are not available, and __try/__except must be used. > NSPR may be used by command-line applications, so it should > not call MessageBox to report an error. In this case you > just need to call WSASetLastError. Thanks for mentioning it. I'm preparing a new patch which addresses this issue. I'll send it shortly. > Isn't this really a bug in Norman Virus Control? Its Niphk.dll > should not throw an exception. Has this crash been reported > to Norman Virus Control? I'm not convinced that NSPR should > catch exceptions. I agree with timeless in comment #12.
Comment 14•18 years ago
|
||
This patch adds support for console applications to address the issue raised in comment #11.
Attachment #235220 -
Attachment is obsolete: true
| Assignee | ||
Comment 15•18 years ago
|
||
Thank you very much for your patch. But I don't want to catch exceptions thrown by code that should not throw exceptions. In the Microsoft documentation of the recv() function (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winsock/winsock/recv_2.asp), search for "except", "try", and "catch". You will find none of these. So the code in question (a Winsock service provider?) failed to conform to the specification. It is Norman Virus Control's responsibility to fix the code that throws exception. I also did a simple experiment: #include <stdio.h> #include <windows.h> int main() { int *p = 0; __try { *p = 0; } __except(EXCEPTION_EXECUTE_HANDLER) { printf("exception caught\n"); } return 0; } This experiment shows that the proposed patch will even mask null pointer dereferencing, a serious programming error. I'm sorry that I can't accept this patch.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → WONTFIX
Target Milestone: --- → 4.7
Comment 16•18 years ago
|
||
wtc: that's precisely the point. i'm an end user. i so don't care that some lame module in my system is crashing. i want my web browser to work.
Comment 17•18 years ago
|
||
(In reply to comment #15) > Thank you very much for your patch. But I don't want to > catch exceptions thrown by code that should not throw > exceptions. > > In the Microsoft documentation of the recv() function > (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winsock/winsock/recv_2.asp), > search for "except", "try", and "catch". You will find > none of these. So the code in question (a Winsock service > provider?) failed to conform to the specification. It is > Norman Virus Control's responsibility to fix the code that > throws exception. Quite true. But the point is, what from Normal Virus Control's developers' point of view is an error that should be fixed, from our point of view is an error preventing a network communication. The current behavior of Mozilla (including Thunderbird, Firefox, etc.) in response to this error is terminating the whole application, while I think that the expected behavior should be a message to the user that there's an unknown error in network communications, or something like that. In other words, the end users should not suffer from the technicalities (because it's an exception raised in Normal Virus Control's code, we won't do anything about it), and instead, get the behavior they usually expect (getting an error message which gives them some idea on what went wrong, and gives them the opportunity to terminate the application *if they want to*.) > I also did a simple experiment: > > #include <stdio.h> > #include <windows.h> > > int main() > { > int *p = 0; > > __try { > *p = 0; > } __except(EXCEPTION_EXECUTE_HANDLER) { > printf("exception caught\n"); > } > return 0; > } > > This experiment shows that the proposed patch will even mask > null pointer dereferencing, a serious programming error. I'm > sorry that I can't accept this patch. Well, the kind of exception handling in my patch was pretty simple. Exception handling can be turned off in debug (and alpha/beta) builds (where the focus is on catching errors. Moreover, there are even mode advanced solutions which can detect which module the exception occurs in, and act accordingly, log the error, send a Talkback report, etc. I don't think refusing to answer this issue is the right choice.
| Assignee | ||
Comment 18•18 years ago
|
||
If you catch and ignore an exception caused by a programming error in a lower layer, the lower layer may be left in an inconsistent state, and it may be dangerous to call into the lower layer again. I put together a simple test program that illustrates this point. When the lower layer dereferences a null pointer, it doesn't get to release an internal semaphore. So when we catch and ignore that access violation and call the lower again, the lower layer deadlocks trying to wait for the semaphore. If I spend more time, I may be able to create a test program in which the inconsistent state causes a worse problem than a deadlock. But I hope you get the idea. Again, I thank you for your interest in and patches for this bug. I know you want to help Mozilla users, and as a programmer I am fascinated by the technical side of this bug. But it is a bad idea to keep going after a programming error may have left the program in an inconsistent state. If we really want to help our users, we can report this crash in Niphk.dll to the vendor of Norman Virus Control and help them fix this crash.
Comment 19•18 years ago
|
||
(In reply to comment #18) > If you catch and ignore an exception caused by a programming > error in a lower layer, the lower layer may be left in an > inconsistent state, and it may be dangerous to call into the > lower layer again. I put together a simple test program that > illustrates this point. When the lower layer dereferences a null > pointer, it doesn't get to release an internal semaphore. So > when we catch and ignore that access violation and call the > lower again, the lower layer deadlocks trying to wait for the > semaphore. If I spend more time, I may be able to create a > test program in which the inconsistent state causes a worse > problem than a deadlock. But I hope you get the idea. Thanks for the sample program. I stand corrected. I think this sample proves that my patch would not *solve* the issue. And it could create more problems in the future, which would be more obscure to investigation. So I guess I should go on and close bug #350087 (as WONTFIX) on the same ground as well?
Comment 20•18 years ago
|
||
wtc: here's the thing. you're arguing that sometimes a crashed program will deadlock. I'm asserting that an aborted program will *always* die. I don't care that /sometimes/ a caught exception will result in an unhappy program. and i certainly don't care that when we catch the exception we explain to the user why the program is unhappy. telling the user that they have an infected wsa chain would be a *good* thing. and if we happened to get enough users complaining to their lame vendors, maybe enough vendors would fix the problem that it would go away.
Updated•13 years ago
|
Crash Signature: [@ Niphk.dll + 0x2bb3]
You need to log in
before you can comment on or make changes to this bug.
Description
•