Closed
Bug 393777
Opened 17 years ago
Closed 17 years ago
Crash [@ KiFastSystemCallRet] or [@ KiIntSystemCall] through [@ RNG_FileForRNG]
Categories
(NSS :: Libraries, defect)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 331404
People
(Reporter: MatsPalmgren_bugz, Unassigned)
References
Details
(Keywords: dev-doc-complete, topcrash)
Crash Data
KiFastSystemCallRet is currently topcrash #5 over the past 3 months: http://crash-stats.mozilla.com/?do_query=1&query_search=signature&query_type=contains&query=&date=&range_value=3&range_unit=months a typical stack looks like this: 0 KiFastSystemCallRet 1 NtWaitForSingleObject 2 WaitForSingleObjectEx 3 WaitForSingleObject 4 google_breakpad::ExceptionHandler::WriteMinidumpOnHandlerThread(...) 5 google_breakpad::ExceptionHandler::HandleInvalidParameter(...) 6 __loctotime64_t 7 _stat64i32 8 RNG_FileForRNG 9 ReadFiles 10 EnumSystemFiles 11 ReadSystemFiles 12 RNG_SystemInfoForRNG 13 nsc_CommonInitialize 14 NSC_Initialize 15 secmod_ModuleInit 16 SECMOD_LoadPKCS11Module 17 SECMOD_LoadModule 18 SECMOD_LoadModule 19 nss_Init 20 NSS_InitReadWrite 21 nsNSSComponent::InitializeNSS(int) 22 nsNSSComponent::Init() 23 nsNSSComponentConstructor 24 nsGenericFactory::CreateInstance 25 nsComponentManagerImpl::CreateInstanceByContractID 26 nsComponentManagerImpl::GetServiceByContractID ... it looks to me as if there could be a bug in RNG_FileForRNG or any of the NSS functions leading up to it. FWIW, all crashes involving RNG_FileForRNG are on Windows. http://crash-stats.mozilla.com/?do_query=1&query_search=stack&query_type=contains&query=RNG_FileForRNG&date=&range_value=3&range_unit=months
Flags: blocking1.9?
Comment 1•17 years ago
|
||
The URL http://crash-stats.mozilla.com/report/list?range_unit=months&query_search=stack&query_type=contains&signature=KiFastSystemCallRet&query=RNG_FileForRNG&range_value=3 tells us that these crashes are all in FF3 alphas. But they do not tell us these crucial pieces of information: a) what version of NSS is being used in that product? b) what version of the MSVC compiler was used, and which service pack, to build the FF3 builds in question. This problem appears to be fundamentally the same as bug 331404. The major difference is that bug 331404 was a problem seen with a call to the MSVCRT function _findnext (which enumerates files in a directory), and this bug is seen with a call to the MSVCRT function stat (which retrieves information about a named file). In both cases, if the file being returned from the enumeration, or being stat'ed has a date/time stamp older than Jan 1 1970, the MSVCRT function throws an invalid parameter exception. This is a bug in the MSVCRT function. It effectively declares that the calling program has made an error, passing an invalid parameter, when in fact the problem is that date/time stamp recorded in the file system is an absurdly old value, one that cannot be represented in a time_t. This is NOT a fault in the calling program. Microsoft claims to have fixed this MSVCRT bug in Service Pack 1 for MSVC8 (a.k.a., MSVC Express 2005), and indeed bug 331404 did go away when people began to build FF with MSVC8 SP1. FF3 has recently standardized on MSVC8 SP1 as the "reference platform" for building FF3 on windows. See bug 365952. So, my first suspicion is that this bug is being seen in builds that use MSVC8's original MSVCRT (not SP1). If that is so, then the solution is simply to ensure that, going forward, all builds are built with MSVC8 SP1, and not with the original MSVC8 version of MSVCRT. Microsoft has not advised us as to what change they made in MSVCRT to fix the problem. It is possible that they changed the default exception handler to ignore this particular error. But these crash stacks are not using MSVC8's default exception handler. They're using a google_breakpad exception handler. It is possible that the Google exception handler has, in effect, recreated the problem that was present in the default exception handler of MSVC8 SP0, so that with the google exception handler in place, even MSVCRT SP1 exhibits the same bug. If that is so, then this bug is actually the fault of the google exception handler. So, the next step is to identify the version of MSVC8's MSVCRT being used in the crashing builds. It would also be good to somehow remove Google's crashpad exception handler, and see if that cures the problem. If so, this is a bug for Google to fix (I think).
Depends on: 365952
Comment 2•17 years ago
|
||
Oh, I might add that there are a couple of Windows program executables attached to bug 331404 that help with the diagnosis and provide a workaround. The program "findold" (attachment 224295 [details]) will find and report all relevant files with dates older than 1970 Jan 1. The program "touch" (attachment 224177 [details]) will correct the bad stamps in any files identified by findold. Limited instructions on their use may be found in bug 365952 comment 4. The MSVC++ 2005 SP1 Redistributable Package (x86) may be downloaded from http://www.microsoft.com/downloads/details.aspx?familyid=200B2FD9-AE1A-4A14-984D-389C36F85647&displaylang=en Installing it is believed to cure this problem.
Reporter | ||
Comment 3•17 years ago
|
||
FWIW, the dev. docs doesn't seem to make it clear MSVC8 SP1 is a requirement: http://developer.mozilla.org/en/docs/Windows_Build_Prerequisites I looked at the Tinderbox build log (WINNT 5.2 fxnewref-win32-Dep Nightly) but I can't tell if it's using SP1 or not. Can someone with access to the box check please?
Comment 4•17 years ago
|
||
VC8SP1 is not a requirement, it just happens that there's a bug in the original VC8 CRT that happens to crash for us in certain circumstances. fxnewref-win32-tbox is using VC8SP1, although I wish that was documented somewhere other than in random bug comments (bug 384624 and bug 365952).
Comment 5•17 years ago
|
||
Also, you can determine the version of the CRT being used from the crash reports, it's just not obvious. Look at the version of msvcr80.dll. Through some poking around, I've found that 8.0.50727.42 is the original release, and 8.00.50727.762 is SP1. Also, if you look at the build ID graph at the top of that crash reports page, you'll notice that this isn't occurring in recent builds, which makes sense. I think this is probably fixed by the switch to SP1.
Reporter | ||
Comment 6•17 years ago
|
||
> VC8SP1 is not a requirement I think the dev docs should at least recommend using VC8SP1, if not require it. (bug 331404 comment 69 hints at other problems with SP0 as well) > I think this is probably fixed by the switch to SP1 I think you're right.
Comment 7•17 years ago
|
||
In reply to comment 3 and comment 4, VS8 SP1 is not required. NSS works fine when built with VC5 (!), VC6, VC7, and VC8-SP1. But builds with VC8-SP0 have problems due to VC8-SP0's CRT. The requirement is NOT to use VC8 SP0 (a.k.a Express 2005 SP0).
Updated•17 years ago
|
Keywords: dev-doc-needed → dev-doc-complete
Updated•13 years ago
|
Crash Signature: [@ KiFastSystemCallRet]
[@ KiIntSystemCall]
[@ RNG_FileForRNG]
You need to log in
before you can comment on or make changes to this bug.
Description
•