Closed
Bug 588873
Opened 14 years ago
Closed 10 years ago
Firefox 4.0b Crash [@ NS_AccumulateFastLoadChecksum(unsigned int*, unsigned char const*, unsigned int, int) ]
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
WORKSFORME
Tracking | Status | |
---|---|---|
blocking2.0 | --- | - |
People
(Reporter: chofmann, Unassigned)
References
Details
(Keywords: crash, regression, Whiteboard: not-ready)
Crash Data
Attachments
(1 file, 3 obsolete files)
7.63 KB,
patch
|
benjamin
:
approval2.0-
|
Details | Diff | Splinter Review |
we crested 400k user on b4 and starting to see some new signatures turn up. this one in fast load code. Correlation to startup or time of session 43 total crashes for NS_AccumulateFastLoadChecksum on 20100818-crashdata.csv 40 startup crashes inside 30 sec. 27 repeated crashes inside 3 min. of last crash Stacks look like http://crash-stats.mozilla.com/report/index/aa934437-6d71-4da5-8d60-8e6e82100819 Frame Module Signature [Expand] Source 0 xul.dll NS_AccumulateFastLoadChecksum xpcom/io/nsFastLoadFile.cpp:172 1 xul.dll nsFastLoadFileReader::ComputeChecksum xpcom/io/nsFastLoadFile.cpp:628 2 xul.dll nsFastLoadFileReader::Open xpcom/io/nsFastLoadFile.cpp:908 3 xul.dll NS_NewFastLoadFileReader xpcom/io/nsFastLoadFile.cpp:1167 4 xul.dll nsFastLoadService::NewInputStream xpcom/io/nsFastLoadService.cpp:171 5 xul.dll mozJSComponentLoader::StartFastLoad 6 xul.dll mozJSComponentLoader::GlobalForLocation js/src/xpconnect/loader/mozJSComponentLoader.cpp:1274 7 xul.dll mozJSComponentLoader::LoadModuleImpl js/src/xpconnect/loader/mozJSComponentLoader.cpp:793 8 xul.dll mozJSComponentLoader::LoadModule js/src/xpconnect/loader/mozJSComponentLoader.cpp:728 9 xul.dll nsComponentManagerImpl::KnownModule::Load xpcom/components/nsComponentManager.cpp:953 10 xul.dll nsFactoryEntry::GetFactory xpcom/components/nsComponentManager.cpp:1942 11 xul.dll nsComponentManagerImpl::CreateInstanceByContractID xpcom/components/nsComponentManager.cpp:1304 12 xul.dll nsComponentManagerImpl::GetServiceByContractID xpcom/components/nsComponentManager.cpp:1670 13 xul.dll CallGetService obj-firefox/xpcom/build/nsComponentManagerUtils.cpp:94 14 xul.dll nsCOMPtr_base::assign_from_gs_contractid_with_error obj-firefox/xpcom/build/nsCOMPtr.cpp:141 15 xul.dll nsAppStartupNotifier::Observe embedding/components/appstartup/src/nsAppStartupNotifier.cpp:100 16 xul.dll XRE_main toolkit/xre/nsAppRunner.cpp:3465 17 firefox.exe wmain toolkit/xre/nsWindowsWMain.cpp:120 18 firefox.exe __tmainCRTStartup obj-firefox/memory/jemalloc/crtsrc/crtexe.c:591 19 kernel32.dll BaseProcessStart more at http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=contains&query=NS_AccumulateFastLoadChecksum%28unsigned&date=08%2F19%2F2010%2011%3A21%3A00&range_value=1&range_unit=weeks&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=NS_AccumulateFastLoadChecksum%28unsigned%20int*%2C%20unsigned%20char%20const*%2C%20unsigned%20int%2C%20int%29 all 289490 43 0.000148537 4.0b3 15642 31 0.00198184 4.0b2 2871 11 0.00383142 4.0b1 1802 1 0.000554939 os breakdown NS_AccumulateFastLoadChecksumTotal 43 Win5.1 0.91 Win6.0 0.02 Win6.1 0.07
Reporter | ||
Updated•14 years ago
|
Updated•14 years ago
|
Severity: normal → critical
Comment 1•14 years ago
|
||
I stared at this for a while, can't see what's causing it. The error means that we are somehow reading outside of the mmapped area, but given the existing bounds checks in there, I don't see how that can happen.
Comment 2•14 years ago
|
||
This should make this crash go away
Attachment #477697 -
Flags: review?(bhsieh)
Comment 3•14 years ago
|
||
Comment on attachment 477697 [details] [diff] [review] fix Per IRC discussion, we're worried about allowing ComputeChecksum() to throw an exception. It'd be better to catch the exception inside the method, especially since it's exposed through the .idl
Attachment #477697 -
Flags: review?(bhsieh) → review-
Updated•14 years ago
|
Assignee: nobody → tglek
blocking2.0: ? → betaN+
Comment 4•14 years ago
|
||
This is still happening in beta7 - lower on the list though ie: #200. I haven't found the signature in any of the latest crash reports on the trunk.
Reporter | ||
Comment 5•14 years ago
|
||
still around in beta7 with about 13 crashes per day. also seen on 4.0b8pre see this query for date restriction removed for a list of the latest crashes. http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=contains&query=NS_AccumulateFastLoadChecksum%28unsigned&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=NS_AccumulateFastLoadChecksum%28unsigned%20int*%2C%20unsigned%20char%20const*%2C%20unsigned%20int%2C%20int%29
Comment 6•14 years ago
|
||
This adds exception_in_page guards to mmap-io and switches to descriptor io in ComputeChecksum which avoids this exception and provides faster startup performance.
Attachment #477697 -
Attachment is obsolete: true
Attachment #494491 -
Flags: review?
Updated•14 years ago
|
Attachment #494491 -
Flags: review? → review?(ben.hsieh)
Comment 7•14 years ago
|
||
>+ if (rem) { >+ NS_AccumulateFastLoadChecksum(&checksum, >+ reinterpret_cast<PRUint8*>(buf), >+ rem, >+ PR_TRUE); Is this last block necessary after the loop? Doesn't seems like other callsites worry about remainder left from NS_AFLC(). >+#ifdef XP_WIN >+ nsAutoString name; >+ rv = mFile->GetPath(name); >+ NS_ENSURE_SUCCESS(rv, rv); >+ HANDLE winFD = ::CreateFileW(name.get(), GENERIC_READ, FILE_SHARE_READ, >+ NULL, OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN, NULL); >+ if (winFD == INVALID_HANDLE_VALUE) >+ return NS_ERROR_FAILURE; >+ fd = PR_ImportFile((PROsfd) winFD); >+#else How come OpenNSPRFileDesc doesn't work here? Maybe put a comment here. r+ with above
Updated•14 years ago
|
Attachment #494491 -
Flags: review?(ben.hsieh) → review+
Comment 8•14 years ago
|
||
(In reply to comment #7) > >+ if (rem) { > >+ NS_AccumulateFastLoadChecksum(&checksum, > >+ reinterpret_cast<PRUint8*>(buf), > >+ rem, > >+ PR_TRUE); > Is this last block necessary after the loop? Doesn't seems like other callsites > worry about remainder left from NS_AFLC(). I brought back old reading logic here. Rest of the callsites don't have to worry about Read() underfill as they read in big enough chunks. > > >+#ifdef XP_WIN > >+ nsAutoString name; > >+ rv = mFile->GetPath(name); > >+ NS_ENSURE_SUCCESS(rv, rv); > >+ HANDLE winFD = ::CreateFileW(name.get(), GENERIC_READ, FILE_SHARE_READ, > >+ NULL, OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN, NULL); > >+ if (winFD == INVALID_HANDLE_VALUE) > >+ return NS_ERROR_FAILURE; > >+ fd = PR_ImportFile((PROsfd) winFD); > >+#else > How come OpenNSPRFileDesc doesn't work here? Maybe put a comment here. > > r+ with above Ok, FILE_FLAG_SEQUENTIAL_SCAN is special sauce that turns on aggressive readahead in windows
Comment 9•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/9ddbf8ab23a5
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
I backed this out to investigate whether it caused bug 617048: http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=c7b784648b02
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Based on the backouts, either this or bug 605672 caused bug 617048. (My money is on this one, but not all *that* strongly.) If you think it's not this one, feel free to try relanding *after* beta8 branches or is tagged, but make sure not to reland on the same day as bug 605672, and watch the crash stats the next day (e.g., on http://dbaron.org/mozilla/crashes-by-build ).
Reporter | ||
Comment 12•14 years ago
|
||
also looking at crash stats its not clear that the fix had impact in solving the original crash bug sort this query by build id to see b8pre crashes on dec 7 with build id 20101207030325 which should have been inside the range with the patch from comment 9 was landed and wend comment 10 backed out the patch. http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=contains&query=NS_AccumulateFastLoadChecksum%28unsigned&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=NS_AccumulateFastLoadChecksum%28unsigned%20int*%2C%20unsigned%20char%20const*%2C%20unsigned%20int%2C%20int%29
Comment 13•14 years ago
|
||
(In reply to comment #12) > also looking at crash stats its not clear that the fix had impact in solving > the original crash bug > > sort this query by build id to see b8pre crashes on dec 7 with build id > 20101207030325 which should have been inside the range with the patch from > comment 9 was landed and wend comment 10 backed out the patch. > > http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=contains&query=NS_AccumulateFastLoadChecksum%28unsigned&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=NS_AccumulateFastLoadChecksum%28unsigned%20int*%2C%20unsigned%20char%20const*%2C%20unsigned%20int%2C%20int%29 It did, but looks it did uncover another issue in the code. If look at the new crash, it's a different (and more rare) exception...the same one as present in firefox 3.6(ie before fasl was rewritten). I guess restoring the old algorithm also restored some old bug. I doubt that this caused bug 617048 because it changed a low level detail in how fasl is read, the resulting structures should be identical, with and without this patch.
Comment 14•14 years ago
|
||
The only thing that I can tell is wrong with the old code is using memcpy instead of memmove(memory shouldn't overlap in memcpy). Wonder if that could cause issues.
Comment 15•14 years ago
|
||
The code was also storing return value of PR_Read() in a PRUint32
Attachment #494491 -
Attachment is obsolete: true
Attachment #497382 -
Flags: review+
Updated•14 years ago
|
blocking2.0: betaN+ → -
Updated•14 years ago
|
Attachment #497382 -
Flags: approval2.0?
Comment 16•14 years ago
|
||
Please don't carry over review flags, it makes bugs confusing.
Updated•14 years ago
|
Attachment #497382 -
Flags: approval2.0? → approval2.0+
Comment 17•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/30e1cfee3545 hopefully my revised patch takes care of the crash, closing for now.
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
Comment 18•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/5634bfc97402 no dice
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 19•13 years ago
|
||
had a += typo instead of = in the previous patch :(
Attachment #497382 -
Attachment is obsolete: true
Updated•13 years ago
|
Attachment #505204 -
Flags: approval2.0?
Comment 20•13 years ago
|
||
Comment on attachment 505204 [details] [diff] [review] use descriptor for checksum(fixed) This bounced once and I don't think it's critical for the release.
Attachment #505204 -
Flags: approval2.0? → approval2.0-
Comment 21•13 years ago
|
||
Comment on attachment 497382 [details] [diff] [review] use descriptor for checksum(fixed) (bookkeeping)
Attachment #497382 -
Flags: approval2.0+
Comment 22•13 years ago
|
||
I don't feel comfortable taking this on the cedar branch.
Whiteboard: not-ready
Assignee | ||
Updated•13 years ago
|
Crash Signature: [@ NS_AccumulateFastLoadChecksum(unsigned int*, unsigned char const*, unsigned int, int) ]
Updated•13 years ago
|
Assignee: tglek → nobody
Comment 23•13 years ago
|
||
This seems to not happen in versions past 6.0.x - have we fixed this in some other way?
Crash Signature: [@ NS_AccumulateFastLoadChecksum(unsigned int*, unsigned char const*, unsigned int, int) ] → [@ NS_AccumulateFastLoadChecksum(unsigned int*, unsigned char const*, unsigned int, int) ]
[@ NS_AccumulateFastLoadChecksum ]
Comment 24•13 years ago
|
||
this code was removed.
Comment 25•10 years ago
|
||
Fixed by code removal per comment 24.
Status: REOPENED → RESOLVED
Closed: 14 years ago → 10 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•