Closed Bug 524313 Opened 11 years ago Closed 11 years ago

crash [@ operator new(unsigned int) | nsFastLoadFileReader::ReadFooter(nsFastLoadFileReader::nsFastLoadFooter*)]

Categories

(Core :: XUL, defect, P2)

x86
All
defect

Tracking

()

RESOLVED FIXED
mozilla1.9.3a1
Tracking Status
status1.9.2 --- beta3-fixed

People

(Reporter: chofmann, Assigned: taras.mozilla)

References

Details

(Keywords: crash, topcrash, Whiteboard: [crashkill])

Crash Data

Attachments

(2 files)

this bug spun off from Bug 511758 and out of memory crashes with "new"  RaiseException crash signatures but this one deals only with  nsFastLoadFileReader::ReadFooter and stacks that look something like:

0      kernel32.dll      RaiseException      
1     mozcrt19.dll     _CxxThrowException     throw.cpp:159
2     mozcrt19.dll     operator new     new.cpp:57
3     xul.dll     nsFastLoadFileReader::ReadFooter    
mozilla/xpcom/io/nsFastLoadFile.cpp:722
4     xul.dll     nsFastLoadFileReader::Open    
mozilla/xpcom/io/nsFastLoadFile.cpp:978
5     xul.dll     NS_NewFastLoadFileReader    
mozilla/xpcom/io/nsFastLoadFile.cpp:1225
6     xul.dll     nsFastLoadService::NewInputStream    
mozilla/xpcom/io/nsFastLoadService.cpp:172
7     xul.dll     mozJSComponentLoader::StartFastLoad    
mozilla/js/src/xpconnect/loader/mozJSComponentLoader.cpp:880
8     xul.dll     CallGetService     nsComponentManagerUtils.cpp:94
9     xul.dll     nsSSLSocketProvider::AddRef    
mozilla/netwerk/base/src/nsFileStreams.cpp:364
10     xul.dll     mozJSComponentLoader::GlobalForLocation    
mozilla/js/src/xpconnect/loader/mozJSComponentLoader.cpp:1127
11     xul.dll     nsLocalFile::ResolveAndStat    
mozilla/xpcom/io/nsLocalFileWin.cpp:847
12     xul.dll     nsComponentManagerImpl::AutoRegisterComponent    
mozilla/xpcom/components/nsComponentManager.cpp:2988
13     xul.dll     nsLocalFile::IsDirectory    
mozilla/xpcom/io/nsLocalFileWin.cpp:2399
14     xul.dll     nsComponentManagerImpl::AutoRegisterImpl    
mozilla/xpcom/components/nsComponentManager.cpp:2862

not sure XUL is the right component but a several fastload bugs seem to end up here so that might be a good place to start.
This is about a third of the 3000/wk _CxxThrowException crashes in Firefox 3.5.3.  So once bug 511758 fixes the signature distribution, this will be approximately topcrash #90.

I'm guessing this is an overlarge allocation rather than a high-memory-use issue, since it happens during startup.

bp-5e061b65-6e60-473e-a42a-e26b12091024
bp-277f7917-1c8f-4369-9927-2aa532091024
Keywords: topcrash
Why is our use of operator new throw an exception?
Maybe you need to toss a std::nothrow in there, like in bug 290284?
(In reply to comment #2)
> Why is our use of operator new throw an exception?

On Windows, the default implementation of operator new throws an exception on out of memory. In the past couple of days, I have been seeing other crashes in fastload stuff. I wonder if we're reading either invalid or old fastload files (by which I mean files that should have been invalidated, but weren't for whatever reason).

If this were the case, then aFooter->mNumIDs could be incorrectly large, causing us to "run out of memory".

(In reply to comment #3)
> Maybe you need to toss a std::nothrow in there, like in bug 290284?

I think this would wallpaper over the problem.
(In reply to comment #4)
> (In reply to comment #2)
> > Why is our use of operator new throw an exception?
> 
> On Windows, the default implementation of operator new throws an exception on
> out of memory. In the past couple of days, I have been seeing other crashes in
> fastload stuff. I wonder if we're reading either invalid or old fastload files
> (by which I mean files that should have been invalidated, but weren't for
> whatever reason).

I doubt it.

> 
> If this were the case, then aFooter->mNumIDs could be incorrectly large,
> causing us to "run out of memory".

Sounds like this is oldschool corruption. I think the problem is that readfooter is called before we checksum.


I still can't believe new[] throws on windows, means we will fail in random places.
(In reply to comment #5)
> 
> I still can't believe new[] throws on windows, means we will fail in random
> places.

For example,when two firefox.exe is running on different profile, and one of them is executing testcase of bug 345161, another one REALLY fails in random places.
I've been thinking about this over the weekend. It's a nasty crash, especially if it happens this frequently. It means that people can't startup firefox without deleting their .mfasl files(or whole profile dir). We checksum the whole file, but checkum happens after the footer is read, leading to bad situations like this.

This is a patch against trunk. 192 one should be pretty much identical.
Attachment #408328 - Flags: review?(brendan)
Comment on attachment 408328 [details] [diff] [review]
checksum before
[Checkin: Comment 10]

>@@ -887,6 +887,15 @@ nsFastLoadFileReader::Open()
>     if (NS_FAILED(rv))
>         return rv;
> 
>+    PRUint32 checksum;
>+    rv = ComputeChecksum(&checksum);
>+    if (NS_FAILED(rv))
>+        return rv;
>+    
>+    if (checksum != mHeader.mChecksum) {
>+        return NS_ERROR_FAILURE;
>+    }

Don't overbrace.

What, if anything, mitigated this or prevent it for all the years before the recent fastload changes? Did we verify the checksum (in XUL or XPC code) before calling Open, and then just recently we transposed the order of those two operations?

/be
Attachment #408328 - Flags: review?(brendan) → review+
> What, if anything, mitigated this or prevent it for all the years before the
> recent fastload changes? Did we verify the checksum (in XUL or XPC code) before
> calling Open, and then just recently we transposed the order of those two
> operations?

The code that is crashing is without my changes. What mitigated it was new[] returning null instead of throwing an exception in VC7
http://hg.mozilla.org/mozilla-central/rev/236673fa1570
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Note that afther this patch, nsXULPrototypeCache::StartFastLoad(nsIURI* aURI) contains a redundant check:
   780         if (NS_SUCCEEDED(rv)) {
   781             if (NS_SUCCEEDED(rv)) {
Comment on attachment 408328 [details] [diff] [review]
checksum before
[Checkin: Comment 10]

This deals with a pretty critical bug where the .mfasl footer gets corrupted, resulting a firefox that wont be able to startup.
Attachment #408328 - Flags: approval1.9.1.5?
re comment 9:
>  What mitigated it was new[] returning null instead of throwing an exception in VC7

we are looking for explainations of why we see such a difference in crash rates between 3.0.x and 3.5.x.   did we make the vc7 switch in 3.5?  could this change in compiler behavior result in changes in crash behavior elsewhere, or might it be isolated to just the situation here in this bug?
chofmann, I see new-throwing crashes in 3.0.14 too:
bp-1bc2e5a3-0f21-4114-b293-230172091028
[@ RaiseException | _CxxThrowException] somehow went from topcrash #4 in 3.0.14 to topcrash #28 in 3.5.3!
so, msvc6 (and earlier, although very few mozilla devs remember earlier) returned null for new failure.

http://binglongx.spaces.live.com/blog/cns!142CBF6D49079DE8!210.entry

my poor memory indicated that msvc7 had a way to specify the msvc6 behavior, but i can't find it.

msvc8 and later didn't offer that at all (you have to use nothrow everywhere).

this has been a problem everywhere for a while. but dealing with it is something we (quasi) decided (by inaction) not to do for a while.
Before we take this on 1.9.1, can we get it landed on 1.9.2, preferably in a beta?

Also, comment 7 says the 1.9.2 patch should be "pretty much identical" to the trunk patch. Are we sure that's true for 1.9.1? Taras, have you tested and built with this patch on 1.9.1?
Flags: blocking1.9.2?
Whiteboard: [crashkill]
(In reply to comment #16) 
> Also, comment 7 says the 1.9.2 patch should be "pretty much identical" to the
> trunk patch. Are we sure that's true for 1.9.1? Taras, have you tested and
> built with this patch on 1.9.1?

I have only tested this on trunk
Flags: blocking1.9.2? → blocking1.9.2+
Whiteboard: [crashkill] → [crashkill][needs 1.9.2 landing]
Whiteboard: [crashkill][needs 1.9.2 landing] → [crashkill][needs 1.9.2 landing, needs 1.9.2 patch]
Attachment #408328 - Flags: approval1.9.1.6?
Comment on attachment 408328 [details] [diff] [review]
checksum before
[Checkin: Comment 10]

Please request approval after you've tested the patch on 1.9.1. If it turns out to be a different patch (excluding offsets, of course), please attach a 1.9.1-specific patch and request approval for it.
same as previous patch by this applies cleanly on 192
Keywords: checkin-needed
Whiteboard: [crashkill][needs 1.9.2 landing, needs 1.9.2 patch] → [crashkill][needs 1.9.2 landing]
Comment on attachment 410853 [details] [diff] [review]
192 patch
[Checkin: Comment 20]


http://hg.mozilla.org/releases/mozilla-1.9.2/rev/839555d1e6fe
Attachment #410853 - Attachment description: 192 patch → 192 patch [Checkin: Comment 20]
Attachment #408328 - Attachment description: checksum before → checksum before [Checkin: Comment 10]
Keywords: checkin-needed
Whiteboard: [crashkill][needs 1.9.2 landing] → [crashkill]
Target Milestone: --- → mozilla1.9.3a1
Summary: crash [@ RaiseException ] nsFastLoadFileReader::ReadFooter → crash [@ operator new(unsigned int) | nsFastLoadFileReader::ReadFooter(nsFastLoadFileReader::nsFastLoadFooter*)]
Crash Signature: [@ operator new(unsigned int) | nsFastLoadFileReader::ReadFooter(nsFastLoadFileReader::nsFastLoadFooter*)]
You need to log in before you can comment on or make changes to this bug.