Closed
Bug 525326
Opened 15 years ago
Closed 15 years ago
Crashes in gif decoder [@ xul.dll@0x348945][@ xul.dll@0x348864][@nsGIFDecoder2::GifWrite(unsigned char const*, unsigned int)]
Categories
(Core :: Graphics: ImageLib, defect, P1)
Core
Graphics: ImageLib
Tracking
()
RESOLVED
FIXED
mozilla1.9.3a1
People
(Reporter: joe, Assigned: jrmuizel)
References
Details
(5 keywords, Whiteboard: [sg:dos] null deref)
Crash Data
Attachments
(4 files, 1 obsolete file)
673 bytes,
image/gif
|
Details | |
11.33 KB,
image/gif
|
Details | |
1.01 KB,
patch
|
samuel.sidler+old
:
approval1.9.1.5+
samuel.sidler+old
:
approval1.9.0.16+
|
Details | Diff | Splinter Review |
1.49 KB,
patch
|
joe
:
review+
|
Details | Diff | Splinter Review |
We're seeing lots of crashes in the GIF decoder, involving nsGIFDecoder2::ProcessData calling GifWrite(), then a null dereference. Right now, this is showing up in the noted crash site, but this is a Windows-specific, 3.5.4-specific crash site, since the offset in xul.dll will change with every build, and obviously xul.dll has no meaning on OS X or Linux.
Our first priority is figuring out where we're crashing; Jeff Muizelaar and I are working on that. A parallel process that someone else could look into is whether this started in 3.5.4, or whether similar crashes existed elsewhere.
Comment 1•15 years ago
|
||
A lot of the comments for this crash implicate the bookmarks stuff, which makes me think that it's to do with .ICO files. Is there a bandaid that we might be able to issue or way we can figure out if the favicon service might be causing this to fail hard?
Comment 2•15 years ago
|
||
Comment 3•15 years ago
|
||
(Note that there very well could be a Mac- or Linux-specific crash with a different signature that's exactly the same, but it might not be in the top 100.)
Reporter | ||
Comment 4•15 years ago
|
||
It shouldn't have anything to do with .ICO files, but if a site specifies a GIF as its favicon, I think we'll happily load it.
Comment 5•15 years ago
|
||
The first time I see this stack signature on 1.9.1 is in 3.5.4 builds on 10/22. It's not present in 3.5.4pre builds prior to that, which almost makes me think we regressed this in the patch we took for build 3 of 3.5.4... wasn't that a gif bug?
Comment 6•15 years ago
|
||
Comment 7•15 years ago
|
||
I'm thinking of bug 519589.
Comment 8•15 years ago
|
||
I'm a liar. Joe pointed out that I should be searching for other xul.dll crashes in general.
In the first 3.5.4 build, we saw some of these crashes at xul.dll@0x350ca9. e.g.: http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.5.4&query_search=signature&query_type=startswith&query=xul.dll&date=10%2F20%2F2009&range_value=1&range_unit=weeks&do_query=1&signature=xul.dll%400x350ca9
There's not much data for 3.5.4pre builds (because there's not many users) and there's only a few crashes in 7-day periods that crash in xul.dll, but the first time I see this crash on 1.9.1 is bp-7c622a06-1d57-4d34-afd8-540af2091016. That crash happened using the 20091002095927 nightly.
Note: I also discovered that this crash has a second xul.dll signature in 3.5.4: http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.5.4&query_search=signature&query_type=startswith&query=xul.dll&date=&range_value=1&range_unit=weeks&do_query=1&signature=xul.dll%400x348864
Summary: Crashes in gif decoder [@ xul.dll@0x348945] → Crashes in gif decoder [@ xul.dll@0x348945][@ xul.dll@0x348864]
Reporter | ||
Comment 9•15 years ago
|
||
Things I need to fix this bug:
- A line number this crashes on, or at least a general area; or
- a GIF file that can reproduce this.
I believe that Sam has a couple of URLs that look suspicious; if someone on Windows (though I believe it to be a general problem, no use limiting ourselves) wants to talk to Sam, then browse through some of the URLs, trying to crash, that would be a big help.
Comment 10•15 years ago
|
||
Can we get in touch (via email, if provided) from some of the people who wrote comments in the crash stacks asking them to tell us the URLs of the sites in those folders? That might be a way to get GIFs that cause the crash.
Comment 12•15 years ago
|
||
http://twitter.com/beltzner/status/5272436091 - let's see if that helps
Updated•15 years ago
|
Keywords: testcase-wanted
Whiteboard: [sg:dos] null deref
Assignee | ||
Comment 14•15 years ago
|
||
It's pretty weird that we don't get any symbols or line numbers for this location. There's no relevant information in xul.sym for this range, so the problem isn't in the stack unwinder. Interestingly enough the symbol information for GifWrite() is duplicated in xul.sym. I'll check the pdb tomorrow to see if it's also a mess. Perhaps this problem is related to bug 520651.
Assignee | ||
Comment 15•15 years ago
|
||
Having a quick read through of the assembly at that address I'd guess that the crash is near this block at line 1137 in 1.9.1
if (mGIFStruct.images_decoded) {
// Copy global colormap into the palette of current frame
PRUint32 size;
mImageFrame->GetPaletteData(&mColormap, &size);
memcpy(mColormap, mGIFStruct.global_colormap, size);
}
Comment 16•15 years ago
|
||
(In reply to comment #14)
> It's pretty weird that we don't get any symbols or line numbers for this
> location.
17:37 < lars> or do you mean that you're looking to get an explanation from breakpad as to why there are no symbols for that offset into xul.dll? In the past, I've seen ted indicate that it might mean a jump into random memory because other offsets in that dll do have symbols
19:08 <@ted> ss: yeah, that looks like a jump to random memory, basically
19:10 <@ted> looks like GifWrite got inlined into ProcessData there
Assignee | ||
Comment 17•15 years ago
|
||
Here's my suggested annotation of the assembly:
We crash dereferencing eax which is presumably null.
52f78945 8b10 mov edx,dword ptr [eax]
if we assume that eax is mImageFrame then edx will be a pointer to the vtable
52f78947 8b5240 mov edx,dword ptr [edx+40h]
here we load the address of GetPaletteData into edx
52f7894a 8d4c241c lea ecx,[esp+1Ch]
52f7894e 51 push ecx
push the address of 'size' onto the stack
52f7894f 57 push edi
push the address of 'mColormap' onto the stack
52f78950 50 push eax
push 'this' onto the stack
52f78951 ffd2 call edx
call GetPaletteData
52f78953 8b44241c mov eax,dword ptr [esp+1Ch]
put the value of 'size' into eax. esp+1C is stack local variable
52f78957 8b0f mov ecx,dword ptr [edi]
put the value of 'mColormap' onto the stack
52f78959 50 push eax
push the size onto the stack
52f7895a 8d86b8210000 lea eax,[esi+21B8h]
get the address mGIFStruct.global_colormap?
52f78960 50 push eax
push mGIFStruct.global_colormap?
52f78961 51 push ecx
push mColormap
52f78962 e8ef7df3ff call xul!memcpy (52eb0756)
This annotation fits very well with my original hypothesis. So it looks like the problem is that mImageFrame is null.
Assignee | ||
Comment 18•15 years ago
|
||
(In reply to comment #16)
> (In reply to comment #14)
> > It's pretty weird that we don't get any symbols or line numbers for this
> > location.
>
> 17:37 < lars> or do you mean that you're looking to get an explanation from
> breakpad as to why there are no symbols for that offset into xul.dll? In the
> past, I've seen ted indicate that it might mean a jump into random memory
> because other offsets in that dll do have symbols
> 19:08 <@ted> ss: yeah, that looks like a jump to random memory, basically
> 19:10 <@ted> looks like GifWrite got inlined into ProcessData there
From what I can tell, there was no jump into random memory. The code at 0x348945 is part of the GIF decoder and seems to make sense. If you look at the addresses in the .sym file there are no symbols even close to this region...
Assignee | ||
Comment 19•15 years ago
|
||
This area of code saw a lot of churn during the 3.5.4 release:
bug 514776, bug 511689 and bug 519589
Comment 20•15 years ago
|
||
Joe was pointing at bug 511689 on IRC earlier. It's not bug 519589 because that landed on 1.9.1 after our first sighting of this crash.
Assignee | ||
Comment 21•15 years ago
|
||
My next guess is that the problem is related to invalid depth handling.
Assignee | ||
Comment 22•15 years ago
|
||
Looks like my guess may not have been right.
The attached image has two frames, the second one has dimensions of 65535x65535. These dimensions cause BeginImageFrame to fail, after which mImageFrame is null.
Since, mGIFStruct.images_decoded is true we pass into this block:
// Copy global colormap into the palette of current frame
PRUint32 size;
mImageFrame->GetPaletteData(&mColormap, &size);
memcpy(mColormap, mGIFStruct.global_colormap, size);
and proceed to crash. The likely cause of this is bug 511689, but I haven't figured out why.
I should've looked at the url's before making a test case, but for what it's worth, this page: http://gallery.site.hu/u/Snokuk_54/GO/Vegyes_egyebek/smileyk/smileyk/?g2_page=2 has gif that also causes crashes on 3.5.4. I've not investigated why.
Assignee | ||
Comment 23•15 years ago
|
||
Here's the gif from the url that I mentioned earlier that crashes.
That's it for me tonight.
Assignee | ||
Updated•15 years ago
|
blocking1.9.1: --- → ?
Flags: blocking1.9.2?
Flags: blocking1.9.0.16?
Updated•15 years ago
|
blocking1.9.1: ? → .5+
status1.9.1:
--- → wanted
Flags: wanted1.9.0.x+
Flags: blocking1.9.0.16?
Flags: blocking1.9.0.16+
Comment 24•15 years ago
|
||
Yeah, "random memory" was too strong of a statement there. Interesting that we don't have symbols for that area, then. Is the symbol dumper missing anything, or is this just the optimizer confusing the heck out of us? I noticed that it appeared to have inlined all of GifWrite.
Updated•15 years ago
|
blocking1.9.1: .5+ → ---
status1.9.1:
wanted → ---
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P1
Assignee | ||
Comment 25•15 years ago
|
||
Some new information:
- Trunk and 1.9.2 do not seem to crash with the examples I've found. Have we seen any crashes like this with those trees?
- The gif from the wild seems to have a similar problem to the one I created.
Updated•15 years ago
|
blocking1.9.1: --- → .5+
status1.9.1:
--- → wanted
Assignee | ||
Comment 26•15 years ago
|
||
The reason we don't crash on trunk and 1.9.2 is that we don't assume that mImageFrame is valid when copying over palette data.
blocking1.9.1: .5+ → ---
status1.9.1:
wanted → ---
Updated•15 years ago
|
blocking1.9.1: --- → .5+
status1.9.1:
--- → wanted
Assignee | ||
Comment 27•15 years ago
|
||
This patch will fix the problem, but I'm not sure about its risk. We didn't have the check before and were fine, so I don't want to break anything else. Still, this patch is better than what we currently have.
Assignee | ||
Updated•15 years ago
|
Attachment #409368 -
Flags: superreview?
Attachment #409368 -
Flags: review?(joe)
Attachment #409368 -
Flags: review?(alfredkayser)
Assignee | ||
Updated•15 years ago
|
Attachment #409368 -
Flags: superreview?
Comment 28•15 years ago
|
||
Just for reference. This also crashes on OS X: bp-6011f77f-de07-4131-b61d-986502091030
First 5 frames:
0 XUL nsGIFDecoder2::GifWrite modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp:1141
1 XUL nsGIFDecoder2::ProcessData modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp:250
2 XUL ReadDataOut modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp:190
3 XUL nsInputStreamTee::WriteSegmentFun xpcom/io/nsInputStreamTee.cpp:102
4 XUL nsPipeInputStream::ReadSegments xpcom/io/nsPipe3.cpp:799
5 XUL nsGIFDecoder2::WriteFrom modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp:269
Assignee: nobody → jmuizelaar
Severity: normal → critical
Status: NEW → ASSIGNED
OS: Mac OS X → All
Hardware: x86 → All
Summary: Crashes in gif decoder [@ xul.dll@0x348945][@ xul.dll@0x348864] → Crashes in gif decoder [@ xul.dll@0x348945][@ xul.dll@0x348864][@nsGIFDecoder2::GifWrite(unsigned char const*, unsigned int)]
Assignee | ||
Comment 29•15 years ago
|
||
This bug was actually caused by bug 514776 which removed the check for null mImageFrame. I don't know why that check was removed. The patch that I posted earlier should be safe to go in.
Attachment #409368 -
Flags: review?(joe) → review+
Updated•15 years ago
|
Attachment #409368 -
Flags: review?(alfredkayser) → review+
Assignee | ||
Comment 30•15 years ago
|
||
Pushed to 1.9.1 as:
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/544bc11bc532
I don't have cvs access so I'll need a volunteer to push this to 1.9.0 cvs.
Attachment #409368 -
Attachment is obsolete: true
Assignee | ||
Updated•15 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 31•15 years ago
|
||
Comment on attachment 409517 [details] [diff] [review]
Patch including the checkin message
Why did this land on 1.9.1 without approval? Please read http://groups.google.com/group/mozilla.dev.planning/browse_thread/thread/3ca007cf277a5147.
Attachment #409517 -
Flags: approval1.9.1.5?
Attachment #409517 -
Flags: approval1.9.0.16?
Assignee | ||
Comment 32•15 years ago
|
||
(In reply to comment #31)
> (From update of attachment 409517 [details] [diff] [review])
> Why did this land on 1.9.1 without approval? Please read
> http://groups.google.com/group/mozilla.dev.planning/browse_thread/thread/3ca007cf277a5147.
I didn't realize that was required. Should I back it out?
Comment 33•15 years ago
|
||
Comment on attachment 409517 [details] [diff] [review]
Patch including the checkin message
No, don't back it out. The more coverage this gets, the better.
Attachment #409517 -
Flags: approval1.9.1.5?
Attachment #409517 -
Flags: approval1.9.1.5+
Attachment #409517 -
Flags: approval1.9.0.16?
Attachment #409517 -
Flags: approval1.9.0.16+
Comment 34•15 years ago
|
||
(Approved across those two branches.)
Updated•15 years ago
|
Keywords: testcase-wanted → testcase
Comment 35•15 years ago
|
||
Joe/Jeff (or anyone else): Can you please get this landed on 1.9.1 ASAP? We'd like as much coverage as we can get.
Comment 36•15 years ago
|
||
(What am I talking about? It already landed. *sigh*)
Reporter | ||
Comment 37•15 years ago
|
||
Checking in modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp;
/cvsroot/mozilla/modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp,v <-- nsGIFDecoder2.cpp
new revision: 1.108; previous revision: 1.107
Keywords: fixed1.9.0.16
Reporter | ||
Comment 38•15 years ago
|
||
I do not believe there is any reason to leave the gifs that cause crashes as secret attachments, correct? This is just a DoS, and one that has been well-described in checkin comments, at that?
Also, Jeff, can you check in a crash test for this bug (m-c only is ok)?
Finally, I think we should probably file a bug on breakpad, or at least our installation of it, for serving areas of GifWrite without symbol information? Or do we honestly think that that 483-line function was inlined?
Comment 39•15 years ago
|
||
(In reply to comment #38)
> I do not believe there is any reason to leave the gifs that cause crashes as
> secret attachments, correct? This is just a DoS, and one that has been
> well-described in checkin comments, at that?
Correct. Open them up and land them away!
Comment 40•15 years ago
|
||
(In reply to comment #38)
> Finally, I think we should probably file a bug on breakpad, or at least our
> installation of it, for serving areas of GifWrite without symbol information?
> Or do we honestly think that that 483-line function was inlined?
We could file a bug on this yes. I do believe it was inlined, since it's only called in one place. Why wouldn't it be? PGO does some pretty aggressive inlining. Lacking symbols for that case is bad, of course.
Assignee | ||
Comment 41•15 years ago
|
||
Attachment #409684 -
Flags: review?(joe)
Updated•15 years ago
|
Attachment #409517 -
Flags: approval1.9.1.5+
Updated•15 years ago
|
blocking1.9.1: .6+ → .5+
status1.9.1:
.6-fixed → ---
Comment 42•15 years ago
|
||
This needs to land on GECKO1914_20091006_RELBRANCH for 1.9.1.5.
Reporter | ||
Comment 43•15 years ago
|
||
Updated•15 years ago
|
status1.9.1:
--- → .5-fixed
Reporter | ||
Updated•15 years ago
|
Attachment #409684 -
Flags: review?(joe) → review+
Comment 44•15 years ago
|
||
Verified this is fixed on 1.9.1 by making sure both testcases didn't crash 3.5.3, did crash 3.5.4, and didn't crash using Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.5pre) Gecko/20091102 Shiretoko/3.5.5pre
I'll save the keyword for when we verify using official builds.
Assignee | ||
Comment 45•15 years ago
|
||
Test case pushed:
http://hg.mozilla.org/mozilla-central/rev/a954f52e3149
Updated•15 years ago
|
Flags: in-testsuite?
Target Milestone: --- → mozilla1.9.3a1
Reporter | ||
Updated•15 years ago
|
Flags: in-testsuite? → in-testsuite+
Reporter | ||
Updated•15 years ago
|
Attachment #409290 -
Attachment is private: false
Reporter | ||
Updated•15 years ago
|
Attachment #409291 -
Attachment is private: false
Comment 46•15 years ago
|
||
Jeff, when this patch will be checked into 1.9.2?
Assignee | ||
Comment 47•15 years ago
|
||
(In reply to comment #46)
> Jeff, when this patch will be checked into 1.9.2?
1.9.2 is not affected.
Flags: blocking1.9.2+
Comment 48•15 years ago
|
||
can the tests be pushed to 1.9.2?
Comment 49•15 years ago
|
||
Verified for 1.9.1.5 with the official build 1 (Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5).
Verified the crash in 1.9.1.4 and the lack of crash in 1.9.1.3 as well.
Keywords: verified1.9.1
Comment 50•15 years ago
|
||
(In reply to comment #38)
> Finally, I think we should probably file a bug on breakpad, or at least our
> installation of it, for serving areas of GifWrite without symbol information?
> Or do we honestly think that that 483-line function was inlined?
I filed bug 526484 on investigating this.
Assignee | ||
Comment 52•15 years ago
|
||
Comment on attachment 409517 [details] [diff] [review]
Patch including the checkin message
This patch doesn't need 1.9.1.6 approval because it is already in 1.9.1.5
Attachment #409517 -
Flags: approval1.9.1.6+
Comment 53•15 years ago
|
||
Verified again for 1.9.0.16 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.16pre) Gecko/2009111921 GranParadiso/3.0.16pre (.NET CLR 3.5.30729). Behavior is the same (good) behavior as 1.9.0.15.
Keywords: fixed1.9.0.16 → verified1.9.0.16
Updated•13 years ago
|
Crash Signature: [@ xul.dll@0x348945]
[@ xul.dll@0x348864]
[@nsGIFDecoder2::GifWrite(unsigned char const*, unsigned int)]
You need to log in
before you can comment on or make changes to this bug.
Description
•