Closed Bug 505474 Opened 10 years ago Closed 10 years ago

New crash in talos [@ nsGIFDecoder2::Close()]

Categories

(Core :: ImageLib, defect)

x86
Windows NT
defect
Not set

Tracking

()

RESOLVED FIXED
Tracking Status
status1.9.2 --- beta1-fixed

People

(Reporter: benjamin, Assigned: joe)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Talos has been orange today more than usual with the following crash:

 0  mozcrt19.dll!memset [memset.asm : 134 + 0x0]
 1  xul.dll + 0x3f6b49
 2  xul.dll!nsGIFDecoder2::Close() [nsGIFDecoder2.cpp:eac913463798 : 163 + 0x8]
 3  xul.dll!nsGIFDecoder2::Release() [nsGIFDecoder2.cpp:eac913463798 : 113 + 0x1f]
 4  xul.dll!imgRequest::OnStopRequest(nsIRequest *,nsISupports *,unsigned int) [imgRequest.cpp:eac913463798 : 847 + 0x7]
 5  xul.dll!ProxyListener::OnStopRequest(nsIRequest *,nsISupports *,unsigned int) [imgLoader.cpp:eac913463798 : 1662 + 0x18]
 6  xul.dll!nsStreamListenerTee::OnStopRequest(nsIRequest *,nsISupports *,unsigned int) [nsStreamListenerTee.cpp:eac913463798 : 65 + 0x18]
 7  xul.dll!nsHttpChannel::OnStopRequest(nsIRequest *,nsISupports *,unsigned int) [nsHttpChannel.cpp:eac913463798 : 5204 + 0x28]
 8  xul.dll!nsInputStreamPump::OnStateStop() [nsInputStreamPump.cpp:eac913463798 : 576 + 0x1b]
 9  xul.dll!nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream *) [nsInputStreamPump.cpp:eac913463798 : 401 + 0x7]
10  xul.dll!nsInputStreamReadyEvent::Run() [nsStreamUtils.cpp:eac913463798 : 190 + 0x6]
11  xul.dll!nsThread::ProcessNextEvent(int,int *) [nsThread.cpp:eac913463798 : 527 + 0xf]
12  xul.dll!nsBaseAppShell::Run() [nsBaseAppShell.cpp:eac913463798 : 170 + 0x23]
13  xul.dll!nsAppStartup::Run() [nsAppStartup.cpp:eac913463798 : 193 + 0xc]
14  nspr4.dll!PR_GetEnv + 0xa
15  firefox.exe!wmain [nsWindowsWMain.cpp:eac913463798 : 110 + 0xe5]

Logs:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1248183057.1248190401.20688.gz&fulltext=1
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1248182240.1248189115.6026.gz&fulltext=1
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1248175004.1248181755.16623.gz&fulltext=1

I've only seen it on Windows...
Flags: blocking1.9.2+
Whiteboard: [orange]
This is actually a consistent orange on the fast talos boxes on Windows. Investigating...
Summary: New "random" crash in talos [@ nsGIFDecoder2::Close()] → New crash in talos [@ nsGIFDecoder2::Close()]
which patch caused this?
(In reply to comment #3)
> which patch caused this?

Looks like http://hg.mozilla.org/mozilla-central/rev/b94bc4be53ca.
Attached patch fixSplinter Review
Before bug 753 landed, we implicitly tracked whether we'd called EndImageFrame() by setting mImageFrame to null inside EndImageFrame(). Once we stopped doing that, the state machine tried writing to memory that had already been freed. 

(This memory is actually freed once we error out on the invalid GIF, but the internal state-tracking mechanics made it so we never wrote to that memory even though we have stale pointers to it. Unfortunately when we entered EndImageFrame() a second time, that state tracking was invalid and we attempted to write to the stale pointer.)

This patch introduces a tracking variable, mCurrentFrame, which is set to the frame of the currently-decoding frame when we're in the middle of decoding it, and -1 otherwise. This ensures we don't enter EndImageFrame() a second time, restoring the functionality we had prior to bug 753.
Attachment #389854 - Flags: review?(vladimir)
(In reply to comment #5)
> Created an attachment (id=389854) [details]
> fix
> 
> Before bug 753 landed, we implicitly tracked whether we'd called
> EndImageFrame() by setting mImageFrame to null inside EndImageFrame(). Once we
> stopped doing that, the state machine tried writing to memory that had already
> been freed. 
> 
> (This memory is actually freed once we error out on the invalid GIF, but the
> internal state-tracking mechanics made it so we never wrote to that memory even
> though we have stale pointers to it. Unfortunately when we entered
> EndImageFrame() a second time, that state tracking was invalid and we attempted
> to write to the stale pointer.)
> 
> This patch introduces a tracking variable, mCurrentFrame, which is set to the
> frame of the currently-decoding frame when we're in the middle of decoding it,
> and -1 otherwise. This ensures we don't enter EndImageFrame() a second time,
> restoring the functionality we had prior to bug 753.

It might be good to include some of this rationale into a comment in the code.
I suggested that it be the commit message, but the last paragraph seems like it would be good as a comment, slightly morphed.
http://hg.mozilla.org/mozilla-central/rev/4c3f1074ab37
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
It's important to have a specific testcase for this, rather than relying on the Tp pageset to catch it for us next time.
Flags: in-testsuite?
Mass change: adding fixed1.9.2 keyword

(This bug was identified as a mozilla1.9.2 blocker which was fixed before the mozilla-1.9.2 repository was branched (August 13th, 2009) as per this query: http://is.gd/2ydcb - if this bug is not actually fixed on mozilla1.9.2, please remove the keyword. Apologies for the bugspam)
Keywords: fixed1.9.2
Blocks: 438871
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.