Last Comment Bug 723053 - Crash @ mozilla::image::RasterImage::IsDecodeFinished
: Crash @ mozilla::image::RasterImage::IsDecodeFinished
Status: RESOLVED FIXED
[sg:critical] debug-only testcase in ...
: crash, regression
Product: Core
Classification: Components
Component: ImageLib (show other bugs)
: 12 Branch
: x86_64 Windows 7
: -- critical (vote)
: mozilla13
Assigned To: Nobody; OK to take it and work on it
:
Mentors:
: 723254 (view as bug list)
Depends on:
Blocks: 715308 723254
  Show dependency treegraph
 
Reported: 2012-02-01 03:34 PST by Scoobidiver (away)
Modified: 2012-06-05 11:46 PDT (History)
10 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
unaffected
unaffected
+
fixed
+
fixed
unaffected
unaffected


Attachments

Description Scoobidiver (away) 2012-02-01 03:34:09 PST
It first appeared in 12.a1/20120128.
The regression range might be (low volume):
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c07595bee6cf&tochange=e99e0dc97746
It's likely caused by bug 715308.

Signature 	mozilla::image::RasterImage::IsDecodeFinished() More Reports Search
UUID	b03d9058-5ebb-4424-a752-4a2e52120131
Date Processed	2012-01-31 22:28:21
Uptime	119
Install Age	4.7 hours since version was first installed.
Install Time	2012-01-31 17:45:15
Product	Firefox
Version	12.0a1
Build ID	20120131031150
Release Channel	nightly
OS	Windows NT
OS Version	6.1.7601 Service Pack 1
Build Architecture	amd64
Build Architecture Info	family 6 model 23 stepping 7
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0x310071
App Notes 	
AdapterVendorID: 0x1002, AdapterDeviceID: 0x6719, AdapterSubsysID: 03ba1043, AdapterDriverVersion: 8.821.0.0
D2D? D2D+
DWrite? DWrite+
D3D10 Layers? D3D10 Layers+
EMCheckCompatibility	True

Frame 	Module 	Signature [Expand] 	Source
0 	xul.dll 	mozilla::image::RasterImage::IsDecodeFinished 	image/src/RasterImage.cpp:2734
1 	xul.dll 	mozilla::image::RasterImage::DecodeWorker::Run 	
2 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:657
3 	xul.dll 	MessageLoop::DoWork 	ipc/chromium/src/base/message_loop.cc:412
4 	xul.dll 	xul.dll@0xef71f 	
5 	xul.dll 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:110
6 	xul.dll 	xul.dll@0x3a394b 	
7 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:201
8 	nspr4.dll 	PR_GetThreadPrivate 	nsprpub/pr/src/threads/prtpd.c:232
9 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:175
10 	msvcr90.dll 	getenv_helper_nolock 	
11 	xul.dll 	MessageLoop::current 	ipc/chromium/src/base/message_loop.cc:85
12 	xul.dll 	nsBaseAppShell::Run 	widget/xpwidgets/nsBaseAppShell.cpp:189
13 	xul.dll 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:220
14 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3537
...

More reports at:
https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3Aimage%3A%3ARasterImage%3A%3AIsDecodeFinished%28%29
Comment 1 Curtis Koenig [:curtisk-use curtis.koenig+bzATgmail.com]] 2012-02-01 08:17:57 PST
marked core-security at request of jlebar
Comment 2 Justin Lebar (not reading bugmail) 2012-02-01 08:28:39 PST
This may be a use-after-free error.  It's crashing on one of these two lines:

if (mDecoder->IsSizeDecode())
  if (mHasSize)

But it's not a null-pointer exception.
Comment 3 Justin Lebar (not reading bugmail) 2012-02-01 11:12:30 PST
We're going to back out bug 715308.
Comment 4 Christian Holler (:decoder) 2012-02-01 12:53:49 PST
It seems I just reported this bug with a testcase, see bug 723254. Is this the same? It's a use-after-free problem.
Comment 5 Daniel Veditz [:dveditz] 2012-02-01 15:38:07 PST
*** Bug 723254 has been marked as a duplicate of this bug. ***
Comment 6 Justin Lebar (not reading bugmail) 2012-02-01 16:14:33 PST
Woohoo, got a stack!  (Doesn't happen consistently in Valgrind, but often enough.)

==29284== Invalid read of size 8
==29284==    at 0x67776C3: mozilla::image::RasterImage::DecodeWorker::Run() (RasterImage.cpp:2932)
==29284==    by 0x73DF8D3: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:657)
==29284==    by 0x739E46B: NS_ProcessNextEvent_P(nsIThread*, bool) (nsThreadUtils.cpp:245)
==29284==    by 0x72EB2D9: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (MessagePump.cpp:110)
==29284==    by 0x7409D49: MessageLoop::RunInternal() (message_loop.cc:208)
==29284==    by 0x7409D71: MessageLoop::Run() (message_loop.cc:201)
==29284==    by 0x7210B42: nsBaseAppShell::Run() (nsBaseAppShell.cpp:189)
==29284==    by 0x70613AF: nsAppStartup::Run() (nsAppStartup.cpp:220)
==29284==    by 0x65FE1BE: XRE_main (nsAppRunner.cpp:3537)
==29284==    by 0x401808: main (nsBrowserApp.cpp:205)
==29284==  Address 0x1b4123e0 is 208 bytes inside a block of size 296 free'd
==29284==    at 0x402A674: free (vg_replace_malloc.c:427)
==29284==    by 0x677426D: mozilla::image::RasterImage::Release() (in /home/jlebar/code/moz/ff-git/objdir-nightly/toolkit/library/libxul.so)
==29284==    by 0x678C7C8: imgRequest::~imgRequest() (nsAutoPtr.h:907)
==29284==    by 0x678C865: imgRequest::~imgRequest() (imgRequest.cpp:136)
==29284==    by 0x6788381: imgRequest::Release() (in /home/jlebar/code/moz/ff-git/objdir-nightly/toolkit/library/libxul.so)
==29284==    by 0x67929FC: mozilla::image::nsJPEGDecoder::~nsJPEGDecoder() (nsJPEGDecoder.cpp:155)
==29284==    by 0x67749D6: mozilla::image::Decoder::Release() (in /home/jlebar/code/moz/ff-git/objdir-nightly/toolkit/library/libxul.so)
==29284==    by 0x6777548: mozilla::image::RasterImage::DecodeWorker::DecodeSomeOfImage(mozilla::image::RasterImage*, mozilla::image::RasterImage::DecodeWorker::DecodeType) (RasterImage.cpp:2979)
==29284==    by 0x67776C2: mozilla::image::RasterImage::DecodeWorker::Run() (RasterImage.cpp:2928)
==29284==    by 0x73DF8D3: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:657)
==29284==    by 0x739E46B: NS_ProcessNextEvent_P(nsIThread*, bool) (nsThreadUtils.cpp:245)
==29284==    by 0x72EB2D9: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (MessagePump.cpp:110)
==29284==
Comment 7 Justin Lebar (not reading bugmail) 2012-02-01 16:19:00 PST
It looks like releasing the decoder in DecodeWorker::DecodeSomeOfImage can release the image itself.

If we hold a strong ref to the image in DecodeWorker::Run, that may fix all our problems.
Comment 8 Justin Lebar (not reading bugmail) 2012-02-01 16:26:31 PST
https://tbpl.mozilla.org/?tree=Try&rev=3c33563f8f9c
Comment 9 Justin Lebar (not reading bugmail) 2012-02-02 13:29:22 PST
I've backed out the offending code from m-i.  I'll back it out from Aurora once the tree re-opens.
Comment 10 Daniel Veditz [:dveditz] 2012-02-02 13:43:54 PST
comment 9 is https://hg.mozilla.org/integration/mozilla-inbound/rev/350ba395c507

Does that mean bug 715308 should be reopened, and this one fixed? And according to the check-in comment, bug 721510 was backed out and should be reopened too?
Comment 11 Justin Lebar (not reading bugmail) 2012-02-02 13:45:54 PST
Yes, but we don't modify bugs' status until the changes have landed on m-c.
Comment 12 Justin Lebar (not reading bugmail) 2012-02-04 09:54:30 PST
This should be fixed now that we've backed out bug 715308 on m-c and Aurora:

https://hg.mozilla.org/mozilla-central/rev/350ba395c507
https://hg.mozilla.org/releases/mozilla-aurora/rev/63278b4cbe87
Comment 13 Al Billings [:abillings] 2012-05-02 17:56:08 PDT
Just to avoid confusion, where does the testcase, if any, for this live?
Comment 14 Al Billings [:abillings] 2012-05-03 11:28:59 PDT
Verified using the testcase in bug 723254 with debug nightly trunk because I can read the whiteboard.
Comment 15 Ioana (away) 2012-05-11 05:46:09 PDT
Can any of you guys help me with some guidelines for verifying this bug on non-debug builds?
Comment 16 Christian Holler (:decoder) 2012-05-11 06:01:13 PDT
(In reply to Ioana Budnar [QA] from comment #15)
> Can any of you guys help me with some guidelines for verifying this bug on
> non-debug builds?

I think there's little you can do on non-debug builds here. Comment 6 confirms that it shows up in Valgrind but even for that the build needs support. I'd try to verify in the debug build of the respective opt-version you're trying to check.
Comment 17 Al Billings [:abillings] 2012-05-11 17:14:22 PDT
Ionana, debug builds are available on ftp.mozilla.org with the rest of the Firefox builds.
Comment 18 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-05-22 15:47:41 PDT
Ioana, you should be able to verify this with:

Firefox 13: ftp://ftp.mozilla.org/pub/firefox/nightly/2012-05-22-mozilla-beta-debug/
Firefox 12: ftp://ftp.mozilla.org/pub/firefox/nightly/2012/03/2012-03-30-mozilla-beta-debug/
Comment 19 Paul Silaghi, QA [:pauly] 2012-05-23 08:15:04 PDT
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #18)
> Ioana, you should be able to verify this with:
> 
> Firefox 13:
> ftp://ftp.mozilla.org/pub/firefox/nightly/2012-05-22-mozilla-beta-debug/
> Firefox 12:
> ftp://ftp.mozilla.org/pub/firefox/nightly/2012/03/2012-03-30-mozilla-beta-
> debug/

No crashes using the testcase from bug 723254 on Win 7/64, Ubuntu 12.04, Mac OS X 10.6. Verified fixed on FF 12, 13.
Comment 20 Al Billings [:abillings] 2012-05-23 15:29:14 PDT
Paul, did you verify that you saw the crashes first in pre-fix builds before checking them in post-fix builds?
Comment 21 Paul Silaghi, QA [:pauly] 2012-05-24 04:17:52 PDT
No, in post-fix builds only.
Comment 22 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-05-24 10:11:40 PDT
(In reply to Paul Silaghi [QA] from comment #21)
> No, in post-fix builds only.

Paul, please confirm that you are able to reproduce the crash in affected builds as well. You need to make sure you can reproduce the bug before you can verify the fix.

Thanks
Comment 23 Paul Silaghi, QA [:pauly] 2012-05-25 01:46:30 PDT
You're right Anthony, I should've done that, I always do.
The thing is I can't reproduce the crash on any nightly debug build between 2012-01-28 and 2012-02-02 both on Win 7 and Ubuntu. 
Al, do you remember if you could reproduce this and with what build ?
Thanks
Comment 24 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-05-25 09:40:08 PDT
Reverting your verifications Paul until we can find a reproducible case to trust.
Comment 25 Al Billings [:abillings] 2012-05-25 11:05:04 PDT
(In reply to Paul Silaghi [QA] from comment #23)

> Al, do you remember if you could reproduce this and with what build ?

I made my own debug...
Comment 26 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-06-01 16:18:18 PDT
I've been unable to get a debug nightly working from before 2012-02-04. I've tried several builds and bc has been trying to assist me for a few hours. I keep getting side-by-side errors even though I have the VC2005 and VC2010 libraries installed.

Can someone else help verify this fixed for Firefox 13?
Comment 27 Christian Holler (:decoder) 2012-06-02 03:41:38 PDT
According to comment 14, this is verified fixed on trunk. Any reason why you did remove the VERIFIED here? The verification discussion after that comment is only about branches.
Comment 28 Scoobidiver (away) 2012-06-02 04:49:36 PDT
(In reply to Christian Holler (:decoder) from comment #27)
> According to comment 14, this is verified fixed on trunk. Any reason why you
> did remove the VERIFIED here? The verification discussion after that comment
> is only about branches.
According to comment 24, nothing has been verified in the trunk (Fx 13) and the Beta channel (Fx 12) because there are no reliable STR.
Comment 29 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-06-02 07:41:31 PDT
Further to that, Al's verification in comment 14 was performed on "debug nightly trunk" on May 3rd. Assuming he used the May 3rd Nightly, that would be Firefox 15.

I guess the question here is do we mark verified fixed because it's been verified on Nightly, regardless of Target Milestone?
Comment 30 Tony Chung [:tchung] 2012-06-02 09:27:07 PDT
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #29)
> Further to that, Al's verification in comment 14 was performed on "debug
> nightly trunk" on May 3rd. Assuming he used the May 3rd Nightly, that would
> be Firefox 15.
> 
> I guess the question here is do we mark verified fixed because it's been
> verified on Nightly, regardless of Target Milestone?

I'd agree al's verification would suffice for now, since the directions call for debug bits.  we certainly should keep an eye on crash stats to see if this bug regresses.
Comment 31 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-06-02 17:26:27 PDT
Scoobidiver, since you filed this bug I will leave the final call to you.

The fix landed when Firefox 13 was in Nightly. If this wasn't fixed, this would have still reproduced in Firefox 15 by nature of riding the train.

Given Al's verification in Firefox 15 Nightly and the fact that I don't seen any recent instances of this signature on crash-stats, can we call this verified?
Comment 33 Christian Holler (:decoder) 2012-06-03 08:09:50 PDT
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #29)
> Further to that, Al's verification in comment 14 was performed on "debug
> nightly trunk" on May 3rd. Assuming he used the May 3rd Nightly, that would
> be Firefox 15.

My understanding of VERIFIED FIXED is that it means the bug has been verified fixed on trunk (which is what abillings did) and thats how many other people use this as well) This does not say anything about the branches like Fx12 or 13, we have branch verified flags for that.
Comment 34 Scoobidiver (away) 2012-06-03 09:05:38 PDT
(In reply to Christian Holler (:decoder) from comment #33)
> My understanding of VERIFIED FIXED is that it means the bug has been
> verified fixed on trunk (which is what abillings did) and thats how many
> other people use this as well
The target milestone indicates the trunk version which is 13.0. A Billing verified the fix based on a testcase, not on crash stats.
Comment 35 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-06-03 09:48:12 PDT
(In reply to Scoobidiver from comment #32)
> Based on crash stats, I can't say it's fixed in 12.0 or 13.0 (happened in
> 13.0b2 and 13.0b3):
> https://crash-stats.mozilla.com/report/
> list?product=Firefox&query_search=signature&query_type=contains&query=mozilla
> %3A%3Aimage%3A%3ARasterImage%3A%3AIsDecodeFinished&reason_type=contains&range
> _value=4&range_unit=weeks&do_query=1&signature=mozilla%3A%3Aimage%3A%3ARaster
> Image%3A%3AIsDecodeFinished%28%29

Can someone from Engineering look into this? I would suspect that these should no longer be existent on Firefox 12/13 given the status of this bug. On the other hand, there are only 21 crashes in the last month, which seems pretty low in the grand scheme of things.

Is it possible these 21 crashes are frankenbuilds?
What was the volume like when the issue was first reported?
Is it possible these crashes are the same signature but a different problem?
Should we reopen this bug or file a new one for investigation?
Comment 36 Al Billings [:abillings] 2012-06-04 17:31:08 PDT
(In reply to Scoobidiver from comment #34)
> (In reply to Christian Holler (:decoder) from comment #33)
> > My understanding of VERIFIED FIXED is that it means the bug has been
> > verified fixed on trunk (which is what abillings did) and thats how many
> > other people use this as well
> The target milestone indicates the trunk version which is 13.0. A Billing
> verified the fix based on a testcase, not on crash stats.

I did not notice that this was a non-trunk logged issue. :-)
Comment 37 Christian Holler (:decoder) 2012-06-05 02:57:59 PDT
The problem here seems to me that bug 723053 (a security bug which affects trunk) is duplicate to this bug. So this is either a trunk issue, or the duplicate status is incorrect. As we cannot mark VERIFIED on trunk in the duplicate bug, we must do it here then, or remove the duplication status. Thoughts on that?
Comment 38 Al Billings [:abillings] 2012-06-05 10:21:10 PDT
(In reply to Christian Holler (:decoder) from comment #37)
> 
> Thoughts on that?

I need a beer?
Comment 39 Scoobidiver (away) 2012-06-05 11:46:05 PDT
(In reply to Al Billings [:abillings] from comment #38)
> I need a beer?
See bug 700000.

Note You need to log in before you can comment on or make changes to this bug.