Closed Bug 723053 Opened 12 years ago Closed 12 years ago

Crash @ mozilla::image::RasterImage::IsDecodeFinished

Categories

(Core :: Graphics: ImageLib, defect)

12 Branch
x86_64
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla13
Tracking Status
firefox10 --- unaffected
firefox11 --- unaffected
firefox12 + fixed
firefox13 + fixed
firefox-esr10 --- unaffected
status1.9.2 --- unaffected

People

(Reporter: scoobidiver, Unassigned)

References

Details

(Keywords: crash, regression, Whiteboard: [sg:critical] debug-only testcase in bug 723254)

Crash Data

It first appeared in 12.a1/20120128.
The regression range might be (low volume):
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c07595bee6cf&tochange=e99e0dc97746
It's likely caused by bug 715308.

Signature 	mozilla::image::RasterImage::IsDecodeFinished() More Reports Search
UUID	b03d9058-5ebb-4424-a752-4a2e52120131
Date Processed	2012-01-31 22:28:21
Uptime	119
Install Age	4.7 hours since version was first installed.
Install Time	2012-01-31 17:45:15
Product	Firefox
Version	12.0a1
Build ID	20120131031150
Release Channel	nightly
OS	Windows NT
OS Version	6.1.7601 Service Pack 1
Build Architecture	amd64
Build Architecture Info	family 6 model 23 stepping 7
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0x310071
App Notes 	
AdapterVendorID: 0x1002, AdapterDeviceID: 0x6719, AdapterSubsysID: 03ba1043, AdapterDriverVersion: 8.821.0.0
D2D? D2D+
DWrite? DWrite+
D3D10 Layers? D3D10 Layers+
EMCheckCompatibility	True

Frame 	Module 	Signature [Expand] 	Source
0 	xul.dll 	mozilla::image::RasterImage::IsDecodeFinished 	image/src/RasterImage.cpp:2734
1 	xul.dll 	mozilla::image::RasterImage::DecodeWorker::Run 	
2 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:657
3 	xul.dll 	MessageLoop::DoWork 	ipc/chromium/src/base/message_loop.cc:412
4 	xul.dll 	xul.dll@0xef71f 	
5 	xul.dll 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:110
6 	xul.dll 	xul.dll@0x3a394b 	
7 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:201
8 	nspr4.dll 	PR_GetThreadPrivate 	nsprpub/pr/src/threads/prtpd.c:232
9 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:175
10 	msvcr90.dll 	getenv_helper_nolock 	
11 	xul.dll 	MessageLoop::current 	ipc/chromium/src/base/message_loop.cc:85
12 	xul.dll 	nsBaseAppShell::Run 	widget/xpwidgets/nsBaseAppShell.cpp:189
13 	xul.dll 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:220
14 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3537
...

More reports at:
https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3Aimage%3A%3ARasterImage%3A%3AIsDecodeFinished%28%29
marked core-security at request of jlebar
This may be a use-after-free error.  It's crashing on one of these two lines:

if (mDecoder->IsSizeDecode())
  if (mHasSize)

But it's not a null-pointer exception.
We're going to back out bug 715308.
It seems I just reported this bug with a testcase, see bug 723254. Is this the same? It's a use-after-free problem.
Whiteboard: [sg:critical]
Blocks: 723254
Woohoo, got a stack!  (Doesn't happen consistently in Valgrind, but often enough.)

==29284== Invalid read of size 8
==29284==    at 0x67776C3: mozilla::image::RasterImage::DecodeWorker::Run() (RasterImage.cpp:2932)
==29284==    by 0x73DF8D3: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:657)
==29284==    by 0x739E46B: NS_ProcessNextEvent_P(nsIThread*, bool) (nsThreadUtils.cpp:245)
==29284==    by 0x72EB2D9: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (MessagePump.cpp:110)
==29284==    by 0x7409D49: MessageLoop::RunInternal() (message_loop.cc:208)
==29284==    by 0x7409D71: MessageLoop::Run() (message_loop.cc:201)
==29284==    by 0x7210B42: nsBaseAppShell::Run() (nsBaseAppShell.cpp:189)
==29284==    by 0x70613AF: nsAppStartup::Run() (nsAppStartup.cpp:220)
==29284==    by 0x65FE1BE: XRE_main (nsAppRunner.cpp:3537)
==29284==    by 0x401808: main (nsBrowserApp.cpp:205)
==29284==  Address 0x1b4123e0 is 208 bytes inside a block of size 296 free'd
==29284==    at 0x402A674: free (vg_replace_malloc.c:427)
==29284==    by 0x677426D: mozilla::image::RasterImage::Release() (in /home/jlebar/code/moz/ff-git/objdir-nightly/toolkit/library/libxul.so)
==29284==    by 0x678C7C8: imgRequest::~imgRequest() (nsAutoPtr.h:907)
==29284==    by 0x678C865: imgRequest::~imgRequest() (imgRequest.cpp:136)
==29284==    by 0x6788381: imgRequest::Release() (in /home/jlebar/code/moz/ff-git/objdir-nightly/toolkit/library/libxul.so)
==29284==    by 0x67929FC: mozilla::image::nsJPEGDecoder::~nsJPEGDecoder() (nsJPEGDecoder.cpp:155)
==29284==    by 0x67749D6: mozilla::image::Decoder::Release() (in /home/jlebar/code/moz/ff-git/objdir-nightly/toolkit/library/libxul.so)
==29284==    by 0x6777548: mozilla::image::RasterImage::DecodeWorker::DecodeSomeOfImage(mozilla::image::RasterImage*, mozilla::image::RasterImage::DecodeWorker::DecodeType) (RasterImage.cpp:2979)
==29284==    by 0x67776C2: mozilla::image::RasterImage::DecodeWorker::Run() (RasterImage.cpp:2928)
==29284==    by 0x73DF8D3: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:657)
==29284==    by 0x739E46B: NS_ProcessNextEvent_P(nsIThread*, bool) (nsThreadUtils.cpp:245)
==29284==    by 0x72EB2D9: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (MessagePump.cpp:110)
==29284==
It looks like releasing the decoder in DecodeWorker::DecodeSomeOfImage can release the image itself.

If we hold a strong ref to the image in DecodeWorker::Run, that may fix all our problems.
Whiteboard: [sg:critical] → [sg:critical] debug-only testcase in bug 723254
I've backed out the offending code from m-i.  I'll back it out from Aurora once the tree re-opens.
comment 9 is https://hg.mozilla.org/integration/mozilla-inbound/rev/350ba395c507

Does that mean bug 715308 should be reopened, and this one fixed? And according to the check-in comment, bug 721510 was backed out and should be reopened too?
Yes, but we don't modify bugs' status until the changes have landed on m-c.
This should be fixed now that we've backed out bug 715308 on m-c and Aurora:

https://hg.mozilla.org/mozilla-central/rev/350ba395c507
https://hg.mozilla.org/releases/mozilla-aurora/rev/63278b4cbe87
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla13
Whiteboard: [sg:critical] debug-only testcase in bug 723254 → [sg:critical][qa+] debug-only testcase in bug 723254
Group: core-security
Just to avoid confusion, where does the testcase, if any, for this live?
Verified using the testcase in bug 723254 with debug nightly trunk because I can read the whiteboard.
Status: RESOLVED → VERIFIED
Can any of you guys help me with some guidelines for verifying this bug on non-debug builds?
(In reply to Ioana Budnar [QA] from comment #15)
> Can any of you guys help me with some guidelines for verifying this bug on
> non-debug builds?

I think there's little you can do on non-debug builds here. Comment 6 confirms that it shows up in Valgrind but even for that the build needs support. I'd try to verify in the debug build of the respective opt-version you're trying to check.
Ionana, debug builds are available on ftp.mozilla.org with the rest of the Firefox builds.
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #18)
> Ioana, you should be able to verify this with:
> 
> Firefox 13:
> ftp://ftp.mozilla.org/pub/firefox/nightly/2012-05-22-mozilla-beta-debug/
> Firefox 12:
> ftp://ftp.mozilla.org/pub/firefox/nightly/2012/03/2012-03-30-mozilla-beta-
> debug/

No crashes using the testcase from bug 723254 on Win 7/64, Ubuntu 12.04, Mac OS X 10.6. Verified fixed on FF 12, 13.
Whiteboard: [sg:critical][qa+] debug-only testcase in bug 723254 → [sg:critical][qa!] debug-only testcase in bug 723254
Paul, did you verify that you saw the crashes first in pre-fix builds before checking them in post-fix builds?
No, in post-fix builds only.
(In reply to Paul Silaghi [QA] from comment #21)
> No, in post-fix builds only.

Paul, please confirm that you are able to reproduce the crash in affected builds as well. You need to make sure you can reproduce the bug before you can verify the fix.

Thanks
You're right Anthony, I should've done that, I always do.
The thing is I can't reproduce the crash on any nightly debug build between 2012-01-28 and 2012-02-02 both on Win 7 and Ubuntu. 
Al, do you remember if you could reproduce this and with what build ?
Thanks
Reverting your verifications Paul until we can find a reproducible case to trust.
Whiteboard: [sg:critical][qa!] debug-only testcase in bug 723254 → [sg:critical][qa+] debug-only testcase in bug 723254
(In reply to Paul Silaghi [QA] from comment #23)

> Al, do you remember if you could reproduce this and with what build ?

I made my own debug...
I've been unable to get a debug nightly working from before 2012-02-04. I've tried several builds and bc has been trying to assist me for a few hours. I keep getting side-by-side errors even though I have the VC2005 and VC2010 libraries installed.

Can someone else help verify this fixed for Firefox 13?
Status: VERIFIED → RESOLVED
Closed: 12 years ago12 years ago
According to comment 14, this is verified fixed on trunk. Any reason why you did remove the VERIFIED here? The verification discussion after that comment is only about branches.
(In reply to Christian Holler (:decoder) from comment #27)
> According to comment 14, this is verified fixed on trunk. Any reason why you
> did remove the VERIFIED here? The verification discussion after that comment
> is only about branches.
According to comment 24, nothing has been verified in the trunk (Fx 13) and the Beta channel (Fx 12) because there are no reliable STR.
Further to that, Al's verification in comment 14 was performed on "debug nightly trunk" on May 3rd. Assuming he used the May 3rd Nightly, that would be Firefox 15.

I guess the question here is do we mark verified fixed because it's been verified on Nightly, regardless of Target Milestone?
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #29)
> Further to that, Al's verification in comment 14 was performed on "debug
> nightly trunk" on May 3rd. Assuming he used the May 3rd Nightly, that would
> be Firefox 15.
> 
> I guess the question here is do we mark verified fixed because it's been
> verified on Nightly, regardless of Target Milestone?

I'd agree al's verification would suffice for now, since the directions call for debug bits.  we certainly should keep an eye on crash stats to see if this bug regresses.
Scoobidiver, since you filed this bug I will leave the final call to you.

The fix landed when Firefox 13 was in Nightly. If this wasn't fixed, this would have still reproduced in Firefox 15 by nature of riding the train.

Given Al's verification in Firefox 15 Nightly and the fact that I don't seen any recent instances of this signature on crash-stats, can we call this verified?
Whiteboard: [sg:critical][qa+] debug-only testcase in bug 723254 → [sg:critical] debug-only testcase in bug 723254
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #29)
> Further to that, Al's verification in comment 14 was performed on "debug
> nightly trunk" on May 3rd. Assuming he used the May 3rd Nightly, that would
> be Firefox 15.

My understanding of VERIFIED FIXED is that it means the bug has been verified fixed on trunk (which is what abillings did) and thats how many other people use this as well) This does not say anything about the branches like Fx12 or 13, we have branch verified flags for that.
(In reply to Christian Holler (:decoder) from comment #33)
> My understanding of VERIFIED FIXED is that it means the bug has been
> verified fixed on trunk (which is what abillings did) and thats how many
> other people use this as well
The target milestone indicates the trunk version which is 13.0. A Billing verified the fix based on a testcase, not on crash stats.
(In reply to Scoobidiver from comment #32)
> Based on crash stats, I can't say it's fixed in 12.0 or 13.0 (happened in
> 13.0b2 and 13.0b3):
> https://crash-stats.mozilla.com/report/
> list?product=Firefox&query_search=signature&query_type=contains&query=mozilla
> %3A%3Aimage%3A%3ARasterImage%3A%3AIsDecodeFinished&reason_type=contains&range
> _value=4&range_unit=weeks&do_query=1&signature=mozilla%3A%3Aimage%3A%3ARaster
> Image%3A%3AIsDecodeFinished%28%29

Can someone from Engineering look into this? I would suspect that these should no longer be existent on Firefox 12/13 given the status of this bug. On the other hand, there are only 21 crashes in the last month, which seems pretty low in the grand scheme of things.

Is it possible these 21 crashes are frankenbuilds?
What was the volume like when the issue was first reported?
Is it possible these crashes are the same signature but a different problem?
Should we reopen this bug or file a new one for investigation?
(In reply to Scoobidiver from comment #34)
> (In reply to Christian Holler (:decoder) from comment #33)
> > My understanding of VERIFIED FIXED is that it means the bug has been
> > verified fixed on trunk (which is what abillings did) and thats how many
> > other people use this as well
> The target milestone indicates the trunk version which is 13.0. A Billing
> verified the fix based on a testcase, not on crash stats.

I did not notice that this was a non-trunk logged issue. :-)
The problem here seems to me that bug 723053 (a security bug which affects trunk) is duplicate to this bug. So this is either a trunk issue, or the duplicate status is incorrect. As we cannot mark VERIFIED on trunk in the duplicate bug, we must do it here then, or remove the duplication status. Thoughts on that?
(In reply to Christian Holler (:decoder) from comment #37)
> 
> Thoughts on that?

I need a beer?
(In reply to Al Billings [:abillings] from comment #38)
> I need a beer?
See bug 700000.
You need to log in before you can comment on or make changes to this bug.