Last Comment Bug 673984 - crash in _purecall | nsImageLoader::DoRedraw(nsRect const*)
: crash in _purecall | nsImageLoader::DoRedraw(nsRect const*)
Status: RESOLVED FIXED
: crash
Product: Core
Classification: Components
Component: ImageLib (show other bugs)
: Trunk
: x86 Windows NT
: -- critical (vote)
: mozilla8
Assigned To: Scott Johnson (:jwir3)
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-07-25 10:52 PDT by Robert Kaiser (not working on stability any more)
Modified: 2011-08-23 13:27 PDT (History)
7 users (show)
Ms2ger: in‑testsuite?
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
b673984 (3.23 KB, patch)
2011-08-11 19:51 PDT, Scott Johnson (:jwir3)
dholbert: review+
dbaron: review+
Details | Diff | Review
b673984 (v2) [r=dbaron,dholbert] (3.36 KB, patch)
2011-08-14 19:17 PDT, Scott Johnson (:jwir3)
jaywir3: review+
Details | Diff | Review
b673984 (v3) [r=dbaron,dholbert] (3.33 KB, patch)
2011-08-14 19:48 PDT, Scott Johnson (:jwir3)
jaywir3: review+
dholbert: checkin+
Details | Diff | Review

Description Robert Kaiser (not working on stability any more) 2011-07-25 10:52:29 PDT
This bug was filed from the Socorro interface and is 
report bp-9503ee31-f847-4fde-8a69-a8d802110725 .
============================================================= 

Stack:

0 	mozcrt19.dll 	_purecall 	obj-firefox/memory/jemalloc/crtsrc/purevirt.c:54
1 	xul.dll 	nsImageLoader::DoRedraw 	layout/base/nsImageLoader.cpp:240
2 	xul.dll 	nsImageLoader::OnStopFrame 	
3 	xul.dll 	mozilla::imagelib::Decoder::PostFrameStop 	modules/libpr0n/src/Decoder.cpp:248
4 	xul.dll 	mozilla::imagelib::Decoder::Finish 	
5 	xul.dll 	mozilla::imagelib::RasterImage::ShutdownDecoder 	modules/libpr0n/src/RasterImage.cpp:2214
6 	xul.dll 	mozilla::imagelib::RasterImage::UnlockImage 	
7 	xul.dll 	imgRequestProxy::UnlockImage 	modules/libpr0n/src/imgRequestProxy.cpp:363
8 	xul.dll 	nsDocument::RemoveImage 	content/base/src/nsDocument.cpp:8216
9 	xul.dll 	nsResetStyleData::Destroy 	layout/style/nsRuleNode.h:110


This started rising significantly on 8.0a1 Nightlies with the 20110723 build IDs and higher. It seems like there was residual crash before in that function, but now it's significant, with over 30 crashes with the build ID from the 23rd so far.
Comment 1 Robert Kaiser (not working on stability any more) 2011-07-25 10:53:05 PDT
https://crash-stats.mozilla.com/report/list?signature=_purecall%20%7C%20nsImageLoader%3A%3ADoRedraw%28nsRect%20const%2A%29 lists more of those crashes, btw.
Comment 2 Jeff Muizelaar [:jrmuizel] 2011-08-08 05:48:11 PDT
Seems like this would be caused by us having a bad frame?
Comment 3 Robert Kaiser (not working on stability any more) 2011-08-11 08:29:09 PDT
Any chance to find a fix? This continues to be a trunk topcrash (#7 in yesterday's data).
Comment 4 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2011-08-11 13:12:13 PDT
Not sure why this just started happening, but the basic problem is this:

PresShell::NotifyDestroyingFrame only calls mPresContext->StopImagesFor(aFrame) when mIgnoreFrameDestruction is false.  That gets set to true during teardown, in particular, at the start of nsFrameManager::Destroy.

We destroy all the image loaders (which have pointers to frames) in nsPresContext::SetShell(null).

However, in PresShell::Destroy there's a bunch of stuff between "FrameManager()->Destroy()" and "mPresContext->SetShell(nsnull)", and this crash is in code in between the two.

We should probably move the code to destroy all image loaders from nsPresContext::SetShell to a new function called by PresShell::SetIgnoreFrameDestruction
Comment 5 Scott Johnson (:jwir3) 2011-08-11 19:51:31 PDT
Created attachment 552580 [details] [diff] [review]
b673984

Patch which I believe fixes this issue by following guidance in comment 4. I am currently unable to reproduce this issue, so any testing assistance would be greatly appreciated.
Comment 6 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2011-08-11 21:00:32 PDT
Comment on attachment 552580 [details] [diff] [review]
b673984

I think it's probably worth putting a null-check of mPresContext around the call to mPresContext->DestroyImageLoaders() (unless you can show why it's not needed).  With that, r=dbaron
Comment 7 Daniel Holbert [:dholbert] 2011-08-11 23:25:28 PDT
Comment on attachment 552580 [details] [diff] [review]
b673984

Looks good to me!  (modulo dbaron's suggestion)
Comment 8 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2011-08-12 11:44:00 PDT
Also, the commit message should describe what you're changing and why, rather than just being the summary of the bug.

If you upload a revised patch addressing those comments (and with the commit message noting reviews), one of us can push it to mozilla-inbound.
Comment 9 Scott Johnson (:jwir3) 2011-08-14 19:17:06 PDT
Created attachment 553084 [details] [diff] [review]
b673984 (v2) [r=dbaron,dholbert]

Added null check as per dbaron's comments. Also modified commit message to be more informative. 

Try tbpl link: http://tbpl.mozilla.org/?tree=Try&rev=92029dec2b1c
Comment 10 Daniel Holbert [:dholbert] 2011-08-14 19:29:33 PDT
(In reply to David Baron [:dbaron] from comment #6)
> I think it's probably worth putting a null-check of mPresContext around the
> call to mPresContext->DestroyImageLoaders()

(In reply to Scott Johnson (:jwir3) from comment #9)
> Added null check as per dbaron's comments.

Looks like your null-check is actually a NS_ABORT_IF_FALSE  -- I think dbaron meant an actual "if" check to keep from crashing in that case, e.g.:
  if (mPresContext) {
    mPresContext->DestroyImageLoaders();
  }

I'm not sure off the top of my head under what conditions it'd be null, but we do bother to null-check mPresContext* up two stack levels from the method you're touching (in PresShell::Destroy), so we're apparently at a point in the code where it could be null.

* http://mxr.mozilla.org/mozilla-central/source/layout/base/nsPresShell.cpp#1916
Comment 11 Scott Johnson (:jwir3) 2011-08-14 19:47:35 PDT
(In reply to Daniel Holbert [:dholbert] from comment #10)
> Looks like your null-check is actually a NS_ABORT_IF_FALSE  -- I think
> dbaron meant an actual "if" check to keep from crashing in that case, e.g.:

Ah, sorry, my mistake. I've corrected this and posted a new patch. Thanks for pointing this out!
Comment 12 Scott Johnson (:jwir3) 2011-08-14 19:48:26 PDT
Created attachment 553089 [details] [diff] [review]
b673984 (v3) [r=dbaron,dholbert]

Incorrect null check fixed.
Comment 13 Daniel Holbert [:dholbert] 2011-08-15 11:47:16 PDT
Landed on mozilla-inbound:
  http://hg.mozilla.org/integration/mozilla-inbound/rev/f9be98f4a042
Comment 15 Sheila Mooney 2011-08-23 13:27:34 PDT
I don't see any of these signatures on 8.0a2 and nothing on 8.0a1 in builds after the 16th and nothing on the trunk. I think we are good. I also think we don't need to track this so removing the nom.

Note You need to log in before you can comment on or make changes to this bug.