Closed Bug 853774 Opened 7 years ago Closed 6 years ago

crash in imgRequest::GetStatusTracker

Categories

(Core :: ImageLib, defect, critical)

22 Branch
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 878037
Tracking Status
firefox21 --- unaffected
firefox22 + verified
firefox23 --- verified
firefox24 --- verified

People

(Reporter: scoobidiver, Assigned: tnikkel)

References

()

Details

(4 keywords)

Crash Data

Attachments

(3 files)

There are a few crashes. It doesn't seem a recent regression.

Signature 	imgRequest::GetStatusTracker() More Reports Search
UUID	ea5fb406-6c05-48a1-8ab7-714c62130322
Date Processed	2013-03-22 00:36:31
Uptime	17
Last Crash	45 seconds before submission
Install Age	2.8 hours since version was first installed.
Install Time	2013-03-21 21:46:03
Product	Firefox
Version	22.0a1
Build ID	20130321090706
Release Channel	nightly
OS	Windows NT
OS Version	6.2.9200
Build Architecture	x86
Build Architecture Info	AuthenticAMD family 16 model 5 stepping 3
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0x30
App Notes 	
AdapterVendorID: 0x1002, AdapterDeviceID: 0x9712, AdapterSubsysID: 2aac103c, AdapterDriverVersion: 8.97.10.6
D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers+ 
Processor Notes 	sp-processor01.phx1.mozilla.com_14425:2008
EMCheckCompatibility	True
Adapter Vendor ID	0x1002
Adapter Device ID	0x9712
Total Virtual Memory	4294836224
Available Virtual Memory	3910184960
System Memory Use Percentage	26
Available Page File	19794317312
Available Physical Memory	4566089728
Accessibility	Active

Frame 	Module 	Signature 	Source
0 	xul.dll 	imgRequest::GetStatusTracker 	image/src/imgRequest.cpp:143
1 	xul.dll 	RequestBehaviour::SetOwner 	image/src/imgRequestProxy.cpp:58
2 	xul.dll 	imgRequestProxy::Init 	image/src/imgRequestProxy.cpp:163
3 	xul.dll 	imgRequestProxy::PerformClone 	image/src/imgRequestProxy.cpp:589
4 	xul.dll 	imgRequestProxy::Clone 	image/src/imgRequestProxy.cpp:568
5 	xul.dll 	nsBulletFrame::DidSetStyleContext 	layout/generic/nsBulletFrame.cpp:132
6 	xul.dll 	nsFrame::Init 	layout/generic/nsFrame.cpp:559
7 	xul.dll 	nsBlockFrame::SetInitialChildList 	layout/generic/nsBlockFrame.cpp:6576
8 	xul.dll 	nsCSSFrameConstructor::ConstructBlock 	layout/base/nsCSSFrameConstructor.cpp:10950
9 	xul.dll 	nsCSSFrameConstructor::ConstructNonScrollableBlock 	layout/base/nsCSSFrameConstructor.cpp:4409
10 	xul.dll 	nsCSSFrameConstructor::ConstructFrameFromItemInternal 	layout/base/nsCSSFrameConstructor.cpp:3532
11 	xul.dll 	nsCSSFrameConstructor::ConstructFramesFromItem 	layout/base/nsCSSFrameConstructor.cpp:5481
12 	xul.dll 	nsCSSFrameConstructor::ConstructFramesFromItemList 	layout/base/nsCSSFrameConstructor.cpp:9812
13 	xul.dll 	nsCSSFrameConstructor::ProcessChildren 	layout/base/nsCSSFrameConstructor.cpp:9954
14 	xul.dll 	nsCSSFrameConstructor::InitAndRestoreFrame 	layout/base/nsCSSFrameConstructor.cpp:4436
15 	xul.dll 	EnumRulesMatching<PseudoElementRuleProcessorData> 	layout/style/nsStyleSet.cpp:634
16 	xul.dll 	xul.dll@0xfe3e0 	
17 		@0x52ae370

More reports at:
https://crash-stats.mozilla.com/report/list?signature=imgRequest%3A%3AGetStatusTracker%28%29
It has spiked since 22.0a1/20130321 before the patch of bug 716140 landed. The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=8156df33b757&tochange=1d6fe70c79c5
Keywords: regression
Version: Trunk → 22 Branch
To Seth for a look...
Assignee: nobody → seth
So assuming the root cause is in imagelib, the following pushes are suspect:

cd08011c06f7	Joe Drew — Bug 851755 - Make nsImageBoxFrame block its document's onload when its image blocks onload. r=tn
3ec52a8caf73	Joe Drew — Bug 852413 - Wait even more for painting to finish before forcing a paint. r=mattwoodrow
4a24d7b3126f	Joe Drew — Bug 852413 - Make test_image_layers.html wait for its image to load. r=mattwoodrow
0b92b896aaa1	Joe Drew — Bug 850441 - Allow BlockOnload/UnblockOnload from both the current and pending request. r=khuey

I'll take a look at those and see if I can narrow it down more.
I looked into this a bit more and I'm rather puzzled. The regression range doesn't seem that useful; I suspect the true culprit isn't in there. At the moment my best guess is that the check around nsBulletFrame:120 is the culprit. This code assumes (implicitly) that if two imgRequestProxys have different URIs, they have different underlying imgRequests. Redirects can mean that that isn't true; two different original URIs may end up mapping to the same resource in the end, which would mean that the imgRequests would be the same even though the URIs differ. If nsBulletFrame::mImageRequest is the last imgRequestProxy holding on to that imgRequest, the imgRequest could be freed when CancelAndForgetObserver is called and we could see the crash observed in this bug report when Clone is called later.

This is just a theory, again, but it's the best I've got. I'll upload a patch based on this idea shortly. I think this _is_ a real bug, even if it isn't actually to blame for this problem, so it's worth fixing regardless.
If this patch passes review I think the best thing we can do here is push it in and see if the crashes stop, since we don't have STR. We'll leave this bug open until we confirm the issue has gone away.
Attachment #734928 - Flags: review?(tnikkel) → review+
(In reply to Seth Fowler [:seth] from comment #6)
> since we don't have STR.
Comments talk about printing to a PDF file.
Did landing this slip through the cracks?
(In reply to Timothy Nikkel (:tn) from comment #8)
> Did landing this slip through the cracks?

Apparently so. Will push this in shortly. Let's see how it goes.
(In reply to Scoobidiver from comment #7)
> (In reply to Seth Fowler [:seth] from comment #6)
> > since we don't have STR.
> Comments talk about printing to a PDF file.

I confess I didn't realize there were comments from reporters until now. That's useful; thanks for pointing it out.
The tree opened for a split second and I got it in. =)

https://hg.mozilla.org/integration/mozilla-inbound/rev/bbec3b84cf8b
https://hg.mozilla.org/mozilla-central/rev/bbec3b84cf8b
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Will take another look.
Every comments talk about printing.
It's #10 top browser crasher in 22.0a2 and #47 in 23.0a1.
Keywords: topcrash
I just can't seem to reproduce this locally. I've tried quite a few things. Regardless, given the frequency of this crash I think the right thing to do for now is to make RequestBehaviour handle null owners gracefully.
Attachment #742141 - Flags: review?(tnikkel)
Note that this is _still_ not guaranteed to solve the crash; it may just move it around. =( If that does happen, though, the manner in which it moves should help pin things down a bit.
Comment on attachment 742141 [details] [diff] [review]
RequestBehaviour must handle null owners.

This seems ok, but I don't know the ramifications of having a null mOwner, so I'm forwarding this to Joe, feel free to re-assign again.
Attachment #742141 - Flags: review?(tnikkel) → review?(joe)
mOwner can be null; in particular imgRequestProxyStatic has a null mOwner, so this could be happening with print preview.
Attachment #742141 - Flags: review?(joe) → review+
Ah, which comment 15 specifically says.
(In reply to Joe Drew (:JOEDREW! \o/) from comment #20)
> mOwner can be null; in particular imgRequestProxyStatic has a null mOwner,
> so this could be happening with print preview.

Sure, but the code changed in the patch is for RequestBehaviour, not StaticBehaviour. That's what's a bit puzzling to me. I don't expect that we should clone a static imgRequestProxy to produce a normal one, nor that we should clone a normal imgRequestProxy with a null owner. My concern is that there is another bug here, somewhere.
https://hg.mozilla.org/mozilla-central/rev/648b0cd18883
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
There are still crashes mainly on 64-bit builds. See https://crash-stats.mozilla.com/report/list?version=Firefox:23.0a1&signature=imgRequest%3A%3AGetStatusTracker%28%29

The stack trace of comment 0 no longer exists after the fix but there are two other kinds of stack traces:
Frame 	Module 	Signature 	Source
0 	xul.dll 	imgRequest::GetStatusTracker 	image/src/imgRequest.cpp:135
1 	xul.dll 	imgRequestProxy::SyncNotifyListener 	image/src/imgRequestProxy.cpp:963
2 	xul.dll 	imgRequestProxy::PerformClone 	image/src/imgRequestProxy.cpp:605
3 	xul.dll 	imgRequestProxy::Clone 	image/src/imgRequestProxy.cpp:573
4 	xul.dll 	nsBulletFrame::DidSetStyleContext 	layout/generic/nsBulletFrame.cpp:128
5 	xul.dll 	nsFrame::Init 	layout/generic/nsFrame.cpp:549
6 	xul.dll 	nsBlockFrame::SetInitialChildList 	layout/generic/nsBlockFrame.cpp:6586
7 	xul.dll 	nsCSSFrameConstructor::ConstructBlock 	layout/base/nsCSSFrameConstructor.cpp:10983

Frame 	Module 	Signature 	Source
0 	xul.dll 	imgRequest::GetStatusTracker 	image/src/imgRequest.cpp:138
1 	xul.dll 	RequestBehaviour::GetImage 	image/src/imgRequestProxy.cpp:83
2 	xul.dll 	imgRequestProxy::LockImage 	image/src/imgRequestProxy.cpp:390
3 	xul.dll 	nsRuleNode::ComputeListData 	layout/style/nsRuleNode.cpp:6403
4 	xul.dll 	nsRuleNode::WalkRuleTree 	layout/style/nsStyleStructList.h:53
5 	xul.dll 	nsStyleContext::CalcStyleDifference 	layout/style/nsStyleContext.cpp:478
6 	xul.dll 	nsFrameManager::ReResolveStyleContext 	layout/base/nsFrameManager.cpp:1289
7 	xul.dll 	nsFrameManager::ReResolveStyleContext 	layout/base/nsFrameManager.cpp:1591
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
That doesn't surprise me. I'll have a look.
The previous fix is definitely wrong. Debugging to try to narrow this down. Whatever the ultimate fix is, it will back out the previous fix.
I'm doing some testing locally as well, but just to increase my chances of reproducing the problem, here's a try job with some asserts that I'm hoping will pin the issue down:

https://tbpl.mozilla.org/?tree=Try&rev=f75698d95795
I'm at a loss. Per discussion with dbaron, I think we need to turn some of these assertions into crashes. Hopefully we can get some new stack traces that will help shed light on the subject.
Attachment #752539 - Flags: review?(joe)
Comment on attachment 752539 [details] [diff] [review]
Crash if sanity checks related to imgRequestProxy::Clone fail.

Review of attachment 752539 [details] [diff] [review]:
-----------------------------------------------------------------

::: image/src/imgRequestProxy.cpp
@@ +63,5 @@
>        mOwnerHasImage = false;
>      }
>    }
>  
> +  virtual bool IsStatic() { return false; }

MOZ_OVERRIDE

@@ +84,5 @@
> +  if (mOwnerHasImage && !mOwner) {
> +    NS_WARNING("If mOwnerHasImage is true mOwner must be true");
> +    MOZ_CRASH();
> +  }
> +  

whitespace

@@ +103,5 @@
> +  if (mOwnerHasImage && !mOwner) {
> +    NS_WARNING("If mOwnerHasImage is true mOwner must be true");
> +    MOZ_CRASH();
> +  }
> +  

whitespace

@@ +181,5 @@
> +  if (!mBehaviour->IsStatic() && !aOwner) {
> +    NS_WARNING("Non-static imgRequestProxies should be initialized with an owner");
> +    MOZ_CRASH();
> +  }
> +  

whitespace

@@ +1048,5 @@
>    virtual void SetOwner(imgRequest* aOwner) MOZ_OVERRIDE {
>      MOZ_ASSERT(!aOwner, "We shouldn't be giving static requests a non-null owner.");
>    }
>  
> +  virtual bool IsStatic() { return true; }

MOZ_OVERRIDE
Attachment #752539 - Flags: review?(joe) → review+
Heh, I actually don't intend to leave this code in. Both this code and "RequestBehaviour must handle null owners." need to be backed out once we uncover the real problem.
Whiteboard: [leave open]
Do we want to land any speculative fixes in Beta 22? We've only got a couple of more opportunities to do so.
No, because we don't understand the problem.
(In reply to Ryan VanderMeulen [:RyanVM][Out May 23-27] from comment #33)
> https://hg.mozilla.org/mozilla-central/rev/be2b73e71d2a
There have been no crashes since 24.0a1/20130523. The working range is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c21ef3664c67&tochange=00b264c7cced
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Whiteboard: [leave open]
Target Milestone: mozilla23 → mozilla24
Very, very likely to be bug 861595.
(In reply to Joe Drew (:JOEDREW! \o/) from comment #37)
> Very, very likely to be bug 861595.
No because these crashes are not fixed in 23.0a2 while the patch of bug 861595 landed in 23.0a2/20130523.
For me, it's the third patch of this bug (see comment 33) that has fixed it.
Wow, that's surprising. That patch shouldn't've changed behaviour at all.
Another possible fix in the working range would be bug 873455.
(In reply to Joe Drew (:JOEDREW! \o/) from comment #39)
> Wow, that's surprising. That patch shouldn't've changed behaviour at all.
You're right. The signature has morphed. See https://crash-stats.mozilla.com/report/list?signature=imgRequestProxy%3A%3AInit%28imgRequest*%2C+nsILoadGroup*%2C+nsIURI*%2C+imgINotificationObserver*%29
Status: RESOLVED → REOPENED
Crash Signature: [@ imgRequest::GetStatusTracker()] → [@ imgRequest::GetStatusTracker()] [@ imgRequestProxy::Init(imgRequest*, nsILoadGroup*, nsIURI*, imgINotificationObserver*)]
Resolution: FIXED → ---
Target Milestone: mozilla24 → ---
Since it doesn't sound like we have a clear path forward at this point, we should get QA black box testing around:

"It went off when I hit the print button"
"For no reason Fire Fox crashed several time while in review of my investments"
"printing a boarding pass to PDF"
"www.freecreditreport.com"

A lot of mentions of printing to PDF - can test on any platform, but Win7 would be most applicable.
Keywords: qawanted
Depends on: 878037
I tried this on Fx22b3 and the latest nightly, and I'm able crash easily by trying to print to PDF on Mac. I used the following URL from one of the comments in the crash report:

http://www.bhinneka.com/products/sku00213896/lenovo_ideacentre_b340_911_all-in-one.aspx

And I got the following crash ID: 5300045f-0abd-4965-b55a-304bc2130531

On nightly it is similar. I tried doing the same on a Windows 7 VM but it's taking me a while to get the printer set up.
In that range bug 851785 looks interesting.
My money's on bug 716140 instead.
STR
1. Open URL comment#43
2. File > Print Preview


Regression window(m-i)
Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/baa90816699d
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20130318 Firefox/22.0 ID:20130318135932
Crash:
http://hg.mozilla.org/integration/mozilla-inbound/rev/25c76053d44a
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20130318 Firefox/22.0 ID:20130318141131
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=baa90816699d&tochange=25c76053d44a

Triggered by: Bug 851785
Blocks: 851785
As expected adding a StartDecoding call to imgRequestProxy::GetStaticRequest fixes the crash in this case. It does not fix the testcase in bug 878037. So these are definitely two distinct bugs. All of the regressing bugs in bug 878037 landed in 21 or before. The regressing bug here (bug 851785) landed for 22. Since on 21 this crash is currently ranked 247 and it is top 20 on 22 and 23 the testcase in this bug is probably the cause of the majority of these crashes and we should fix this before we release 22.
Timothy, I take it you're working on something for the FrameRect regression here?
Assignee: seth → tnikkel
Yeah, I'm trying to hack something up that makes us decode for static requests that should be safe enough to land on beta.
I haven't quite figured out how the decodedness of the image fits into this exactly, but the patch in bug 878037 does fix this bug. So we should probably just uplift that to beta. There might only be one bug here, and different testcases regressing at different times due to luck or bad luck. I'll keep looking so we can be confident though.
If the image is not decoded then RasterImage::GetAnimated returns an error and imgRequestProxy::GetStaticRequest will create a static proxy to be on the safe side because it can't prove that it's not animated. Once we have a static proxy we fall victim to Clone not be overridden on static proxy's (ie bug 878037). So bug 878037 is the full and proper fix here.
I'm not sure if we should mark this fixed or duplicate to bug 878037. Whoever tracks the branch flags and top crash stuff should decide what they want to happen.
"meh"
Status: REOPENED → RESOLVED
Closed: 7 years ago6 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 878037
You need to log in before you can comment on or make changes to this bug.