Closed Bug 599341 Opened 14 years ago Closed 14 years ago

Firefox 4.0b5 and b6 crash [@ PL_DHashTableOperate | nsObjectFrame::StopPluginInternal ][@ PL_DHashTableOperate | nsPresContext::GetRootPresContext()]

Categories

(Core Graveyard :: Plug-ins, defect)

defect
Not set
critical

Tracking

(blocking2.0 beta8+)

RESOLVED FIXED
mozilla2.0b7
Tracking Status
blocking2.0 --- beta8+

People

(Reporter: scoobidiver, Assigned: MatsPalmgren_bugz)

References

Details

(Keywords: crash, Whiteboard: [crashkill])

Crash Data

Attachments

(3 files)

This new crash signature first appeared on 09/14th in 4.0b5, b6 and is still there in b7pre.
So it is not linked to a build, but to an update.
An add-on update is the most plausible. Indeed, there are also 3 crashes on Linux, so it is not an OS or a driver update.
It could be the new flash player square 10.2 which is in alpha state.

It is #44 top crasher in 4.0b6 for the last 2 weeks.

Here is a list of user's comment :
"The crash seems to be an issue with the flash plugin. I have yet to determine whether any of the addons are responsible (the issue is rare enough to make troubleshooting difficult). The primary indicator of the problem is when I go to one page with an embedded flash video, then navigate somewhere else. Sometimes there appears to be a an empty frame folating on top of the new page whose size and location coincides with the flash object. "
"The background of www.warriordash.com started displaying over all the text, then the background would not disappear and overlaid other websites. I've had this happen before, it seems to be related to websites that have large animated (Flash?) ads."
"Trying to open a birthday card collection on cardfountain.com with google mail open in another tab."
"Closing tab that had a white block stuck overlaying it (which has something to do with a plugin further down on the page)"

Signature	PL_DHashTableOperate | nsObjectFrame::StopPluginInternal
UUID	748d9081-244f-462b-a305-e54952100921
Time 	2010-09-21 02:26:02.741392
Uptime	502112
Last Crash	1118361 seconds (1.8 weeks) before submission
Install Age	502163 seconds (5.8 days) since version was first installed.
Product	Firefox
Version	4.0b6
Build ID	20100914072643
Branch	2.0
OS	Mac OS X
OS Version	10.6.4 10F569
CPU	x86
CPU Info	family 6 model 23 stepping 6
Crash Reason	EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE
Crash Address	0x2c4
User Comments	Reloaded a Site (hit the reload-Button)

Crashing Thread
Frame 	Module 	Signature [Expand] 	Source
0 	XUL 	PL_DHashTableOperate 	pldhash.c:615
1 	XUL 	nsObjectFrame::StopPluginInternal 	layout/generic/nsObjectFrame.cpp:2306
2 	XUL 	nsObjectFrame::DestroyFrom 	layout/generic/nsObjectFrame.cpp:642
3 	XUL 	nsFrameList::DestroyFramesFrom 	layout/generic/nsFrameList.cpp:98
4 	XUL 	nsContainerFrame::DestroyFrom 	layout/generic/nsContainerFrame.cpp:272
5 	XUL 	nsLineBox::DeleteLineList 	layout/generic/nsLineBox.cpp:336
6 	XUL 	nsBlockFrame::DestroyFrom 	layout/generic/nsBlockFrame.cpp:316
7 	XUL 	nsFrameList::DestroyFramesFrom 	layout/generic/nsFrameList.cpp:98
8 	XUL 	nsContainerFrame::DestroyFrom 	layout/generic/nsContainerFrame.cpp:272
.... and like 7, 8 ...
17 	XUL 	nsLineBox::DeleteLineList 	layout/generic/nsLineBox.cpp:336
18 	XUL 	nsBlockFrame::DestroyFrom 	layout/generic/nsBlockFrame.cpp:316
19 	XUL 	nsFrameList::DestroyFramesFrom 	layout/generic/nsFrameList.cpp:98
20 	XUL 	nsContainerFrame::DestroyFrom 	layout/generic/nsContainerFrame.cpp:272
.... and like 19, 20 ...
29 	XUL 	nsLineBox::DeleteLineList 	layout/generic/nsLineBox.cpp:336
30 	XUL 	nsBlockFrame::DestroyFrom 	layout/generic/nsBlockFrame.cpp:316
31 	XUL 	nsLineBox::DeleteLineList 	layout/generic/nsLineBox.cpp:336
32 	XUL 	nsBlockFrame::DestroyFrom 	layout/generic/nsBlockFrame.cpp:316
33 	XUL 	nsFrameList::DestroyFramesFrom 	layout/generic/nsFrameList.cpp:98
34 	XUL 	nsContainerFrame::DestroyFrom 	layout/generic/nsContainerFrame.cpp:272
... and like 33, 34 ...
39 	XUL 	nsFrameManager::Destroy 	layout/generic/nsIFrame.h:535
40 	XUL 	PresShell::Destroy 	layout/base/nsPresShell.cpp:1958
41 	XUL 	DocumentViewerImpl::DestroyPresShell 	layout/base/nsDocumentViewer.cpp:4298
42 	XUL 	DocumentViewerImpl::Destroy 	layout/base/nsDocumentViewer.cpp:1621
43 	XUL 	nsSHistory::EvictContentViewersInRange 	docshell/shistory/src/nsSHistory.cpp:890
44 	XUL 	nsSHistory::EvictAllContentViewers 	docshell/shistory/src/nsSHistory.cpp:681
45 	XUL 	nsDocShell::Destroy 	docshell/base/nsDocShell.cpp:4492
46 	XUL 	nsFrameLoader::Finalize 	content/base/src/nsFrameLoader.cpp:461
47 	XUL 	nsDocument::MaybeInitializeFinalizeFrameLoaders 	content/base/src/nsDocument.cpp:5402
48 	XUL 	nsDocument::EndUpdate 	content/base/src/nsDocument.cpp:3908
49 	XUL 	nsXULDocument::EndUpdate 	content/xul/document/src/nsXULDocument.cpp:3318
50 	XUL 	nsINode::doRemoveChildAt 	content/base/src/mozAutoDocUpdate.h:66
51 	XUL 	nsGenericElement::RemoveChildAt 	content/base/src/nsGenericElement.cpp:3634
52 	XUL 	nsXULElement::RemoveChildAt 	content/xul/content/src/nsXULElement.cpp:1008
53 	XUL 	nsIDOMNode_RemoveChild 	nsINode.h:485
54 	XUL 	js::Interpret 	js/src/jsinterp.cpp:4696
55 	XUL 	js::InvokeCommon<JSBool > 	js/src/jsinterp.cpp:577
56 	XUL 	js::Invoke 	js/src/jsinterp.cpp:696
57 	XUL 	js::InternalInvoke 	js/src/jsinterp.cpp:736
58 	XUL 	JS_CallFunctionValue 	js/src/jsinterp.h:651
59 	XUL 	nsJSContext::CallEventHandler 	dom/base/nsJSEnvironment.cpp:2248
60 	XUL 	nsJSEventListener::HandleEvent 	dom/src/events/nsJSEventListener.cpp:228
61 	XUL 	nsXBLPrototypeHandler::ExecuteHandler 	content/xbl/src/nsXBLPrototypeHandler.cpp:332
62 	XUL 	nsXBLEventHandler::HandleEvent 	content/xbl/src/nsXBLEventHandler.cpp:88
63 	XUL 	nsEventListenerManager::HandleEventSubType 	content/events/src/nsEventListenerManager.cpp:1112
64 	XUL 	nsEventListenerManager::HandleEventInternal 	content/events/src/nsEventListenerManager.cpp:1208
65 	XUL 	nsEventTargetChainItem::HandleEventTargetChain 	content/events/src/nsEventListenerManager.h:146
66 	XUL 	nsEventDispatcher::Dispatch 	content/events/src/nsEventDispatcher.cpp:628
67 	XUL 	nsTransitionManager::WillRefresh 	layout/style/nsTransitionManager.cpp:946
68 	XUL 	nsRefreshDriver::Notify 	layout/base/nsRefreshDriver.cpp:253
69 	XUL 	nsTimerImpl::Fire 	xpcom/threads/nsTimerImpl.cpp:428
70 	XUL 	nsTimerEvent::Run 	xpcom/threads/nsTimerImpl.cpp:517
71 	XUL 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:547
72 	XUL 	NS_ProcessPendingEvents_P 	nsThreadUtils.cpp:200
73 	XUL 	nsBaseAppShell::NativeEventCallback 	widget/src/xpwidgets/nsBaseAppShell.cpp:126
74 	XUL 	nsAppShell::ProcessGeckoEvents 	widget/src/cocoa/nsAppShell.mm:394
....

More reports at :
http://crash-stats.mozilla.com/report/list?range_value=4&range_unit=weeks&signature=PL_DHashTableOperate%20|%20nsObjectFrame%3A%3AStopPluginInternal
Forget about the addons update, the crash reports were presented on two pages and I misunderstood that.
Summary: Firefox 4.0b5 and b6 crash after an unknown addon update on 09/14th [@ PL_DHashTableOperate | nsObjectFrame::StopPluginInternal ] → Firefox 4.0b5 and b6 crash [@ PL_DHashTableOperate | nsObjectFrame::StopPluginInternal ]
happened to me too while clicking on one link and boom -> http://crash-stats.mozilla.com/report/index/bp-8af2ceb1-77bd-40c9-96fa-7187b2101006
Whiteboard: [crashkill]
Tomcat, can you try to figure out a way to reproduce?
I have been running Flash Version: 10.2.161.22 and I have yet to be able to reproduce this bug using many of the sites listed. Will keep trying. I also tried on 10.5 with the same Flash version.
Component: Layout → Plug-ins
QA Contact: layout → plugins
This also just happened to me:

bp-c53c3539-fa34-45ec-9ce0-5012e2101020

Let me know if I can help in providing more information to diagnose this.
> Let me know if I can help in providing more information to diagnose this.

Please try to figure out how to reproduce this crash.

If we can't reproduce it, there's not much we can do about it.

Also, what Flash version are you using?  Marcia says she doesn't see this crash with Flash 10.2.161.22 (a beta version).
> Also, what Flash version are you using?  Marcia says she doesn't see this crash
> with Flash 10.2.161.22 (a beta version).

Actually Flash probably isn't involved.  There's only a weak correlation between these crashes and the Flash plugin.
(In reply to comment #6)
> > Let me know if I can help in providing more information to diagnose this.
> 
> Please try to figure out how to reproduce this crash.
> 
> If we can't reproduce it, there's not much we can do about it.

I've got this crash only one time so far in my main browsing profile, right when I clicked inside the comment field on a bug page.  I don't have any idea how to reproduce this, unfortunately.

> Also, what Flash version are you using?  Marcia says she doesn't see this crash
> with Flash 10.2.161.22 (a beta version).

My Flash player version is 10.1.82.76.
Hmm, I wonder if nsPresContext::GetRootPresContext using frames is a problem as they are not always around. It could easily use views.
Actually, probably not, DocumentViewerImpl::Destroy disconnects the root view as well.
I think what's happening here is that, in nsObjectFrame::StopPluginInternal, rootPC is null, and we call rootPC->UnregisterPluginForGeometryUpdates(this) anyway.
blocking2.0: --- → beta8+
Summary: Firefox 4.0b5 and b6 crash [@ PL_DHashTableOperate | nsObjectFrame::StopPluginInternal ] → Firefox 4.0b5 and b6 crash [@ PL_DHashTableOperate | nsObjectFrame::StopPluginInternal ][@ PL_DHashTableOperate | nsPresContext::GetRootPresContext()]
I suspect that this PresShell is Frozen and thus that StopPlugin
has already been called.  If so, it should be safe to null-check
and skip the UnregisterPluginForGeometryUpdates.
If not, then we have a dangling frame pointer in the root pres context.

If we don't want to depend on ancestor frames/views to get the root pres
context, we could add this to nsPresContext (given as arg to the ctor):
nsRefPtr<nsPresContext> mRootPresContext;
the only time we need to update it is when swapping doc shells (I think).

Or we could fallback and go through the docshell tree, like
PresShell::GetParentPresShell() which uses mForwardingContainer when
detached.
I don't think we want to use the docshell tree, it doesn't always match what we draw on the screen, see bug 593286 comment 5.

In bug 473178 comment 49 it was suggested to cache the root prescontext pointer for perf reasons, so that might be a good idea. Just need to make sure we cover all situations when it might change.
looks like there might have been a volume increase on the trunk for PL_DHashTableOperate...nsObjectFrame::StopPluginInternal crashes going from about 2-5 crashes per day to 20-25 crashes per day.

date     tl crashes at, count build, count build, ...
         PL_DHashTableOperate...nsObjectFrame::StopPluginInternal
20101010 57  53 4.0b6^\2010091407, 
                1 4.0b8pre^\2010101003, 1 4.0b8pre^\2010100703, 
                1 4.0b7pre^\2010100603, 1 4.0b5^\2010083107, 
20101011 63  60 4.0b6^\2010091407, 
                1 4.0b8pre^\2010101103, 1 4.0b8pre^\2010100703, 
                1 4.0b5^\2010083107, 
20101012 66  62 4.0b6^\2010091407, 
                2 4.0b8pre^\2010101203, 2 4.0b8pre^\2010101003, 
20101013 45  41 4.0b6^\2010091407, 
                2 4.0b8pre^\2010101307, 1 4.0b8pre^\2010101203, 
                1 4.0b7pre^\2010100603, 
20101014 59  54 4.0b6^\2010091407, 
                2 4.0b8pre^\2010101403, 1 4.0b8pre^\2010101307, 
                1 4.0b8pre^\2010101203, 1 4.0b7pre^\2010100603, 
20101015 56  44 4.0b6^\2010091407, 
                8 4.0b8pre^\2010101503, 2 4.0b8pre^\2010101403, 
                1 4.0b8pre^\2010101307, 1 4.0b7pre^\2010100603, 
20101016 76  57 4.0b6^\2010091407, 
                13 4.0b8pre^\2010101603, 2 4.0b8pre^\2010101503, 
                1 4.0b8pre^\2010101602, 1 4.0b8pre^\2010101307, 
                1 4.0b7pre^\2010091603, 1 4.0b5^\2010083107, 
20101017 76  57 4.0b6^\2010091407, 
                8 4.0b8pre^\2010101703, 8 4.0b8pre^\2010101603, 
                3 4.0b8pre^\2010101503, 
...

20101023 73  51 4.0b6^\2010091407, 
                10 4.0b8pre^\2010102303, 9 4.0b8pre^\2010102203, 
                2 4.0b8pre^\2010101803, 1 4.0b8pre^\2010102302, 


the new mostly Mac reports on the trunk can be isolated with this query.

http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A4.0b8pre&query_search=signature&query_type=contains&query=PL_DHashTableOperate%20|%20nsObjectFrame%3A%3AStopPluginInternal&date=10%2F24%2F2010%2016%3A46%3A19&range_value=1&range_unit=weeks&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=PL_DHashTableOperate%20|%20nsObjectFrame%3A%3AStopPluginInternal

regression/different bug that should be spun off?
Are you accounting for all signatures that this shows up as?  The bouncing between signatures may appear random for various reasons.
(In reply to comment #21)
> Are you accounting for all signatures that this shows up as?  The bouncing
> between signatures may appear random for various reasons.

If I combine all 3 signatures (from this bug and bug 606860) and look at only those reported on 4.0b8pre I still see a bit of a spike on oct 15

count date

oct 1-9  77-113 crashes per day

  85 20101010-crashdata.csv
  94 20101011-crashdata.csv
 108 20101012-crashdata.csv
  81 20101013-crashdata.csv
  89 20101014-crashdata.csv

 153 20101015-crashdata.csv
 179 20101016-crashdata.csv
 205 20101017-crashdata.csv
 223 20101018-crashdata.csv
 185 20101019-crashdata.csv
 100 20101020-crashdata.csv
 145 20101021-crashdata.csv
 174 20101022-crashdata.csv
 185 20101023-crashdata.csv
growth in adu's could also explain the shift but its a more gradual ramp on number of trunk users over this time rather than a sharp jump around the 15h.

date    	 Firefox   adus	   
date    	 crashes   	  

2010-10-20	4148	44407	
2010-10-19	3641	43596	
2010-10-18	3805	41922	
2010-10-17	4069	33112	
2010-10-16	4521	31672	
2010-10-15	5519	36384	
2010-10-14	4522	35221	
2010-10-13	1427	32942	
2010-10-12	1427	28126	
2010-10-11	1144	25646	
2010-10-10	874	17826	
2010-10-09	815	15827	
2010-10-08	636	14854	
2010-10-07	219	4054
oops, those are numbers for all versions. here are numbers for just b8pre

   1 20101008-crashdata.csv
   4 20101009-crashdata.csv
   2 20101010-crashdata.csv
   2 20101011-crashdata.csv
   4 20101012-crashdata.csv
   5 20101013-crashdata.csv
   5 20101014-crashdata.csv
  69 20101015-crashdata.csv
  95 20101016-crashdata.csv
 120 20101017-crashdata.csv
 123 20101018-crashdata.csv
  72 20101019-crashdata.csv
  33 20101020-crashdata.csv
  65 20101021-crashdata.csv
  74 20101022-crashdata.csv
 114 20101023-crashdata.csv

also b7pre has run from around 1-9 crashes per day for all of oct including before oct 7 when most trunk users were on that version.
I crashed in this stack today, and it happened just like Tomcat described in Comment 2 - I clicked on a link and then crashed. https://crash-stats.mozilla.com/report/index/bp-83708678-80d9-4fff-9697-2968e2101025 is my report and I had a number of addons installed. If I am able to reproduce I will report back.
Attached patch Patch rev. 1Splinter Review
Don't try to unregister or set widget geometry if we can't find the
root pres context.  This should only occur when the shell is frozen
in which case those operations aren't needed.
Attachment #486079 - Flags: review?(roc)
WIP on a strong ref solution as outlined in comment 18 just to get a sense
of the risks involved.  Let me know if you think this would be better and
I'll polish it a bit more...
Comment on attachment 486079 [details] [diff] [review]
Patch rev. 1 (wdiff)

nsIPresShell will need an IID bump.

Do we want/need to not make the ConfigureChildren call if we don't have a root prescontext?
Good point. I think we should still call ConfigureChildren.
Oh but the GetEmptyConfiguration call needs a root prescontext too or it does nothing.
(In reply to comment #32)
> nsIPresShell will need an IID bump.

Why is that necessary?
There's no change to the vtable, nor the object size. 

> Do we want/need to not make the ConfigureChildren call if we don't
> have a root prescontext?

My understanding is that the root pres context can only be null in the
case this shell is frozen in which case we have already been here once,
through nsObjectFrame::StopPlugin() from FreezeElement in
nsPresShell.cpp.  So it should have been done already, and more
importantly we should have unregistered the frame already
(that's why I added the assertion).

If I'm wrong and the root pres context can be null in other cases,
then this patch is flawed, and we must do something more robust
like having a strong ref.
Oh, ok.
Comment on attachment 486079 [details] [diff] [review]
Patch rev. 1 (wdiff)

Tree rules says I need explicit beta7 approval to land.
Attachment #486079 - Flags: approval2.0?
Comment on attachment 486079 [details] [diff] [review]
Patch rev. 1 (wdiff)

a2.0b7=dbaron

(This boils down to a null-check.)
Attachment #486079 - Flags: approval2.0? → approval2.0+
Keywords: checkin-needed
http://hg.mozilla.org/mozilla-central/rev/5cc1a77ffbad
Status: NEW → RESOLVED
Closed: 14 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [crashkill] → [crashkill][needs landing on beta7 branch]
Target Milestone: --- → mozilla2.0b8
Whiteboard: [crashkill][needs landing on beta7 branch] → [crashkill]
Target Milestone: mozilla2.0b8 → mozilla2.0b7
Crash Signature: [@ PL_DHashTableOperate | nsObjectFrame::StopPluginInternal ] [@ PL_DHashTableOperate | nsPresContext::GetRootPresContext()]
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: