Crash in [@ nsPluginInstanceOwner::CreateWidget() ]

RESOLVED FIXED

Status

()

--
critical
RESOLVED FIXED
8 years ago
7 years ago

People

(Reporter: marcia, Assigned: roc)

Tracking

(4 keywords)

Trunk
x86
Windows Vista
crash, reproducible, testcase, topcrash
Points:
---

Firefox Tracking Flags

(blocking2.0 betaN+)

Details

(crash signature)

Attachments

(3 attachments, 2 obsolete attachments)

Harvested from B2 crash stats. Highest concentration of reports is B2, but there are a few 3.6.x and trunk crashes sprinkled in.

http://tinyurl.com/2eo5euh links to the crashes, all Windows. 

Frame  	Module  	Signature [Expand]  	Source
0 	xul.dll 	nsPluginInstanceOwner::CreateWidget 	layout/generic/nsObjectFrame.cpp:5831
1 	xul.dll 	nsPluginHost::InstantiateEmbeddedPlugin 	modules/plugin/base/src/nsPluginHost.cpp:1083
2 	xul.dll 	nsObjectFrame::InstantiatePlugin 	layout/generic/nsObjectFrame.cpp:977
3 	xul.dll 	nsObjectFrame::Instantiate 	layout/generic/nsObjectFrame.cpp:2115
4 	xul.dll 	nsObjectLoadingContent::Instantiate 	content/base/src/nsObjectLoadingContent.cpp:1883
5 	xul.dll 	nsAsyncInstantiateEvent::Run 	content/base/src/nsObjectLoadingContent.cpp:165
6 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:547
7 	xul.dll 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:118
8 	xul.dll 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:219
9 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:202
10 	xul.dll 	xul.dll@0x319963 	
11 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:176
12 	xul.dll 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:175
13 	xul.dll 	xul.dll@0xa57bf3 	
14 	xul.dll 	nsAppStartup::Run 	toolkit/components/startup/src/nsAppStartup.cpp:191
15 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3630
16 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:120
17 	firefox.exe 	__tmainCRTStartup 	obj-firefox/memory/jemalloc/crtsrc/crtexe.c:591
18 	kernel32.dll 	BaseProcessStart 	

Hulu, Java applets and Opening Gmail mentioned in the 14 comments. One person mentioned that the Java version check freezes

Comment 1

8 years ago
definitely looks like a regression in 4.0 betas

checking --- nsPluginInstanceOwner::CreateWidget 20100802-crashdata.csv
found in: 4.0b2 4.0b3pre 3.6.8 3.5.8
release total-crashes
              nsPluginInstanceOwner::CreateWidget crashes
                         pct.
all     272748  111     0.000406969
4.0b2    13035  103     0.0079018
4.0b3pre  1454    4     0.00275103
3.6.8   160934    3     1.86412e-05
3.5.8      604    1     0.00165563

os breakdown
nsPluginInstanceOwner::CreateWidgetTotal 110
Win5.1  0.30
Win6.0  0.06
Win6.1  0.62
Mac10.4 0.00
Mac10.5 0.00
Mac10.6 0.00
Lin2.4  0.00
blocking2.0: --- → ?

Comment 2

8 years ago
currently #16 topcrash in beta2

Comment 3

8 years ago
What OSes?
blocking2.0: ? → beta4+

Comment 4

8 years ago
os pct. breakdown  for Aug2 is in comment 1.

full counts for the last month are below.

 278 4.0b2 Windows NT 6.1.7600
  98 4.0b2 Windows NT 5.1.2600 Service Pack 3
  37 4.0b2 Windows NT 5.1.2600 Service Pack 2
  32 4.0b3pre Windows NT 6.1.7600
  23 4.0b2 Windows NT 6.0.6002 Service Pack 2
  16 4.0b3pre Windows NT 5.1.2600 Service Pack 3
  16 4.0b2pre Windows NT 6.1.7600
   9 4.0b2 Windows NT 6.0.6001 Service Pack 1
   6 4.0b2 Windows NT 6.0.6000
   6 3.6.6 Windows NT 5.1.2600 Service Pack 3
   5 4.0b2pre Windows NT 5.1.2600 Service Pack 3
   4 4.0b2 Windows NT 5.2.3790 Service Pack 2
   3 4.0b2pre Windows NT 5.1.2600 Service Pack 2
   3 4.0b2 Windows NT 6.1.7601 Service Pack 1, v.178

Comment 5

8 years ago
roc, http://hg.mozilla.org/mozilla-central/annotate/961f253985a4/layout/generic/nsObjectFrame.cpp#l5831 parent is probably null (we'd hit the assertion in a debug build)... any clues?
How can a plugin widget be toplevel?

We could wallpaper this with a null check, but something is seriously broken to get into this state, and I have no idea what.
Created attachment 463186 [details]
testcase

I crash with this stacktrace, with this testcase.
Steps to reproduce:
- Open testcase, then focus the RealPlayer plugin
- Let the testcase reload itself a few times, after that the browser seems to get into a 'zombie' state, where it doesn't respond to input at all.
- Open up the Task Manager, under the 'Applications' tab, choose to end the Minefield task.

After that, Minefield closes, but the process still seems to continue. After a while I get a prompt from Windows, that says that Minefield doesn't respond. If I ignore that, then the Crash Reporter comes up and says that Minefield has crashed.

This is with the RealPlayer Plugin installed, version 6.0.12.732.

This is the crash report that I get:
http://crash-stats.mozilla.com/report/index/c04201ee-e048-447e-919a-5dad921008050  	xul.dll  	nsPluginInstanceOwner::CreateWidget  	 layout/generic/nsObjectFrame.cpp:5911
1 	xul.dll 	nsPluginHost::InstantiateEmbeddedPlugin 	modules/plugin/base/src/nsPluginHost.cpp:1065
2 	xul.dll 	nsObjectFrame::InstantiatePlugin 	layout/generic/nsObjectFrame.cpp:978
Should this really block? I'm not sure this will be fixed for Beta 4 ...

Comment 9

8 years ago
It's a new regression and a topcrash, if that doesn't match our beta-blocker criterion then I really don't know what does!
Assignee: nobody → roc
I can't reproduce this with the steps in comment #7. Windows XP, Realplayer plugin version 6.0.12.775.

Martijn, do you have another plugin installed that handles type audio/aac? Realplayer doesn't handle that, at least not the way I've installed it.
Hmm, I guess you've probably got Quicktime installed and that probably handles AAC. I'll try installing Quicktime.
Installed Quicktime, still doesn't crash. Clicking on stuff in the RealPlayer plugin sometimes causes it to crash, but that doesn't hurt the browser. I've tried opening the testcase in Bugzilla and locally.

Martijn, what Windows version are you running? In a VM or not? I don't suppose you have VMWare record and replay available?
It turns out, I get the crash with the RealPlayer(tm) G2 LiveConnect-Enabled Plug-In (32-bit) running.
    File: nppl3260.dll
    Version: 6.0.12.732
    RealPlayer(tm) LiveConnect-Enabled Plug-In

Not really sure where you can get that from.

(Btw, disabling/enabling plugins in Tools->Addons doesn't really seem to stick, while the testcase is running)
Is Quicktime handling the audio/aac element? What Windows version?
Created attachment 465604 [details]
testcase

Sorry, it turns out this part:
<object type="*">
<object type="audio/aac">
</object>
</object>
isn't necessary to trigger the crash at all. (while minimizing a testcase, you sometimes get tired of minimizing further, thinking that some parts are necessary, while they aren't).

I'm on Windows 7 here. So the only plugin that you should need to get the crash is RealPlayer(tm) G2 LiveConnect-Enabled Plug-In.
In a debug build, I keep getting this assertion, when I move the mouse on that page:
###!!! ASSERTION: Received "nonqueued" message 295 during a synchronous IPC mess
age for window 28780874 ("MozillaWindowClass"), sending it to DefWindowProc inst
ead of the normal window procedure.: 'Error', file c:/mozilla-build/mozilla-cent
ral/ipc/glue/WindowsMessageLoop.cpp, line 318
Attachment #463186 - Attachment is obsolete: true
I'll try it in a debug build on Windows 7, but it'll have to be Monday (my time) so this almost certainly won't make beta4.
OK, I can reproduce that assertion in Windows 7. Maybe it happens in XP, but I was only testing an opt build in XP.

However, I do not see a browser crash. Sometimes the plugin container crashes (apparently in response to me clicking on the plugin or moving the browser window, it's hard to tell), Windows 7 pops up an alert telling me the plugin container has crashed, and the browser is hung until I dismiss the alert. But after dismissing the alert the browser resumes normally.

cjones, is this assertion a real "we are in big trouble" assertion?

Comment 18

8 years ago
Is this a debug build? You might need to set the plugin hang timeout to a nonzero value (5 seconds) in order to reproduce the crash?
(In reply to comment #17)
> cjones, is this assertion a real "we are in big trouble" assertion?

Yes, it's bad: while blocked on the plugin, we get message 295 and don't know what to do with it, so we deliver it to the DefWindowProc.

Jim/Ben, do you guys know what this message is?

Comment 20

8 years ago
WM_CHANGEUISTATE  	0x0127
Pushing to beta5.
blocking2.0: beta4+ → beta5+
Martijn, do you have a nonzero plugin hang timeout? Do you have D2D enabled?
(In reply to comment #22)
> Martijn, do you have a nonzero plugin hang timeout? Do you have D2D enabled?

My dom.ipc.plugins.timeoutSecs pref is set to 45s (default).
I used a build from 2010-08-13, those builds don't have anything to do with D2D, afaict.
Btw, I think I tried to reproduce this crash in a debug build, but it doesn't seem like I could catch it in there (although the tons of assertions might have gotten in the way).
Tried a home-made opt build and a nightly. No crash with D2D on. Plugin hang timeout is 45s (default), same as Martijn's.
No crash with D2D off either.

I'll look into this assertion now.
See http://msdn.microsoft.com/en-us/library/ms646342%28v=VS.85%29.aspx. It seems this could lead to synchronous triggering of WM_UPDATEUISTATE in our window. That can call through to nsGlobalWindow::SetKeyboardIndicators, thence into nsEventStateManager::SetContentState(NS_EVENT_STATE_FOCUS). That could be bad if it happens at the wrong time...

Martijn, you could try making the handler for WM_UPDATEUISTATE in nsWindow just break instead of doing anything. If that fixes the crash for you, then we know this is the right track.
Yeah, so my current theory is that the RealPlayer sends a lot of WM_CHANGEUISTATE messages to its window, which propagate into our window while we're doing a synchronous operation on the plugin (normally painting, in my stacks), that propagates up to our toplevel window and Windows turns it into a WM_UPDATEUISTATE, which propagates down through our windows, which triggers nsEventStateManager::SetContentState while we're painting, which can trigger stuff like frame destruction. Which can be deadly while we're painting.

Probably the most straightforward option is to defer the WM_CHANGEUISTATE message to run asynchronously, similar to the way other messages are deferred by the IPC code.

Another option would be to just drop WM_CHANGEUISTATE messages from plugins on the floor. But I guess the first way is better.
Comment on attachment 466533 [details] [diff] [review]
add prefs

Sorry, wrong bug.
Attachment #466533 - Attachment is obsolete: true
Attachment #466533 - Flags: review?(dbaron)
Unfortunately I can't prove my theory. I never see nsGlobalWindow::SetKeyboardIndicators getting called. Of course, I don't see crashes either.

Martijn, can you set a breakpoint on nsGlobalWindow::SetKeyboardIndicators and see if it gets called when you're running the testcase? A stack for the call would be helpful too.
The breakpoint at nsGlobalWindow::SetKeyboardIndicators doesn't seem to get called at all. But like I said, I don't crash with a debug Minefield, either. Only the plugin-container is crashing sometimes.

I'll try you suggestion in comment 16, but I'll need to build a non-debug version first for that.
You should be able to get an opt build with symbols, using the symbol server if you're using a nightly build, and set that breakpoint in it.
Created attachment 466639 [details]
stack from optimized build

Ok, I made an optimized build with symbols. I'm able to crash with that one. I attached the stack of that build, with some additional info.
Again, also with this optimized build, the breakpoint I set at nsGlobalWindow::SetKeyboardIndicators doesn't seem to get called at all.
Doing the suggestion in comment 27 doesn't seem to fix the crash.

However let me note that the fact that the browser can get into a 'zombie' state, where it doesn't respond to input at all, is perhaps the issue that needs to be addressed, since that seems to be the root cause of the crash.
I think we should write a patch to defer WM_CHANGEUISTATE to get rid of that assertion, even though I suspect now that it's not related to the crash.

Possibly the problems will be fixed by bug 130078.
Moving this to beta6+ - can someone try to verify comment 36 which indicates that bug 130078 may have resolved this issue?
blocking2.0: beta5+ → beta6+
can we get this into beta 6, its rank 14 in the top crash stats for Beta 5
Roc: are you actively working on this?
Not lately. I can do what I suggested in comment #36, but that will probably not fix the crash. I'm going to try again to reproduce the crash when I set up my new Windows machine in a few days.
Currently #73 topcrash on nightlies and #20 on b6, which suggests that while it would be good to fix, it doesn't strictly block beta7, so moving to betaN.
blocking2.0: beta7+ → betaN+
Still cannot reproduce on new laptop. The plugin crashes every now and again, and the browser hangs while it's crashing, but the browser always recovers. This is the testcase in comment #15, homemade opt build on Windows 7, new profile, RealPlayer nppl3260.dll version 6.0.12.775.

Martijn, is it possible that the browser crash only happens on Vista?
Hmm, from the crash reports I guess not.

I'm not sure how to proceed here.

Comment 46

8 years ago
some more possible ideas for testing based on recent user comments.  these all look like a problem beyond real player.

   1 It still keeps freezing in flash apps. You're forcing me to use IE8 in Facebook.   infact the crash url is in IETAB .../extensions/ietab@ip.cn/chrome/content/container.html
   1 this has been going on for sometime now....constantly having to re-fresh the facebook web site

  inthemafia, frontierville, cafeworld, treasuremadness


   1 The Google Earth add-on is constantly causing Firefox to freeze up.
   1 crach by google earth plugin

   several http://www.orkut.com.br main user pages

this looks like it might require a login to get to but maybe we can set up a test account.   please tell me why my page keeps crashing. can't you fix this problem? blackplanet.com
 http://www.blackpeoplemeet.com/community/media/photo.cfm?Profile=AC2CB93318DD8B8D&type=1&photo=10982563&search=false&back=slideshow&backparam=&from=Photo 

   1 The browser just locks up solid when accessing Java content.
   1 Trying to access HP iLO... crashed when the Java launched.
2291 \N

  1 stupid plugin container... please fix from crashing firefox.. or better get rid of it

 several reports with google talk gadget

last few days if im in a window and click on link or other tab i have to minimize whole thing and bring it back up for it to complete the proccess. when that doesnt work it freezes and eventually crash.  ... crash url associated with this comment is 
   1 wyciwyg://7/http://talkgadget.google.com/talkgadget/notifierclient?client=sm&prop=iGoogle&nav=true&fid=gtn-roster-iframe-id&ts=0&debug=undefined&os=Win32&stime=1286483338397&fb=false&re=true&no=undefined&hc=true&ref=true&href=http%3A%2F%2Fwww.google.com%2
chofmann: Any way to extract the most common flash version from the people that are hitting this issue? That will help a bit with facebook testing.

(In reply to comment #46)
> some more possible ideas for testing based on recent user comments.  these all
> look like a problem beyond real player.
> 
>    1 It still keeps freezing in flash apps. You're forcing me to use IE8 in
> Facebook.   infact the crash url is in IETAB
> .../extensions/ietab@ip.cn/chrome/content/container.html
>    1 this has been going on for sometime now....constantly having to re-fresh
> the facebook web site
> 
>   inthemafia, frontierville, cafeworld, treasuremadness
> 
> 
>    1 The Google Earth add-on is constantly causing Firefox to freeze up.
>    1 crach by google earth plugin
> 
>    several http://www.orkut.com.br main user pages
> 
> this looks like it might require a login to get to but maybe we can set up a
> test account.   please tell me why my page keeps crashing. can't you fix this
> problem? blackplanet.com
> 
> http://www.blackpeoplemeet.com/community/media/photo.cfm?Profile=AC2CB93318DD8B8D&type=1&photo=10982563&search=false&back=slideshow&backparam=&from=Photo 
> 
>    1 The browser just locks up solid when accessing Java content.
>    1 Trying to access HP iLO... crashed when the Java launched.
> 2291 \N
> 
>   1 stupid plugin container... please fix from crashing firefox.. or better get
> rid of it
> 
>  several reports with google talk gadget
> 
> last few days if im in a window and click on link or other tab i have to
> minimize whole thing and bring it back up for it to complete the proccess. when
> that doesnt work it freezes and eventually crash.  ... crash url associated
> with this comment is 
>    1
> wyciwyg://7/http://talkgadget.google.com/talkgadget/notifierclient?client=sm&prop=iGoogle&nav=true&fid=gtn-roster-iframe-id&ts=0&debug=undefined&os=Win32&stime=1286483338397&fb=false&re=true&no=undefined&hc=true&ref=true&href=http%3A%2F%2Fwww.google.com%2
If Martijn can get this in a VM with record and replay enabled, we can easily fix this.

Comment 49

8 years ago
flash versions for this signature where the crash was on facebook.

  30 10.1.85.3
  17 10.1.82.76
   5 10.1.53.64
   4 [blank]
   3 10.0.22.87
   2 10.0.2.54
   1 9.0.45.0
   1 10.2.161.23
   1 10.2.161.22
   1 10.1.51.45
   1 10.0.45.2
   1 10.0.32.18
(In reply to comment #44)
> Still cannot reproduce on new laptop. The plugin crashes every now and again,
> and the browser hangs while it's crashing, but the browser always recovers.
> This is the testcase in comment #15, homemade opt build on Windows 7, new
> profile, RealPlayer nppl3260.dll version 6.0.12.775.

Did you follow the steps in comment 7?
You need open up the Task Manager, under the 'Applications' tab, choose to end the
Minefield task. The Minefield.exe process will still be there and will eventually crash after 20s or so.
Ah, so in comment #7, after the browser is in the zombie state, if you don't do anything it eventually recovers? That wasn't clear.
Zombies don't usually come back to life :-)
(In reply to comment #51)
> Ah, so in comment #7, after the browser is in the zombie state, if you don't do
> anything it eventually recovers? That wasn't clear.

Yes, it eventually recovers. Unless you close the brower in the task manager in the Applications tab.
ah ok, great! Then I can reproduce this. Hopefully I'll be able to figure out what's going on.

Updated

8 years ago
Whiteboard: reproducible

Updated

8 years ago
Keywords: reproducible
Whiteboard: reproducible

Comment 55

8 years ago
This is still appearing in the top 50 for the trunk.
A week's worth of trunk data shows 53 crashes for the week, all Windows: http://tinyurl.com/2cgbzwo
The parent widget at the crashing site will be whatever nsObjectFrame::CreateWidget passes as the parent widget when it creates the plugin widget. And it passes PresContext()->GetRootPresContext()->PresShell()->FrameManager()->GetRootFrame()->GetNearestWidget(), so if the plugin is in a document that has somehow been disconnected from the root document the parent widget could be null. Maybe we should check for null parent widget in nsObjectFrame::CreateWidget and refuse to show the plugin in that case?
Yeah, I guess we should just do that. But I also want to look into the hang state we get into.

Comment 59

8 years ago
This is #10 for beta7 with over 1900 crashes for the past week. It's lower at #25 on the trunk but we still have a bunch of older regressions polluting the list.
Created attachment 497913 [details] [diff] [review]
wallpaper-ish fix

I think this is the right thing to do. I don't quite buy the explanation that we're in a disconnected document; if that were the case, GetRootPresContext should return null. But this should fix the crash anyway.
Attachment #497913 - Flags: review?(joshmoz)
Keywords: testcase

Updated

8 years ago
Attachment #497913 - Flags: review?(joshmoz) → review+
Keywords: checkin-needed
Whiteboard: [needs landing]
http://hg.mozilla.org/mozilla-central/rev/28487d9876c5
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [needs landing]
Crash Signature: [@ nsPluginInstanceOwner::CreateWidget() ]
You need to log in before you can comment on or make changes to this bug.