Closed Bug 626343 Opened 11 years ago Closed 11 years ago
Crash [@ ns
Plugin Instance Owner::Create Widget() ]
This crash signature is supposed to be fixed in 4.0b9pre/20101225 after the fixing of bug 584251 but there has been no crash stop. I don't reopen it as it has a patch. It is #7 top crasher in 4.0b9 over the last week. Signature nsPluginInstanceOwner::CreateWidget() UUID ae94955a-bed7-43f4-9917-436492110116 Time 2011-01-16 13:53:43.216190 Uptime 1834 Last Crash 1118682 seconds (1.8 weeks) before submission Install Age 1834 seconds (30.6 minutes) since version was first installed. Product Firefox Version 4.0b10pre Build ID 20110116030325 Branch 2.0 OS Windows NT OS Version 5.1.2600 Service Pack 3 CPU x86 CPU Info GenuineIntel family 6 model 22 stepping 1 Crash Reason EXCEPTION_ACCESS_VIOLATION_READ Crash Address 0x0 App Notes AdapterVendorID: 8086, AdapterDeviceID: 2772 Frame Module Signature [Expand] Source 0 xul.dll nsPluginInstanceOwner::CreateWidget layout/generic/nsObjectFrame.cpp:6323 1 xul.dll nsPluginHost::DoInstantiateEmbeddedPlugin modules/plugin/base/src/nsPluginHost.cpp:1101 2 xul.dll nsPluginHost::InstantiateEmbeddedPlugin modules/plugin/base/src/nsPluginHost.cpp:967 3 xul.dll nsObjectFrame::InstantiatePlugin layout/generic/nsObjectFrame.cpp:1043 More reports at: http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=exact&query=&range_value=4&range_unit=weeks&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=nsPluginInstanceOwner%3A%3ACreateWidget%28%29
In nsPluginInstanceOwner::CreateWidget the mObjectFrame->CreateWidget call is inside an "if (!view || !mWidget)" where view is the (outer) view of the object frame. Although it seems quite unlikely that we wouldn't have a view, it seems wrong to proceed in that case. If view is null and mWidget is not then proceeding is certainly be the wrong thing to do. And if both are null I don't think its the right thing to do either. We could try taking the !view part out of that if, or just null check parentWidget at the crashing site. Mats, what do you think?
Looking at the code history the view null check stems from a time when nsObjectFrame::CreateWidget created the view for the nsObjectFrame and it looks like we didn't have inner views at all for object frames (?). It actually used to just be a "if (!view)" check, the !widget part was added later.
This is a high ranking regression since 3.6.x. Its been around for the last few betas and is up to nearly 400 crashes per day. It should block 4.0 final. nsPluginInstanceOwner::CreateWidget.. date total breakdown by build crashes count build, count build, ... 20110117 386 243 4.0b92011011019, 78 4.0b82010121417, 32 4.0b72010110414, 11 4.0b62010091408, 9 4.0b22010072019, 3 4.0b52010083108, 2 4.0b32010080519, 2 4.0b10pre2011011603, 2 4.0b10pre2011011503, 2 3.6.132010120307, 1 4.0b42010081813, 1 4.0b10pre2011011703,
Let's try this, if it has no effect on the crash numbers we can look elsewhere.
Attachment #504788 - Flags: review?(roc)
Attachment #504788 - Flags: review?(roc) → review+
Landed that patch http://hg.mozilla.org/mozilla-central/rev/dd10fd6a8b24 We'll need to evaluate if this had any effect once we get new nightlies and enough data.
looks like 250-400 crashes per day on the betas, but much lower volume on trunk so we might need to wait for a beta to see full impact. here are recent numbers. nsPluginInstanceOwner::CreateWidget.. date total breakdown by build crashes count build, count build, ... 20110117 386 243 4.0b92011011019, 78 4.0b82010121417, 32 4.0b72010110414, 11 4.0b62010091408, 9 4.0b22010072019, 3 4.0b52010083108, 2 4.0b32010080519, 2 4.0b10pre2011011603, 2 4.0b10pre2011011503, 2 3.6.132010120307, 1 4.0b42010081813, 1 4.0b10pre2011011703,
3 crashes so far on the 2011-01-19 nightly, so we can probably conclude that patch didn't fix it.
I noticed that in nsObjectFrame::Instantiate the InstantiatePlugin call (which eventually leads to our crash site) is followed by a weak frame check. So that means the object frame can go away, perhaps it goes away somewhere and we need to check that somewhere.
If my 'frame is destroyed' theory is correct, we would need to figure out where it can be destroyed so we know where to add weak frame checks.
Any time you make an NPAPI call (NPP_New, NPP_SetWindow, and many others) you should assume that the plugin may do scripting which may destroy the frame. Plugins do crazy things.
Enough beta10 data to conclude this had no effect now?
To see if my "frame is destroyed" theory is true at all we could do something ugly and check if a weak frame we init in nsObjectFrame::InstantiatePlugin and store somewhere accessible by nsPluginInstanceOwner::CreateWidget is alive and then crash by accessing an invalid and sentinel address and then see if we get that in the crash stats.
I have less faith in that theory as if the frame is destroyed the pointer stored in nsPluginInstanceOwner would get cleared, so it wouldn't be stale.
nsWindow::Create can fail and return a failure code, we don't get this. It can fail if ::CreateWindowEx fails for some reason. I'm not sure if this is happening, but we should probably do this anyway.
Attachment #508326 - Flags: review?(roc)
Attachment #508326 - Flags: review?(roc) → review+
http://hg.mozilla.org/mozilla-central/rev/73f7643d522d And I've got more ideas on how to justify another wallpaper patch if that doesn't work.
That added some debugging code by accident, backed that part out: http://hg.mozilla.org/mozilla-central/rev/6fbc3c08894e
I've been operating under the theory that this crash is caused by using the parent widget at the crashing location where we didn't before. We started using the parent widget with retained layers (landed mid July 2010) and the first bug on this crash (bug 584251) was filed August 3, 2010.
We have no crashes so far in builds that have the last patch. Given the frequency before that it is pretty good evidence that the patch fixed the crash. So that means that the call to ::CreateWindowExW in nsWindow::Create fails. Why would that fail? Maybe nsWindow::OnDestroy has fired on the parent widget and so it's mWnd is null?
(In reply to comment #18) > We have no crashes so far in builds that have the last patch. Given the > frequency before that it is pretty good evidence that the patch fixed the > crash. > > So that means that the call to ::CreateWindowExW in nsWindow::Create fails. Why > would that fail? Maybe nsWindow::OnDestroy has fired on the parent widget and > so it's mWnd is null? Yep, if the parent hwnd is null and you're creating a child, it'll fail. We have a warning in there on this that you should see.
If the reason for this crash is that OnDestroy has been called on the parent widget then I think the correct fix here is just to correctly handle that error case.
Adding [has patch] to whiteboard to indicate that we've got a fix for this issue and make some bugzilla accounting a bit easier. If you decide to go for a better fix in this bug, feel free to remove this whiteboard addition.
Whiteboard: [hardblocker] → [hardblocker][has patch][wallpaper fix landed]
I disagree that it is wallpaper. This seems to be a valid state and we need to deal with errors. Before resolving this I'm going to wait for beta 11 data. I will do any further work in a followup.
Whiteboard: [hardblocker][has patch][wallpaper fix landed] → [hardblocker][waiting for more crash data before resolving]
I'm sorry for mis-classifying as wallpaper. I wasn't making any judgement, I just misread one of your comments about the patch. I've removed that bit from the whiteboard but am keeping [has patch] since this bug does have a patch and any future patching will, if I understand it correctly, happen in a follow-up bug. Sorry also for the interruption and confusion.
Whiteboard: [hardblocker][waiting for more crash data before resolving] → [hardblocker][has patch][waiting for more crash data before resolving]
Calling this fixed. If it turns out its not, we can reopen or a new bug. The followup mentioned in comment 22 was just to check the return value in callers of CreateWidget. Which caused orange on try server, so maybe I'll just stop poking the sleeping-plugin-bear and call this a win.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [hardblocker][has patch][waiting for more crash data before resolving] → [hardblocker]
(FTR the the try server failure in comment 24 was layout/reftests/bugs/599476.html on mac.)
Hmm, still a few crashes in beta 11. The volume is vastly reduced though.
We got one crash in a win64 build, and this one actually has a deeper stack. And the result is interesting because we are thawing a document when it comes out of the bfcache. But still I don't see how it could fail. The view hierarchy etc should be intact at this point. I just wanted to preserve this somewhere in case I tackle this crash signature again. https://crash-stats.mozilla.com/report/index/ddc73da6-d950-44fd-9681-9a4912110218 0 xul.dll nsPluginInstanceOwner::CreateWidget layout/generic/nsObjectFrame.cpp:6627 1 xul.dll nsPluginHost::DoInstantiateEmbeddedPlugin modules/plugin/base/src/nsPluginHost.cpp:1104 2 mozcrt19.dll arena_dalloc obj-firefox/memory/jemalloc/crtsrc/jemalloc.c:4291 3 mozcrt19.dll arena_dalloc obj-firefox/memory/jemalloc/crtsrc/jemalloc.c:4291 4 mozcrt19.dll memcmp obj-firefox/memory/jemalloc/crtsrc/memcmp.c:60 5 xul.dll SearchTable obj-firefox/xpcom/build/pldhash.c:439 6 xul.dll mozilla::FramePropertyTable::Get layout/base/FramePropertyTable.cpp:105 7 xul.dll nsACString_internal::Equals xpcom/string/src/nsTSubstring.cpp:600 8 xul.dll nsComponentManagerImpl::GetService xpcom/components/nsComponentManager.cpp:1425 9 xul.dll nsDocument::QueryInterface content/base/src/nsDocument.cpp:1730 10 xul.dll nsPluginHost::InstantiateEmbeddedPlugin modules/plugin/base/src/nsPluginHost.cpp:970 11 xul.dll nsCOMPtr_base::assign_from_qi obj-firefox/xpcom/build/nsCOMPtr.cpp:96 12 xul.dll nsObjectFrame::InstantiatePlugin layout/generic/nsObjectFrame.cpp:1141 13 xul.dll nsCOMPtr_base::assign_with_AddRef obj-firefox/xpcom/build/nsCOMPtr.cpp:88 14 xul.dll nsObjectFrame::Instantiate layout/generic/nsObjectFrame.cpp:2692 15 xul.dll nsObjectLoadingContent::Instantiate content/base/src/nsObjectLoadingContent.cpp:1893 16 xul.dll matchPrefEntry modules/libpref/src/prefapi.cpp:111 17 xul.dll NS_TableDrivenQI obj-firefox/xpcom/build/nsISupportsImpl.cpp:49 18 xul.dll nsObjectLoadingContent::EnsureInstantiation content/base/src/nsObjectLoadingContent.cpp:917 19 xul.dll nsCOMPtr_base::assign_from_qi obj-firefox/xpcom/build/nsCOMPtr.cpp:96 20 xul.dll ThawElement layout/base/nsPresShell.cpp:7628 21 xul.dll xul.dll@0x364b8f 22 xul.dll EnumWalkAllRules content/xbl/src/nsBindingManager.cpp:1350 23 xul.dll PL_DHashTableEnumerate obj-firefox/xpcom/build/pldhash.c:754 24 xul.dll nsIDocument::EnumerateFreezableElements content/base/src/nsDocument.cpp:8040 25 xul.dll xul.dll@0x364b8f 26 xul.dll xul.dll@0x363993 27 xul.dll xul.dll@0x62c83f 28 xul.dll PresShell::Thaw layout/base/nsPresShell.cpp:7651 29 xul.dll nsDocShell::RestoreFromHistory docshell/base/nsDocShell.cpp:7347
Note that Win64 stacks are slightly less trustworthy, since I never actually fixed bug 548035, so they rely exclusively on stack scanning to walk the stack.
The RC seems to produce better stacks for the low volume crashes that are left. Just recording some of the interesting ones here. https://crash-stats.mozilla.com/report/index/e07bdd2d-f6d1-4590-9dfb-4a42c2110312 is a plugin instantiated because of script access. https://crash-stats.mozilla.com/report/index/689ef17e-989e-49e6-a578-935cd2110312 https://crash-stats.mozilla.com/report/index/fd06a573-c8f1-45e7-9aef-d02b42110312 https://crash-stats.mozilla.com/report/index/f21a2ff7-7df7-495a-b8f7-771402110312 https://crash-stats.mozilla.com/report/index/466d9237-54f0-4c8c-b93d-b40a52110312 are all from the async instantiate event.
Bug 507926 seems related.
Crash Signature: [@ nsPluginInstanceOwner::CreateWidget() ]
You need to log in before you can comment on or make changes to this bug.