This bug was filed from the Socorro interface and is report bp-1f09769b-29d8-427b-b23e-9611a2150417. ============================================================= I hit this plugin crash three times in row when force reloading (SHIFT+CMD+R) cnn.com in a private browsing window. bp-5914e8a2-9665-4976-9b42-12b8d2150417 bp-1f09769b-29d8-427b-b23e-9611a2150417 bp-bf8f191c-f870-4477-9a85-b180e2150417 Frame Module Signature Source 0 XUL mozilla::plugins::PluginInstanceParent::NPP_DestroyStream(_NPStream*, short) dom/plugins/ipc/PluginInstanceParent.cpp 1 XUL mozilla::plugins::PluginModuleParent::NPP_DestroyStream(_NPP*, _NPStream*, short) dom/plugins/ipc/PluginModuleParent.cpp 2 XUL nsNPAPIPluginStreamListener::CleanUpStream(short) dom/plugins/base/nsNPAPIPluginStreamListener.cpp 3 XUL nsNPAPIPluginInstance::Stop() dom/plugins/base/nsNPAPIPluginInstance.cpp 4 XUL nsPluginHost::StopPluginInstance(nsNPAPIPluginInstance*) dom/plugins/base/nsPluginHost.cpp 5 XUL nsPluginDestroyRunnable::Run() dom/plugins/base/nsPluginHost.cpp 6 XUL nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp 7 XUL NS_ProcessPendingEvents(nsIThread*, unsigned int) xpcom/glue/nsThreadUtils.cpp 8 XUL nsBaseAppShell::NativeEventCallback() widget/nsBaseAppShell.cpp 9 XUL nsAppShell::ProcessGeckoEvents(void*) widget/cocoa/nsAppShell.mm Ø 10 CoreFoundation CoreFoundation@0x80a00 Ø 11 CoreFoundation CoreFoundation@0x72b8c Ø 12 CoreFoundation CoreFoundation@0x721be Ø 13 CoreFoundation CoreFoundation@0x71bd7 Ø 14 HIToolbox HIToolbox@0x3256e Ø 15 HIToolbox HIToolbox@0x322e9 Ø 16 HIToolbox HIToolbox@0x3212a Ø 17 AppKit AppKit@0x919ba Ø 18 AppKit AppKit@0x90f67 19 XUL -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] widget/cocoa/nsAppShell.mm Ø 20 AppKit AppKit@0x86bf2 21 XUL nsAppShell::Run() widget/cocoa/nsAppShell.mm 22 XUL nsAppStartup::Run() toolkit/components/startup/nsAppStartup.cpp 23 XUL XREMain::XRE_mainRun() toolkit/xre/nsAppRunner.cpp 24 XUL XREMain::XRE_main(int, char**, nsXREAppData const*) toolkit/xre/nsAppRunner.cpp 25 XUL XRE_main toolkit/xre/nsAppRunner.cpp 26 firefox main browser/app/nsBrowserApp.cpp 27 firefox start
I don't see this bug in today's m-c nightly, testing on OS X 10.8.5, 10.9.5 and 10.10.2. (I haven't yet upgraded from 10.10.2 to 10.10.3.) Is this consistently reproducible? Do you see it with a clean profile? When you pressed cmd-shift-r, did you wait until the page had finished (re)loading? I tried it both ways, but never crashed.
This crashes are reported as null-dereferences, and they really *are* null-derefences. Note that rbx is null in the raw dumps. I checked in the assembly code for the 2015-04-16 m-c nightly that these crashes take place as rbx is dereferenced. So in the following code, 's' is null: http://hg.mozilla.org/mozilla-central/annotate/a35163f83d22/dom/plugins/ipc/PluginInstanceParent.cpp#l1422 And we could trivially fix these crashes by doing a null check. Though I suppose it's better to try to figure out *how* 's' can be null here. As bug 1018360 comment #0 says, AMD64 processors have a design flaw that causes crash addresses > 0x7fffffffffff to be misreported (as null or 0xffffffffffffffff). Memory poisoned addresses match this pattern. So many (most?) Mac crashes that Socorro reports as null dereferences are really dereferences of memory poisoned values.
(In reply to Steven Michaud from comment #2) > And we could trivially fix these crashes by doing a null check. Though I > suppose it's better to try to figure out *how* 's' can be null here. > Indeed. That'll be how my Friday is spent!
ASAN might have just given me the answer: https://bugzilla.mozilla.org/show_bug.cgi?id=1151804#c16
Interesting. I assume the ASan builds don't use jemalloc. If this really is a use-after-free and jemalloc were in play, the crash would be at a memory-poisoned address. Chris seems to have used a standard m-c nightly (for 2015-04-16), which (as far as I know) all use jemalloc. So I don't think Chris's crashes involved any use-after-free.
Maybe Chris's crashes are use-after-destruction-but-just-before-free ... it that's possible.
Created attachment 8594337 [details] WinDbg log of crash I can reproduce this reliably. STR: 1. Install DivX Web Player (only Web Player is needed. Disable all the other crap unless you need it): http://www.divx.com/en/software/web-player/features 2. Launch Firefox with a new profile 3. Enable dom.ipc.plugins.asyncInit 4. Disable e10s and restart browser 5. Visit www.stagevu.com 6. Click the featured video on the left and allow the plugin 7. Happy crashing.
Created attachment 8595036 [details] [diff] [review] Null out NPStream::pdata and check for it
Comment on attachment 8595036 [details] [diff] [review] Null out NPStream::pdata and check for it Canceling this as I'm still seeing some failures on try.
[Tracking Requested - why for this release]: This is 10% of all Dev Edition browser crashes right now.
Created attachment 8595513 [details] [diff] [review] Null out NPStream::pdata and check for it (r2) OK, this one passes on non-e10s which is what we want. https://treeherder.mozilla.org/#/jobs?repo=try&revision=6489f9b59b4c
From crash stats it looks like 38 and earlier are unaffected. Tracking 39+
Comment on attachment 8595513 [details] [diff] [review] Null out NPStream::pdata and check for it (r2) Approval Request Comment [Feature/regressing bug #]: async plugin init [User impact if declined]: instability when starting plugin [Describe test coverage new/current, TreeHerder]: test_pluginstream_err.html when asyncInit is enabled. Tested locally and on try, but asyncInit is disabled on central. [Risks and why]: Low. Patch is a simple fix and cause of crashing is well understood. [String/UUID change made/needed]: None
Comment on attachment 8595513 [details] [diff] [review] Null out NPStream::pdata and check for it (r2) Looks like a safe patch. Let's get this on Aurora to test. Aurora+
I just received a crash [ bp-eecf1d46-4527-469c-990a-efbbf2150423 ] on a build that does not yet have this fix with the following slightly different signature: [@ mozilla::plugins::PluginModuleParent::NPP_DestroyStream(_NPP*, _NPStream*, short) ] Will the patch in this bug also cover this case?
I'm not sure.
(In reply to Aaron Klotz [:aklotz] (please use needinfo) from comment #19) > I'm not sure. Not knowing or understanding the code but looking at this logically... Given that you've added null checking to: dom/plugins/ipc/BrowserStreamParent.cpp And made a relevant change to: dom/plugins/ipc/PluginInstanceParent.cpp I'm assuming it would make sense for you to make the equivalent change to: dom/plugins/ipc/PluginModuleParent.cpp Do you want me to open a new bug so that can be done or do we need to wait for another crash?
We haven't seen any reports for builds newer than April 22. I think we should hold off until we see any crashes with recent builds.