crash in mozilla::plugins::PluginInstanceParent::NPP_DestroyStream(_NPStream*, short)

RESOLVED FIXED in Firefox 39

Status

()

Core
Plug-ins
--
critical
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: cpeterson, Assigned: aklotz)

Tracking

({crash, reproducible})

unspecified
mozilla40
crash, reproducible
Points:
---

Firefox Tracking Flags

(firefox38 disabled, firefox39+ fixed, firefox40+ fixed)

Details

(crash signature)

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

3 years ago
This bug was filed from the Socorro interface and is 
report bp-1f09769b-29d8-427b-b23e-9611a2150417.
=============================================================

I hit this plugin crash three times in row when force reloading (SHIFT+CMD+R) cnn.com in a private browsing window.

bp-5914e8a2-9665-4976-9b42-12b8d2150417
bp-1f09769b-29d8-427b-b23e-9611a2150417
bp-bf8f191c-f870-4477-9a85-b180e2150417

Frame 	Module 	Signature 	Source
0 	XUL 	mozilla::plugins::PluginInstanceParent::NPP_DestroyStream(_NPStream*, short) 	dom/plugins/ipc/PluginInstanceParent.cpp
1 	XUL 	mozilla::plugins::PluginModuleParent::NPP_DestroyStream(_NPP*, _NPStream*, short) 	dom/plugins/ipc/PluginModuleParent.cpp
2 	XUL 	nsNPAPIPluginStreamListener::CleanUpStream(short) 	dom/plugins/base/nsNPAPIPluginStreamListener.cpp
3 	XUL 	nsNPAPIPluginInstance::Stop() 	dom/plugins/base/nsNPAPIPluginInstance.cpp
4 	XUL 	nsPluginHost::StopPluginInstance(nsNPAPIPluginInstance*) 	dom/plugins/base/nsPluginHost.cpp
5 	XUL 	nsPluginDestroyRunnable::Run() 	dom/plugins/base/nsPluginHost.cpp
6 	XUL 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp
7 	XUL 	NS_ProcessPendingEvents(nsIThread*, unsigned int) 	xpcom/glue/nsThreadUtils.cpp
8 	XUL 	nsBaseAppShell::NativeEventCallback() 	widget/nsBaseAppShell.cpp
9 	XUL 	nsAppShell::ProcessGeckoEvents(void*) 	widget/cocoa/nsAppShell.mm
Ø 10 	CoreFoundation 	CoreFoundation@0x80a00 	
Ø 11 	CoreFoundation 	CoreFoundation@0x72b8c 	
Ø 12 	CoreFoundation 	CoreFoundation@0x721be 	
Ø 13 	CoreFoundation 	CoreFoundation@0x71bd7 	
Ø 14 	HIToolbox 	HIToolbox@0x3256e 	
Ø 15 	HIToolbox 	HIToolbox@0x322e9 	
Ø 16 	HIToolbox 	HIToolbox@0x3212a 	
Ø 17 	AppKit 	AppKit@0x919ba 	
Ø 18 	AppKit 	AppKit@0x90f67 	
19 	XUL 	-[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] 	widget/cocoa/nsAppShell.mm
Ø 20 	AppKit 	AppKit@0x86bf2 	
21 	XUL 	nsAppShell::Run() 	widget/cocoa/nsAppShell.mm
22 	XUL 	nsAppStartup::Run() 	toolkit/components/startup/nsAppStartup.cpp
23 	XUL 	XREMain::XRE_mainRun() 	toolkit/xre/nsAppRunner.cpp
24 	XUL 	XREMain::XRE_main(int, char**, nsXREAppData const*) 	toolkit/xre/nsAppRunner.cpp
25 	XUL 	XRE_main 	toolkit/xre/nsAppRunner.cpp
26 	firefox 	main 	browser/app/nsBrowserApp.cpp
27 	firefox 	start
I don't see this bug in today's m-c nightly, testing on OS X 10.8.5, 10.9.5 and 10.10.2.  (I haven't yet upgraded from 10.10.2 to 10.10.3.)

Is this consistently reproducible?  Do you see it with a clean profile?

When you pressed cmd-shift-r, did you wait until the page had finished (re)loading?  I tried it both ways, but never crashed.
This crashes are reported as null-dereferences, and they really *are* null-derefences.  Note that rbx is null in the raw dumps.  I checked in the assembly code for the 2015-04-16 m-c nightly that these crashes take place as rbx is dereferenced.

So in the following code, 's' is null:

http://hg.mozilla.org/mozilla-central/annotate/a35163f83d22/dom/plugins/ipc/PluginInstanceParent.cpp#l1422

And we could trivially fix these crashes by doing a null check.  Though I suppose it's better to try to figure out *how* 's' can be null here.

As bug 1018360 comment #0 says, AMD64 processors have a design flaw that causes crash addresses > 0x7fffffffffff to be misreported (as null or 0xffffffffffffffff).  Memory poisoned addresses match this pattern.  So many (most?) Mac crashes that Socorro reports as null dereferences are really dereferences of memory poisoned values.
(Assignee)

Updated

3 years ago
Blocks: 1151804
(Assignee)

Comment 3

3 years ago
(In reply to Steven Michaud from comment #2)
> And we could trivially fix these crashes by doing a null check.  Though I
> suppose it's better to try to figure out *how* 's' can be null here.
> 

Indeed. That'll be how my Friday is spent!
Assignee: nobody → aklotz
Status: NEW → ASSIGNED
status-firefox39: --- → affected
OS: Mac OS X → All
(Assignee)

Comment 4

3 years ago
ASAN might have just given me the answer:
https://bugzilla.mozilla.org/show_bug.cgi?id=1151804#c16
Interesting.  I assume the ASan builds don't use jemalloc.

If this really is a use-after-free and jemalloc were in play, the crash would be at a memory-poisoned address.  Chris seems to have used a standard m-c nightly (for 2015-04-16), which (as far as I know) all use jemalloc.  So I don't think Chris's crashes involved any use-after-free.
Maybe Chris's crashes are use-after-destruction-but-just-before-free ... it that's possible.

Comment 7

3 years ago
Created attachment 8594337 [details]
WinDbg log of crash

I can reproduce this reliably.

STR:

1. Install DivX Web Player (only Web Player is needed.  Disable all the other crap unless you need it): http://www.divx.com/en/software/web-player/features
2. Launch Firefox with a new profile
3. Enable dom.ipc.plugins.asyncInit
4. Disable e10s and restart browser
5. Visit www.stagevu.com
6. Click the featured video on the left and allow the plugin
7. Happy crashing.

Updated

3 years ago
Keywords: reproducible
(Assignee)

Comment 8

3 years ago
Created attachment 8595036 [details] [diff] [review]
Null out NPStream::pdata and check for it
Attachment #8595036 - Flags: review?(jmathies)
(Assignee)

Comment 9

3 years ago
Comment on attachment 8595036 [details] [diff] [review]
Null out NPStream::pdata and check for it

Canceling this as I'm still seeing some failures on try.
Attachment #8595036 - Flags: review?(jmathies)

Comment 10

3 years ago
[Tracking Requested - why for this release]:
This is 10% of all Dev Edition browser crashes right now.
tracking-firefox39: --- → ?
(Assignee)

Comment 11

3 years ago
Created attachment 8595513 [details] [diff] [review]
Null out NPStream::pdata and check for it (r2)

OK, this one passes on non-e10s which is what we want.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6489f9b59b4c
Attachment #8595036 - Attachment is obsolete: true
Attachment #8595513 - Flags: review?(jmathies)

Updated

3 years ago
Attachment #8595513 - Flags: review?(jmathies) → review+
(Assignee)

Comment 12

3 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/815511fdfea6
Thanks!
From crash stats it looks like 38 and earlier are unaffected. Tracking 39+
status-firefox38: --- → unaffected
tracking-firefox39: ? → +
tracking-firefox40: --- → +
(Assignee)

Comment 14

3 years ago
Comment on attachment 8595513 [details] [diff] [review]
Null out NPStream::pdata and check for it (r2)

Approval Request Comment
[Feature/regressing bug #]: async plugin init
[User impact if declined]: instability when starting plugin
[Describe test coverage new/current, TreeHerder]: test_pluginstream_err.html when asyncInit is enabled. Tested locally and on try, but asyncInit is disabled on central.
[Risks and why]: Low. Patch is a simple fix and cause of crashing is well understood.
[String/UUID change made/needed]: None
Attachment #8595513 - Flags: approval-mozilla-aurora?
Comment on attachment 8595513 [details] [diff] [review]
Null out NPStream::pdata and check for it (r2)

Looks like a safe patch. Let's get this on Aurora to test. Aurora+
Attachment #8595513 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(Assignee)

Comment 16

3 years ago
https://hg.mozilla.org/releases/mozilla-aurora/rev/4abcc0869ba0
status-firefox38: unaffected → disabled
status-firefox39: affected → fixed
https://hg.mozilla.org/mozilla-central/rev/815511fdfea6
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
status-firefox40: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla40

Comment 18

3 years ago
I just received a crash [ bp-eecf1d46-4527-469c-990a-efbbf2150423 ] on a build that does not yet have this fix with the following slightly different signature:

[@ mozilla::plugins::PluginModuleParent::NPP_DestroyStream(_NPP*, _NPStream*, short) ]

Will the patch in this bug also cover this case?
Flags: needinfo?(aklotz)
(Assignee)

Comment 19

3 years ago
I'm not sure.
Flags: needinfo?(aklotz)

Comment 20

3 years ago
(In reply to Aaron Klotz [:aklotz] (please use needinfo) from comment #19)
> I'm not sure.

Not knowing or understanding the code but looking at this logically...

Given that you've added null checking to:
dom/plugins/ipc/BrowserStreamParent.cpp

And made a relevant change to:
dom/plugins/ipc/PluginInstanceParent.cpp

I'm assuming it would make sense for you to make the equivalent change to:
dom/plugins/ipc/PluginModuleParent.cpp

Do you want me to open a new bug so that can be done or do we need to wait for another crash?
Flags: needinfo?(aklotz)
(Assignee)

Comment 21

3 years ago
We haven't seen any reports for builds newer than April 22. I think we should hold off until we see any crashes with recent builds.
Flags: needinfo?(aklotz)
You need to log in before you can comment on or make changes to this bug.