Last Comment Bug 767775 - Firefox crash in mozilla::plugins::PPluginInstance::Transition with abort message: "###!!! ABORT: __delete__()d actor: file e:/builds/moz2_slave/rel-m-rel-w32-bld/build/obj-firefox/ipc/ipdl/PPluginInstance.cpp, line 28"
: Firefox crash in mozilla::plugins::PPluginInstance::Transition with abort mes...
Status: VERIFIED FIXED
[flash-11.3]
: crash, topcrash
Product: Core
Classification: Components
Component: Plug-ins (show other bugs)
: 13 Branch
: x86 Windows 7
: -- critical (vote)
: mozilla16
Assigned To: Josh Matthews [:jdm]
:
Mentors:
Depends on: 767882
Blocks: 773996
  Show dependency treegraph
 
Reported: 2012-06-23 23:57 PDT by Scoobidiver (away)
Modified: 2012-08-03 00:49 PDT (History)
10 users (show)
ryanvm: in‑testsuite+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
+
wontfix
+
verified


Attachments
ipdltest testcase (9.04 KB, patch)
2012-07-10 08:24 PDT, Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg]
cjones.bugs: review+
Details | Diff | Splinter Review
Silently ignore any messages received between an RPC __delete__ message and its reply. (8.11 KB, patch)
2012-07-11 11:42 PDT, Josh Matthews [:jdm]
no flags Details | Diff | Splinter Review
Viciously and loudly kill any process sending messages that race with RPC __delete__ messages. (8.21 KB, patch)
2012-07-11 12:55 PDT, Josh Matthews [:jdm]
cjones.bugs: review+
Details | Diff | Splinter Review
Viciously and loudly kill any process sending messages that race with RPC __delete__ messages. Test by bsmedberg. (16.39 KB, patch)
2012-07-13 10:56 PDT, Josh Matthews [:jdm]
lukasblakk+bugs: approval‑mozilla‑beta+
Details | Diff | Splinter Review

Description Scoobidiver (away) 2012-06-23 23:57:21 PDT
It's #20 top browser crasher in 13.0.1, #14 in 14.0b8, #35 in 15.0a2 and #60 in 16.0a1.
It might be related to bug 763405 in XUL Fennec.

Almost all comments talk about Flash 11.3.

There's an abort message with browser crashes, not Flash crashes.

Stack traces are various but it seems there's a main pattern depending on the channel:
* up to 15.0a2:
Frame 	Module 	Signature 	Source
0 	mozalloc.dll 	mozalloc_abort 	memory/mozalloc/mozalloc_abort.cpp:46
1 	xul.dll 	NS_DebugBreak_P 	xpcom/base/nsDebugImpl.cpp:369
2 	xul.dll 	mozilla::plugins::PPluginInstance::Transition 	obj-firefox/ipc/ipdl/PPluginInstance.cpp:28
3 	xul.dll 	mozilla::plugins::PPluginInstanceParent::OnCallReceived 	obj-firefox/ipc/ipdl/PPluginInstanceParent.cpp:2063
4 	xul.dll 	mozilla::plugins::PPluginModuleParent::Lookup 	obj-firefox/ipc/ipdl/PCompositorChild.cpp:342
5 	xul.dll 	std::_Deque_iterator<IPC::Message,std::allocator<IPC::Message> >::operator- 	deque:457
6 	xul.dll 	mozilla::ipc::RPCChannel::Incall 	ipc/glue/RPCChannel.cpp:471
7 	xul.dll 	Pickle::Pickle 	ipc/chromium/src/base/pickle.cc:38
8 	xul.dll 	mozilla::plugins::PPluginInstanceParent::Call__delete__ 	obj-firefox/ipc/ipdl/PPluginInstanceParent.cpp:213
9 	xul.dll 	xul.dll@0xc3ada3 

* in 16.0a1:
Frame 	Module 	Signature 	Source
0 	mozalloc.dll 	mozalloc_abort 	memory/mozalloc/mozalloc_abort.cpp:23
1 	xul.dll 	NS_DebugBreak_P 	xpcom/base/nsDebugImpl.cpp:369
2 	xul.dll 	mozilla::plugins::PPluginInstance::Transition 	obj-firefox/ipc/ipdl/PPluginInstance.cpp:28
3 	xul.dll 	mozilla::plugins::PPluginInstanceParent::OnCallReceived 	obj-firefox/ipc/ipdl/PPluginInstanceParent.cpp:2063
4 	xul.dll 	mozilla::plugins::PPluginModuleParent::OnCallReceived 	obj-firefox/ipc/ipdl/PPluginModuleParent.cpp:1276
5 	xul.dll 	mozilla::ipc::RPCChannel::DispatchIncall 	ipc/glue/RPCChannel.cpp:485
6 	xul.dll 	mozilla::ipc::RPCChannel::Incall 	ipc/glue/RPCChannel.cpp:471
7 	xul.dll 	mozilla::ipc::RPCChannel::Call 	ipc/glue/RPCChannel.cpp:279
8 	xul.dll 	mozilla::plugins::PPluginInstanceParent::Call__delete__ 	obj-firefox/ipc/ipdl/PPluginInstanceParent.cpp:213
9 	xul.dll 	mozilla::plugins::PluginModuleParent::NPP_Destroy 	dom/plugins/ipc/PluginModuleParent.cpp:393
10 	xul.dll 	nsNPAPIPluginInstance::Stop 	dom/plugins/base/nsNPAPIPluginInstance.cpp:213
11 	xul.dll 	nsPluginHost::StopPluginInstance 	dom/plugins/base/nsPluginHost.cpp:3084
12 	xul.dll 	nsObjectLoadingContent::DoStopPlugin 	content/base/src/nsObjectLoadingContent.cpp:2213
13 	xul.dll 	nsObjectLoadingContent::StopPluginInstance 	content/base/src/nsObjectLoadingContent.cpp:2244
...

More reports at:
https://crash-stats.mozilla.com/report/list?signature=mozalloc_abort%28char+const*+const%29+|+NS_DebugBreak_P+|+mozilla%3A%3Aplugins%3A%3APPluginInstance%3A%3ATransition%28mozilla%3A%3Aplugins%3A%3APPluginInstance%3A%3AState%2C+mozilla%3A%3Aipc%3A%3ATrigger%2C+mozilla%3A%3Aplugins%3A%3APPluginInstance%3A%3AState*%29
Comment 1 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2012-06-26 12:01:20 PDT
Did this go away in 14.0b9? I think this might be roughly the same this as bug 751826 and fixed by bug 686335.
Comment 2 Marcia Knous [:marcia - use ni] 2012-06-26 12:44:16 PDT
From what I can see in this query - http://tinyurl.com/dyx8e2o, that signature is still present in Beta 9.

(In reply to Benjamin Smedberg  [:bsmedberg] from comment #1)
> Did this go away in 14.0b9? I think this might be roughly the same this as
> bug 751826 and fixed by bug 686335.
Comment 3 Scoobidiver (away) 2012-06-26 12:57:40 PDT
It's #10 top browser crasher in 14.0b8 and #12 in 14.0b9 so it's no fixed.
Comment 4 Marcia Knous [:marcia - use ni] 2012-06-27 02:27:02 PDT
Total Count 	URL
546 	about:blank
308 	https://www.facebook.com/?ref=tn_tnmn
231 	http://www.facebook.com/?ref=tn_tnmn
185 	http://www.facebook.com/
107 	https://www.facebook.com/
73 	https://www.facebook.com/?ref=logo
68 	javascript:'';
68 	http://love.mail.ru/my/messages.phtml
51 	http://apps.facebook.com/logout.php
48 	https://apps.facebook.com/logout.php
46 	http://www.facebook.com/?ref=logo
42 	https://apps.facebook.com/thesimssocial/?pf_ref=retry
29 	http://messagerie-11.sfr.fr/webmail/framePub.htm
24 	http://www.mamba.ru/my/messages.phtml
22 	http://www.aol.com/
22 	http://messagerie-13.sfr.fr/webmail/framePub.htm
21 	http://www.runescape.com/game.ws?j=1
20 	about:newtab
20 	http://www.meez.com/create-avatar.dm
20 	http://e.mail.ru/cgi-bin/msglist?back=1
19 	http://mail.aol.com/36472-112/aol-6/en-us/Suite.aspx
18 	http://mail.aol.com/36515-112/aol-6/en-us/Suite.aspx
18 	http://love.mail.ru/
18 	http://messagerie-12.sfr.fr/webmail/framePub.htm
16 	http://www.yahoo.com/
16 	http://www.myspace.com/games/play/104283?pm_cmp=mdp_104283_AppsGame_YourGames
16 	http://www.rr.com/
15 	http://www.youtube.com/
14 	http://xat.com/fairytaileaki
14 	http://mail.qip.ru/internal/iframe/mailqip_240x400
12 	http://mail.aol.com/36515-111/aol-6/en-us/Suite.aspx
12 	https://banner.pt.ray.fi/flashpoker/11/%5Bfi%5D/index.php?pageid=game&appset=RAY
11 	http://e.mail.ru/cgi-bin/msglist
11 	http://mail.aol.com/36472-111/aol-6/en-us/Suite.aspx
10 	http://www.milfmovs.com/
10 	http://espn.go.com/espnradio/play?s=espn
10 	http://mci2.webkinz.com/loader4.php?go=exitScreenW&aol=0&fs=0
10 	http://www.fandango.com/movie-trailer/
10 	http://shcgame.klicknation.com/apps/heroes/pages/blank.php
10 	http://www.lequipe.fr/
10 	http://www.jackpotjoy.com/game/games-lobby-jackpotjoy#free,DailyGames
10 	http://www.google.com/
9 	http://mamba.ru/my/messages.phtml
9 	http://www.iltasanomat.fi/
9 	http://www.google.fr/
9 	http://www.yandex.ru/
9 	http://www.wp.pl/
8 	http://www.optimum.net/
8 	https://maps.google.com/
8 	http://www.voanews.com/burmese/news/?refresh=1
8 	https://www.igmarkets.com/dealing/pd/cfd/chart/tickchart.jsp?chartKey=chart0
8 	http://r.mail.ru/n77699981?sz=1&
7 	https://twitter.com/#!/
7 	http://www.tumblr.com/dashboard
7 	http://love.rambler.ru/my/messages.phtml
7 	http://www.programme-tv.net/programme/programme-tnt.html
7 	http://www.orkut.com.br/Main#Home
7 	http://polskalokalna.pl/wiadomosci/slaskie/news/matka-szymona-uslyszala-zarzut-z
7 	http://www.isky.co.nz/livetv/sky_sport_2
7 	http://www.myspace.com/
7 	http://mci2.webkinz.com/loader4.php?go=exitScreen&aol=0&fs=0
7 	http://www.news.com.au/
7 	http://br.yahoo.com/
7 	http://www.google.com.br/
6 	http://www.fandango.com/prometheus_141625/movieoverview?date=
6 	http://www.lacapital.com.ar/
6 	http://www.skyrock.com/m/common/redirect_to_lv.php
6 	http://platform.twitter.com/widgets/follow_button.html
6 	http://espn.go.com/
6 	http://www.swiatseriali.pl/seriale/julia-664/news-na-zyciowych-zakretach,nId,613
6 	http://fr.chat.tchatche.com/home/Index#
6 	https://mail.google.com/mail/u/0/?shva=1#inbox
6 	http://www.skyrock.com/
6 	http://r.mail.ru/cln9725/love.mail.ru/
6 	http://www.google.de/
6 	http://www.programme-tv.net/programme/toutes-les-chaines/
6 	http://www.odnoklassniki.ru/
5 	http://www.isky.co.nz/livetv/sky_sport_1
5 	https://www.facebook.com/login.php?login_attempt=1
5 	http://www.voanews.com/burmese/news/
5 	https://apps.facebook.com/autocollectbonuses/?fb_source=bookmark_favorites&ref=b
5 	http://r.mail.ru/
5 	http://www.rr.com/home/home
Comment 5 Virgil Dicu [:virgil] [QA] 2012-07-02 09:04:45 PDT
Mozilla/5.0 (Windows NT 6.1; rv:13.0) Gecko/20100101 Firefox/13.0.1

No luck with my attempt to reproduce. 
Tried 13.0.1, Flash 11.3 with a Nvidia GeForce video card.

Loaded a bunch of flash games on miniclip.com, facebook games, stressed the browser a bit-changing tabs, scrolling, back/forward navigation with no luck.

Comments mention a lot of flash sites and Flash 11.3, but nothing particularly useful in reproducing.
Comment 6 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2012-07-02 09:09:24 PDT
cjones, I thought that incorrect lookups on the parent side were merely supposed to abort the connection/kill the child but not abort the parent process. Can you help me understand whether we can make this condition less fatal and only kill off the misbehaving plugin?
Comment 7 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-02 12:51:54 PDT
IIRC we abort if the parent side is *using* a deleted actor, i.e. trying to send it in an IPC message.  That could indicate that it's holding on to freed memory.  *Receiving* a deleted actor from the child side causes the child process to be killed off.  What's happening here?
Comment 8 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2012-07-02 13:00:31 PDT
It appears to me as if PPluginInstanceParent::Call__delete__ is racing with an incoming bad actor ID.
Comment 9 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-02 13:05:49 PDT
If the Call__delete__() is happening after ActorDestroy(), then that's a bug on the parent side.  That's not allowed.
Comment 10 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2012-07-06 20:07:11 PDT
Chris, I'm pretty sure that the Call__delete__ is correct. At least it appears to be correct here:

https://crash-stats.mozilla.com/report/index/0b35c170-db02-4b73-8a0a-114602120707

It's the message that comes *in* during the __delete__ which is on the bad actor.
Comment 11 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2012-07-10 08:24:00 PDT
Created attachment 640609 [details] [diff] [review]
ipdltest testcase

cjones, here is a testcase in ipdltest form:

* parent creates new subactor                              
* parent immediately __delete__s subactor using RPC call
*    child receives subactor constructor, sends async ping()
* ping response wins the RPC race, but then aborts because its a message to a deleted actor
Comment 12 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2012-07-10 12:56:03 PDT
jdm gets a fun one!

Josh, this is something that we might want to uplift to 15 or even a 14 dot-release if we can make a safe-enough patch.
Comment 13 Alex Keybl [:akeybl] 2012-07-10 19:21:16 PDT
This is currently #9 on the 13.0.1 top crash list.
Comment 14 Josh Matthews [:jdm] 2012-07-11 11:42:01 PDT
Created attachment 641145 [details] [diff] [review]
Silently ignore any messages received between an RPC __delete__ message and its reply.
Comment 15 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2012-07-11 12:05:19 PDT
I really don't want to silently ignore this message: this is an error condition which should cause us to trigger error conditions in the child, essentially aborting it. We should just avoid aborting the parent process.
Comment 16 Josh Matthews [:jdm] 2012-07-11 12:55:39 PDT
Created attachment 641170 [details] [diff] [review]
Viciously and loudly kill any process sending messages that race with RPC __delete__ messages.
Comment 17 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-12 13:01:38 PDT
Comment on attachment 640609 [details] [diff] [review]
ipdltest testcase

>diff --git a/ipc/ipdl/test/cxx/PTestBadActor.ipdl b/ipc/ipdl/test/cxx/PTestBadActor.ipdl

>+child:
>+  PTestBadActorSub();
>+  __delete__();

Please a short note explaining what this test is doing.  (We can't
write a protocol for it because it wouldn't type check.)

>diff --git a/ipc/ipdl/test/cxx/TestBadActor.cpp b/ipc/ipdl/test/cxx/TestBadActor.cpp

>+bool
>+TestBadActorChild::RecvPTestBadActorSubConstructor(PTestBadActorSubChild* actor)
>+{
>+  if (!actor->SendPing()) {
>+    NS_WARNING("Couldn't send ping to an actor which supposedly isn't dead yet.");

This should be a fail(), Ping should always go through (from the
sending side.)

r=me with those.
Comment 18 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-12 13:29:38 PDT
Comment on attachment 641170 [details] [diff] [review]
Viciously and loudly kill any process sending messages that race with RPC __delete__ messages.

>diff -r 27f19e81dbfb -r c303291bffb5 ipc/ipdl/ipdl/lower.py

>+    def dyingStat(self):

dyingState, but this doesn't seem to be used anywhere, so can probably
kill it.

>+            # in the event of an RPC delete message, we want to silently ignore any
>+            # messages that are received that are not a reply to the original message

That's not entirely true ... we want to complain loudly and
machine-gun violators.  It's just, if the parent is the sender, it
wants to drop the messages on the floor.

>diff -r 27f19e81dbfb -r c303291bffb5 ipc/ipdl/ipdl/type.py

>+        self.hasBlockingDelete = False

The key here isn't that the __delete__ is blocking, but rather that
it's reentrant.  Let's call this |hasReentrantDelete| instead.

>+        if p.decl.type.hasDelete:
>+            p.decl.type.hasBlockingDelete = self.symtab.lookup(_DELETE_MSG).type.isRpc()

p.decl.type.hasReentrantDelete =
    p.decl.type.hasDelete and self.symtab.lookup(_DELETE_MSG).type.isRpc()

Did you run the compiler-only tests?  Be sure to ---

  make -C $objdir/ipc/ipdl/test check

will run both the compiler-only and C++ tests.

r=me with above fixed and with tests passing.
Comment 19 Josh Matthews [:jdm] 2012-07-13 10:56:07 PDT
Created attachment 641957 [details] [diff] [review]
Viciously and loudly kill any process sending messages that race with RPC __delete__ messages. Test by bsmedberg.
Comment 20 Josh Matthews [:jdm] 2012-07-14 08:08:28 PDT
I don't have an SSH key I can push with right now, so I would appreciate somebody else pushing this for me.
Comment 21 Ryan VanderMeulen [:RyanVM] 2012-07-14 10:10:11 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/d97fd94a6348
Comment 22 Ryan VanderMeulen [:RyanVM] 2012-07-14 16:18:22 PDT
https://hg.mozilla.org/mozilla-central/rev/d97fd94a6348
Comment 23 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2012-07-20 09:22:19 PDT
Verified via crash-stats that this fixes the browser process crash. I'd really like this for beta, is there any reason we shouldn't request it?
Comment 24 Josh Matthews [:jdm] 2012-07-20 09:33:20 PDT
No argument from my end.
Comment 25 Alex Keybl [:akeybl] 2012-07-20 15:34:44 PDT
Josh - please nominate for beta approval as soon as you get the chance. Thanks!
Comment 26 Josh Matthews [:jdm] 2012-07-20 15:45:39 PDT
Comment on attachment 641957 [details] [diff] [review]
Viciously and loudly kill any process sending messages that race with RPC __delete__ messages. Test by bsmedberg.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Flash update exposing underlying problem in IPDL code
User impact if declined: High rate of unnecessary browser crashes
Testing completed (on m-c, etc.): Crash signature disappeared on nightlies.
Risk to taking this patch (and alternatives if risky): Shouldn't be any. The patch handles a problematic situation by aggressively cleaning it up.
String or UUID changes made by this patch: None.
Comment 27 Lukas Blakk [:lsblakk] use ?needinfo 2012-07-23 13:04:21 PDT
Comment on attachment 641957 [details] [diff] [review]
Viciously and loudly kill any process sending messages that race with RPC __delete__ messages. Test by bsmedberg.

looks good, let's get this in today so it's in the beta going to build tomorrow morning.
Comment 28 :Ehsan Akhgari (away Aug 1-5) 2012-07-23 13:23:26 PDT
http://hg.mozilla.org/releases/mozilla-beta/rev/3548fbfdbef6
Comment 29 Virgil Dicu [:virgil] [QA] 2012-07-27 06:22:39 PDT
http://bit.ly/OhA8vu

Marking as verified - no crashes for Nightly 17 since the patch landed.

Will wait for one extra week before setting this to verified in 15 beta 2.
Comment 30 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-08-02 11:25:32 PDT
Virgil, can you please verify for Beta now that we have 15.0b3 builds? Thanks.
Comment 31 Virgil Dicu [:virgil] [QA] 2012-08-03 00:49:10 PDT
http://bit.ly/OhA8vu

Only one crash in 15 b2 since the fix landed. This is safe to be marked as verified.

Note You need to log in before you can comment on or make changes to this bug.