Last Comment Bug 648935 - Firefox freezes when an alert box generated by Flash is shown.
: Firefox freezes when an alert box generated by Flash is shown.
Status: VERIFIED FIXED
: testcase
Product: Core
Classification: Components
Component: Plug-ins (show other bugs)
: Trunk
: x86 Windows 7
: -- normal with 1 vote (vote)
: mozilla6
Assigned To: Jim Mathies [:jimm]
:
Mentors:
http://www.broadband-xp.com/test/fire...
: 663773 (view as bug list)
Depends on: 658741 675645
Blocks:
  Show dependency treegraph
 
Reported: 2011-04-10 20:00 PDT by rshimazu
Modified: 2011-08-17 14:36 PDT (History)
15 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
stack (8.21 KB, text/plain)
2011-04-11 12:29 PDT, Jim Mathies [:jimm]
no flags Details
child and parent stacks (9.95 KB, text/plain)
2011-04-27 12:12 PDT, Jim Mathies [:jimm]
no flags Details
debug log (1.83 KB, text/plain)
2011-05-03 09:59 PDT, Jim Mathies [:jimm]
no flags Details
Test (6.46 KB, patch)
2011-05-03 12:54 PDT, Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
no flags Details | Diff | Splinter Review
spin work around (1.45 KB, patch)
2011-05-03 15:14 PDT, Jim Mathies [:jimm]
no flags Details | Diff | Splinter Review
Need to use the "remote stack depth view" when deciding whether to undefer, too (3.91 KB, patch)
2011-05-04 00:54 PDT, Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
benjamin: review+
Details | Diff | Splinter Review
Test for bug 648935. (5.77 KB, patch)
2011-05-04 00:57 PDT, Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
no flags Details | Diff | Splinter Review
cleanup v.1 (2.85 KB, patch)
2011-05-06 13:40 PDT, Jim Mathies [:jimm]
benjamin: review+
Details | Diff | Splinter Review
[backed out]focus change trap v.1 (2.72 KB, patch)
2011-05-06 13:44 PDT, Jim Mathies [:jimm]
benjamin: review+
Details | Diff | Splinter Review
child side paint v.1 (2.02 KB, patch)
2011-05-06 13:50 PDT, Jim Mathies [:jimm]
bent.mozilla: review+
Details | Diff | Splinter Review
alert fix v.1 (11.88 KB, patch)
2011-05-06 13:52 PDT, Jim Mathies [:jimm]
benjamin: review+
Details | Diff | Splinter Review
focus test case (980 bytes, text/html)
2011-05-09 13:29 PDT, Jim Mathies [:jimm]
no flags Details

Description rshimazu 2011-04-10 20:00:04 PDT
User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0) Gecko/20100101 Firefox/4.0
Build Identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0) Gecko/20100101 Firefox/4.0

When you set "dom.ipc.plugins.enabled" as true, you can not dismiss the alert box generated by Flash's ExternalInterface.call method. Firefox freezes. 

Reproducible: Always

Steps to Reproduce:
1. Go to the test case: http://www.broadband-xp.com/test/firefox4_alert/alert_test2.html
2. Click the character "A"
3.
Actual Results:  
Even if you click "OK" button, the alert box does not disappear but Firefox freezes.

Expected Results:  
If you click "OK" button, the alert box should disappear.

When you set "dom.ipc.plugins.enabled" as false, you will not see this problem.
For this test, you have to set it as the default.

http://kishiken.com/as/javascript/test.html
is also a test case even though this page is not mine.
Comment 1 aravindm 2011-04-10 22:24:30 PDT
Confirmed on Mozilla/5.0 (Windows NT 6.1; rv:2.0.1pre) Gecko/20110410 Firefox/4.0.1pre
Comment 2 Jim Mathies [:jimm] 2011-04-11 12:29:43 PDT
Created attachment 525132 [details]
stack
Comment 3 Jim Mathies [:jimm] 2011-04-27 12:12:49 PDT
Created attachment 528667 [details]
child and parent stacks

Interesting hang. Neither process is held up by a system call. The child is running in RPCChannel's WaitForNotify and the parent is spinning around in a javascript loop waiting for the alert to close. The js loop calls ProcessNextNativeEvent so incoming Windows events get processed. But for some reason the system won't allow the window to come to the forefront and accept and deliver mouse events. There doesn't appear to be any deferring of messages going on, I can shake to minimize the main window and the events it receives are delivered to the proper event procedure.

All in all strange hang, I'm not quite sure what's going on yet.
Comment 4 Jim Mathies [:jimm] 2011-04-27 12:17:16 PDT
What might be going on here is the event queue for the child window has some stuff in it windows wants delivered. We would defer anything that was non-queued and ignore anything that's queued. In some cases we've ran into with dialog focus queued messages also need to be delivered.
Comment 5 Jim Mathies [:jimm] 2011-04-27 13:03:04 PDT
(In reply to comment #4)
> What might be going on here is the event queue for the child window has some
> stuff in it windows wants delivered. We would defer anything that was
> non-queued and ignore anything that's queued. In some cases we've ran into with
> dialog focus queued messages also need to be delivered.

This doesn't appear to be the case. Always spinning the event loop in the child doesn't solve this hang.
Comment 6 Jim Mathies [:jimm] 2011-04-29 11:59:51 PDT
I've managed to hack my way around this, but I don't have a clean solution yet. First I setup RPCChannel's WaitForNotify in the child to always call into spin loop. That cleared out queued events that were pending for the plugin window. The other thing I had to do is block update window calls on the plugin from widget in the parent which hung. (There was a delayed paint event that fired after the alert was shown.)
Comment 7 Jim Mathies [:jimm] 2011-05-03 09:59:48 PDT
Created attachment 529739 [details]
debug log

Attached are the logs showing how we get into this hang. The child gets caught up waiting on a response to evaluate after deferring an incall from the parent. There's a comment in the RPC code detailing this, Iā€™m not sure of our reasoning here:

http://mxr.mozilla.org/mozilla-central/source/ipc/glue/RPCChannel.cpp#232

The sequence of events is:

child: calls NPN_Evaluate on parent
parent: display alert & display the overlay plus clip the plugin out of the view to display the overlay properly.

(Windows posts focus and paint related events to the plugin windows. Oddly enough, the paint goes to the window in the parent, and the focus goes to the child window in the child process. Neat!)

child: plugin window receives WM_SETFOCUS, sends Msg_PluginFocusChange
parent: plugin window receives WM_PAINT, sends Msg_UpdateWindow
child: receives Msg_UpdateWindow, defers it and goes back to waiting on the NPN_Evaluate reply
parent: receives Msg_PluginFocusChange incall, processes it. 
parent: sits waiting on the response to Msg_UpdateWindow which never shows up.
-deadlock-
Comment 8 Jim Mathies [:jimm] 2011-05-03 10:00:51 PDT
cc'cjones. Chris, curious why avoid processing those deferred incalls in the child.
Comment 9 Jim Mathies [:jimm] 2011-05-03 11:07:39 PDT
Hmm, there's a pretty easy widget change that avoids this when the plugin is hidden, but unfortunately the same hang occurs when the plugin is shown again once the alert is dismissed. That's a bit harder to avoid.
Comment 10 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-05-03 11:51:23 PDT
The problem here is that both eval and FocusChange race with UpdateWin, but we're only recognizing the race with FocusChange.

 P                     C
------                ------
                       <- Evaluate
 PAINT                 SETFOCUS
 UpdateWin ->          <- FocusChange

  [UpdateWin races with Evaluate and FocusChange]

 reply FocusChange       (defer UpdateWin)
 reply Evaluate
          /`+==== [!!BUG!! should be doing this, aren't]


I should be able to write a test for this.
Comment 11 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-05-03 12:00:08 PDT
Actually I can't write an IPDL-only test for this.  This situation can only occur with the windows wakeup stuff (we *really* need tests there :/).  /me scratches head...
Comment 12 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-05-03 12:12:10 PDT
(Jim straightened me out; I misread the log, the alert triggered by the eval() actually spins a nested event loop.)  So I think I can write a cross-platform test after all.
Comment 13 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-05-03 12:54:53 PDT
Created attachment 529808 [details] [diff] [review]
Test

Run with
  $objdir/dist/bin/ipdlunittest.exe TestNestedLoopRace

This works fine for me locally :/.  Not sure what's up.
Comment 14 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-05-03 13:19:06 PDT
(In reply to comment #8)
> cc'cjones. Chris, curious why avoid processing those deferred incalls in the
> child.

So, the |continue| statement there is partly meant to loop back around to check |MaybeUndeferIncall();| (among other things).  In the test above, this is happening successfully, but for some reason that doesn't seem to be happening with the plugin stuff.  In your log, I see

"child is ignoring deferred incall in inner RPCChannel::Call while loop!"

several times; where in the code was that printf()?
Comment 15 Jim Mathies [:jimm] 2011-05-03 13:59:48 PDT
(In reply to comment #14)
> (In reply to comment #8)
> > cc'cjones. Chris, curious why avoid processing those deferred incalls in the
> > child.
> 
> So, the |continue| statement there is partly meant to loop back around to check
> |MaybeUndeferIncall();| (among other things).  In the test above, this is
> happening successfully, but for some reason that doesn't seem to be happening
> with the plugin stuff.  In your log, I see

If I remember correctly, that returned early here:

if (mDeferred.top().rpc_remote_stack_depth_guess() < stackDepth)
   return;

I'll debug those stack depth numbers.

> 
> "child is ignoring deferred incall in inner RPCChannel::Call while loop!"
> 
> several times; where in the code was that printf()?

In that final else I added a check of !mDeferred.empty() and printed that if it was true.
Comment 16 Jim Mathies [:jimm] 2011-05-03 15:14:38 PDT
Created attachment 529850 [details] [diff] [review]
spin work around
Comment 17 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-05-04 00:54:56 PDT
Created attachment 529943 [details] [diff] [review]
Need to use the "remote stack depth view" when deciding whether to undefer, too

This ended up being bug 592002 part 2.  Missed a spot :/.
Comment 18 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-05-04 00:57:17 PDT
Created attachment 529945 [details] [diff] [review]
Test for bug 648935.

Some things I learned while hacking on this test
 - it's possible to simulate all (I think) the insanity of the windows message loop stuff with "pure IPDL"
 - the invariants on state are just about to the point where they don't fit in my head anymore.  Probably about time to model-check this code.
Comment 19 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-05-04 00:58:16 PDT
Thanks rshimazu for the great test case!

I'm not assigning this bug to myself pending the fix Jim wants to make for the spin loop.
Comment 20 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-05-04 13:57:24 PDT
First part
http://hg.mozilla.org/mozilla-central/rev/3f5b363ad39b
http://hg.mozilla.org/mozilla-central/rev/5f42997621ee
Comment 21 Jim Mathies [:jimm] 2011-05-06 13:40:02 PDT
Created attachment 530718 [details] [diff] [review]
cleanup v.1

deleting some left over code from the old modal dialogs code.
Comment 22 Jim Mathies [:jimm] 2011-05-06 13:44:37 PDT
Created attachment 530724 [details] [diff] [review]
[backed out]focus change trap v.1

This eliminates some redundant focus related ipc calls when the plugin receives focus. Here's a log with an example of what this prevents:

[InstanceChild] before WM_MOUSEACTIVATE
[PPluginInstanceChild] Sending Msg_PluginFocusChange([TODO])
[PPluginInstanceParent] Received Msg_PluginFocusChange([TODO])
bool __thiscall mozilla::plugins::PluginInstanceParent::AnswerPluginFocusChange(const bool &)
[InstanceParent] before AnswerPluginFocusChange
[PPluginInstanceParent] Sending Msg_SetPluginFocus([TODO])
[PPluginInstanceChild] Received Msg_SetPluginFocus([TODO])
[InstanceChild] msg queue msgs: 203
bool __thiscall mozilla::plugins::PluginInstanceChild::AnswerSetPluginFocus(void)
[InstanceChild] before AnswerSetPluginFocus
[InstanceParent] deferring: '8'
[InstanceParent] deferring: '281'
[InstanceChild] PluginWindowProcInternal: 281
[InstanceChild] PluginWindowProcInternal: 282
[InstanceChild] PluginWindowProcInternal: 7
[InstanceChild] after AnswerSetPluginFocus (set)
[PPluginInstanceChild] Sending reply Reply_SetPluginFocus([TODO])
[PPluginInstanceParent] Received reply Reply_SetPluginFocus([TODO])
[PPluginInstanceParent] Sending reply Reply_PluginFocusChange([TODO])
[InstanceParent] after AnswerPluginFocusChange
[PPluginInstanceChild] Received reply Reply_PluginFocusChange([TODO])
[PPluginInstanceChild] Sending Msg_NPN_GetValue_NPNVWindowNPObject([TODO])
[InstanceChild] after WM_MOUSEACTIVATE
Comment 23 Jim Mathies [:jimm] 2011-05-06 13:50:54 PDT
Created attachment 530726 [details] [diff] [review]
child side paint v.1

This fixes a problem where deferred WM_PAINT events force windows to cascade the paint event up to the parent triggering a CallUpdateWindow. I found some of these 'bounced' paint events in my logs while debugging so I worked up a fix. I'd like to go back and investigate further. This might be the cause of those empty update region paint events we were seeing in bug 543788.
Comment 24 Jim Mathies [:jimm] 2011-05-06 13:52:58 PDT
Created attachment 530728 [details] [diff] [review]
alert fix v.1

This adds support to the child side for receiving the spin event loop async message, and hooks the current RPC stack frame up to nsAppShell so that it can message the child when the gecko event loop spins.
Comment 25 rshimazu 2011-05-08 16:59:18 PDT
As I reported in the beginning, when you set "dom.ipc.plugins.enabled" as false, you will not see this problem.

And today I found another "exception".
When I install an addon "Page Saver" ( http://pearlcrescent.com/products/pagesaver/ ), and enable "Capture Flash Contents"(though I am not sure about English translation) in it, I see no problem even if I keep "dom.ipc.plugins.enabled" as true. I can dismiss an alert generated by the flash content.

But once I disable "Capture Flash Contents" even if "Page Saver" is installed, I see the problem: I can not dismiss an alert generated by the flash content.

Can this information be a clue or a hint to solve this problem?
Comment 26 Jim Mathies [:jimm] 2011-05-09 05:29:40 PDT
(In reply to comment #25)
> As I reported in the beginning, when you set "dom.ipc.plugins.enabled" as
> false, you will not see this problem.
> 
> And today I found another "exception".
> When I install an addon "Page Saver" (
> http://pearlcrescent.com/products/pagesaver/ ), and enable "Capture Flash
> Contents"(though I am not sure about English translation) in it, I see no
> problem even if I keep "dom.ipc.plugins.enabled" as true. I can dismiss an
> alert generated by the flash content.
> 
> But once I disable "Capture Flash Contents" even if "Page Saver" is
> installed, I see the problem: I can not dismiss an alert generated by the
> flash content.
> 
> Can this information be a clue or a hint to solve this problem?

rshimazu, I just fired off some try builds with the fixes posted here, the windows builds should be done in a few hours. If you can, please test with the windows build to confirm the patches here fix the alert problem.

The builds should start showing up here in a few hours:
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-23adfb9e1e13/
Comment 28 Jim Mathies [:jimm] 2011-05-09 13:27:10 PDT
Comment on attachment 530718 [details] [diff] [review]
cleanup v.1

purely dead code removal, easy review.
Comment 29 Jim Mathies [:jimm] 2011-05-09 13:27:49 PDT
Comment on attachment 530724 [details] [diff] [review]
[backed out]focus change trap v.1

cuts down on focus change traffic when the plugin is activated by the mouse.
Comment 30 Jim Mathies [:jimm] 2011-05-09 13:29:33 PDT
Created attachment 531120 [details]
focus test case

handy focus test case I use to test.
Comment 31 Jim Mathies [:jimm] 2011-05-09 13:30:55 PDT
Comment on attachment 530726 [details] [diff] [review]
child side paint v.1

This cuts down on paint traffic when we defer a paint event. This isn't a deal killer for these fixes (we survive just fine without it) but it does cut down on calls to CallUpdateWindow in widget.
Comment 32 Jim Mathies [:jimm] 2011-05-09 13:31:31 PDT
Comment on attachment 530728 [details] [diff] [review]
alert fix v.1

the actual hang fix.
Comment 33 Ben Turner (not reading bugmail, use the needinfo flag!) 2011-05-09 13:44:36 PDT
Comment on attachment 530726 [details] [diff] [review]
child side paint v.1

>-  HWND hWnd;
>+  HWND mWnd;

Nit: I totally don't care, but it is a little weird that this one has 'mWnd' while all the rest use 'hWnd'.
Comment 34 rshimazu 2011-05-09 18:00:10 PDT
Hello, Jim

>try builds:
>https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-23adfb9e1e13/try-win32/

What is the filename of the file which I should try? There are so many files in the directory.
Comment 35 Jim Mathies [:jimm] 2011-05-09 18:20:40 PDT
(In reply to comment #34)
> Hello, Jim
> 
> >try builds:
> >https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-23adfb9e1e13/try-win32/
> 
> What is the filename of the file which I should try? There are so many files
> in the directory.

Save this zip to your desktop:

https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-23adfb9e1e13/try-win32/firefox-6.0a1.en-US.win32.zip

Double click on it to open in explorer and drag the firefox folder out onto your desktop. Then double click to open that folder, and run firefox.exe. It should load up with your current profile. After you're done you can delete both the zip and the folder, your current install of fx should work normally.
Comment 36 rshimazu 2011-05-09 19:25:00 PDT
(In reply to comment #35)
OK I downloaded the file, which works well now. No hang.
Thank you so much.
Comment 37 Benjamin Smedberg [:bsmedberg] 2011-05-16 06:50:32 PDT
Comment on attachment 530724 [details] [diff] [review]
[backed out]focus change trap v.1

Can PluginInstanceParent::AnswerPluginFocusChange nest so that mInAnswerFocusChange be true coming in? If so, we probably need to be a bit more careful about the values there (use mozilla::AutoRestore). If not, add an assertion.
Comment 38 Benjamin Smedberg [:bsmedberg] 2011-05-16 06:55:40 PDT
Comment on attachment 530728 [details] [diff] [review]
alert fix v.1

Make the NS_NOTREACHED a NS_RUNTIME_ABORT in PluginModuleChild
Comment 39 Jim Mathies [:jimm] 2011-05-16 10:39:32 PDT
(In reply to comment #37)
> Comment on attachment 530724 [details] [diff] [review] [review]
> focus change trap v.1
> 
> Can PluginInstanceParent::AnswerPluginFocusChange nest so that
> mInAnswerFocusChange be true coming in? If so, we probably need to be a bit
> more careful about the values there (use mozilla::AutoRestore). If not, add
> an assertion.

AutoRestore is perfect. Didn't know we had that!
Comment 41 Ignas 2011-06-14 01:37:29 PDT
*** Bug 663773 has been marked as a duplicate of this bug. ***
Comment 42 Simona B [:simonab ] 2011-07-27 06:59:13 PDT
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0a1) Gecko/20110725 Firefox/8.0a1
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20100101 Firefox/6.0

Verified issue on Windows 7 on the latest nightly and Firefox 6.0b3 using the test cases from Comment 0. The issue is no longer reproducible.

Setting resolution to VERIFIED FIXED.
Comment 43 rshimazu 2011-08-17 14:36:50 PDT
Today I downloaded Firefox 6.0 and confirmed this problem had been fixed.

Note You need to log in before you can comment on or make changes to this bug.