Closed Bug 648935 Opened 13 years ago Closed 13 years ago

Firefox freezes when an alert box generated by Flash is shown.

Categories

(Core Graveyard :: Plug-ins, defect)

x86
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED
mozilla6

People

(Reporter: rshimazu, Assigned: jimm)

References

()

Details

(Keywords: testcase)

Attachments

(10 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0) Gecko/20100101 Firefox/4.0
Build Identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0) Gecko/20100101 Firefox/4.0

When you set "dom.ipc.plugins.enabled" as true, you can not dismiss the alert box generated by Flash's ExternalInterface.call method. Firefox freezes. 

Reproducible: Always

Steps to Reproduce:
1. Go to the test case: http://www.broadband-xp.com/test/firefox4_alert/alert_test2.html
2. Click the character "A"
3.
Actual Results:  
Even if you click "OK" button, the alert box does not disappear but Firefox freezes.

Expected Results:  
If you click "OK" button, the alert box should disappear.

When you set "dom.ipc.plugins.enabled" as false, you will not see this problem.
For this test, you have to set it as the default.

http://kishiken.com/as/javascript/test.html
is also a test case even though this page is not mine.
Confirmed on Mozilla/5.0 (Windows NT 6.1; rv:2.0.1pre) Gecko/20110410 Firefox/4.0.1pre
Component: General → Plug-ins
Keywords: testcase
Product: Firefox → Core
QA Contact: general → plugins
Version: unspecified → Trunk
Attached file stack (obsolete) —
Status: UNCONFIRMED → NEW
Ever confirmed: true
See Also: → 561075
Interesting hang. Neither process is held up by a system call. The child is running in RPCChannel's WaitForNotify and the parent is spinning around in a javascript loop waiting for the alert to close. The js loop calls ProcessNextNativeEvent so incoming Windows events get processed. But for some reason the system won't allow the window to come to the forefront and accept and deliver mouse events. There doesn't appear to be any deferring of messages going on, I can shake to minimize the main window and the events it receives are delivered to the proper event procedure.

All in all strange hang, I'm not quite sure what's going on yet.
Attachment #525132 - Attachment is obsolete: true
What might be going on here is the event queue for the child window has some stuff in it windows wants delivered. We would defer anything that was non-queued and ignore anything that's queued. In some cases we've ran into with dialog focus queued messages also need to be delivered.
(In reply to comment #4)
> What might be going on here is the event queue for the child window has some
> stuff in it windows wants delivered. We would defer anything that was
> non-queued and ignore anything that's queued. In some cases we've ran into with
> dialog focus queued messages also need to be delivered.

This doesn't appear to be the case. Always spinning the event loop in the child doesn't solve this hang.
I've managed to hack my way around this, but I don't have a clean solution yet. First I setup RPCChannel's WaitForNotify in the child to always call into spin loop. That cleared out queued events that were pending for the plugin window. The other thing I had to do is block update window calls on the plugin from widget in the parent which hung. (There was a delayed paint event that fired after the alert was shown.)
Attached file debug log
Attached are the logs showing how we get into this hang. The child gets caught up waiting on a response to evaluate after deferring an incall from the parent. There's a comment in the RPC code detailing this, I’m not sure of our reasoning here:

http://mxr.mozilla.org/mozilla-central/source/ipc/glue/RPCChannel.cpp#232

The sequence of events is:

child: calls NPN_Evaluate on parent
parent: display alert & display the overlay plus clip the plugin out of the view to display the overlay properly.

(Windows posts focus and paint related events to the plugin windows. Oddly enough, the paint goes to the window in the parent, and the focus goes to the child window in the child process. Neat!)

child: plugin window receives WM_SETFOCUS, sends Msg_PluginFocusChange
parent: plugin window receives WM_PAINT, sends Msg_UpdateWindow
child: receives Msg_UpdateWindow, defers it and goes back to waiting on the NPN_Evaluate reply
parent: receives Msg_PluginFocusChange incall, processes it. 
parent: sits waiting on the response to Msg_UpdateWindow which never shows up.
-deadlock-
cc'cjones. Chris, curious why avoid processing those deferred incalls in the child.
Hmm, there's a pretty easy widget change that avoids this when the plugin is hidden, but unfortunately the same hang occurs when the plugin is shown again once the alert is dismissed. That's a bit harder to avoid.
The problem here is that both eval and FocusChange race with UpdateWin, but we're only recognizing the race with FocusChange.

 P                     C
------                ------
                       <- Evaluate
 PAINT                 SETFOCUS
 UpdateWin ->          <- FocusChange

  [UpdateWin races with Evaluate and FocusChange]

 reply FocusChange       (defer UpdateWin)
 reply Evaluate
          /`+==== [!!BUG!! should be doing this, aren't]


I should be able to write a test for this.
Actually I can't write an IPDL-only test for this.  This situation can only occur with the windows wakeup stuff (we *really* need tests there :/).  /me scratches head...
(Jim straightened me out; I misread the log, the alert triggered by the eval() actually spins a nested event loop.)  So I think I can write a cross-platform test after all.
Attached patch Test (obsolete) — Splinter Review
Run with
  $objdir/dist/bin/ipdlunittest.exe TestNestedLoopRace

This works fine for me locally :/.  Not sure what's up.
(In reply to comment #8)
> cc'cjones. Chris, curious why avoid processing those deferred incalls in the
> child.

So, the |continue| statement there is partly meant to loop back around to check |MaybeUndeferIncall();| (among other things).  In the test above, this is happening successfully, but for some reason that doesn't seem to be happening with the plugin stuff.  In your log, I see

"child is ignoring deferred incall in inner RPCChannel::Call while loop!"

several times; where in the code was that printf()?
(In reply to comment #14)
> (In reply to comment #8)
> > cc'cjones. Chris, curious why avoid processing those deferred incalls in the
> > child.
> 
> So, the |continue| statement there is partly meant to loop back around to check
> |MaybeUndeferIncall();| (among other things).  In the test above, this is
> happening successfully, but for some reason that doesn't seem to be happening
> with the plugin stuff.  In your log, I see

If I remember correctly, that returned early here:

if (mDeferred.top().rpc_remote_stack_depth_guess() < stackDepth)
   return;

I'll debug those stack depth numbers.

> 
> "child is ignoring deferred incall in inner RPCChannel::Call while loop!"
> 
> several times; where in the code was that printf()?

In that final else I added a check of !mDeferred.empty() and printed that if it was true.
Attached patch spin work aroundSplinter Review
This ended up being bug 592002 part 2.  Missed a spot :/.
Attachment #529808 - Attachment is obsolete: true
Attachment #529943 - Flags: review?(benjamin)
Some things I learned while hacking on this test
 - it's possible to simulate all (I think) the insanity of the windows message loop stuff with "pure IPDL"
 - the invariants on state are just about to the point where they don't fit in my head anymore.  Probably about time to model-check this code.
Thanks rshimazu for the great test case!

I'm not assigning this bug to myself pending the fix Jim wants to make for the spin loop.
Attachment #529943 - Flags: review?(benjamin) → review+
Attached patch cleanup v.1Splinter Review
deleting some left over code from the old modal dialogs code.
This eliminates some redundant focus related ipc calls when the plugin receives focus. Here's a log with an example of what this prevents:

[InstanceChild] before WM_MOUSEACTIVATE
[PPluginInstanceChild] Sending Msg_PluginFocusChange([TODO])
[PPluginInstanceParent] Received Msg_PluginFocusChange([TODO])
bool __thiscall mozilla::plugins::PluginInstanceParent::AnswerPluginFocusChange(const bool &)
[InstanceParent] before AnswerPluginFocusChange
[PPluginInstanceParent] Sending Msg_SetPluginFocus([TODO])
[PPluginInstanceChild] Received Msg_SetPluginFocus([TODO])
[InstanceChild] msg queue msgs: 203
bool __thiscall mozilla::plugins::PluginInstanceChild::AnswerSetPluginFocus(void)
[InstanceChild] before AnswerSetPluginFocus
[InstanceParent] deferring: '8'
[InstanceParent] deferring: '281'
[InstanceChild] PluginWindowProcInternal: 281
[InstanceChild] PluginWindowProcInternal: 282
[InstanceChild] PluginWindowProcInternal: 7
[InstanceChild] after AnswerSetPluginFocus (set)
[PPluginInstanceChild] Sending reply Reply_SetPluginFocus([TODO])
[PPluginInstanceParent] Received reply Reply_SetPluginFocus([TODO])
[PPluginInstanceParent] Sending reply Reply_PluginFocusChange([TODO])
[InstanceParent] after AnswerPluginFocusChange
[PPluginInstanceChild] Received reply Reply_PluginFocusChange([TODO])
[PPluginInstanceChild] Sending Msg_NPN_GetValue_NPNVWindowNPObject([TODO])
[InstanceChild] after WM_MOUSEACTIVATE
This fixes a problem where deferred WM_PAINT events force windows to cascade the paint event up to the parent triggering a CallUpdateWindow. I found some of these 'bounced' paint events in my logs while debugging so I worked up a fix. I'd like to go back and investigate further. This might be the cause of those empty update region paint events we were seeing in bug 543788.
Attached patch alert fix v.1Splinter Review
This adds support to the child side for receiving the spin event loop async message, and hooks the current RPC stack frame up to nsAppShell so that it can message the child when the gecko event loop spins.
As I reported in the beginning, when you set "dom.ipc.plugins.enabled" as false, you will not see this problem.

And today I found another "exception".
When I install an addon "Page Saver" ( http://pearlcrescent.com/products/pagesaver/ ), and enable "Capture Flash Contents"(though I am not sure about English translation) in it, I see no problem even if I keep "dom.ipc.plugins.enabled" as true. I can dismiss an alert generated by the flash content.

But once I disable "Capture Flash Contents" even if "Page Saver" is installed, I see the problem: I can not dismiss an alert generated by the flash content.

Can this information be a clue or a hint to solve this problem?
(In reply to comment #25)
> As I reported in the beginning, when you set "dom.ipc.plugins.enabled" as
> false, you will not see this problem.
> 
> And today I found another "exception".
> When I install an addon "Page Saver" (
> http://pearlcrescent.com/products/pagesaver/ ), and enable "Capture Flash
> Contents"(though I am not sure about English translation) in it, I see no
> problem even if I keep "dom.ipc.plugins.enabled" as true. I can dismiss an
> alert generated by the flash content.
> 
> But once I disable "Capture Flash Contents" even if "Page Saver" is
> installed, I see the problem: I can not dismiss an alert generated by the
> flash content.
> 
> Can this information be a clue or a hint to solve this problem?

rshimazu, I just fired off some try builds with the fixes posted here, the windows builds should be done in a few hours. If you can, please test with the windows build to confirm the patches here fix the alert problem.

The builds should start showing up here in a few hours:
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-23adfb9e1e13/
Comment on attachment 530718 [details] [diff] [review]
cleanup v.1

purely dead code removal, easy review.
Attachment #530718 - Flags: review?(benjamin)
Comment on attachment 530724 [details] [diff] [review]
[backed out]focus change trap v.1

cuts down on focus change traffic when the plugin is activated by the mouse.
Attachment #530724 - Flags: review?(benjamin)
Attached file focus test case
handy focus test case I use to test.
Comment on attachment 530726 [details] [diff] [review]
child side paint v.1

This cuts down on paint traffic when we defer a paint event. This isn't a deal killer for these fixes (we survive just fine without it) but it does cut down on calls to CallUpdateWindow in widget.
Attachment #530726 - Flags: review?(bent.mozilla)
Comment on attachment 530728 [details] [diff] [review]
alert fix v.1

the actual hang fix.
Attachment #530728 - Flags: review?(benjamin)
Comment on attachment 530726 [details] [diff] [review]
child side paint v.1

>-  HWND hWnd;
>+  HWND mWnd;

Nit: I totally don't care, but it is a little weird that this one has 'mWnd' while all the rest use 'hWnd'.
Attachment #530726 - Flags: review?(bent.mozilla) → review+
Hello, Jim

>try builds:
>https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-23adfb9e1e13/try-win32/

What is the filename of the file which I should try? There are so many files in the directory.
(In reply to comment #34)
> Hello, Jim
> 
> >try builds:
> >https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-23adfb9e1e13/try-win32/
> 
> What is the filename of the file which I should try? There are so many files
> in the directory.

Save this zip to your desktop:

https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-23adfb9e1e13/try-win32/firefox-6.0a1.en-US.win32.zip

Double click on it to open in explorer and drag the firefox folder out onto your desktop. Then double click to open that folder, and run firefox.exe. It should load up with your current profile. After you're done you can delete both the zip and the folder, your current install of fx should work normally.
(In reply to comment #35)
OK I downloaded the file, which works well now. No hang.
Thank you so much.
Attachment #530718 - Flags: review?(benjamin) → review+
Comment on attachment 530724 [details] [diff] [review]
[backed out]focus change trap v.1

Can PluginInstanceParent::AnswerPluginFocusChange nest so that mInAnswerFocusChange be true coming in? If so, we probably need to be a bit more careful about the values there (use mozilla::AutoRestore). If not, add an assertion.
Attachment #530724 - Flags: review?(benjamin) → review+
Comment on attachment 530728 [details] [diff] [review]
alert fix v.1

Make the NS_NOTREACHED a NS_RUNTIME_ABORT in PluginModuleChild
Attachment #530728 - Flags: review?(benjamin) → review+
(In reply to comment #37)
> Comment on attachment 530724 [details] [diff] [review] [review]
> focus change trap v.1
> 
> Can PluginInstanceParent::AnswerPluginFocusChange nest so that
> mInAnswerFocusChange be true coming in? If so, we probably need to be a bit
> more careful about the values there (use mozilla::AutoRestore). If not, add
> an assertion.

AutoRestore is perfect. Didn't know we had that!
Assignee: nobody → jmathies
Target Milestone: --- → mozilla6
Depends on: 658741
Attachment #530724 - Attachment description: focus change trap v.1 → [backed out]focus change trap v.1
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0a1) Gecko/20110725 Firefox/8.0a1
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20100101 Firefox/6.0

Verified issue on Windows 7 on the latest nightly and Firefox 6.0b3 using the test cases from Comment 0. The issue is no longer reproducible.

Setting resolution to VERIFIED FIXED.
Status: RESOLVED → VERIFIED
Depends on: 675645
No longer depends on: 675645
Depends on: 675645
Today I downloaded Firefox 6.0 and confirmed this problem had been fixed.
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: