Last Comment Bug 834127 - Windows is intermittently not setting WS_VISIBLE on the Plugin Hang UI dialog
: Windows is intermittently not setting WS_VISIBLE on the Plugin Hang UI dialog
Status: VERIFIED FIXED
:
Product: Core
Classification: Components
Component: Plug-ins (show other bugs)
: unspecified
: x86_64 Windows 7
: P2 normal (vote)
: mozilla21
Assigned To: Aaron Klotz [:aklotz]
: Manuela Muntean [Away]
Mentors:
: 851604 (view as bug list)
Depends on: 805591 858800
Blocks:
  Show dependency treegraph
 
Reported: 2013-01-23 20:41 PST by Aaron Klotz [:aklotz]
Modified: 2013-04-05 14:43 PDT (History)
10 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
unaffected
+
verified
+
verified


Attachments
Proposed patch (10.76 KB, patch)
2013-02-13 12:59 PST, Aaron Klotz [:aklotz]
benjamin: review+
bajaj.bhavana: approval‑mozilla‑aurora+
Details | Diff | Splinter Review

Description Aaron Klotz [:aklotz] 2013-01-23 20:41:25 PST
Found by Manuela, I was able to distill this into a repro:

1. Follow the steps to reproduce from https://bugs.adobe.com/jira/browse/SDK-26044
2. Click around in the tab strip
3. We expect the Plugin Hang UI to be shown after the threshold has been reached. Sometimes it does, but sometimes it doesn't show up.

plugin-hang-ui.exe is being started and Spy++ shows that the windows exist, have the correct owner, but are hidden. Windows always initially creates a dialog without WS_VISIBLE and then sets it some time after WM_INITDIALOG. For some reason this isn't always happening.
Comment 1 Masatoshi Kimura [:emk] 2013-01-24 07:01:02 PST
Is that reproducible with protected mode disabled?
Comment 2 Aaron Klotz [:aklotz] 2013-01-29 12:17:28 PST
(In reply to Masatoshi Kimura [:emk] from comment #1)
> Is that reproducible with protected mode disabled?

Yes.
Comment 3 Aaron Klotz [:aklotz] 2013-01-29 12:26:40 PST
plugin-hang-ui.exe is hanging due to implicit SendMessage calls that occur during dialog box creation. Windows is trying to synchronously send messages to other windows (whose threads are hung) to inform them that they are losing activation and keyboard focus.

I've been attempting to attack this from numerous angles. While I've managed to make some changes that reduce the frequency of the hang, it still happens.

These issues wouldn't exist if the dialog didn't have an owner. I consider this option to be the option of last resort. Removing the owner would cause grief because extra measures would need to be taken to ensure that the Plugin Hang UI always appears above Firefox in the Z-order. We get that for free courtesy of the owner-owned relationship.

I still have a few ideas up my sleeve for attacking this without going nuclear on the ownership aspect.
Comment 4 Aaron Klotz [:aklotz] 2013-01-29 12:50:48 PST
Here's the call stack from the hung plugin-hang-ui.exe:

USER32!NtUserShowWindow+0x15
USER32!DialogBox2+0x1ac
USER32!InternalDialogBox+0xe5
USER32!DialogBoxIndirectParamAorW+0x37
USER32!DialogBoxParamW+0x3f
plugin_hang_ui!mozilla::plugins::PluginHangUIChild::Show+0x21 [c:\users\aklotz\src\m-c\dom\plugins\ipc\hangui\pluginhanguichild.cpp @ 320]
plugin_hang_ui!mozilla::plugins::PluginHangUIChild::ShowAPC+0x9 [c:\users\aklotz\src\m-c\dom\plugins\ipc\hangui\pluginhanguichild.cpp @ 340]
ntdll_77400000!RtlDispatchAPC+0x4c
ntdll_77400000!KiUserApcDispatcher+0x25
KERNEL32!WaitForSingleObjectExImplementation+0x75
plugin_hang_ui!mozilla::plugins::PluginHangUIChild::WaitForDismissal+0x45 [c:\users\aklotz\src\m-c\dom\plugins\ipc\hangui\pluginhanguichild.cpp @ 361]
plugin_hang_ui!wmain+0x89 [c:\users\aklotz\src\m-c\dom\plugins\ipc\hangui\pluginhanguichild.cpp @ 396]
plugin_hang_ui!__tmainCRTStartup+0x122 [f:\dd\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 552]
KERNEL32!BaseThreadInitThunk+0xe
ntdll_77400000!__RtlUserThreadStart+0x70
ntdll_77400000!_RtlUserThreadStart+0x1b
Comment 5 Aaron Klotz [:aklotz] 2013-01-29 12:59:28 PST
The parameters on the stack for the ShowWindow call are:
hWnd == (Plugin Hang UI Dialog HWND)
nCmdShow == SW_SHOWNORMAL
Comment 6 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2013-01-30 10:59:33 PST
Here's my educated guess:

DialogBox creates the window with the owner relationship. When this relationship is set up, Windows automatically attaches the input queues. Then some of the activate/showwindow calls are dispatched to the owner window before WM_INITDIALOG. Because the owner window is stuck, the modal dialog loop just sits.

Have you tried creating the window with no owner (in DialogBox) and the setting the owner in the WM_INITDIALOG handler immediately before detaching the event queues? (via SetWindowLong(handle, GWL_HWNDPARENT, foo) I don't know whether this will end up positioning/z-ordering them correctly or not...
Comment 7 Aaron Klotz [:aklotz] 2013-01-30 14:29:03 PST
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #6)

> Have you tried creating the window with no owner (in DialogBox) and the
> setting the owner in the WM_INITDIALOG handler immediately before detaching
> the event queues? (via SetWindowLong(handle, GWL_HWNDPARENT, foo) I don't
> know whether this will end up positioning/z-ordering them correctly or not...

I tried that and there were still hangs in the subsequent SetWindowPos call (which is necessary to notify windows about the state change).

I revisited the idea this afternoon and it looks like this works when combined with some of the other changes I was experimenting with. It looks like I've managed to quash this thing after putting together the right combination of fixes. Patch pending...
Comment 8 Aaron Klotz [:aklotz] 2013-02-04 16:44:05 PST
This is annoying: While the combined patch had shown the best results so far, I eventually hit intermittent hangs in the Plugin Hang UI after it was shown. It got stuck in a call to PeekMessage (!). I fired up LiveKd and dumped the kernel-mode call stack of that thread. It turns out that the PeekMessage implementation in win32k.sys was attempting to send a synchronous, inter-thread message to the "MSCTFIME UI" window in the Flash helper process.

I went back and reproduced the pre-show hangs and it looks like those calls are blocking on the same thing. For some reason Windows is being insistent on messaging that hidden IME window.
Comment 9 Aaron Klotz [:aklotz] 2013-02-13 12:59:03 PST
Created attachment 713586 [details] [diff] [review]
Proposed patch

I think that for now I've exhausted my avenues when it comes to detaching input queues while at the same time maintaining an owner-owned relationship between Firefox and Plugin Hang UI windows. Too many calls result in either the Window manager or DefWindowProc trying to send a synchronous inter-thread message to an IME window owned by a hung flash thread.

Various things that I've tried (individually and together):
- Deferring setting the owner after the dialog has been shown;
- Enumerating all Windows with a child/owned relationship to Firefox and attempting to explicitly detach their input queues;
- Pumping messages in PluginHangUIParent until the Plugin Hang UI has shown;
- Temporarily severing the parent-child relationship between the Firefox window and the plugin-container window for the duration of the hang;
- Trying to prevent WM_MOUSEACTIVATE and WM_NC* messages from propagating to the owner window;
- (Not for production use) Walking the entire thread list and attempting to explicitly detach from everybody else;
- Probably some other ridiculous tactics that my mind has suppressed and therefore can no longer recall.

Instead, I propose that we create the Plugin Hang UI with a NULL owner. This patch uses a NULL owner but also sets the App User Model ID so that the dialog is properly grouped with Firefox on the Windows 7 taskbar.

Testing with try builds over the past few days has demonstrated that as long as the Firefox window stays hung, it won't be promoted above the Plugin Hang UI in the Z-order. In a multiple-window situation I have observed "other" Firefox windows jumping in front, but this behavior is really no different than what would would happen when specifying an owner.

Perhaps this problem can be reevaluated in the future if we are able to gain new insights. For the moment, I believe that a null owner is the only way that we can guarantee that the Plugin Hang UI's input queue won't become entangled with the others.
Comment 10 Brian R. Bondy [:bbondy] 2013-02-14 10:10:16 PST
Comment on attachment 713586 [details] [diff] [review]
Proposed patch

Review of attachment 713586 [details] [diff] [review]:
-----------------------------------------------------------------

I'm fine with this but you may want to pass it by a plugin peer.
Comment 11 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2013-02-14 12:35:19 PST
Comment on attachment 713586 [details] [diff] [review]
Proposed patch

Should we remove the code which explicitly separates the input queues, since they should no longer be automatically attached by the owner relationship?
Comment 12 Aaron Klotz [:aklotz] 2013-02-14 13:44:35 PST
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #11)
> Comment on attachment 713586 [details] [diff] [review]
> Proposed patch
> 
> Should we remove the code which explicitly separates the input queues, since
> they should no longer be automatically attached by the owner relationship?

I removed that code from the WM_INITDIALOG handler as part of the patch.
Comment 13 Aaron Klotz [:aklotz] 2013-02-14 13:45:36 PST
Marking checkin-needed. Try build:
https://tbpl.mozilla.org/?tree=Try&rev=e161f64acac6
Comment 14 Ryan VanderMeulen [:RyanVM] 2013-02-14 15:04:57 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/960f1346d63d
Comment 15 Aaron Klotz [:aklotz] 2013-02-14 15:09:08 PST
Comment on attachment 713586 [details] [diff] [review]
Proposed patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 805591
User impact if declined: Plugin Hang UI will itself intermittently hang
Testing completed (on m-c, etc.): Builds have been tested both by myself and by QA on a variety of Windows platforms
Risk to taking this patch (and alternatives if risky): Low, this patch is pretty simple. An alternative is to disable the Plugin Hang UI via pref change for this version of Firefox.
String or UUID changes made by this patch: None
Comment 16 Ryan VanderMeulen [:RyanVM] 2013-02-15 06:52:13 PST
https://hg.mozilla.org/mozilla-central/rev/960f1346d63d
Comment 17 bhavana bajaj [:bajaj] 2013-02-15 15:13:24 PST
Comment on attachment 713586 [details] [diff] [review]
Proposed patch

Low risk for a new feature in Fx20.Approving on aurora.
Comment 18 Vladan Djeric (:vladan) 2013-02-15 17:51:22 PST
Aaron: This patch does not apply cleanly because Aurora doesn't have bug 833560. Should we uplift that bug as well?
Comment 19 Vladan Djeric (:vladan) 2013-02-15 18:07:24 PST
https://hg.mozilla.org/releases/mozilla-aurora/rev/284d0218eb3d
Comment 20 Manuela Muntean [Away] 2013-03-08 06:52:12 PST
The Plugin Hang UI is shown now, on a Windows 7 64-bit machine, with all these builds:

- latest Nightly (build ID: 20130307030926)
- latest Aurora (build ID: 20130307042016)
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451)

I'll continue testing on other Windows OSs, and come back with the results here.
Comment 21 Manuela Muntean [Away] 2013-03-11 09:18:42 PDT
Here are the results of my investigations:

1) Windows XP 32-bit: - the Plugin Hang UI is shown for builds:

- Nightly (build ID: 20130307030926)
- Aurora (build ID: 20130307042016)
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451)


2) Windows 8 32-bit: 

- Nightly (build ID: 20130307030926) - the Plugin Hang UI isn't shown
- Aurora (build ID: 20130307042016) - the Plugin Hang UI isn't shown
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451) - the Plugin Hang UI is shown

3) Windows 8 64-bit: - the Plugin Hang UI is shown for builds:

- latest Nightly (build ID: 20130311030946)
- latest Aurora (build ID: 20130310042012)
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451)

4) Windows Vista 32-bit: - the Plugin Hang UI is shown for builds:

- latest Nightly (build ID: 20130311030946)
- latest Aurora (build ID: 20130311042011)
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451)

5) Windows 7 32-bit: - the Plugin Hang UI is shown for builds:

- latest Nightly (build ID: 20130311030946)
- latest Aurora (build ID: 20130311042011)
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451)
Comment 22 Manuela Muntean [Away] 2013-03-18 00:46:33 PDT
*** Bug 851604 has been marked as a duplicate of this bug. ***
Comment 23 Manuela Muntean [Away] 2013-03-18 00:55:45 PDT
Marking this verified for Firefox 20 and 21, considering comment 20, comment 21 and that this is an intermittent issue.

Note You need to log in before you can comment on or make changes to this bug.