Closed Bug 834127 Opened 11 years ago Closed 11 years ago

Windows is intermittently not setting WS_VISIBLE on the Plugin Hang UI dialog

Categories

(Core Graveyard :: Plug-ins, defect, P2)

x86_64
Windows 7
defect

Tracking

(firefox19 unaffected, firefox20+ verified, firefox21+ verified)

VERIFIED FIXED
mozilla21
Tracking Status
firefox19 --- unaffected
firefox20 + verified
firefox21 + verified

People

(Reporter: bugzilla, Assigned: bugzilla)

References

Details

Attachments

(1 file)

Found by Manuela, I was able to distill this into a repro:

1. Follow the steps to reproduce from https://bugs.adobe.com/jira/browse/SDK-26044
2. Click around in the tab strip
3. We expect the Plugin Hang UI to be shown after the threshold has been reached. Sometimes it does, but sometimes it doesn't show up.

plugin-hang-ui.exe is being started and Spy++ shows that the windows exist, have the correct owner, but are hidden. Windows always initially creates a dialog without WS_VISIBLE and then sets it some time after WM_INITDIALOG. For some reason this isn't always happening.
Is that reproducible with protected mode disabled?
(In reply to Masatoshi Kimura [:emk] from comment #1)
> Is that reproducible with protected mode disabled?

Yes.
plugin-hang-ui.exe is hanging due to implicit SendMessage calls that occur during dialog box creation. Windows is trying to synchronously send messages to other windows (whose threads are hung) to inform them that they are losing activation and keyboard focus.

I've been attempting to attack this from numerous angles. While I've managed to make some changes that reduce the frequency of the hang, it still happens.

These issues wouldn't exist if the dialog didn't have an owner. I consider this option to be the option of last resort. Removing the owner would cause grief because extra measures would need to be taken to ensure that the Plugin Hang UI always appears above Firefox in the Z-order. We get that for free courtesy of the owner-owned relationship.

I still have a few ideas up my sleeve for attacking this without going nuclear on the ownership aspect.
Here's the call stack from the hung plugin-hang-ui.exe:

USER32!NtUserShowWindow+0x15
USER32!DialogBox2+0x1ac
USER32!InternalDialogBox+0xe5
USER32!DialogBoxIndirectParamAorW+0x37
USER32!DialogBoxParamW+0x3f
plugin_hang_ui!mozilla::plugins::PluginHangUIChild::Show+0x21 [c:\users\aklotz\src\m-c\dom\plugins\ipc\hangui\pluginhanguichild.cpp @ 320]
plugin_hang_ui!mozilla::plugins::PluginHangUIChild::ShowAPC+0x9 [c:\users\aklotz\src\m-c\dom\plugins\ipc\hangui\pluginhanguichild.cpp @ 340]
ntdll_77400000!RtlDispatchAPC+0x4c
ntdll_77400000!KiUserApcDispatcher+0x25
KERNEL32!WaitForSingleObjectExImplementation+0x75
plugin_hang_ui!mozilla::plugins::PluginHangUIChild::WaitForDismissal+0x45 [c:\users\aklotz\src\m-c\dom\plugins\ipc\hangui\pluginhanguichild.cpp @ 361]
plugin_hang_ui!wmain+0x89 [c:\users\aklotz\src\m-c\dom\plugins\ipc\hangui\pluginhanguichild.cpp @ 396]
plugin_hang_ui!__tmainCRTStartup+0x122 [f:\dd\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 552]
KERNEL32!BaseThreadInitThunk+0xe
ntdll_77400000!__RtlUserThreadStart+0x70
ntdll_77400000!_RtlUserThreadStart+0x1b
The parameters on the stack for the ShowWindow call are:
hWnd == (Plugin Hang UI Dialog HWND)
nCmdShow == SW_SHOWNORMAL
Here's my educated guess:

DialogBox creates the window with the owner relationship. When this relationship is set up, Windows automatically attaches the input queues. Then some of the activate/showwindow calls are dispatched to the owner window before WM_INITDIALOG. Because the owner window is stuck, the modal dialog loop just sits.

Have you tried creating the window with no owner (in DialogBox) and the setting the owner in the WM_INITDIALOG handler immediately before detaching the event queues? (via SetWindowLong(handle, GWL_HWNDPARENT, foo) I don't know whether this will end up positioning/z-ordering them correctly or not...
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #6)

> Have you tried creating the window with no owner (in DialogBox) and the
> setting the owner in the WM_INITDIALOG handler immediately before detaching
> the event queues? (via SetWindowLong(handle, GWL_HWNDPARENT, foo) I don't
> know whether this will end up positioning/z-ordering them correctly or not...

I tried that and there were still hangs in the subsequent SetWindowPos call (which is necessary to notify windows about the state change).

I revisited the idea this afternoon and it looks like this works when combined with some of the other changes I was experimenting with. It looks like I've managed to quash this thing after putting together the right combination of fixes. Patch pending...
Priority: -- → P2
This is annoying: While the combined patch had shown the best results so far, I eventually hit intermittent hangs in the Plugin Hang UI after it was shown. It got stuck in a call to PeekMessage (!). I fired up LiveKd and dumped the kernel-mode call stack of that thread. It turns out that the PeekMessage implementation in win32k.sys was attempting to send a synchronous, inter-thread message to the "MSCTFIME UI" window in the Flash helper process.

I went back and reproduced the pre-show hangs and it looks like those calls are blocking on the same thing. For some reason Windows is being insistent on messaging that hidden IME window.
Attached patch Proposed patchSplinter Review
I think that for now I've exhausted my avenues when it comes to detaching input queues while at the same time maintaining an owner-owned relationship between Firefox and Plugin Hang UI windows. Too many calls result in either the Window manager or DefWindowProc trying to send a synchronous inter-thread message to an IME window owned by a hung flash thread.

Various things that I've tried (individually and together):
- Deferring setting the owner after the dialog has been shown;
- Enumerating all Windows with a child/owned relationship to Firefox and attempting to explicitly detach their input queues;
- Pumping messages in PluginHangUIParent until the Plugin Hang UI has shown;
- Temporarily severing the parent-child relationship between the Firefox window and the plugin-container window for the duration of the hang;
- Trying to prevent WM_MOUSEACTIVATE and WM_NC* messages from propagating to the owner window;
- (Not for production use) Walking the entire thread list and attempting to explicitly detach from everybody else;
- Probably some other ridiculous tactics that my mind has suppressed and therefore can no longer recall.

Instead, I propose that we create the Plugin Hang UI with a NULL owner. This patch uses a NULL owner but also sets the App User Model ID so that the dialog is properly grouped with Firefox on the Windows 7 taskbar.

Testing with try builds over the past few days has demonstrated that as long as the Firefox window stays hung, it won't be promoted above the Plugin Hang UI in the Z-order. In a multiple-window situation I have observed "other" Firefox windows jumping in front, but this behavior is really no different than what would would happen when specifying an owner.

Perhaps this problem can be reevaluated in the future if we are able to gain new insights. For the moment, I believe that a null owner is the only way that we can guarantee that the Plugin Hang UI's input queue won't become entangled with the others.
Attachment #713586 - Flags: review?(netzen)
Comment on attachment 713586 [details] [diff] [review]
Proposed patch

Review of attachment 713586 [details] [diff] [review]:
-----------------------------------------------------------------

I'm fine with this but you may want to pass it by a plugin peer.
Attachment #713586 - Flags: review?(netzen) → review+
Attachment #713586 - Flags: review+ → review?(benjamin)
Comment on attachment 713586 [details] [diff] [review]
Proposed patch

Should we remove the code which explicitly separates the input queues, since they should no longer be automatically attached by the owner relationship?
Attachment #713586 - Flags: review?(benjamin) → review+
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #11)
> Comment on attachment 713586 [details] [diff] [review]
> Proposed patch
> 
> Should we remove the code which explicitly separates the input queues, since
> they should no longer be automatically attached by the owner relationship?

I removed that code from the WM_INITDIALOG handler as part of the patch.
Comment on attachment 713586 [details] [diff] [review]
Proposed patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 805591
User impact if declined: Plugin Hang UI will itself intermittently hang
Testing completed (on m-c, etc.): Builds have been tested both by myself and by QA on a variety of Windows platforms
Risk to taking this patch (and alternatives if risky): Low, this patch is pretty simple. An alternative is to disable the Plugin Hang UI via pref change for this version of Firefox.
String or UUID changes made by this patch: None
Attachment #713586 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/960f1346d63d
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Comment on attachment 713586 [details] [diff] [review]
Proposed patch

Low risk for a new feature in Fx20.Approving on aurora.
Attachment #713586 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Aaron: This patch does not apply cleanly because Aurora doesn't have bug 833560. Should we uplift that bug as well?
The Plugin Hang UI is shown now, on a Windows 7 64-bit machine, with all these builds:

- latest Nightly (build ID: 20130307030926)
- latest Aurora (build ID: 20130307042016)
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451)

I'll continue testing on other Windows OSs, and come back with the results here.
Here are the results of my investigations:

1) Windows XP 32-bit: - the Plugin Hang UI is shown for builds:

- Nightly (build ID: 20130307030926)
- Aurora (build ID: 20130307042016)
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451)


2) Windows 8 32-bit: 

- Nightly (build ID: 20130307030926) - the Plugin Hang UI isn't shown
- Aurora (build ID: 20130307042016) - the Plugin Hang UI isn't shown
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451) - the Plugin Hang UI is shown

3) Windows 8 64-bit: - the Plugin Hang UI is shown for builds:

- latest Nightly (build ID: 20130311030946)
- latest Aurora (build ID: 20130310042012)
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451)

4) Windows Vista 32-bit: - the Plugin Hang UI is shown for builds:

- latest Nightly (build ID: 20130311030946)
- latest Aurora (build ID: 20130311042011)
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451)

5) Windows 7 32-bit: - the Plugin Hang UI is shown for builds:

- latest Nightly (build ID: 20130311030946)
- latest Aurora (build ID: 20130311042011)
- latest Beta, Firefox 20 beta 4 (build ID: 20130307075451)
Marking this verified for Firefox 20 and 21, considering comment 20, comment 21 and that this is an intermittent issue.
Status: RESOLVED → VERIFIED
Depends on: 858800
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: