aklotz recently pointed me at a talk by Raymond Chen where he talks about attached input queues. In particular, Windows *automatically* attaches input queues when a window from one thread is parented to a window from another thread. This means that if the plugin-container hangs, the Firefox window will be unable to process input and will hang, and this can happen in a way that is not caught by the existing plugin hang detector.
What I'm going to try is explicitly detaching the input queues when we set the parent relationship for windowed plugins.
Created attachment 688799 [details] [diff] [review]
Detach the input queues, rev. 1
This patch does *something* but causes at least the following regression: when I try to go fullscreen in Flash (youtube), the fullscreen window opens behind the Firefox window.
Aaron, do you have suggestions on what I could try next?
Comment on attachment 688799 [details] [diff] [review]
Detach the input queues, rev. 1
The patch looks good, notwithstanding the side effects.
One of the things that we know about Windows' rationale for the synchronization is that it's used to ensure that things like focus changes are coordinated between two threads.
This is just a shot in the dark, but perhaps the lack of synchronization is interfering with the ability of Firefox to relinquish its foreground state. I'm thinking that we could try to explicitly coordinate the change of foreground. What if you added a call within Firefox to AllowSetForegroundWindow(), passing the PID of the plugin container?
Sorry for the upcoming wall of text. I've been playing around with Spy++ and a debugger. I've noticed a few interesting things:
* The GeckoFPSandboxChildWindow in the Flash helper process is a child of GeckoPluginWindow in plugin-container. Implicitly, the two threads involved have attached input queues.
* Some (but not all) of my message logs show the Flash Fullscreen window being set as active for a moment and then being deactivated again.
* The Flash Fullscreen window does not set a parent or owner, but it is created by the same thread as GeckoFPSandboxChildWindow.
* From what I saw in the debugger, it looks like Flash uses SetWindowPos() with HWND_TOP to place itself at the top of the Z-order.
* The window manager is deciding not to allow HWND_TOP. When the call passes HWND_TOP as hWndInsertAfter, the corresponding WM_WINDOWPOSCHANGING shows hWndInsertAfter being set to the main Firefox HWND.
* SetWindowPos() may set the active window implicitly (and this was confirmed by Spy++).
I also found this:
"SetActiveWindow brings the active window into the foreground only if the thread is the foreground thread... Applications can call AttachThreadInput to allow a set of threads to share the same input state. By sharing input state, the threads share their concept of the active window. By doing this, one thread can always activate another thread's window. This function is also useful for sharing focus state, mouse capture state, keyboard state, and window z-order state among windows created by different threads whose input state is shared."
My theory is that, without the patch, the Firefox main thread, the plugin-container main thread, and the Flash helper main thread are all attached to one another. Furthermore, since the Firefox main thread is foreground, all three threads are sharing those rights and privileges. In this case, any of those threads can call SetActiveWindow on a window created by any of those other threads successfully.
Once we detach Firefox from the other two, the plugin and its helper no longer have foreground privileges. They can activate the Flash fullscreen window but it won't successfully become foreground. Without all 3 threads being attached, I think that SetWindowPos() is insufficient to bring the Fullscreen window to the foreground.
Created attachment 735006 [details]
Potential input queue synchronization hangs
These are main-thread stacks from chrome-hangs lasting at least 5 seconds.
Aaron, could you confirm all 5 of these are caused by input queue synchronization?
I think that the first, fifth, sixth and seventh stacks are not applicable to this bug. I should note however that they are all related to shell functions or common dialogs.
Based on some testing I see three main problems with the use of this api -
1) focus / flicker problems caused by conflicting foreground status between the browser window and the plugin window
* I’m not sure if this is something we can mitigate in code, some heavy debugging of focus events down in widget might shed some light on this.
2) random hangs in modal loops associated with window actions like dragging or resize
* I’m seeing random lockups when manipulating windows or when the plugin window gets clipped by other desktop windows.
* Unfortunately attaching a utility like procdump prevents me from reproducing to get stacks.
* It appears the desktop switches to a synchronous event model for drag/resize loops. This means the desktop gets caught up in these lockups.
* Exiting these lockups requires switching desktop focus using alt-tab. Everything frees up after that.
3) Plugin focus is broken for secondary windows like fullscreen
* we can hand ‘foreground taking rights’ off to flash if we can find the right process handle, but we can’t revoke it.
We could file a support request on this seeking advice if people think it would help. Aaron I can hand that request off to you to manage.
(In reply to Jim Mathies [:jimm] from comment #7)
> We could file a support request on this seeking advice if people think it
> would help. Aaron I can hand that request off to you to manage.
Before doing that I'm going to start by seeing how changes in our code affect the attachments by using the iqvis utility that I wrote last week. See http://dblohm7.ca/blog/2015/03/03/attached-input-queues-on-firefox-for-windows/ for details.