Closed Bug 631518 Opened 13 years ago Closed 13 years ago

Attempting to select submenu items under compiz after lengthy gnome-screensaver lock closes a menu. Fixed by focusing another app

Categories

(Firefox :: Menus, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: bugs, Assigned: karlt)

References

Details

(Whiteboard: [fixed in compiz 0.8.8])

Attachments

(8 files, 2 obsolete files)

I've run into this bug repeatedly on two different laptops.
Both are running Ubuntu Linux 10.10 x86_64 and Firefox 4 nightly.

One was running Hide Menubar and the other, Compact Menu 2, to achieve a similar
compactness of space as under Windows / OSX needed on laptops with limited vertical area.

I'm fairly certain I've reproduced this in my clean test profile too, though, with only NoScript installed (which offers a context menu sub menu).

At any rate, on both machines, Firefox would work fine for a while, then suddenly all sub menus would be unselectable.

That means if I right clicked on the context menu, and attempted to access the NoScript submenu, the context menu would immediately close on attempting to mouse over into that submenu.  Or, in Hide Menubar, if I clicked on the Minefield button and hovered over Help or Preferences to get those sub menus, the Minefield button menu would close.  Same thing with Compact Menu.

Sub menus aren't too common in Firefox without addons which may be why I've never had any luck in finding other reported bugs on this.

Anyway. Restarting Firefox fixes things for a while, but it is annoying.
Does this happen after you suspend the system or after the screen saver kicks in (with Firefox having focus before the event)? Does switching to another virtual monitor and back fix the problem?

If so this may be bug 507269 / bug 616833. It will annoy Linux user (especially before they learn the trick of switching to another virtual window and back) but blocking was rejected as there isn't a known regression window.
Heh. No known regression window and linux being a minority OS ;)
'sok.  I wouldn't want this to block anyway.

As for screensaver, tried activating it, didn't seem to trigger the bug.

Problem with a regression window is it takes quite a while for the bug to manifest itself which would make finding the culprite a multi-month endeavor, along with the problem of picking the wrong good/bad revisions just 'cause the bug isn't manifesting itself at the time.
This issue also prevents the awesomebar from working at all. I am going to test without Compiz turned on today.
Behaviour has now changed. any menu now closes instantly on trying to use it. doesn't have to be a sub menu.
This could mean it is a different bug.

Anyway. Does seem it might be related to screen saver lock (although could be just idling, and I'd unlocked my screensaver a good half-dozen times today before noticing it).

But.  I tried switching to a virtual term and back, a couple of times, and it did not help.

Restarting compiz (compiz --replace&) *did* fix it.

I'm experiencing this in nightlies on both my ubuntu machines - ubuntu 10.10 and ubuntu 11.04
Ah. Interestingly, it seems to be happening repeatedly now.
I haven't restarted firefox, but every time the screen locks (I have to leave it for a short time, it isn't instantaneous), menus stop working.

compiz --replace& fixes until the screen locks again.

Since it is semi-reproducible, and I can do my own builds w/o too much trouble, lemme know if you want me to slap in debug code or whatnot.
As I mentioned earlier, take a look at  bug 507269 (with a link to a Ubuntu big) and bug 616833. This sounds like a duplicate of those.

If you can solve the issue many of us would be very happy, but there are some suggestions that it may be a Ubuntu/X/Gnome bug as other applications have reported similar symptoms. (Even if the bug is outside of Firefox, it would be nice to have Firefox detect the condition and fix it.)
Compiz does seem important. If I do metacity --replace&
I don't seem to be able to reproduce.
Here's another odd thing.
I had the bright(?) idea to try and see if the menu was slightly offset - similar to the CSS bugs that keep you from getting onto a dropdown menu on websites if there is no CSS transition delaying the menu disappearance.
So I took some screenshots of a working menu, then waited for the bug to reappear.

Well, bug reappears, and I go to gimp to take a screenshot.
I set screenshot on 5 second delay, go back to firefox... and the menu is working again.

Something about using the menus in gimp, or starting screenshot process, made the menu work.  I have no idea why.

Going to try to narrow down circumstances.
This isn't necessarily tied to the screen saver; I can get it to happen just by surfing and switching between applications.

The problem is a focus issue. When you switched to another application and clicked on some UI element that took focus you fixed the bug. Often when I notice this problem I can move my cursor up to some shortcuts I have on a GNOME Panel and see that the focus ring moves to them.
Ah. Yep. You're totally right.
Well, that makes working around it easier at least.

Well, does seem linked to compiz to me, so i'm going to try poking at various compiz settings.
And, yeah, completely unrelated to mouse position, since you can move the mouse anywhere around the screen after activating the menu, it just closes the instant you try to move the mouse onto it.
Another clue! This is not entirely mouse-based.

If I right click on desktop to pull up a context menu, then use arrow keys to go through it, then try to hit right arrow to access noscript's submenu, the main context menu closes immediately.  No mouse interaction.

There is no such problem once I temporarily "fix" the issue suggested just by switching to any other app for a second.
Trying to figure this out in the context that it works in metacity, but not in compiz.
Compiz guys had some theories...
10:48 < maniac103> nemo: not exactly much ... it just tells is it's not EnterNotify/LeaveNotify/MotionNotify which is the culprit, but more 
                   likely FocusIn/FocusOut
10:51 < maniac103> likely they get FocusOut for the popup and are closing it because of that

maniac103 did say he didn't see anything obviously wrong in nsWindow.cpp though...

http://m8y.org/tmp/bug631518/ has results of running xev against the firefox root window. - not attaching those 3 files to bug to avoid adding 3 more e-mail notifications on this bug for possibly unuseful files.

For each of those files procedure was same.  I launched xev, then locked the laptop, unlocked it a few minutes later, clicked on the Minefield button, attempted to enter the menu, then right clicked on a website to open the context menu, and used the arrow key to scroll down to the NoScript entry then hit left arrow.  I then halted logging.

In the broken entry, trying to move mouse into the main menu, and trying to use right on the keyboard to enter the noscript submenu of the context menu, immediately closed the menus.

In the working compiz and metacity, I successfully entered those menus, then closed them.

One thing I noticed different in the logs, was that in the broken entry, the keypress events of down arrow are visible in the logging of the root window.

In the non-broken entries, they are not.  I verified this by rerunning xev live.  Those keystrokes are definitely not visible when things are working properly.
On a hunch I tried 54308 thinking it might have been 54309-54314 that broke it.  No joy, although 54308 does have the old behaviour originally described in the bug, in that right clicking, you can access the main context menu, it is only trying to access sub menus (like noscript) that breaks.

I think I'm just going to have to bisect this.  At least it is fairly reproducible, even if testing a build does take a while.
OS: Linux → Windows Server 2003
OS: Windows Server 2003 → Linux
(In reply to comment #13)
> One thing I noticed different in the logs, was that in the broken entry, the
> keypress events of down arrow are visible in the logging of the root window.
> 
> In the non-broken entries, they are not.  I verified this by rerunning xev
> live.  Those keystrokes are definitely not visible when things are working
> properly.

That's great info, thanks.  When the window manager offers focus to the toplevel window (with WM_TAKE_FOCUS), GDK moves focus to a child window (the sole purpose of which is to receive keyboard events).  This should mean that no keypress events are visible on the toplevel window.  The keypress events you see indicate that this process has not worked as expected.

Can you install the xtrace package and run firefox under that please?  It will produce a (big) log of all X events and requests.
(In reply to comment #13)
> 10:51 < maniac103> likely they get FocusOut for the popup and are closing it
> because of that

Does it look like the main firefox window loses toplevel focus/activation at any stage?  (i.e. does the border stay highlighted?)
(In reply to comment #16)
> (In reply to comment #13)
> > 10:51 < maniac103> likely they get FocusOut for the popup and are closing it
> > because of that
> 
> Does it look like the main firefox window loses toplevel focus/activation at
> any stage?  (i.e. does the border stay highlighted?)

The window always remains focused when I've seen this issue. However, the focus ring will move to components in other windows. So if my cursor is in the URL bar I'll get a blue outline effect, then if I move so that my cursor is over Thunderbird I'll see the selected message there get the dotted border indicating it has keyboard focus. The title bar for Firefox will remain blue when that happens.
As requested in comment #15, http://m8y.org/tmp/bug631518/xtrace.txt has the results
Launched against minefield nightly piped to tee xtrace.txt (forgot to include stderr, hopefully wasn't necessary).
In the new browser window I clicked on the Minefield button once, verified I could enter the menu, then locked the screen and went away for a few minutes, since in the past I learned the bug would not reproduce if I unlocked too quickly.

When I unlocked, I verified the Minefield button menu was no longer accessible, and similarly, a right click on the page to get a context menu.

I then closed the browser.

BTW, sorry for delay in responding.  I thought the bug was not reproducing, then I remembered I had switched to metacity *due* to the bug.

Also, WRT my intent to bisect, I discovered builds from last summer no longer build in natty, so apart from what I learned about the different behaviour last fall which I could bisect to try and discover when it changed if you think it would be useful (the fact that it worked on the context menu, but not on sub menus of the context menu), I couldn't really bisect.

I'm going to setup a build env on my machine that is still running Maverick - hopefully I can bisect more successfully there.
Summary: Attempting to select submenu items closes a menu. Fixed briefly by restart of firefox → Attempting to select submenu items closes a menu. Fixed briefly by using menu in another app
I've finally been able to consistently reproduce this like nemo, so I've done some bisecting:

A build from 20080501 shows the broken awesome bar behavior from Bug #507269 but the menus work fine.

The submenu bug seems to have started back in September. In the 20100916 build submenus would work even though the awesome bar doesn't. In the 20100917 build, hovering over a submenu causes the menu to dismiss.

So pushlog:

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f38ef1080bfe&tochange=268ef4ccb5ff
Blocks: 522956
Depends on: 507269, 616833
Summary: Attempting to select submenu items closes a menu. Fixed briefly by using menu in another app → Attempting to select submenu items under compiz closes a menu. Fixed briefly by using menu in another app
Blocks: 638386
What I see in the log is that, after the successful menu function something (or things, probably the screensaver) grabs and releases the keyboard twice.  It looks like these grab the keyboard to the root window.  The keyboard is then grabbed (probably to the root window) a third time.

While the keyboard is grabbed, focus moves to a non-root window (perhaps the password dialog).

Focus later ends up on the root window, which means that keyboard events are sent to the window under the pointer (because all windows are inferiors of the root window).

When the pointer moves into a normal GDK window (main Firefox window), that window has "pointer focus".  When the pointer moves into the menu window, even though that window usually does not accept focus, the main Firefox window loses pointer focus.  At that point GDK tells Gecko that the main window has lost focus, so Gecko closes the menu.

This root window focus (and therefore pointer focus) is not a situation that should be happening with a window manager running.  And as indicated in comment 17, compiz still thinks that the Firefox main window has focus.  There must be some interaction between compiz and gnome-screensaver that leads to this situation.

There is some scary looking code in gs-grab-x11.c.
It would be helpful to have gnome-screensaver running with the --debug parameter.  It looks like messages are output to $TMPDIR/gnome_screensaver_debug_XXXXXX.

(When I tried compiz under tigervnc I got only a white display, so I haven't reproduced yet.  I hope to try some other things, but if someone can attach the screensaver debug log, that would be very helpful.)
Managed to reproduce.
Don't see anything too disturbing in the gnome-screensaver log.
Assignee: nobody → karlt
Status: NEW → ASSIGNED
Summary: Attempting to select submenu items under compiz closes a menu. Fixed briefly by using menu in another app → Attempting to select submenu items under compiz after lengthy gnome-screensaver lock closes a menu. Fixed by focusing another app
For what it's worth, I can see the same focus issue with XChat. So this might be more of a compiz bug than a Firefox bug.
https://bugs.launchpad.net/ubuntu/+source/gnome-screensaver/+bug/450021

Related?

If so, goes back a ways.  Reported 2009-10-13
Wonder if a compiz bisect is feasible.
(In reply to comment #20)
> When the pointer moves into a normal GDK window (main Firefox window), that
> window has "pointer focus".  When the pointer moves into the menu window, even
> though that window usually does not accept focus, the main Firefox window loses
> pointer focus.  At that point GDK tells Gecko that the main window has lost
> focus, so Gecko closes the menu.

Maybe I'm not getting things correctly here, but Gecko surely shouldn't close the popup if the popup itself got focus? I have seen KDE doing that (http://quickgit.kde.org/?p=kde-workspace.git&a=commit&h=e28f4994a06c86d052fd054e617f0de4eb3bb9bb), and I don't think that's correct behaviour.

> This root window focus (and therefore pointer focus) is not a situation that
> should be happening with a window manager running.  And as indicated in comment
> 17, compiz still thinks that the Firefox main window has focus.  There must be
> some interaction between compiz and gnome-screensaver that leads to this
> situation.

Can you please elaborate why this situation shouldn't be possible with a WM running? Unlike window mapping, the WM isn't asked when an app calls XSetInputFocus...
Well. There appears to be a workaround.
Enably Focus Follows Mouse appears to unbreak the buggy behaviour (if you don't mind focus follows mouse that is).

ccsm->General Options->Focus & Raise Behaviour->Click To Focus [uncheck]

After that, no more problems with menus.  I would bring firefox to foreground, lock for screensaver, wait a while, unlock.  The firefox window would be right under the mouse, went to menu, and it worked fine.

Also, the launchpad bug I linked went away too (not surprising I suppose since moving mouse would actually focus the window).

With click to focus, moving mouse over another window did not focus it, but keystrokes were sent to it.  If the window was a gnome terminal, the terminal cursor changed from an empty square to a filled one.

My ATI machine had Focus follows mouse enabled, which was why I wasn't reproducing it there.
Here are patches for compiz 0.8 and 0.9:
http://fpaste.org/yKdN/
http://sprunge.us/fECK
(In reply to comment #24)
> Maybe I'm not getting things correctly here, but Gecko surely shouldn't close
> the popup if the popup itself got focus?

I'm open for discussion if there's reason to support PointerRoot focus, but a window manager won't set focus to an override-redirect window.

> > This root window focus (and therefore pointer focus) is not a situation that
> > should be happening with a window manager running.  And as indicated in comment
> > 17, compiz still thinks that the Firefox main window has focus.  There must be
> > some interaction between compiz and gnome-screensaver that leads to this
> > situation.
> 
> Can you please elaborate why this situation shouldn't be possible with a WM
> running? Unlike window mapping, the WM isn't asked when an app calls
> XSetInputFocus...

I expected the window manager to maintain either focus on a top-level window or no focus.  Falling into PointerRoot leads to an unhelpful focus mode.
(In reply to comment #24)
> Unlike window mapping, the WM isn't asked when an app calls
> XSetInputFocus...

An app should not call XSetInputFocus except in the situations described in
http://tronche.com/gui/x/icccm/sec-4.html#s-4.2.7
and then the app only can XSetInputFocus to one of its windows (not the root window).

Normally apps use a _NET_ACTIVE_WINDOW client message to request focus from the window manager.
Attached file focus activity recording program (obsolete) —
This uses the RECORD extension to track all GrabKeyboard/UngrabKeyboard and SendEvent requests on a display.  Pass a client resource id base to track focus change events seen by that client.
Attached file success log
Log of gnome-screensaver lock and unlock with focus still behaving as expected.

0x00000161 is the root window

client resource id bases:
0x02000000 is compiz
0x01400000 is gnome-screensaver
0x01e00000 is Firefox (the client that has focus prior to locking the screen)
Attached file focus activity recording program v1.1 (obsolete) —
Bug in first version meant SetInputFocus was not recorded.
Attachment #519824 - Attachment is obsolete: true
This records SetInputFocus, etc. even on the client tracked for focus events.
Attached file fail log
Log of gnome-screensaver lock and unlock with focus reverting to the root window.

The difference in behaviour is triggered by gnome-screensaver requesting focus (with _NET_ACTIVE_WINDOW) while it has grabbed the keyboard.  compiz honours the request and transfers focus from Firefox to the gnome-screensaver window:

  0x01400000 SendEvent 33 to 0x00000161 propagate=0 mask=0x00180000
    _NET_ACTIVE_WINDOW window=0x0140044c format=32 0x00000001 0x00000000
  0x02000000 SetInputFocus 0x0140044c revert-to=PointerRoot time=0x00000000
  0x02000000 FocusOut 0x01e0008b NonlinearVirtual WhileGrabbed
  0x02000000 FocusIn 0x0140044c Nonlinear WhileGrabbed
  0x02000000 SendEvent 33 to 0x0140044c propagate=0 mask=0x00000000
    WM_PROTOCOLS protocol=WM_TAKE_FOCUS time=0x155aad4e

gnome-screensaver (appropriately) transfers focus to a child window:

  0x01400000 SetInputFocus 0x0140044d revert-to=Parent time=0x155aad4e
  0x02000000 FocusOut 0x0140044c Inferior WhileGrabbed

gnome-screensaver ungrabs the keyboard:

 0x01400000 UngrabKeyboard time=0x00000000
 0x02000000 FocusOut 0x0140044c Inferior Ungrab

Then I assume the window is closed, and focus reverts to the (ancestor) root window:

 0x02000000 FocusOut 0x0140044c Virtual Normal
 0x02000000 FocusIn 0x00000161 Inferior Normal

compiz still thinks the Firefox window has focus.
I wonder whether that is because the WhileGrabbed focus events were ignored.
Attachment #519830 - Attachment is obsolete: true
(In reply to comment #34)
> Then I assume the window is closed, and focus reverts to the (ancestor) root
> window:
> 
>  0x02000000 FocusOut 0x0140044c Virtual Normal
>  0x02000000 FocusIn 0x00000161 Inferior Normal

When the gnome-screensaver window is closed and compiz gets the UnmapNotify
event, moveInputFocusToOtherWindow() is called.  d->nextActiveWindow is the
gnome-screensaver window, but d->activeWindow is the Firefox window (that was
active prior to screen lock).  focusDefaultWindow() picks the Firefox window
for focus but thinks that it is already focused.
Apparently d->activeWindow is still the Firefox window because, when the FocusIn event was received for the gnome-screensaver window, findTopLevelWindowAtDisplay() did not find it (w=0).

The gnome-screensaver window has override-redirect set.

I don't know that it's appropriate for gnome-screensaver to use
_NET_ACTIVE_WINDOW for an override-redirect window.
When compiz received _NET_ACTIVE_WINDOW, the ClientMessage handler found a CompWindow for the gnome-screensaver window with findWindowAtDisplay().

For InputOutput window the distinction between findTopLevelWindowAtDisplay() and findWindowAtDisplay() is that findTopLevelWindowAtDisplay() ignores override-redirect windows.

So the question is:

Should the _NET_ACTIVE_WINDOW handler use findTopLevelWindowAtDisplay() or should the FocusIn handler use findWindowAtDisplay()?
I'm thinking it could be better if the FocusIn handler ensured to unset d->activeWindow if it isn't interested in the new focus window.
Karl,

thanks for that in-depth investigation. Now that I get the root cause on the silver tablet ;-), making patches was easy.

Can you please try the attached patches?
- patch 1 makes sure to move the input focus away from the root window if it ever happens to get focus
- patch 2 makes the _NET_ACTIVE_WINDOW client message handler use findTopLevelWindowAtDisplay instead of findWindowAtDisplay. As said in the commit comment: Most other WMs don't track override_redirect windows at all, so if they rely on the WM transferring focus to a window it doesn't handle, they're clearly broken.
- patch 3+4 unset activeWindow when some unmanaged window gets focus, so focusDefaultWindow can do the right thing

Can you please give them a try?
Attached patch Fix for 0.9Splinter Review
This was enough to fix the issue here using 0.9.
No longer depends on: 616833
(In reply to comment #42)
> - patch 2 makes the _NET_ACTIVE_WINDOW client message handler use
> findTopLevelWindowAtDisplay instead of findWindowAtDisplay. As said in the
> commit comment: Most other WMs don't track override_redirect windows at all, so
> if they rely on the WM transferring focus to a window it doesn't handle,
> they're clearly broken.

Yes, I agree.  This patch on its own resolves the issue and this is the patch I would recommend to distributions that want a patch to backport.
(In reply to comment #38)
> I'm thinking it could be better if the FocusIn handler ensured to unset
> d->activeWindow if it isn't interested in the new focus window.

(In reply to comment #42)
> - patch 1 makes sure to move the input focus away from the root window if it
> ever happens to get focus

> - patch 3+4 unset activeWindow when some unmanaged window gets focus, so
> focusDefaultWindow can do the right thing

This may be more complicated than I first thought because UnmapNotify is received before the corresponding FocusIn event.  For regular managed windows UnmapNotify does the right thing, but on FocusIn for the root window it would then recalculate (perhaps with less information).

I tried this patch series and saw a behaviour change.  I think you need Firefox 4 to reproduce:

1) Clear the url bar.
2) Type a letter.
   Assuming you have some history, a completion dropdown will show.
3) press Alt-TAB.

Expected behaviour is that the dropdown closes and the window list is shown immediately.  That is still the behavior with patch 2.

With the full patch series, it is necessary to press Alt-TAB twice to get the window list.
(In reply to comment #47)
> I tried this patch series and saw a behaviour change.  I think you need Firefox
> 4 to reproduce:
> 
> 1) Clear the url bar.
> 2) Type a letter.
>    Assuming you have some history, a completion dropdown will show.
> 3) press Alt-TAB.
> 
> Expected behaviour is that the dropdown closes and the window list is shown
> immediately.  That is still the behavior with patch 2.
> 
> With the full patch series, it is necessary to press Alt-TAB twice to get the
> window list.

Yes, this happens because the dropdown window still has a grab active at the time Alt+Tab is pressed and thus the grab attempt done by the switcher plugin fails. With Firefox 3, the dropdown doesn't even close. If you have e.g. a Konsole menu open, you can cycle through the menu items with each tab press.

I don't think the current behaviour is any better or worse than the old behaviour, as the switcher behaviour is inconsistent anyway with open popups. I just checked Kwin: For the Konsole and Firefox 3 cases, it behaves exactly as compiz does. For the Firefox 4 case, it shows the old compiz behaviour.
The behaviour change is caused by patches 1, 3 and 4, BTW. It happens as soon as either 1 or 3+4 are applied.
(In reply to comment #48)
> Yes, this happens because the dropdown window still has a grab active at the
> time Alt+Tab is pressed and thus the grab attempt done by the switcher plugin
> fails.

Firefox 4 has a pointer grab, but not a keyboard grab.
If the switcher needs a pointer grab, I wonder why it succeed pre-patch-1-or-3+4.

> With Firefox 3, the dropdown doesn't even close.

Yes, Firefox 3 had a keyboard grab.  Having to click to close was considered bad behaviour particularly for autocomplete where the dropdown is not manually opened.
Thanks Danny.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
Whiteboard: [fixed in compiz 0.8.8]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: