Last Comment Bug 685457 - [UI][regression] search engine in address bar popup menu drawn twice
: [UI][regression] search engine in address bar popup menu drawn twice
Status: RESOLVED FIXED
: regression
Product: SeaMonkey
Classification: Client Software
Component: Location Bar (show other bugs)
: SeaMonkey 2.3 Branch
: x86 Linux
: -- normal (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
:
Mentors:
: 675419 (view as bug list)
Depends on: 626963
Blocks: 547342
  Show dependency treegraph
 
Reported: 2011-09-08 00:59 PDT by Sergey
Modified: 2011-09-24 14:39 PDT (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
seabug.gif (28.36 KB, image/gif)
2011-09-08 00:59 PDT, Sergey
no flags Details
Possible branch workaround (1.06 KB, patch)
2011-09-11 15:56 PDT, neil@parkwaycc.co.uk
no flags Details | Diff | Splinter Review
Alternative branch workaround (626 bytes, patch)
2011-09-11 15:58 PDT, neil@parkwaycc.co.uk
iann_bugzilla: review+
iann_bugzilla: approval‑comm‑aurora+
iann_bugzilla: approval‑comm‑beta+
Details | Diff | Splinter Review
NSPR log (14.84 KB, text/plain)
2011-09-12 15:52 PDT, neil@parkwaycc.co.uk
no flags Details
Problematic stack trace (7.19 KB, text/plain)
2011-09-13 11:33 PDT, neil@parkwaycc.co.uk
no flags Details
Trunk workaround (1.47 KB, patch)
2011-09-22 12:58 PDT, neil@parkwaycc.co.uk
mozilla: review+
Details | Diff | Splinter Review

Description Sergey 2011-09-08 00:59:59 PDT
Created attachment 559071 [details]
seabug.gif

User Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:6.0.2) Gecko/20110902 Firefox/6.0.2 SeaMonkey/2.3.3
Build ID: 20110902182936

Steps to reproduce:

1. Download latest seamonkey binary build for Linux from seamonkey-project.org
2. Start it
3. Type a few characters in location bar


Actual results:

Search engine was listed twice with a huge empty space when no results left (see attachment)


Expected results:

No empty space, just single search engine

Additional information:
Attached screenshot is from Seamonkey 2.3.3.
The bug itself appeared somewhere between seamonkey 2.1a2 and 2.1a3 versions.
Comment 1 Edmund Wong (:ewong) 2011-09-08 01:32:15 PDT
Confirming even on trunk build.

Build identifier: Mozilla/5.0 (X11; Linux i686; rv:9.0a1) Gecko/20110903 Firefox/9.0a1 SeaMonkey/2.6a1
Comment 2 neil@parkwaycc.co.uk 2011-09-08 14:24:12 PDT
I can't really debug popups on linux, since my current VM doesn't give me a getty on the text console so that I can gdb the browser on the GUI.

I can at least reproduce the problem at will. As per the description, the problem occurs when there are no results left that match the typed string. When this happens, the autocomplete code does two things: 1) it collapses the tree of results (this is SeaMonkey, so we're not using the richlistbox) 2) it sets the nomatch attribute on the popup. The theme has a rule that shrinks the border between the tree and the search engine when the nomatch attribute is set.

Now just watching the screen I noticed one problem already - when deleting text so that matches appear again, the border width doesn't take effect immediately, instead the tree is painted with the results, then a fraction of a second later the border width changes. This is strange, because if I change the order in which the attributes are set, then the bug continues to occur.

If I temporarily remove the rule that changes the border width, then the bug does not occur. 

If I add a second rule that collapses the tree keyed from the same attribute that changes the border width, then, despite the tree already being collapsed by the XBL, the bug does not occur!

My 2.1 build is suffering from an additional behaviour - when the results reduce (but there are still matches) then the height does not change. But when there are no more results, then the search engine does pop up to the top of the panel, so the collapsed attribute is getting honoured, but the widget does not resize.
Comment 3 Boris Zbarsky [:bz] (still a bit busy) 2011-09-08 14:27:09 PDT
> If I add a second rule that collapses the tree keyed from the same attribute that changes > the border width

When you say "collapse", does that involve a reframe?

The border thing may be due to some things happening off the refresh driver while others happen sync....
Comment 4 Sergey 2011-09-09 22:23:47 PDT
Found the changeset that introduced this bug:

https://hg.mozilla.org/mozilla-central/rev/3f91606bd115

Have no idea what it does anyway.

Hope this helps.
Comment 5 neil@parkwaycc.co.uk 2011-09-10 16:53:42 PDT
I have found another workaround for the bug.

The clearResults method calls scrollToRow(0), if this is removed then the bug does not occur.
Comment 6 neil@parkwaycc.co.uk 2011-09-10 16:56:24 PDT
Ah, that would be because scrollToRow calls GetTreeBody(true)...
Comment 7 neil@parkwaycc.co.uk 2011-09-11 03:59:05 PDT
Flushing layout on all nsTreeBoxObject methods doesn't help.
Comment 8 neil@parkwaycc.co.uk 2011-09-11 09:05:09 PDT
OK, so this is what happens when there are no result left:

Without layout flushing (i.e. disabling bug 547342), the refresh driver resizes the view to its final size.

With layout flushing, getting the box object flushes layout, which shrinks the view by 1 pixel. There is subsequently an expose event which then shrinks the view to its final size.

By comparison, this is what happens when you backspace:

Without layout flushing, the refresh driver resizes the view 1 pixel too small. With layout flushing, there is an expose event that resizes the view 1 pixel too small. Either way there is then a (subsequent) expose event that enlarges the view to its final size.

(The significance of the pixel is that the border is 1px when there are no results but 2px when there are results.)
Comment 9 neil@parkwaycc.co.uk 2011-09-11 09:43:41 PDT
Oh, and the reason the CSS hack works is that we've already set the nomatch attribute on the autocomplete by the time we want to hide the results, so that the layout flush resizes the view to the correct size. (This also explains why the size is temporarily off by 1px at times: when rows are being shown they are added before the attribute is cleared, and when they are being removed the attribute is cleared before they are reamoved.) So another workaround is to set the nomatch attribute only after adjusting the tree's height. That means that the refresh driver gets to resize the view again.
Comment 10 neil@parkwaycc.co.uk 2011-09-11 15:56:09 PDT
Created attachment 559796 [details] [diff] [review]
Possible branch workaround
Comment 11 neil@parkwaycc.co.uk 2011-09-11 15:58:54 PDT
Created attachment 559797 [details] [diff] [review]
Alternative branch workaround
Comment 12 neil@parkwaycc.co.uk 2011-09-11 16:38:57 PDT
OK, so here's a curious thing: if I remove the layout flush introduced by bug 547342, and introduce a different one (e.g. by accessing treeBoxObject.height) then the bug still appears. I put a breakpoint on gtk_window_resize and it does get called twice, once with the intermediate size as discussed and once with the correct size, so somehow the data for the second call is getting lost. Is this a widget bug, or could it even be a gtk2 bug?
Comment 13 Karl Tomlinson (:karlt) 2011-09-11 18:48:47 PDT
(In reply to neil@parkwaycc.co.uk from comment #12)
> I put a breakpoint on
> gtk_window_resize and it does get called twice, once with the intermediate
> size as discussed and once with the correct size, so somehow the data for
> the second call is getting lost.

That sounds odd.

A log with NSPR_LOG_MODULES=Widget:5 in the environment may be useful.

Are we sure that the popup is the wrong size and the problem is not that the window behind the popup is not getting painted?  I guess if the window behind was not repainted, the problem would remain even after the popup closed.
Comment 14 neil@parkwaycc.co.uk 2011-09-12 00:32:05 PDT
(In reply to Karl Tomlinson from comment #13)
> Are we sure that the popup is the wrong size and the problem is not that the
> window behind the popup is not getting painted?  I guess if the window
> behind was not repainted, the problem would remain even after the popup
> closed.

I tried putting an animated image behind the popup, which I was hoping would have triggered repaints if there had been a one-off failure to repaint.
Comment 15 neil@parkwaycc.co.uk 2011-09-12 15:52:42 PDT
Created attachment 559907 [details]
NSPR log
Comment 16 neil@parkwaycc.co.uk 2011-09-12 15:55:33 PDT
Comment on attachment 559907 [details]
NSPR log

>-1276713120[b3c10060]: GrabPointer 1
>-1276713120[b3c10060]: configure event [ac724f30] 208 111 391 134
>-1276713120[b3c10060]: configure event [ac724f30] 208 111 391 134
>-1276713120[b3c10060]: OnEnterNotify: ac724f30
>-1276713120[b3c10060]: WARNING: OpenGL-accelerated layers are not supported on this system.: file /hg/comm/mozilla/widget/src/xpwidgets/nsBaseWidget.cpp, line 849
>-1276713120[b3c10060]: key_press_event_cb
>-1276713120[b3c10060]: key_release_event_cb
>-1276713120[b3c10060]: key_release_event_cb
>-1276713120[b3c10060]: nsWindow::NativeResize [ac724f30] 391 98
>-1276713120[b3c10060]: size_allocate [ac724f30] 0 0 391 98
>-1276713120[b3c10060]: configure event [ac724f30] 208 111 391 98
The first few keypresses give multiple results, requiring 98 pixels of height.

>-1276713120[b3c10060]: key_press_event_cb
>-1276713120[b3c10060]: key_release_event_cb
>-1276713120[b3c10060]: key_release_event_cb
>-1276713120[b3c10060]: key_press_event_cb
>-1276713120[b3c10060]: key_release_event_cb
>-1276713120[b3c10060]: key_release_event_cb
>-1276713120[b3c10060]: key_press_event_cb
>-1276713120[b3c10060]: key_release_event_cb
>-1276713120[b3c10060]: key_release_event_cb
>-1276713120[b3c10060]: nsWindow::NativeResize [ac724f30] 391 97
>-1276713120[b3c10060]: size_allocate [ac724f30] 0 0 391 97
>-1276713120[b3c10060]: nsWindow::NativeResize [ac724f30] 391 25
>-1276713120[b3c10060]: configure event [ac724f30] 208 111 391 97
Then subsequent keypresses remove all of the results. But the configure event shows that it's forgotten to resize all the way down to 25 pixels...

>-1276713120[b3c10060]: key_press_event_cb
>-1276713120[b3c10060]: CaptureRollupEvents ac724f30
>-1276713120[b3c10060]: ReleaseGrabs
Comment 17 Karl Tomlinson (:karlt) 2011-09-12 20:38:46 PDT
Hmm.  I can reproduce.  Under xtrace, I see the ConfigureWindow request to reduce the height from 114 to 113 pixels, but no subsequent request.
Comment 18 Karl Tomlinson (:karlt) 2011-09-12 22:19:55 PDT
It looks like GTK (or specifically gtk_window_move_resize) does not expect
resize requests from within expose signal handlers.

gtk_window_resize stores the requested height in
window->geometry_info->resize_height, and then schedules
gtk_window_move_resize to run off the event loop.

gtk_window_move_resize issues the ConfigureWindow request with
geometry_info->resize_height, and then calls gdk_window_process_updates (for
override-redirect/GTK_WINDOW_POPUP windows, at least), which dispatches expose
signals for child windows that have been invalidated.

On GTK expose signals, we dispatch NS_WILL_PAINT, NS_PAINT, and NS_DID_PAINT events.
One of these events triggers the second gtk_window_resize call.

The second gtk_window_resize stores the new height in
geometry_info->resize_height and schedules a second gtk_window_move_resize event.

Before the second gtk_window_move_resize runs, the first
gtk_window_move_resize continues.  It has completed its resize and so it
clears geometry_info->resize_height.

When the second gtk_window_move_resize runs, there is no resize scheduled in
geometry_info->resize_height.
Comment 19 neil@parkwaycc.co.uk 2011-09-13 11:33:14 PDT
Created attachment 560001 [details]
Problematic stack trace
Comment 20 Karl Tomlinson (:karlt) 2011-09-13 16:00:33 PDT
roc, does any widget code on other platforms have to work around this kind of thing?

It seems the problem only happens when an expose event causes (through WILL_PAINT) a resize of a toplevel popup window.

Some possible solutions I could imagine.

1. Delay gtk_window_resize calls when Resize is called during expose events.
   The gtk_window_resize effects the resize asynchronously anyway, so I would not
   expect the delay to be significant.  The only cost would be the
   extra complexity of managing pending resizes.

2. Block painting during gtk_window_move_resize and update after the function
   has completed.  Could be implemented using check-resize signal handlers on
   the popup mShell widgets.  Implementation would be ugly but perhaps
   simpler than option 1.

3. I thought about trying to flush layout before the gtk_window_move_resize
   call, but the NS_SIZE event is dispatched during gtk_window_move_resize, so
   I doubt this would work.
Comment 21 Robert O'Callahan (:roc) (email my personal email if necessary) 2011-09-14 01:40:54 PDT
(In reply to Karl Tomlinson (:karlt) from comment #20)
> roc, does any widget code on other platforms have to work around this kind
> of thing?

On Windows, we run WILL_PAINT outside of the BeginPaint/EndPaint pair because doing it inside that pair caused similar problems.

> 1. Delay gtk_window_resize calls when Resize is called during expose events.
>    The gtk_window_resize effects the resize asynchronously anyway, so I
> would not
>    expect the delay to be significant.  The only cost would be the
>    extra complexity of managing pending resizes.

That sounds reasonable.

> 2. Block painting during gtk_window_move_resize and update after the function
>    has completed.  Could be implemented using check-resize signal handlers on
>    the popup mShell widgets.  Implementation would be ugly but perhaps
>    simpler than option 1.

That also sounds reasonable.
Comment 22 Ian Neal 2011-09-14 12:38:09 PDT
Comment on attachment 559797 [details] [diff] [review]
Alternative branch workaround

r/a=me assuming a better core fix doesn't make it to the branches (esp beta) before release.
Comment 24 Jens Hatlak (:InvisibleSmiley) 2011-09-16 13:56:13 PDT
*** Bug 675419 has been marked as a duplicate of this bug. ***
Comment 25 neil@parkwaycc.co.uk 2011-09-22 12:58:25 PDT
Created attachment 561853 [details] [diff] [review]
Trunk workaround

This works around the problem by setting the nomatch attribute after we've finished inspecting the popup's tree's box object, and therefore we won't flush layout too early, and the refresh driver will fix things up for us. I don't think this can affect Thunderbird as it never shows the popup for zero results.
Comment 26 neil@parkwaycc.co.uk 2011-09-24 14:39:54 PDT
Pushed changeset cb79942d0feb to mozilla-central.

Note You need to log in before you can comment on or make changes to this bug.