[FIX]combobox display areas are very slow to update

RESOLVED FIXED

Status

()

Core
Layout: Form Controls
RESOLVED FIXED
10 years ago
9 years ago

People

(Reporter: dbaron, Assigned: bz)

Tracking

({perf, regression})

Trunk
x86
Linux
perf, regression
Points:
---
Dependency tree / graph
Bug Flags:
wanted-next +
blocking1.9 -
wanted1.9 -

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 3 obsolete attachments)

The updating of the display area of a combobox is sometimes really delayed -- the updates happen after a number of changes made later have already updated, which confuses me about what changes I've made to a combobox.

I suspect this bug is much easier to reproduce on Linux because combobox painting there is really slow with the native theme stuff.  It's been bugging me for a while.

Steps to reproduce:
 1. load attachment 67802 [details] (or any attachment) from an account with lots of bugzilla permissions, or the testcase that I'll write and attach shortly
 2. click on the the first combobox to focus it
 3. repeatedly press -, <TAB>, -, <TAB>, -, etc., to set the value of each combobox to minus

Actual results: The focus change from one combobox to the next is visible as I'm tabbing (sometimes a bit slow), but the values don't change to minus until after I stop tabbing.

Expected results:  values appear as "-" before or at the same time as the focus ring is drawn on the next combobox

[17 17:30:27] <bz> dbaron: we update the display area off an event
[17 17:30:38] <bz> dbaron: you might be seeing that....
[17 17:30:54] <dbaron> yeah -- that event seems to have annoyingly low priority when my system is doing lots of stuff
[17 17:31:12] <dbaron> I often have to wait to submit to make sure that I set them to what I think I set them to.
[17 17:31:27] <bz> dbaron: hmm
[17 17:31:55] <bz> dbaron: we should fix that...
[17 17:32:30] <bz> dbaron: e.g. make a reflow flush flush it
Flags: blocking1.9?
So for some reason, I can't make a simplified testcase that shows this -- I see it only on the attachment page.  I even tried including the script that's in the attachment page...
Hmm.  So when I download the attachment page locally (save as, complete) I think see the problem.  It goes away when I remove the <iframe>, though.  Do you see something like that?
And I'm seeing X take up most of the time, for what it's worth.
Flags: blocking1.9? → blocking1.9-
Whiteboard: [wanted-1.9]
I have to wonder whether there are several different things going on here...

David, does not putting off the display area on an event in nsComboboxControlFrame help what you're seeing?
For what it's worth, I see most of the symptoms go away if I add

  input, combobox, textarea { border-style: solid }

to the page's stylesheet.  So I think this is a combination of 397303 (slow painting of inset/outset borders) and the issue discussed in bug 377419 comment 38 (us repainting the whole table on any change of any sort inside the table; totally unnecessary in this case).

Just adding that style rule make it take about 4-5x less time to mark a review on a patch over here...
Depends on: 377419
Flags: wanted1.9+
Whiteboard: [wanted-1.9]
The "solid" part was indeed bug 397303.
Depends on: 397303
(In reply to comment #5)
> So I think this is a combination of 397303 (slow
> painting of inset/outset borders) and the issue discussed in bug 377419 comment
> 38 (us repainting the whole table on any change of any sort inside the table;
> totally unnecessary in this case).

Could you check again, now that (bug 397303 and) bug 414298 has been fixed ?
For me, it's still pretty slow, though faster than before.  Dunno about David; he might have been seeing a different problem entirely.
To be more precise, it's still slow, and setting the borders to solid still makes it fast.

And in general, there's no reason to prompt me to check whether my own performance fixes help this sort of thing; I'd tested that a while back and found not much effect...
For what it's worth, when I filed this, I was complaining especially that they were slower than other things (i.e., "very slow" relative to other things rather than in absolute terms), and thus that they were updating out of order, which causes significant user confusion.  Making things faster is good, but as long as we can't guarantee that things update instantly, they should also update in the order the changes were made.
The events in question, for what it's worth, are nsComboboxControlFrame::RedisplayTextEvent.
Flags: wanted1.9-
Flags: wanted1.9+
Flags: wanted-next+
I've seen something similar on Bugzilla pages lately -- when selecting
in the dropdown menu of a flag combobox the menu area isn't repainted
so it looks like it's still posted.  I have to move the mouse slightly
for the repaint to happen.  Is this the same bug?
(I'm using 3.1a1pre nightly builds on Linux)
Created attachment 360540 [details] [diff] [review]
Proposed fix

This makes sure that we redisplay as soon as we can after the point when the event is posted.
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
Attachment #360540 - Flags: superreview?(mats.palmgren)
Attachment #360540 - Flags: review?(mats.palmgren)
Summary: combobox display areas are very slow to update → [FIX]combobox display areas are very slow to update
Blocks: 67752
FYI, make sure you're not seeing bug 339541 when testing this with
+,-,? Bugzilla <select>s.
I run into this bug using the down arrow too, so I don't think I'm seeing bug 339541.
Comment on attachment 360540 [details] [diff] [review]
Proposed fix

The patch looks ok, but I get a bunch of:

###!!! ASSERTION: nsDOMClassInfo not thread-safe: '_mOwningThread.GetThread() == PR_GetCurrentThread()', file dom/src/base/nsDOMClassInfo.cpp, line 1733

when changing the value with the keyboard.
I'd like to know why.
Attachment #360540 - Flags: superreview?(mats.palmgren)
Attachment #360540 - Flags: superreview-
Attachment #360540 - Flags: review?(mats.palmgren)
Attachment #360540 - Flags: review-
I'd like to know why too, because I don't see those assertions over here in my build with this patch.  Care to give me a stack or something to work with here?  And are you sure this patch is what triggers the assertions for you?  None of the code in this patch should have made anything run off the main thread unless something else is severely broken.
Created attachment 361793 [details]
stack for the assertion in comment 16

I recompiled with/without the patch a few times and I always get the
assertions with the patch and never without the patch.
I have re-tested in a different tree with the same result.
It only occurs when using the keyboard.
I also noticed just now that when using the mouse the patch doesn't seem
to help sometimes.  When clicking a new option the dropdown menu does
not close until I move the mouse slightly (I can reproduce this both
with and without the patch).  I'm on x86_64 Linux (KDE).
Er...  What thread is that event dispatch happening on in that case?  I really don't see how this patch would possibly affect that.

I'll spin up a Linux build to check, I guess.

Also, does this patch make the mouse situation _worse_, or just not better?  I wouldn't expect it to make anything related to event handling better.... all it does is run exactly the code we do now, but at a safe script run point instead of from the event loop.
Ok, I just tried my 32-bit Linux build, opened <https://bugzilla.mozilla.org/attachment.cgi?id=360540&action=edit>, tabbed to the review flag dropdowns, and tried changing their value, both by typing +/- and via the up/down arrow keys.

I get no assertions.  I'm pretty sure the assertions have nothing to do with this patch per se, unless you think that somewhere we're removing a script blocker on a non-main thread?  Can you add assertions to your nsScriptBlocker ctor/dtor to ensure that we aren't and see whether we hit them?
For what it's worth I do NOT hit such assertions locally while using keyboard on comboboxes.
(In reply to comment #19)
> Er...  What thread is that event dispatch happening on in that case?  I really
> don't see how this patch would possibly affect that.

Everything happens on the main thread, it's not a threading error.
But the assertion is using the _mOwningThread member which has been
overwritten due to a memory error...

nsContentUtils::AddScriptRunner() returns PR_TRUE both when it
adds it to its internal queue and when it runs it directly:
http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsContentUtils.cpp#4302
in the latter case it doesn't add a strong ref, so the local nsRefPtr
in nsComboboxControlFrame::RedisplayText is the last ref to the event
and thus destroys it when it goes out of scope.

The problem is mRedisplayTextEvent which keeps a plain pointer to the
event.  The current code depends on mRedisplayTextEvent being assigned
before Run() (leading to HandleRedisplayTextEvent() calling Forget()
on it).  As for the overwritten memory, Revoke() on mRedisplayTextEvent
calls RedisplayTextEvent::Revoke which nulls out 'mControlFrame' on the
deleted event... which is actually the _mOwningThread member on a
nsDOMClassInfo object...

> Also, does this patch make the mouse situation _worse_, or just not better?

Neither better nor worse.
Oh, we have callers that don't have a scriptblocker on the stack?  That would be bad, indeed, since HandleRedisplayTextEvent can kill the frame...

I'll go through and audit the callers.
Created attachment 362670 [details] [diff] [review]
Better fix, I think

This is a bit more involved, but basically makes sure that we always have a live script blocker when trying to post this script runner.

The nsHTMLSelectElement changes actually fix a serious issue: it was holding a frame pointer across a call that fired DOM events even without this change...
Attachment #360540 - Attachment is obsolete: true
Should I review the new patch?

In RedisplayText(), it doesn't hurt to be a bit defensive:

    mRedisplayTextEvent = event;
    if (!nsContentUtils::AddScriptRunner(event))
      mRedisplayTextEvent.Forget();

That should avoid having a dangling pointer even if someone forgets
a script blocker.
Forgetting a script runner blocker won't make that function return false, though.

I'll post a new slightly more defensive patch here.
Created attachment 362731 [details] [diff] [review]
Reorder things so that we're safe
Attachment #362670 - Attachment is obsolete: true
Attachment #362731 - Flags: superreview?(mats.palmgren)
Attachment #362731 - Flags: review?(mats.palmgren)
Comment on attachment 362731 [details] [diff] [review]
Reorder things so that we're safe

r+sr=mats with the last of the nits fixed.


>content/html/content/src/nsHTMLSelectElement.cpp
>@@ -236,28 +236,34 @@ nsHTMLSelectElement::InsertOptionsIntoLi
>-    nsISelectControlFrame* selectFrame = GetSelectFrame();
>+    nsISelectControlFrame* selectFrame = nsnull;
>+    nsWeakFrame weakSelectFrame;
> 
>     nsPresContext *presContext = nsnull;
>     if (selectFrame) {
>       presContext = GetPresContext();
>     }

'presContext' is always null.


>@@ -917,16 +924,17 @@ nsHTMLSelectElement::SetOptionsSelectedB
>   nsISelectControlFrame *selectFrame = nsnull;
>   PRBool did_get_frame = PR_FALSE;
>+  nsWeakFrame weakSelectFrame;

Can we remove 'did_get_frame' and just test '!weakSelectFrame.IsAlive()' ?


>layout/forms/nsListControlFrame.cpp
> nsListControlFrame::ComboboxFinish(PRInt32 aIndex)
>+    nsWeakFrame weakFrame(this);
>+
>     if (displayIndex != aIndex) {
>-      mComboboxFrame->RedisplaySelectedText();
>+      mComboboxFrame->RedisplaySelectedText(); // might destroy us
>     }
> 
>-    mComboboxFrame->RollupFromList(); // might destroy us
>+    if (this && mComboboxFrame) {
>+      mComboboxFrame->RollupFromList(); // might destroy us
>+    }

s/this && mComboboxFrame/weakFrame.IsAlive()/
Attachment #362731 - Flags: superreview?(mats.palmgren)
Attachment #362731 - Flags: superreview+
Attachment #362731 - Flags: review?(mats.palmgren)
Attachment #362731 - Flags: review+
> 'presContext' is always null.

Good catch.  I'm actually going to go ahead and remove the prescontext arg to that method, since it's unused (and not needed, since frames know their prescontext) in any case.

> Can we remove 'did_get_frame' and just test '!weakSelectFrame.IsAlive()' ?

Hrm.  That would impact performance if we have no presentation, but so does my code.  I guess the correct test is |did_get_frame || (selectFrame && !weakSelectFrame.IsAlive())|.

> s/this && mComboboxFrame/weakFrame.IsAlive()/

Ugh.  Yes, should have been |weakFrame.IsAlive() && mComboboxFrame|.
Created attachment 362828 [details] [diff] [review]
interdiff addressing the comments
Attachment #362731 - Attachment is obsolete: true
Created attachment 362829 [details] [diff] [review]
Patch updated to comments
Attachment #362829 - Flags: superreview?(mats.palmgren)
Attachment #362829 - Flags: review?(mats.palmgren)
Comment on attachment 362829 [details] [diff] [review]
Patch updated to comments

r+sr=mats

>+    PRBool did_get_frame = PR_FALSE;

I would prefer camel case so we follow the naming convention in this file.
Attachment #362829 - Flags: superreview?(mats.palmgren)
Attachment #362829 - Flags: superreview+
Attachment #362829 - Flags: review?(mats.palmgren)
Attachment #362829 - Flags: review+
Pushed http://hg.mozilla.org/mozilla-central/rev/c7206236523e with that change.
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.