Closed Bug 1724617 Opened 3 years ago Closed 3 years ago

Highlighting effect when hovering over bookmark item is noticeably delayed and Click is not responding when a folder is opened for the first time.

Categories

(Core :: Web Painting, defect)

Firefox 92
Desktop
Windows 10
defect

Tracking

()

VERIFIED FIXED
95 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox-esr91 --- unaffected
firefox90 --- unaffected
firefox91 --- unaffected
firefox92 --- wontfix
firefox93 --- wontfix
firefox94 + verified
firefox95 --- verified

People

(Reporter: alice0775, Assigned: mattwoodrow)

References

(Regression)

Details

(Keywords: nightly-community, perf, regression)

Attachments

(8 files, 2 obsolete files)

Attached video Screen capture.wmv (obsolete) —

Steps to reproduce:

  1. Prepare many bookmarks and folders.
  2. Open a folder and mouse hover over bookmarks items

Actual Results:
Sometimes, the highlighting effect when hovering over bookmark item is noticeably delayed.

Expected Results:
The highlighting effect should appear almost instantaneously.

Would you be able to try and use mozregression to get a refined regression range? I don't think this is likely to be old, since we haven't had it reported previously.

Flags: needinfo?(alice0775)

(In reply to Mark Banner (:standard8) from comment #2)

Would you be able to try and use mozregression to get a refined regression range? I don't think this is likely to be old, since we haven't had it reported previously.

I think this is not a recent regression.

Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(alice0775)
Resolution: --- → WONTFIX
Attached file places.sqlite sample (obsolete) —

Mouse response will get stuck for a long time(10-15seconds).
profiler log: https://share.firefox.dev/3lhzfXs
Screencast: https://youtu.be/I2TkaJt6pzw

Almost certainly, this is the regression range.
Regression window:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=ae11d50c40266fd3413ee79ef86e6ccb1dbbed9c&tochange=e505fffd43c3663a2ed608907ed17830cc526ec9

Keywords: regression
Regressed by: 1722258
Has Regression Range: --- → yes
Attachment #9235287 - Attachment is obsolete: true
Component: Performance → Web Painting
See Also: → 1730362
See Also: → 1729756
Attached file Debug build log

This is the output log of the debug build when a problem occurs on the bad build.

This is the output log of the debug build when a problem occurs on Latest Nightly94.0a1 debug build(20210911212243).

Keywords: perf
Attached file places.sqlite sample
Attachment #9240695 - Attachment is obsolete: true
See Also: → 1731133

Matt, could you look into the regression for this performance (temporary hang-up)?

Summary: Highlighting effect when hovering over bookmark item is noticeably delayed when a folder is opened for the first time. → Highlighting effect when hovering over bookmark item is noticeably delayed and Click is not responding when a folder is opened for the first time.

[Tracking Requested - why for this release]: ui hangs up temporarily.

STR:
0. Use attached places.sqlite

  1. Open bookmark and click なろう
  2. Click a subfolder to expand the subfolder
  3. move mouse on the parent folder
    --- no hover effect. BUG!
  4. Click the other subfolder to expand subfolder
    --- not expand the subfolder, BUG!

Actual Results:
no hover effect.
not expand the subfolder.
ui hangs up temporarily(for 10-15sec)

Expected Results:
Hover effect(highlight) should be provided
Subfolders should expand immediately.

I'm not able to reproduce this at all, on Windows 10. Rendering is maybe a tiny bit slow with the giant bookmarks folder, but it's within a frame or two of what I'd expect

Is the profile from comment 7 from a case where there was a 10 second hang? Rendering looks slow there, but there's no sign of it failing to draw anything.

Flags: needinfo?(matt.woodrow)

See screencast in comment 7.
elapsed time 0:06 - 0:21 , mouse click does not response, no hover highlight.

And the debug build logged(Attachment 9240777 [details]) a lot of WARNING if the problem is occurred.
[Parent 11192, Main Thread] WARNING: Must complete empty transaction when compositing!: file /builds/worker/checkouts/gecko/layout/base/PresShell.cpp:6276

(In reply to Matt Woodrow (:mattwoodrow) from comment #13)

I'm not able to reproduce this at all, on Windows 10. Rendering is maybe a tiny bit slow with the giant bookmarks folder, but it's within a frame or two of what I'd expect

Is the profile from comment 7 from a case where there was a 10 second hang? Rendering looks slow there, but there's no sign of it failing to draw anything.

Have you tested this on Windows 7? I know this bug is for Windows 10 but mine (1725629) was considered a duplicate even though it's for Win 7. Versions 92+ are so bad with that OS and bookmark scrolling, particularly with older computers like Pentium 4 they're unusable. With those I'm now forced to go with Edge or stay with 91esr. Even with modern hardware it's sluggish.

Incidentally, I'm seeing both bugs, the time it takes for the bookmark icons to show and the bookmark highlight to catch up with the mouse

Please test if you can.

See Also: 1729756
See Also: 1730362

(In reply to 202nine from comment #17)

Have you tested this on Windows 7? I know this bug is for Windows 10 but mine (1725629) was considered a duplicate even though it's for Win 7. Versions 92+ are so bad with that OS and bookmark scrolling, particularly with older computers like Pentium 4 they're unusable. With those I'm now forced to go with Edge or stay with 91esr. Even with modern hardware it's sluggish.

Incidentally, I'm seeing both bugs, the time it takes for the bookmark icons to show and the bookmark highlight to catch up with the mouse

Please test if you can.

Are you seeing a 10+ second hang with no response at all, or just general slowness/lag as moving through the highlighted options?

Slowness is somewhat expected on older machines, and we can probably do better if it's particularly problematic (which it seems that it is).

A complete hang for 10 seconds (with relatively fast paints after that as it catches up, as shown in the screencast) sounds like a different bug, and one that I can't reproduce.

Flags: needinfo?(dr2017)

No neither the icon display nor the bookmark highlighting hang, they're just very slow. I've got a very fast newer machine as well and while not quite as slow it's still sluggish.

Flags: needinfo?(dr2017)

The screencast in comment 7 and the incoming dupes are concerning to me. This feels like something we shouldn't let slip any further releases if we can avoid it.

In Bug 1731133, the user reported that a similar issue can be reproduced by enabling Windows High Contrast mode.

I can reproduce slowness on Win10 with high contrast mode turned on:
https://share.firefox.dev/3zYwcsr

I suspect the surface we're creating in gfxWindowsNativeDrawing::BeginNativeDrawing is very large.

I can reproduce the issue with high contrast mode too, the slowdown is definitely significant for that case. However, it seems that some people are reporting issues (like the screencast in this bug) which happen without high contrast mode enabled.

It seems like there's 3 potential separate issues here:

  • 10+ second 'hangs' with no menu highlights painted at all.
    I can't reproduce this at all, and none of the profiles thus far show anything obvious that would show why this is happening.

  • General slowness/lag for menu highlight changes.
    This is likely because we're no longer retaining content for these popup windows and are redrawing the whole thing every time.
    I can't really reproduce this, but that may largely be due to the performance of the systems I'm testing on this. This will almost certainly be worse on older computers.
    One option would be to enable WebRender for these popups (since they have the potential to be large). Another would be to try to optimize a bunch of slow spots, and hope that this is sufficient. We could also investigate doing some sort of simplistic dirty region tracking, trying to reuse the code from WR fallback.

  • Extreme slowness when high contrast is enabled..
    I can reproduce this, appears to be a subset of the above issue, where slow painting is exacerbated by the need to repaint the whole area including unchanged pixels. We're doing alpha recovery and the destination surface is skia, so every menu entry requires an alloc/dealloc of 2 Windows HDC surfaces.
    As above, switching to WebRender for the popups would take us back to only repainting the changed areas, so this would stop being an issue again.
    Another option would be to use a Cairo/HDC surface for the destination (which is what the widget provides), which would remove the need for alpha recovery and temporary surfaces (but increases our reliance on cairo drawing). It might also remove a copy from the skia surface into the widget's cairo surface at the end of each paint.
    We could also try some optimizations, like caching the temporary surfaces to reduce allocations.

Jeff, any preferences on what the best way forward is?

Flags: needinfo?(jmuizelaar)

Matt, did you confirm that the surface sizes are not substantially larger than the window?

Flags: needinfo?(jmuizelaar) → needinfo?(matt.woodrow)

Indeed, the surface sizes are all pretty small, there's just a lot of them.

Flags: needinfo?(matt.woodrow)

You fellows mentioned older, maybe low-performance computers.
I'm seeing this sluggishness on two Win7-64 desktop machines with quad-core i7 and GTX 960 and GTX 1070 graphics.
No other machines around here with Firefox.
The bookmark menu highlighting moves up or down about 20 items per second.
When I open a fresh bookmark submenu, its text appears instantly but its icons appear about 50 icons per second.
During that lagging, Task Manager shows approximately one core of steady CPU usage.
Even though rendering is lagging, I can click any menu item, and its page opens immediately.
Similar behavior with both Windows Classic Theme (my favorite) and various Aero themes.
I tried a clean install of a Firefox nightly into Sandboxie, copied over my huge bookmarks JSON, and saw the same problem.
That all began with FF 92.0.

I managed to reproduce the 'hang' by making menu paints even slower with a sleep.

Each mouse move is triggering a paint request, and also scheduling an nsMenuActivateEvent, which is what changes the highlighted option. Paint events from the OS have higher priority, and when they take longer than 16ms, we end up always prioritizing the paints and never processing the menu activate events.

As long as you keep moving the mouse, we just build up a longer and longer queue of menu activate events. Once you stop moving the mouse, we finally start to process them, one at a time (since each one schedules a paint, and that gets prioritized ahead of the remaining queue), so there's a long replay period where we catch up on all the changes individually.

Increasing the priority of nsMenuActivateEvent to VSYNC priority fixes this particular subset of the problem (painting is still slow, especially with high-contrast, but it's not stuck).

Smaug, any ideas what the best thing to do here is?

Flags: needinfo?(bugs)

None, of the options seems like an obvious winner. That being said, I think it's worth seeing how performance is using WebRender for the popups as well as using a Cairo surface. Both of these choices seem like ok options for a low churn fix.

For these non-retained popups, the refresh driver tick itself does no painting work other than requesting a native paint event from the OS. The resulting OS paint callback (OnPaint for Windows) is where we do the actual (slow) paint work.

I think that is what's breaking our scheduling, and causing the menu updates (as well as favicon rendering, see comment 29) to be processed incredibly slowly.

Depends on D127375

(In reply to Matt Woodrow (:mattwoodrow) from comment #30)

I managed to reproduce the 'hang' by making menu paints even slower with a sleep.

Each mouse move is triggering a paint request, and also scheduling an nsMenuActivateEvent, which is what changes the highlighted option. Paint events from the OS have higher priority, and when they take longer than 16ms, we end up always prioritizing the paints and never processing the menu activate events.

As long as you keep moving the mouse, we just build up a longer and longer queue of menu activate events. Once you stop moving the mouse, we finally start to process them, one at a time (since each one schedules a paint, and that gets prioritized ahead of the remaining queue), so there's a long replay period where we catch up on all the changes individually.

Increasing the priority of nsMenuActivateEvent to VSYNC priority fixes this particular subset of the problem (painting is still slow, especially with high-contrast, but it's not stuck).

Smaug, any ideas what the best thing to do here is?

Just some extra information, we end up basically only processing 3 types of events:

mousemove input events - UI priority - these request a refresh driver tick (a catch-up one usually)
'catch-up' refresh driver ticks - Vsync priority - these request a native OS paint event
native OS paint events - processed ahead of all prioritized events - these take >16ms.

These events take up the full vsync time slice, and as long as the mouse keeps moving, they're all we manage to process. Any normal priority events (like the nsMenuActivateEvent) get queued indefinitely.

(In reply to Matt Woodrow (:mattwoodrow) from comment #30)

Each mouse move is triggering a paint request, and also scheduling an nsMenuActivateEvent, which is what changes the highlighted option. Paint events from the OS have higher priority, and when they take longer than 16ms, we end up always prioritizing the paints and never processing the menu activate events.

I don't quite understand this part. OS level events shouldn't block Gecko events in the AppShell.
Does https://searchfox.org/mozilla-central/rev/25002b534963ad95ff0c1a3dd0f906ba023ddc8e/widget/nsBaseAppShell.cpp#16-19 not work?

Increasing the priority of nsMenuActivateEvent to VSYNC priority fixes this particular subset of the problem (painting is still slow, especially with high-contrast, but it's not stuck).

vsync priority in the Gecko's main thread doesn't affect how OS level events are processed.

Flags: needinfo?(bugs)

(In reply to Olli Pettay [:smaug] from comment #38)

I don't quite understand this part. OS level events shouldn't block Gecko events in the AppShell.
Does https://searchfox.org/mozilla-central/rev/25002b534963ad95ff0c1a3dd0f906ba023ddc8e/widget/nsBaseAppShell.cpp#16-19 not work?

I think we're not entirely starving gecko events, but the vsync/io priority gecko events (mousemove, catch-up refresh tick) in combination with the long-running OS level events (OnPaint) are sufficient to starve the normal priority gecko events.

Assignee: nobody → matt.woodrow
Attachment #9244113 - Attachment description: WIP: Bug 1724617 - WIP: Use WebRender for menu popups. → Bug 1724617 - Use WebRender for menu popups.
Status: NEW → ASSIGNED
Pushed by mwoodrow@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8ff0f5b38b62 Use WebRender for menu popups. r=jrmuizel
Status: ASSIGNED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 95 Branch

That nightly solves the lagging bookmarks menus on my system. Thank you!

Same as with comment #42, today's nightly resolved the bookmark lagging in Windows 7 on both the newer and older hardware.

Nightly 95.0a1 resolves the bookmark issue reported in 1730362. Thanks.

Regressions: 1734346

The patch landed in nightly and beta is affected.
:mattwoodrow, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.
If yes, don't forget to request an uplift for the patches in the regression caused by this fix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(matt.woodrow)

Hi Glenn, I guess the uplift request for this falls under you now. Please request when you're comfortable with this and bug 1734346.

Flags: needinfo?(gwatson)

Comment on attachment 9244113 [details]
Bug 1724617 - Use WebRender for menu popups.

Beta/Release Uplift Approval Request

  • User impact if declined: The bookmarks menu can be very sluggish in some situations (Win7, Win10 high contrast mode) A fair number of users have complained.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Medium
  • Why is the change risky/not risky? (and alternatives if risky): Medium-to-high risk. This completely changes the backend that we use for these menus. It should be ok, and seems to be working fine on Nightly, but who knows what could happen in some weird niche situation.
  • String changes made/needed:
Attachment #9244113 - Flags: approval-mozilla-beta?

Comment on attachment 9244113 [details]
Bug 1724617 - Use WebRender for menu popups.

A bit scary, but we still have a week of Beta left this cycle and this has been picking up a number of dupes. Let's give it a shot and circle back before we build the RC to see if we're still a go for shipping. I also think it would be good if QA could do some targeting testing around pop-up menus to look for any more edge cases. Approved for 94.0b7.

Flags: needinfo?(matt.woodrow)
Flags: needinfo?(gwatson)
Attachment #9244113 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+
QA Whiteboard: [qa-triaged]

Hello! Unfortunately, I could only reproduce the issue while having High Contrast enabled on Windows 7x64 and Windows 10x64 on the same machine with Firefox 92.0a1 (20210807214822). I used STR from comment 12 and the attached places.sqlite.

The issue is no longer reproducible using Firefox 94.0b8 (20211019190240) and 95.0a1 (20211020093007) on the same machine running Windows 7x64 and Windows 10x64 while High Contrast is enabled.

Alice, given the fact that I could not reproduce the issue only with High contrast, can you please also verify if you have the time on the latest beta and nightly to see if the issue is fixed on your end as well? Thank you in advance!

Flags: needinfo?(alice0775)

I managed to reproduce the issue in Nightly94.0a1(20211004095342) Windows10.
And I verified fix this in Firefox94.0b8(20211019190240) and Nightly95.0a1(20211020093007) Windows10.

Status: RESOLVED → VERIFIED
Flags: needinfo?(alice0775)

Thank you, Alice!

Flags: qe-verify+
Regressions: 1758691
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: