Open Bug 1742656 Opened 3 years ago Updated 2 months ago

The WinWindowOcclusionCalc thread uses a visible amount of CPU time

Categories

(Core :: Widget: Win32, defect, P3)

Desktop
Windows
defect

Tracking

()

Tracking Status
firefox-esr91 --- unaffected
firefox94 --- unaffected
firefox95 --- unaffected
firefox96 --- disabled
firefox97 --- disabled
firefox98 --- wontfix
firefox99 --- wontfix
firefox100 --- wontfix

People

(Reporter: florian, Unassigned)

References

(Blocks 2 open bugs, Regression)

Details

(Keywords: power, regression, Whiteboard: [win:power])

When looking at the parent process in about:processes, I see the WinWindowOcclusionCalc thread using about 1% of a CPU core when moving the mouse around above the Firefox window, and about 2% of a CPU core when moving another window.

Here is a profile of it: https://share.firefox.dev/3FF41SL

Flags: needinfo?(sotaro.ikeda.g)

Set release status flags based on info from the regressing bug 1732736

Thanks. From your comment and what I could see in the profile, it seems the CPU time is not used by our code reacting to the events, but rather by the overhead of the operating system sending us these events. In the profile, most of the time is spent in NtUserGetQueueStatus, PeekMessageW, NtUserKillTimer.

So questions about ways to potentially reduce this overhead:

  • is there another thread where we are already receiving these events that could be used to get these events and forward only the interesting ones to the WinWindowOcclusionCalc thread?
  • my understanding is that detecting window occlusion is meant to save power by avoiding graphics operations in windows that are guaranteed to be invisible. Could we run the occlusion detection code only when we would otherwise be doing expensive graphics operations? Maybe that would mean when vsync is enabled.
Flags: needinfo?(sotaro.ikeda.g)
Blocks: 1688997
Flags: needinfo?(sotaro.ikeda.g)

The severity field is not set for this bug.
:jimm, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jmathies)
Severity: -- → S4
Flags: needinfo?(jmathies)

NI for comment 3 that was probably unset by accident.

Flags: needinfo?(sotaro.ikeda.g)
Has Regression Range: --- → yes

Sotaro, should this bug block shipping occlusion culling? Do we know if Chrome has the same overhead?

(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)

Sotaro, should this bug block shipping occlusion culling? Do we know if Chrome has the same overhead?

From source code, chrome also should have same overhead. Then this bug does not block shipping, I think.

Flags: needinfo?(sotaro.ikeda.g)

In current implementation, global event hook is unregistered when all Firefox windows are minimized.
One thing we can do for reducing power usage is adding more situations that global event hook is unregistered.

Priority: -- → P3

Following Jeff's suggestion, I captured etw profiles of both Firefox and Chrome. In both cases I had the browser window on the left half of the screen, and the Windows Task Manager on the right half of the screen. I moved the mouse above the Task Manager window during the profile.

Both browsers used non-0 CPU while the mouse was moving above the other window, unless all their windows were minimized. So it's quite possible that we perform similar to Chrome here, but I still think we can and should do better.

Firefox profile: https://share.firefox.dev/3IJKRN1
Chrome profile: https://share.firefox.dev/3g2Ogub (no idea of which thread is relevant)

Sotaro, following what you suggested in comment 8 and my question in comment 3, could we unregister the global event hook when vsync is not enabled?

Flags: needinfo?(sotaro.ikeda.g)

Thanks for getting the profiles Florian. One thing I noticed is that Chrome seems to have a more efficient event loop for this thread.

In Chrome it's basically just calling RealMsgWaitForMultipleObjectsEx and PeekMessageW where as in Firefox we have calls to KillTimer and GetQueueStatus. These functions account for about 26% of the time on this thread: https://share.firefox.dev/3Gd8trE

Sotaro, is it possible to use a simpler event loop for this thread?

It seems like that, and unregistering when vsync is not enabled should go a decent way in moving this overhead to below the noise threshold.

Keywords: power

Too late for a fix in 98.

Whiteboard: [win:power]
Flags: needinfo?(sotaro.ikeda.g)
You need to log in before you can comment on or make changes to this bug.