Closed Bug 1580117 Opened 5 years ago Closed 4 years ago

[macOS] High CPU utilization with restored minimized windows

Categories

(Core :: Graphics, defect, P3)

Unspecified
macOS
defect

Tracking

()

VERIFIED FIXED
83 Branch
Performance Impact medium
Tracking Status
relnote-firefox --- 83+
firefox83 --- verified

People

(Reporter: haik, Assigned: jryans)

References

Details

(Keywords: perf, perf:resource-use)

Attachments

(5 files)

On macOS 10.15 Beta 7 (19A546d), I've noticed that with several windows minimized to the dock, when Firefox is restarted (with "Restore previous session" enabled), the Firefox parent process ("firefox") has very high CPU utilization. In my testing, the load was at least 40% for several minutes. For the tests below, I have ~50 windows minimized with each window having one tab with a static HTML file opened from ~/Desktop/basic.html. I first noticed this problem after restarting with ~10 windows minimized.

I have not tested on 10.14 or earlier releases.

Steps to reproduce:

  1. Open 50 windows each with a single tab with the attached basic.html file loaded via file:/// from the filesystem
  2. Minimize each window, leaving 1 remaining window
  3. Enable "Restore previous session" in Firefox's preferences
  4. Quit Firefox
  5. Start Firefox
  6. Check the CPU utilization of the firefox parent process (using $ top -o cpu).

Profiles collected:

  1. https://perfht.ml/2ZONiqQ (after startup with ~50 windows minimized)
  2. https://perfht.ml/2ZUqKVx (a second profile collected a few minutes later)
  3. https://perfht.ml/2ZVHdZI (another instance of the problem after restarting the browser with ~50 minimized basic.html windows.)

After un-minimizing all ~50 minimized windows, %CPU utilization drops to 0%.

Attached file Static HTML Test page

Uh, is e10s off for these profiles? They do not seem to include content processes...

Flags: needinfo?(haftandilian)

Oh, the content processes are hidden but still available.

In any case... while this is filed under session restore, I don't see evidence in the profile that that's what's causing the CPU spike (only on the order of 10-20 samples are in sessionstore js code, when the entire profile spans 10s of seconds). In fact, I don't see much of anything in terms of samples in the profile that shows the CPU usage...

Florian, can you make anything of this?

Flags: needinfo?(haftandilian) → needinfo?(florian)
Whiteboard: [qf]

Haik - can you get a profile with "," for the threads listing in the profiler?
Also, a dump of the top data (say once per second) and a screenshot of Activity Monitor (CPU and Energy use pages).

Have you (or anyone else) seen this on any other OS version, or other OS? Does 71 Nightly show the same issue?

Thanks!

Flags: needinfo?(haftandilian)

See attached log files and the following profiles collected from the same instance of the problem with all threads included. These were collected on Nightly 71 on macOS Catalina Beta 19A558d.

https://perfht.ml/34SG4pC
https://perfht.ml/34S2nvj

In some of the Activity Monitor samples for the parent process main thread, I noticed OSSpinLockLockSlow appearing a few times in allocator stacks. Looking at the other threads in the profile, I didn't see anything obvious.

If I can collect any more useful data I'll post it.

Flags: needinfo?(haftandilian)

jesup: this requires some analysis. Can you do a first pass?

Flags: needinfo?(rjesup)
Whiteboard: [qf] → [qf:p2:resource]

In the profiles it looks like there's activity all the time in the compositor track, despite nothing changing on screen. Markus, could you please have a look at these profiles when you get a chance?

Flags: needinfo?(florian) → needinfo?(mstange)

The busiest thread is Main Process/MainThread --- and it's only about 10% busy. Compositor is maybe 5%. (And those are of one core)
BTW, it's odd you have only 2 content threads..... can you verify you haven't artificially made it slower? How many cores? 2, 4?

It smells like either a non-nsThread (there are some, such as OS service threads and webrtc threads (not an issue here)), or an OS/kernel/OS-lib issue. Perhaps Markus knows something here

Flags: needinfo?(rjesup)

All these windows think they're animated and have vsync listeners. But then compositing detects that there's nothing to composite. We need to find out why these windows don't stop listening for vsync.

Haik, do you see the same problem with Firefox 70 / 71 (Beta / Nightly) builds?

Flags: needinfo?(haftandilian)

(In reply to Markus Stange [:mstange] [on PTO until Sept 30] from comment #10)

All these windows think they're animated and have vsync listeners. But then compositing detects that there's nothing to composite. We need to find out why these windows don't stop listening for vsync.

Haik, do you see the same problem with Firefox 70 / 71 (Beta / Nightly) builds?

I have reproduced it on Released 69 and Nightly 71 (using the same profile that was created with Release 69).

Flags: needinfo?(haftandilian)

(In reply to Haik Aftandilian [:haik] from comment #11)

(In reply to Markus Stange [:mstange] [on PTO until Sept 30] from comment #10)

All these windows think they're animated and have vsync listeners. But then compositing detects that there's nothing to composite. We need to find out why these windows don't stop listening for vsync.

Haik, do you see the same problem with Firefox 70 / 71 (Beta / Nightly) builds?

I have reproduced it on Released 69 and Nightly 71 (using the same profile that was created with Release 69).

Once I un-minimize all the restored windows that are in the dock, the problem seems to go away and the utilization drops to zero.

See Also: → 1567018
See Also: → 1579212

I still hit this problem and it's pretty severe on my 2018 MacBook Air Dual Core i5. After a session restore with 4 minimized windows, the CPU load from the parent process is 20% while the browser should be idle. The 4 minimized windows have a single tab with a basic page like https://www.craigslist.org and the one un-minimized window has about:blank loaded.

Opening the windows and then minimizing them fixes the problem and makes the load drop to zero.

I found that setting the window docShell.isActive=false from the browser toolbox stops the compositing:

>> BrowserWindowTracker.orderedWindows.filter(w => w.windowState == w.STATE_MINIMIZED).forEach(w => w.docShell.isActive)

SendSetDocShellIsActive()/RecvSetDocShellIsActive() is being called for minimized windows at startup.

Ignoring when it is manually set, from the browser toolbox, the docShell.isActive value is always seems to be true including after minimizing a window. I don't know if that's an error or not:

>> BrowserWindowTracker.orderedWindows.filter(w => w.windowState == w.STATE_MINIMIZED).map(w => w.docShell.isActive)
Array [ true ]

Could these values be out of sync such that the parent thinks the docShell is active when it shouldn't be.

Any ideas on how to debug further?

Component: Session Restore → Graphics
Product: Firefox → Core
Summary: [macOS 10.15] High CPU utilization with many restored minimized windows → [macOS 10.15] High CPU utilization with restored minimized windows
Version: unspecified → Trunk

I've recategorized this is a graphics bug. I think this might be worth looking into from the graphics perspective. It could be a low hanging fruit fix that has a noticeable effect on Mac laptop battery life.

Flags: needinfo?(mstange) → needinfo?(jbonisteel)

@nika, does comment 13 about docShells provide any clues about what might be going wrong here?

Flags: needinfo?(nika)

I wonder if the situation is any different with/without WebRender.

Priority: -- → P3

(In reply to Dzmitry Malyshau [:kvark] from comment #17)

I wonder if the situation is any different with/without WebRender.

It doesn't appear to make a difference. I just tested with WebRender enabled and see the same behavior on two different Mac machines.

Markus - wondering if you have any thoughts on this?

Flags: needinfo?(jbonisteel) → needinfo?(mstange)

(In reply to Haik Aftandilian [:haik] from comment #13)

Could these values be out of sync such that the parent thinks the docShell is active when it shouldn't be.

I'm not sure whether this is unexpected. I did a quick scan, and there doesn't appear to be any logic which would set isActive to false on a chrome docShell when we minimize a window.

A quick test in the browser console with a window suggested that isActive doesn't change (admittedly, this is on linux):

>> docShell.isActive
<- true
>> window.minimize()
<- undefined
>> docShell.isActive
<- true

I think that flag is primarily set by frontend on content tabs to tell gecko which one is currently active, and I'm not sure it's ever set on the toplevel browser window.

Flags: needinfo?(nika)

I'm unable to reproduce this using the steps in comment #0

I believe I am able to reproduce this issue. I used the attached basic.html loaded as file content and followed the steps in comment #0 in a new profile.

OS: macOS 10.14.6
Machine: MacBook Pro (15-inch, 2018) / MacBookPro15,1
GPUs: Intel UHD Graphics 630, Radeon Pro 560X

While I did try 50 windows as per comment #0, 10 windows seemed to be enough to induce the issue for me, so I went with that as I continued testing.

WebRender Enabled

https://share.firefox.dev/39auatu

  • Minimized windows at startup: 20% constant CPU
  • All windows normal state: occasional 0 - 10% CPU

While the windows are minimized, the browser process compositor thread shows 3 - 4 VsyncTimestamp and CompositeToTarget markers for every frame (seems incorrect). 3 of the RenderBackend threads are actively building frames.

When all windows restored to normal state, they all report VsyncTimestamp to the compositor thread, but no compositing happens (seems expected).

When re-minimizing all but 1 window after they were all normal, only 1 window reports VsyncTimestamp, and again no compositing happens (seems expected).

WebRender Disabled

https://share.firefox.dev/3fLljRw

  • Minimized windows at startup: 10% constant CPU
  • All windows normal state: occasional 0 - 10% CPU

While the windows are minimized, the browser process compositor thread shows 3 - 4 VsyncTimestamp, followed by Composite and NoCompositorScreenshot because nothing changed (seems incorrect).

This suggests WebRender disabled is following the same pattern as WebRender enabled, in that several windows are trigger composition, even though nothing has changed.

I've made some progress in isolating the issue.

The CPU usage is (as Markus suspected in comment 10) due to some of the minimized windows (3 out of 10 with this test case on my machine) animating indefinitely.

The animation involved is the tab throbber transform animation. In particular, there seems to be specific styling to animate the throbber after session restore:

  :root[sessionrestored] .tab-throbber[busy]::before {
    ...
    animation: tab-throbber-animation 1.05s steps(30) infinite;
    ...
  }

  @keyframes tab-throbber-animation {
    0% { transform: translateX(0); }
    100% { transform: translateX(-100%); }
  }

If I remove this animation style, the CPU usage goes away.

This does not appear to be a frontend / browser issue: shortly after startup, the busy attribute is correctly removed from all of the tabs. So it seems there's a bug in either the style or graphics pipeline that fails to stop these animations after the busy attribute is removed. In 7 out of 10 windows, the animation is stopped correctly, but for the other 3 windows, the animation runs forever, until the window is un-minimized / restored to normal state.

Summary: [macOS 10.15] High CPU utilization with restored minimized windows → [macOS] High CPU utilization with restored minimized windows

For the WebRender case at least, the root of the issue seems to be that some of the windows go through a timeline like this:

  1. The window minimizes
  2. We're still animating the tab throbber on every vsync
  3. The busy attribute is removed and frames are reconstructed after styling
  4. As part of this the WebRenderAnimationData for the animation is marked for discarding:
WebRenderAnimationData::~WebRenderAnimationData() {
  // It may be the case that nsDisplayItem that created this WebRenderUserData
  // gets destroyed without getting a chance to discard the compositor animation
  // id, so we should do it as part of cleanup here.
  uint64_t animationId = mAnimationInfo.GetCompositorAnimationsId();
  // animationId might be 0 if mAnimationInfo never held any active animations.
  if (animationId) {
    mManager->AddCompositorAnimationsIdForDiscard(animationId);
  }
}
  1. This would normally be sent along by RenderRootStateManager::DiscardCompositorAnimations at the end of the next layer transaction...
  2. ...but we never actually paint the window at this point, since NeedsPaint return false (because the window is minimized and thus not visible)

I can imagine several possible fixes:

  • Immediately call DiscardCompositorAnimations as part of the AnimationData destructor (this seems to work, but might not be ideal for some reason)
  • Change NeedsPaint for macOS windows to allow painting when the compositor is still running, similar to Android windows (a bit silly, since we don't want to paint them really, but it would be enough to get discard message across, and stop the animation, and then the vsync listener)
  • Stop all activity on the minimized windows by setting browser window docshells inactive (similar to bug 1578123 for occluded windows)

Kats, I see you've worked on the ~WebRenderAnimationData() code before - is there a better way to forward animation discards here, without requiring a paint?

Flags: needinfo?(mstange.moz) → needinfo?(kats)

(Copying my comment from #gfx):

I didn't read the whole bug but it seems to me like when the window is minimized we should stop vsync on it, and so even if the compositor animation doesn't get discarded it shouldn't keep animating. If memory serves each widget has its own vsync dispatcher so if a window knows it's minimized, it should be able to stop the vsync events. That seems conceptually like a better approach to me but you probably know details from your investigation that might make that solution complicated.

Flags: needinfo?(kats)

Matt - this would be a good bug to take a look at

Flags: needinfo?(matt.woodrow)

I attempted the approach :kats recommended of having the window stop the VsyncDispatcher when minimized, but this does not fully resolve the issue and seems to fight against the intended way the vsync dispatchers are managed. CompositorVsyncScheduler expects to be the one managing whether the vsync observer (and thus the dispatcher) are enabled for a window, and so it can cause it to become re-enabled. Of course, we could add various flags to prevent that when minimized, but it feels a bit awkward to me.

I also experimented with setting the browser window docshell to inactive when minimized, but curiously that did not seem to stop the compositor work for windows that have already gotten trapped in the compositing loop (perhaps this is different for the WR path I am focused on vs. legacy path that :haik tested in comment 13, I haven't tried it with the legacy rendering path yet).

The current approach I am exploring is for the CompostorBridgeParent / WebRenderBridgeParent to skip compositing when the window is hidden, similar to the existing paused state in those classes. There's one wrinkle at the moment where a few windows get confused and keep their refresh drivers active. I'm hoping to sort that out soonish if I can.

(I haven't assigned this bug to myself yet, as I am not confident in this area and only have a small amount of time for working on this, but anyway, I'll keep trying for now, and we'll see if I get to something workable.)

(In reply to J. Ryan Stinnett [:jryans] (Use needinfo, replies may be slow) from comment #28)

I also experimented with setting the browser window docshell to inactive when minimized, but curiously that did not seem to stop the compositor work for windows that have already gotten trapped in the compositing loop (perhaps this is different for the WR path I am focused on vs. legacy path that :haik tested in comment 13, I haven't tried it with the legacy rendering path yet).

I think this makes sense, it'd stop the refresh driver from painting the document (which should fix the issue from bug 1578123), but wouldn't affect the compositor associated with that window.

The current approach I am exploring is for the CompostorBridgeParent / WebRenderBridgeParent to skip compositing when the window is hidden, similar to the existing paused state in those classes. There's one wrinkle at the moment where a few windows get confused and keep their refresh drivers active. I'm hoping to sort that out soonish if I can.

This sounds like it should work. Can we make AppWindow::OcclusionStateChanged both set the active state on mDocShell, and SendPause/SendResume on mWindow->GetRemoteRenderer?

Flags: needinfo?(matt.woodrow)

(In reply to Matt Woodrow (:mattwoodrow) from comment #29)

(In reply to J. Ryan Stinnett [:jryans] (Use needinfo, replies may be slow) from comment #28)

The current approach I am exploring is for the CompostorBridgeParent / WebRenderBridgeParent to skip compositing when the window is hidden, similar to the existing paused state in those classes. There's one wrinkle at the moment where a few windows get confused and keep their refresh drivers active. I'm hoping to sort that out soonish if I can.

This sounds like it should work. Can we make AppWindow::OcclusionStateChanged both set the active state on mDocShell, and SendPause/SendResume on mWindow->GetRemoteRenderer?

My first approach was to use the existing CompositorWidget::IsHidden() method (after implementing it for the macOS in-process case) as a way for the compositor to ask about the widget state, but this leads to crashes, most likely because of widget state changing on the main thread while the compositor is accessing it at the same time, so that ends up seeming like a bad design.

My next approach is indeed to try sending pause / resume to compositor like you mention. In the macOS case at least, the remote renderer seems to be on the mainChildView, not the app window directly, so it may involve a few contortions to reach it, but that's the basic idea at least.

Assignee: nobody → jryans
Status: NEW → ASSIGNED

This pauses composition when a window becomes fully occluded.

This is particularly important for platforms like macOS, where minimized /
hidden windows historically would remain compositing, potentially on every vsync
if the hidden window contained animations. This should lead to a nice power
savings by skipping this unnecessary work.

This change affects both the WebRender and legacy compositor paths.

As bug 1580117 is believed to be a macOS specific issue, this is currently
implemented for macOS only, but could in the future be generalized to other
platforms as well (likely using size mode changes since other platforms do not
notify on occlusion change).

If the browser window is minimized, it's unlikely that a preloaded browser will
be useful in the near future. This change skips creating one in such a case.

This avoids an issue in the graphics layer where about:newtab content marks
itself as renderLayers = true, and will try to composite content in windows
which are not compositing (because they are hidden).

Depends on D85954

I believe this is nearly ready to land... We just need to agree on an approach for the preloaded browser case, to ensure they don't get stuck in a loop, burning CPU with vsync refresh drivers staying active. In case :mstange is busy, maybe :mattwoodrow can take a look at the discussion in https://phabricator.services.mozilla.com/D85955 and advise on a preferred approach?

Flags: needinfo?(matt.woodrow)

Added some comments in the first patch there!

Flags: needinfo?(matt.woodrow)

Thanks! 😄 I have replied to your comments, please take another look.

Flags: needinfo?(matt.woodrow)
Flags: needinfo?(matt.woodrow)
Attachment #9168046 - Attachment description: Bug 1580117 - Skip preloaded browsers in minimized windows. r=Gijs → Bug 1580117 - Skip preloaded browsers in minimized windows. r=mattwoodrow
Pushed by jryans@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/70b864f4b6a5
Pause composition for occluded windows on macOS. r=mstange
https://hg.mozilla.org/integration/autoland/rev/aba57d4f19cb
Skip preloaded browsers in minimized windows. r=mattwoodrow,Gijs

Backed out 2 changesets (bug 1580117) for mochitest and webrender failures.

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&searchStr=os%2Cx%2C10.14%2Copt%2Cmochitests%2Ctest-macosx1014-64%2Fopt-mochitest-devtools-chrome-e10s%2Cdt3&fromchange=d3b1f72de9dd13a52cc92c4354ec48a22333eb0a&tochange=8fdb8525ddfc3c0a6ed6535b7841bbdb24757c9d&selectedTaskRun=DmKY1hgsRvaEhmf9LeG6-Q.0

Backout link: https://hg.mozilla.org/integration/autoland/rev/8fdb8525ddfc3c0a6ed6535b7841bbdb24757c9d

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=314559029&repo=autoland&lineNumber=19333

[task 2020-09-01T12:03:16.145Z] 12:03:16     INFO - TEST-START | devtools/client/webconsole/test/browser/browser_console_chrome_context_message.js
[task 2020-09-01T12:03:16.609Z] 12:03:16     INFO - GECKO(1621) | JavaScript warning: resource://devtools/shared/builtin-modules.js, line 196: debuggee 'resource://devtools/shared/base-loader.js:289' would run
[task 2020-09-01T12:03:16.629Z] 12:03:16     INFO - GECKO(1621) | JavaScript warning: resource://devtools/shared/builtin-modules.js, line 196: debuggee 'resource://devtools/shared/base-loader.js:289' would run
[task 2020-09-01T12:03:33.860Z] 12:03:33     INFO - TEST-INFO | started process screencapture
[task 2020-09-01T12:03:33.989Z] 12:03:33     INFO - TEST-INFO | screencapture: exit 0
[task 2020-09-01T12:03:33.989Z] 12:03:33     INFO - Buffered messages logged at 12:03:16
[task 2020-09-01T12:03:33.989Z] 12:03:33     INFO - Entering test bound 
[task 2020-09-01T12:03:33.990Z] 12:03:33     INFO - Console message: OpenGL compositor Initialized Succesfully.
[task 2020-09-01T12:03:33.990Z] 12:03:33     INFO - Version: 2.1 INTEL-12.9.22
[task 2020-09-01T12:03:33.990Z] 12:03:33     INFO - Vendor: Intel Inc.
[task 2020-09-01T12:03:33.990Z] 12:03:33     INFO - Renderer: Intel Iris OpenGL Engine
[task 2020-09-01T12:03:33.990Z] 12:03:33     INFO - FBO Texture Target: TEXTURE_2D
[task 2020-09-01T12:03:33.990Z] 12:03:33     INFO - Console message: OpenGL compositor Initialized Succesfully.
[task 2020-09-01T12:03:33.990Z] 12:03:33     INFO - Version: 2.1 INTEL-12.9.22
[task 2020-09-01T12:03:33.991Z] 12:03:33     INFO - Vendor: Intel Inc.
[task 2020-09-01T12:03:33.991Z] 12:03:33     INFO - Renderer: Intel Iris OpenGL Engine
[task 2020-09-01T12:03:33.991Z] 12:03:33     INFO - FBO Texture Target: TEXTURE_2D
[task 2020-09-01T12:03:33.991Z] 12:03:33     INFO - Console message: [JavaScript Warning: "The Web Console logging API (console.log, console.info, console.warn, console.error) has been disabled by a script on this page."]
[task 2020-09-01T12:03:33.991Z] 12:03:33     INFO - Console message: [JavaScript Warning: "debuggee 'resource://devtools/shared/base-loader.js:289' would run" {file: "resource://devtools/shared/builtin-modules.js" line: 196}]
[task 2020-09-01T12:03:33.991Z] 12:03:33     INFO - Console message: [JavaScript Warning: "debuggee 'resource://devtools/shared/base-loader.js:289' would run" {file: "resource://devtools/shared/builtin-modules.js" line: 196}]
[task 2020-09-01T12:03:33.992Z] 12:03:33     INFO - Adding a new tab with URL: data:text/html,<script>console.log("hello from content")</script>
[task 2020-09-01T12:03:33.992Z] 12:03:33     INFO - Buffered messages logged at 12:03:17
[task 2020-09-01T12:03:33.992Z] 12:03:33     INFO - Console message: [JavaScript Error: "The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol." {file: "data:text/html,<script>console.log("hello from content")</script>" line: 0}]
[task 2020-09-01T12:03:33.992Z] 12:03:33     INFO - Tab added and finished loading
[task 2020-09-01T12:03:33.992Z] 12:03:33     INFO - Opening the toolbox
[task 2020-09-01T12:03:33.992Z] 12:03:33     INFO - Toolbox opened and focused
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - Wait for expected message are shown on browser console
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - Console message: [JavaScript Error: "Cu.reportError" {file: "debugger eval code" line: 1}]
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - @debugger eval code:1:4
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - getEvalResult@resource://devtools/server/actors/webconsole/eval-with-debugger.js:243:24
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - exports.evalWithDebugger@resource://devtools/server/actors/webconsole/eval-with-debugger.js:167:14
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - evaluateJS@resource://devtools/server/actors/webconsole.js:1118:38
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - evaluateJSAsync/<@resource://devtools/server/actors/webconsole.js:1012:29
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - DevToolsUtils.executeSoon*exports.executeSoon@resource://devtools/shared/DevToolsUtils.js:47:21
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - evaluateJSAsync@resource://devtools/server/actors/webconsole.js:1009:19
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - handler@resource://devtools/shared/protocol/Actor.js:166:37
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - onPacket@resource://devtools/server/devtools-server-connection.js:379:58
[task 2020-09-01T12:03:33.993Z] 12:03:33     INFO - send/<@resource://devtools/shared/transport/local-transport.js:68:25
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - exports.makeInfallible/<@resource://devtools/shared/ThreadSafeDevToolsUtils.js:103:22
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - DevToolsUtils.executeSoon*exports.executeSoon@resource://devtools/shared/DevToolsUtils.js:47:21
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - send@resource://devtools/shared/transport/local-transport.js:56:21
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - send@resource://devtools/shared/protocol/Front.js:244:30
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - request@resource://devtools/shared/protocol/Front.js:262:10
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - generateRequestMethods/</frontProto[name]@resource://devtools/shared/protocol/Front/FrontClassWithSpec.js:46:19
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - evaluateJSAsync@resource://devtools/client/fronts/webconsole.js:104:42
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - evaluateJSAsync@resource://devtools/client/webconsole/commands.js:40:18
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - async*evaluateExpression/<@resource://devtools/client/webconsole/actions/input.js:105:8
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - async*thunkWithOptions/</<@resource://devtools/client/shared/redux/middleware/thunk-with-options.js:16:9
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - ignore/</<@resource://devtools/client/shared/redux/middleware/ignore.js:31:12
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - createPerformanceMarkerMiddleware/</</<@resource://devtools/client/shared/redux/middleware/performance-marker.js:50:28
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - dispatchEvaluateExpression@resource://devtools/client/webconsole/webconsole-wrapper.js:297:11
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - execute@chrome://mochitests/content/browser/devtools/client/webconsole/test/browser/head.js:286:25
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - @chrome://mochitests/content/browser/devtools/client/webconsole/test/browser/browser_console_chrome_context_message.js:26:10
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - Async*Tester_execTest/<@chrome://mochikit/content/browser-test.js:1069:34
[task 2020-09-01T12:03:33.994Z] 12:03:33     INFO - Tester_execTest@chrome://mochikit/content/browser-test.js:1109:11
[task 2020-09-01T12:03:33.995Z] 12:03:33     INFO - nextTest/<@chrome://mochikit/content/browser-test.js:932:14
[task 2020-09-01T12:03:33.995Z] 12:03:33     INFO - SimpleTest.waitForFocus/waitForFocusInner/focusedOrLoaded/<@chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:1037:23
[task 2020-09-01T12:03:33.995Z] 12:03:33     INFO - 
[task 2020-09-01T12:03:33.995Z] 12:03:33     INFO - Buffered messages finished
[task 2020-09-01T12:03:33.995Z] 12:03:33     INFO - TEST-UNEXPECTED-FAIL | devtools/client/webconsole/test/browser/browser_console_chrome_context_message.js | Uncaught exception - at chrome://mochitests/content/browser/devtools/client/shared/test/shared-head.js:655 - Error: Failed waitFor(): 
[task 2020-09-01T12:03:33.995Z] 12:03:33     INFO - Failed condition: () =>
[task 2020-09-01T12:03:33.995Z] 12:03:33     INFO -     expectedMessages.every(expectedMessage => findMessage(hud, expectedMessage))
[task 2020-09-01T12:03:33.995Z] 12:03:33     INFO - 
[task 2020-09-01T12:03:33.995Z] 12:03:33     INFO - Stack trace:
[task 2020-09-01T12:03:33.995Z] 12:03:33     INFO - waitFor@chrome://mochitests/content/browser/devtools/client/shared/test/shared-head.js:655:11
[task 2020-09-01T12:03:33.995Z] 12:03:33     INFO - async*@chrome://mochitests/content/browser/devtools/client/webconsole/test/browser/browser_console_chrome_context_message.js:28:9
[task 2020-09-01T12:03:33.996Z] 12:03:33     INFO - Async*Tester_execTest/<@chrome://mochikit/content/browser-test.js:1069:34
[task 2020-09-01T12:03:33.996Z] 12:03:33     INFO - Tester_execTest@chrome://mochikit/content/browser-test.js:1109:11
[task 2020-09-01T12:03:33.996Z] 12:03:33     INFO - nextTest/<@chrome://mochikit/content/browser-test.js:932:14
[task 2020-09-01T12:03:33.996Z] 12:03:33     INFO - SimpleTest.waitForFocus/waitForFocusInner/focusedOrLoaded/<@chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:1037:23
[task 2020-09-01T12:03:33.996Z] 12:03:33     INFO - Leaving test bound 
[task 2020-09-01T12:03:34.013Z] 12:03:34     INFO - Removing tab.
[task 2020-09-01T12:03:34.013Z] 12:03:34     INFO - Waiting for event: 'TabClose' on [object XULElement].
[task 2020-09-01T12:03:34.017Z] 12:03:34     INFO - Got event: 'TabClose' on [object XULElement].
[task 2020-09-01T12:03:34.023Z] 12:03:34     INFO - GECKO(1621) | console.warn: "IGNORED REDUX ACTION:" "AUTOCOMPLETE_CLEAR"
[task 2020-09-01T12:03:34.034Z] 12:03:34     INFO - Tab removed and finished closing
[task 2020-09-01T12:03:34.081Z] 12:03:34     INFO - TEST-PASS | devtools/client/webconsole/test/browser/browser_console_chrome_context_message.js | The main process DevToolsServer has no pending connection when the test ends - 
[task 2020-09-01T12:03:34.170Z] 12:03:34     INFO - GECKO(1621) | MEMORY STAT | vsize 7686MB | residentFast 416MB | heapAllocated 142MB
[task 2020-09-01T12:03:34.170Z] 12:03:34     INFO - TEST-OK | devtools/client/webconsole/test/browser/browser_console_chrome_context_message.js | took 17994ms
Flags: needinfo?(jryans)

Thanks for backing out, I'll work on resolving this assertion issue.

Flags: needinfo?(jryans)

I suspect it will be a slow process for me to move forward here... So far, I have tried naively removing the failing assertion (it should not matter on macOS, but may be more critical on other platforms, unsure so far). This avoids the assertion crashes, but there are still various DevTools, browser chrome, and WPT reftest failures.

I'll try looking through these in more detail, but if anyone has theories, please do let me know. 😄

It seems surprising to me that we're getting into the occluded state while run tests in automation (which I assume is what's happening).

We could consider disabling the pause code when running automated tests as a workaround, but it'd be good to understand why this happens.

See Also: → 1664472

Thanks, I'll try to investigate whether that is indeed what's happening. I haven't had much time for this bug recently, but I'd definitely like to see this get over the line and merged, so I'll attempt to make some time in the next week or so.

Ryan, thanks so much for your continued work on this! It will be so awesome once this lands!

Some investigation suggests that indeed, we do get into an occluded state while running tests. It seems to happen during browser startup and shutdown, and perhaps this races with state in the compositor, causing things to get a bit confused. The intermittent test failures here seem to be more likely to occur when the browser is restarted mid-test-run, as e.g. WPT reftests will do for each directory.

A fresh try run suggests that using only window minimize events to pause (rather than the noisier occlusions events that can happen in more situations) avoids the test failures while still solving the core problem here.

Assuming that continues looking good, I'll revise the patches here with that approach.

The new approach using minimized only is looking good:

As part of refreshing the patches, I have added a (default enabled) pref widget.pause-compositor-when-minimized so we can easily control this feature if needed. I noticed today is the soft freeze for 82, so I'll wait until the tree reopens for 83 before trying another landing.

While we wait for 83 and the end of the soft freeze, I have added one more patch to resolve an invalid assertion, so hopefully :mattwoodrow can review or redirect review on that while we wait. 😄

Attachment #9168045 - Attachment description: Bug 1580117 - Pause composition for occluded windows on macOS. r=mstange → Bug 1580117 - Pause composition for minimized windows on macOS. r=mstange

Since the compositor can be paused at any time (due to window state changes),
it's possible (and tests confirm it can happen) that RecvGetSnapshot is called
while the compositor is paused. This change removes the assertion and instead
skips the snapshot work in this case.

Depends on D85955

Pushed by jryans@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/722f3de39067
Pause composition for minimized windows on macOS. r=mstange
https://hg.mozilla.org/integration/autoland/rev/8e12f264d17f
Skip preloaded browsers in minimized windows. r=mattwoodrow,Gijs
https://hg.mozilla.org/integration/autoland/rev/1af7632512d6
Remove invalid assertion from RecvGetSnapshot. r=mattwoodrow
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 83 Branch

Thanks for all the hard work and persistence on this! The difference with the fix is huge.

With a test case on Nightly using wikipedia pages and about 20 minimized windows, the main process CPU utilization drops to around 1% and lower within a minute of launching. Compared to Release (which doesn't have the fix yet) where the main process utilization hovers about 80% until I un-minimize the windows.

(Nightly on this machine gets WebRender enabled but Release does not. With WebRender disabled on Nightly, the the results are the same for me.)

Glad to hear it's working well for you! It definitely makes a huge difference for me: much less wasted CPU time (compositor thread drops to near 0% assuming window content is idle) and thus a large power savings. Also, the CPU fan is quiet again, which is what originally triggered me to investigate in the first place. 😅

Thanks to everyone who helped along the way. 😄

Sounds like a huge difference reducing the CPU usage which would be a welcome change.

Is that something that makes sense to port back from a risk perspective?

Flags: needinfo?(jryans)

Is that something that makes sense to port back from a risk perspective?

The actual code added is quite small, and leverages an existing feature to pause the compositor that has been around for years, so the risk level seems low to me. There is also a pref added to control this specific change as well, so there's an easy way to control it if it is later found to cause issues.

Since I don't work on the Graphics team myself, I am not sure what usual risk tolerance is for these sort of things. Should I just request uplift and see what happens...? Maybe :jrmuizel (or someone else) can advise on the right process to follow.

Flags: needinfo?(jryans) → needinfo?(jmuizelaar)

With just a quick look this seems ok to uplift to me. Matt Woodrow and mstange can speak up if they disagree.

Flags: needinfo?(jmuizelaar)

Release Note Request (optional, but appreciated)
[Why is this notable]: Significant reduction in CPU usage / power consumption for users with minimized windows on macOS
[Affects Firefox for Android]: No
[Suggested wording]: For users on macOS with minimized windows, Firefox now uses much less power and you should see much longer battery life.
[Links (documentation, blog post, etc)]: None so far

relnote-firefox: --- → ?

Comment on attachment 9168045 [details]
Bug 1580117 - Pause composition for minimized windows on macOS. r=mstange

Beta/Release Uplift Approval Request

  • User impact if declined: If declined, users on macOS with minimized windows will continue to see high power usage / low battery life as they have for at least 1 year or longer. This uplift request is an attempt to deliver the fix a little faster.
  • Is this code covered by automated tests?: Yes, but there are some caveats: there are no automated tests for this specific code change, but if something erratic happened (the compositor got stuck in paused state), other existing tests would fail on this, so I think this counts as coverage.
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The actual code added is quite small, and leverages an existing feature to pause the compositor that has been around for years, so the risk level seems low to me. There is also a pref added to control this specific change as well, so there's an easy way to control it if it is later found to cause issues.
  • String changes made/needed: None
Attachment #9168045 - Flags: approval-mozilla-beta?
Attachment #9168046 - Flags: approval-mozilla-beta?
Attachment #9176326 - Flags: approval-mozilla-beta?

Comment on attachment 9168045 [details]
Bug 1580117 - Pause composition for minimized windows on macOS. r=mstange

given this is a longstanding issue I'd prefer to let it ride to 83

One question: how does this handle the pref getting flipped between pause and resume? (Is that even worth worrying about?)

Attachment #9168045 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Attachment #9168046 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Attachment #9176326 - Flags: approval-mozilla-beta? → approval-mozilla-beta-

(In reply to Julien Cristau [:jcristau] from comment #57)

One question: how does this handle the pref getting flipped between pause and resume? (Is that even worth worrying about?)

At the moment, switching the pref to false would mean any already minimized windows during that Firefox session get stuck paused and would not update if you un-minimize them. Restarting Firefox would fix the issue. It's easy to tweak this behaviour as needed e.g. to always resume regardless of the pref value, or perhaps only read the pref once at startup to avoid this kind of issue.

Flags: qe-verify+
QA Whiteboard: [qa-triaged]

Noticed an improvement from builds without the patch pre83.0.
On 83.0 -macOS 10.15 the initial load was lighter, and overall restore "strain" noticed on the Activity Monitor values.

Status: RESOLVED → VERIFIED
Flags: qe-verify+

Looks like this was added to the 83 release notes.

Regressions: 1682043
Performance Impact: --- → P2
Whiteboard: [qf:p2:resource]
See Also: → 1830753
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: