Closed Bug 1781524 Opened 2 years ago Closed 1 year ago

60Hz composites forever after running the browser_contentTheme_in_process_tab.js test on Mac

Categories

(Core :: Graphics, defect)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox111 --- ?

People

(Reporter: florian, Unassigned)

References

(Blocks 1 open bug)

Details

+++ This bug was initially created as a clone of Bug #1765387 +++

While working on bug 1742842, I had a try run that shows the browser/base/content/test/contentTheme/browser_contentTheme_in_process_tab.js test causing vsync to remain enabled at the end of the test on Mac (but not other platforms).

I can reproduce consistently locally, but only intermittently on try. Here is a profile of the test: https://share.firefox.dev/3zAJcb2

I tried to reduce the test case and noticed that the 60Hz composite is related to closing the browser window before a 100ms CSS transition on the background-color property is over. If I delay closing the window by 100ms, the composite stops.

The relevant transition is:
https://searchfox.org/mozilla-central/rev/23bf1890e07f780ba70e075bc8f46ffb85d1128c/browser/themes/shared/browser-shared.css#190

:root[sessionrestored] #nav-bar:-moz-lwtheme {
  transition: var(--ext-theme-background-transition);
}

with --ext-theme-background-transition: background-color 0.1s cubic-bezier(.17,.67,.83,.67); (set here)

When looking at the profile (both the on the screenshot track, and the activity on the Renderer thread), it seems the animation keeps being played even after the browser window has been closed (after the 'unload' DOM event). After the animation is over, we keep compositing forever, but stop rendering: https://share.firefox.dev/3cCbNDz

Here is what the reduced test looks like:

add_task(async function test_in_process_tab() {
  let win = await BrowserTestUtils.openNewBrowserWindow();
  await BrowserTestUtils.withNewTab(
    {
      gBrowser: win.gBrowser,
      url: "about:about", // needs to be a page that loads in the parent
    },
    async browser => {
      win.document.getElementById("nav-bar")
	.setAttribute("style",
		      "background-color: red;
                       transition: background-color 0.1s cubic-bezier(.17,.67,.83,.67);");
    }
  );

  // Waiting 100ms here before closing the window makes the test disable vsync.
  //await new Promise(r => setTimeout(r, 100));
  
  ok(true, "done");
  await BrowserTestUtils.closeWindow(win);
});

I don't understand why we need the tab to contain a document loaded in the parent process for this bug to reproduce, but we do.

Hiro, any idea?

Bug 1765387 might have the same root cause.

Flags: needinfo?(hikezoe.birchill)

Animations running on the compositor thread should be destroyed by triggering AddCompositorAnimationsIdForDiscard in the dtor of WebRenderAnimationData which is owned by display item. So, I guess when the browser window gets closed we leak display items?

Note that in the case of popup (bug 1742797), we keep display items so that we needed an implicit discarding the animations there.

Flags: needinfo?(hikezoe.birchill)

(In reply to Hiroyuki Ikezoe (:hiro) from comment #1)

Animations running on the compositor thread should be destroyed by triggering AddCompositorAnimationsIdForDiscard in the dtor of WebRenderAnimationData which is owned by display item. So, I guess when the browser window gets closed we leak display items?

I wouldn't be surprised if we leaked stuff (not sure what / how much exactly). What makes me think that is that we never add the CompositorScreenshotWindowDestroyed marker, and the profiler front-end keeps displaying the screenshot of the already closed window.

(In reply to Hiroyuki Ikezoe (:hiro) from comment #1)

Animations running on the compositor thread should be destroyed by triggering AddCompositorAnimationsIdForDiscard in the dtor of WebRenderAnimationData which is owned by display item. So, I guess when the browser window gets closed we leak display items?

Or maybe, when we close the browser window, we destroy everything for the browser so that there's no chance that the animation id is notified to the compositor, that's a plausible scenario.

Severity: -- → S3
See Also: → 1782134

I can no longer reproduce locally, but try still gave me a profile of the failure: https://share.firefox.dev/3PXO5Ap, and bug 1782825 was filed on this intermittent.

In the profile we keep compositing forever on a window that is already closed. https://share.firefox.dev/3zu3T7b shows the "unload" DOMEvent when the window #31 is closed, that we keep sampling animations on the compositor for 386ms after that, and that we keep compositing in that same window forever.

When running the test locally with --verify, I couldn't reproduce the failure, but I noticed the screenshot tracks continue significantly longer than the lifetime of the related DOM windows, and 8 of them end at once when there's a GCMajor: https://share.firefox.dev/3JsGac5
That makes me think something might be holding an unwanted reference to the window object, and preventing things from being properly cleaned-up when the window is closed.

Markus, any idea about why the profile from try isn't symbolicated? It's an artifact off of mozilla-central, so I thought it would have symbols from the m-c build. Ideas about the bug itself welcome too of course :-).

Blocks: 1782825
Flags: needinfo?(mstange.moz)

(In reply to Florian Quèze [:florian] from comment #4)

Markus, any idea about why the profile from try isn't symbolicated? It's an artifact off of mozilla-central,

It looks like it's actually off of autoland - the build ID link in the profile info panel leads me to this commit. And autoland builds don't seem to upload their symbols.

Flags: needinfo?(mstange.moz)

(In reply to Markus Stange [:mstange] from comment #5)

(In reply to Florian Quèze [:florian] from comment #4)

Markus, any idea about why the profile from try isn't symbolicated? It's an artifact off of mozilla-central,

It looks like it's actually off of autoland - the build ID link in the profile info panel leads me to this commit. And autoland builds don't seem to upload their symbols.

The build log says:

[task 2022-08-04T11:47:07.732Z] 11:47:07     INFO -  Searching Taskcluster index with namespace: gecko.v2.mozilla-central.shippable.revision.8f150040b037fc9f7b6039b84a1b7806e8f6a77d.firefox.macosx64-opt
[task 2022-08-04T11:47:07.732Z] 11:47:07     INFO -  Trying to find artifacts for hg revision c8de8244d50fcfeda2f9d6d85efaa3e0dde717ab on tree mozilla-central.
[task 2022-08-04T11:47:07.733Z] 11:47:07     INFO -  Searching Taskcluster index with namespace: gecko.v2.mozilla-central.shippable.revision.c8de8244d50fcfeda2f9d6d85efaa3e0dde717ab.firefox.macosx64-opt
[task 2022-08-04T11:47:07.733Z] 11:47:07     INFO -  Trying to find artifacts for hg revision 19cb3c811aaecde7e5bcb1826d5e7882f4c9f754 on tree integration/autoland.
[task 2022-08-04T11:47:07.733Z] 11:47:07     INFO -  Searching Taskcluster index with namespace: gecko.v2.autoland.shippable.revision.19cb3c811aaecde7e5bcb1826d5e7882f4c9f754.firefox.macosx64-opt

So it seems after trying 2 m-c commits, it tried one from autoland and used it.

(In reply to Florian Quèze [:florian] from comment #4)

When running the test locally with --verify, I couldn't reproduce the failure, but I noticed the screenshot tracks continue significantly longer than the lifetime of the related DOM windows, and 8 of them end at once when there's a GCMajor: https://share.firefox.dev/3JsGac5
That makes me think something might be holding an unwanted reference to the window object, and preventing things from being properly cleaned-up when the window is closed.

Looking more at this profile, we have the stacks for what's destroying the pres context and closing the compositor bridge, it is cycle collection: https://share.firefox.dev/3zRc2mv

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → DUPLICATE

this was an error to mark as a duplicate.

Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---

I think this is fixed by bug 1803387.

Depends on: 1803387
Status: REOPENED → RESOLVED
Closed: 2 years ago1 year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.