composites happen at 60Hz until the browser window is closed after showing an arrowpanel
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox104 | --- | fixed |
People
(Reporter: florian, Assigned: hiro)
References
(Depends on 1 open bug, Blocks 2 open bugs)
Details
(Keywords: perf:resource-use, power)
Attachments
(4 files, 2 obsolete files)
I found steps to reproduce for a case that causes composites to happen at 60Hz until the browser window is closed.
- Preparation: in a first browser window, start a download (eg. http://test-debit.free.fr/10485760.rnd), let it progress for a few seconds, then cancel it. This makes the downloads toolbar icon appear on all browser windows.
- Open a new browser window, and close the first browser window.
- Start the profiler.
- Click the downloads toolbar icon, this makes the downloads panel appear. Wait for a few seconds.
- Dismiss the downloads panel. Wait for a few seconds.
- (Optional) Open a new browser window, and close the browser window that was opened at step 2.
- Capture the profile.
I have been able to reproduce consistently on the current Nightly on 2 Windows machines. On my fast-ish development laptop (https://share.firefox.dev/3nPqRkF) and on the Quantum reference laptop (https://share.firefox.dev/30Vk6VH). I have not been able to reproduce on Mac (and I have not tried on Linux yet).
In the profiles, when we composite at 60Hz for no obvious reason, the Screenshots track of the downloads panel is black. Reopening the downloads panel makes the composition rate go down to normal.
After step 2, composition happens normally. After step 4 it still happens normally. After step 5 (hiding the downloads panel), it happens at 60Hz. After step 6 it goes back to normal.
Emilio, Markus, could you please forward this needinfo to someone who can figure this out? I'm not sure if this is in cross platform code or if it's Windows specific.
Reporter | ||
Comment 1•3 years ago
|
||
(In reply to Florian Quèze [:florian] from comment #0)
I have not been able to reproduce on Mac (and I have not tried on Linux yet).
On Linux I can't reproduce the 60Hz composite. The one part of the behavior that exists on Linux too is that the panel has its own Screenshots track that ends only when closing the browser window it was attached to. But there's no black screenshots on that track. https://share.firefox.dev/3cMamj0
Reporter | ||
Updated•3 years ago
|
Comment 2•3 years ago
|
||
I couldn't repro either on Linux... Maybe Sotaro knows what might cause this on Windows? Maybe windows keeps sending us paint events? Maybe his work on occlusion helps popups as well, unclear.
Comment 3•3 years ago
|
||
I currently don't have a Windows machine to reproduce.
Comment 5•3 years ago
•
|
||
:florian, can you check FPS with debug overlay? It could be checked with the following prefs
- gfx.webrender.debug.profiler-ui:FPS
- gfx.webrender.debug.profiler:true
I saw the a lot of "Paint" markers in profiler. But FPS of debug overlay was very low.
It seems like handling of profiler marker is not good. "Paint" marker id set even when rendering does not happen. Rendering happens only when aRender==true.
https://searchfox.org/mozilla-central/rev/7fe9421af35256a95acc4620e9e0b76df7867287/gfx/webrender_bindings/RenderThread.cpp#489
Comment 6•3 years ago
|
||
:mstange, can you comment to "Paint" marker of comment 5?
Reporter | ||
Comment 7•3 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #5)
:florian, can you check FPS with debug overlay? It could be checked with the following prefs
- gfx.webrender.debug.profiler-ui:FPS
- gfx.webrender.debug.profiler:true
I saw the a lot of "Paint" markers in profiler. But FPS of debug overlay was very low.
The debug overlay disappears at the same time as the panel when I close the panel, so I can't see what it would show when the bug occurs.
It seems like handling of profiler marker is not good. "Paint" marker id set even when rendering does not happen. Rendering happens only when aRender==true.
https://searchfox.org/mozilla-central/rev/7fe9421af35256a95acc4620e9e0b76df7867287/gfx/webrender_bindings/RenderThread.cpp#489
I don't know if that marker is accurate or misleading. The main problem in this bug is that activity is happening at 60Hz (eg. the PVsyncBridge::Msg_NotifyVsync IPCs are also a problem).
Comment 8•3 years ago
|
||
FYI the criticality of this issue will increase as we ship bug 1733587 since the panel will now open automatically for new downloads.
Comment 9•3 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #6)
:mstange, can you comment to "Paint" marker of comment 5?
I agree it could be better - we should emit a "Renderer Update #WIN" marker instead of a "Composite" marker if aRender
is false
. However, as Florian said, the real issue still stands: We are waking up at 60Hz. We should stop listening for vsync and let the CPU sleep.
Comment 10•3 years ago
|
||
The Downloads panel has fade-out effect when it closes: https://searchfox.org/mozilla-central/rev/65d4d3399afa79c8de5a0cc11752d2ba7c31edc1/toolkit/content/xul.css#319-326
This opacity animation runs on the compositor. It is possible that we never stop the compositor-side animation if the panel is closed, maybe because we never get to an aRender == true
composite when the animation is stopped.
Hiro, could you take a look?
Comment 11•3 years ago
•
|
||
Looks like I filed this four years ago, in bug 1413763, and then lost track of it. To answer Hiro's question from back then:
(In reply to Hiroyuki Ikezoe (:hiro) from comment #2)
How do we hide the panel?
If we do hide by setting visiblity:hidden, the transform animation keep
running.
The panel is hidden by making its nsView hidden and by closing the view's nsIWidget. The panel widget has its own compositor.
Comment 13•3 years ago
|
||
It is possible that making the hidden panel display:none would stop the compositor animation. This is currently being investigated in bug 1714846.
Reporter | ||
Comment 14•3 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #4)
Florian, can you check if this is a regression?
I can reproduce on the 2020-05-22 build, which is the oldest build where the profiler works.
Assignee | ||
Comment 15•3 years ago
|
||
It looks like the animation (in fact it's two transtions) in question has a finite duration (0.18s), so it sounds odd that the animation keeps running on the compositor.
Assignee | ||
Comment 16•3 years ago
|
||
(In reply to Hiroyuki Ikezoe (:hiro) from comment #15)
It looks like the animation (in fact it's two transtions) in question has a finite duration (0.18s), so it sounds odd that the animation keeps running on the compositor.
Never mind about this comment. I did forget that we use fill-mode: both for transitions to avoid glitches on the compositor, thus it keeps running until an explicit transaction which makes the transitions stop happens.
Comment 17•3 years ago
|
||
The severity field is not set for this bug.
:jimm, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•3 years ago
|
Updated•3 years ago
|
Comment 18•3 years ago
|
||
I think this is fixed by bug 1714846 - Florian, can you confirm? I'm not particularly convinced because I don't see any composite markers at all, so I assume the steps in comment 0 miss, say, which profiler preset you're meant to use or where to look for the markers? https://share.firefox.dev/3EVl5nu is my profile. I did already figure out I had to turn off screenshots because otherwise those trigger markers for the screenshots that seemed like they'd interfere with actually determining what was going on here... but I've no idea if there's anything else I'm missing.
Reporter | ||
Comment 19•3 years ago
|
||
(In reply to :Gijs (he/him) from comment #18)
I think this is fixed by bug 1714846 - Florian, can you confirm? I'm not particularly convinced because I don't see any composite markers at all, so I assume the steps in comment 0 miss, say, which profiler preset you're meant to use or where to look for the markers? https://share.firefox.dev/3EVl5nu is my profile.
I guess I forgot to say "look in the Renderer or compositor thread". Your profile has composites until you close the browser window (the parent process has a DOMWindowClose event when the composite markers stop) https://share.firefox.dev/3s2NcO4
I did already figure out I had to turn off screenshots because otherwise those trigger markers for the screenshots that seemed like they'd interfere with actually determining what was going on here...
I'm curious what made you think this. We create a screenshot marker for every composite, and won't trigger any if there's no composite happening.
Comment 20•3 years ago
•
|
||
(In reply to Florian Quèze [:florian] from comment #19)
(In reply to :Gijs (he/him) from comment #18)
I think this is fixed by bug 1714846 - Florian, can you confirm? I'm not particularly convinced because I don't see any composite markers at all, so I assume the steps in comment 0 miss, say, which profiler preset you're meant to use or where to look for the markers? https://share.firefox.dev/3EVl5nu is my profile.
I guess I forgot to say "look in the Renderer or compositor thread". Your profile has composites until you close the browser window (the parent process has a DOMWindowClose event when the composite markers stop) https://share.firefox.dev/3s2NcO4
So does that mean this is fixed or did we expect the patch to mean that the composites stop when the panel is closed even while that window remains open? If the latter, who is in a good position to investigate why that doesn't happen? If the former, should we close this bug?
I did already figure out I had to turn off screenshots because otherwise those trigger markers for the screenshots that seemed like they'd interfere with actually determining what was going on here...
I'm curious what made you think this. We create a screenshot marker for every composite, and won't trigger any if there's no composite happening.
The composite markers for screenshots seem to appear even when there is no composite marker on the compositor thread - compare https://share.firefox.dev/3IOceqd and https://share.firefox.dev/3pYsTid (different views on the same profile (with screenshots this time) . Perhaps that's a separate bug?
Comment 21•3 years ago
•
|
||
(In reply to :Gijs (he/him) from comment #20)
The composite markers for screenshots seem to appear even when there is no composite marker on the compositor thread - compare https://share.firefox.dev/3IOceqd and https://share.firefox.dev/3pYsTid (different views on the same profile (with screenshots this time) . Perhaps that's a separate bug?
Note that you can Cmd-click threads and get an integrated marker chart of multiple threads: https://share.firefox.dev/3p0PlYM (I also applied a search filter for "composit")
The screenshot markers span the distance between two consecutive composites of the same window.
Comment 22•3 years ago
|
||
Ah, and in this particular case, it looks like we're getting "Composite" markers for the hidden panel, but no updated "CompositorScreenshot" markers - instead, the panel's synthesized "CompositorScreenshot" marker spans the full duration of 4.6 seconds until the parent window is closed. So it seems like we're triggering a composite but then canceling compositing somewhere in the pipeline before we capture a new screenshot. Likely because something notices that the window is closed. But the frequent "Composite" and "CompositeToTarget" markers indicate that we still wake up the CPU. And that's the part that this bug is about.
Reporter | ||
Comment 23•3 years ago
|
||
(In reply to :Gijs (he/him) from comment #20)
So does that mean this is fixed or did we expect the patch to mean that the composites stop when the panel is closed even while that window remains open? If the latter, who is in a good position to investigate why that doesn't happen?
We expect composites for the panel to stop when the panel is closed. Given the previous comments (especially comment 16), I would say Hiro would be in a good position to keep investigating.
Comment 24•3 years ago
|
||
Hiro, based on the last few comments, can you take a look at what's happening in current nightly that keeps us compositing when the downloads panel has been closed, and its content
slot has been marked display: none
? (I just checked and that change does now take effect - though the downloads panel itself seems to still be display: -moz-popup
, it's unclear to me if that's expected - Emilio, did you see that when working on bug 1714846?)
Comment 26•3 years ago
|
||
Yeah, that's kind of expected. In the past when mentioning this to Neil he mentioned that the current design of popups not being display: none
when closed was to avoid recreating native widgets over and over for e.g., the context menu. Maybe the trade-offs are different now?
Updated•3 years ago
|
Assignee | ||
Comment 27•3 years ago
|
||
The animation in question here is on this downloadsPanel, right? As far as I can tell, it's the animation keeps running on the compositor and I can't see display:none style on the element after the popup was closed. Oh now I see what you meant, yeah the display style is: -moz-popup...
Assignee | ||
Comment 28•3 years ago
|
||
I am assuming what Gijs wanted to know is why the animation keeps running after closing the popup. The answer is;
- The animation is a CSS transition running on the compositor, we let CSS transitions keep running on the compositor until an explicit notification comes from the main-thread
- Though this is a guess, in the popup case the notification is tried to be sent after popup closed thus any display list building stuff won't happen thus the notification will never be arrived to the compositor
So, I guess setting transition:none just before closing the popup would solve this issue? (IIRC there's a such popup blah event)
Comment 29•3 years ago
|
||
(In reply to Hiroyuki Ikezoe (:hiro) from comment #27)
The animation in question here is on this downloadsPanel, right? As far as I can tell, it's the animation keeps running on the compositor and I can't see display:none style on the element after the popup was closed. Oh now I see what you meant, yeah the display style is: -moz-popup...
FWIW, this sounds very similar to bug 1739161, where the element is made zero size. In that testcase, the animation ends properly only when we take EndEmptyTransaction
painting path, which is only used with retained display lists.
Retained display lists are not used for popups, so this might be related.
Assignee | ||
Comment 30•3 years ago
|
||
(In reply to Miko Mynttinen [:miko] from comment #29)
(In reply to Hiroyuki Ikezoe (:hiro) from comment #27)
The animation in question here is on this downloadsPanel, right? As far as I can tell, it's the animation keeps running on the compositor and I can't see display:none style on the element after the popup was closed. Oh now I see what you meant, yeah the display style is: -moz-popup...
FWIW, this sounds very similar to bug 1739161, where the element is made zero size. In that testcase, the animation ends properly only when we take
EndEmptyTransaction
painting path, which is only used with retained display lists.
Retained display lists are not used for popups, so this might be related.
That's a different, a problem about bug 1739161 is happening on the main-thread, it keeps restyling on the main-thread, on the other hand in this bug the animation isn't restyled on the main-thread.
Assignee | ||
Comment 31•3 years ago
|
||
I figured out a way to solve this. The way is to call ClearCachedWebrenderResources when the popup gets hidden. I was initially thinking that it's not a good way because of resource recreation cost what Emilio commented in comment 26, but we've already clear the resources when recreating the popup. So I am mostly 100% sure it's okay.
Updated•3 years ago
|
Comment 32•3 years ago
|
||
I'm a bit confused. I think this happens for every panel? Both based on the conversation here (that seems to suggest that this happens due to the transition
that is on the root of the panel/popup node, which isn't being display: none
'd), and because that's what this profile seems to show, AFAICT: https://share.firefox.dev/3sl7By8 , and because that's what the dupe (bug 1413763) says.
But the steps here make me think this is somehow DL-panel specific? Was that intentional when filing it?
Given this isn't DL-panel specific, I'm going to unblock bug 1733587 as it's unrelated, really? People in general are much more likely to open "any arrow panel" either before or after the changes on nightly right now.
AIUI, the outgoing opacity transition is intentional, so I don't think frontend and/or the toolkit css and custom element code can fix this, despite the attempt in bug 1714846, and we might need a fix like the one in attachment 9255857 [details] [diff] [review].
Reporter | ||
Comment 33•3 years ago
|
||
(In reply to :Gijs (he/him) from comment #32)
I'm a bit confused. I think this happens for every panel? Both based on the conversation here (that seems to suggest that this happens due to the
transition
that is on the root of the panel/popup node, which isn't beingdisplay: none
'd), and because that's what this profile seems to show, AFAICT: https://share.firefox.dev/3sl7By8 , and because that's what the dupe (bug 1413763) says.But the steps here make me think this is somehow DL-panel specific? Was that intentional when filing it?
It was intentional, but it was me being confused. I tried a bunch of other panels at the time and they didn't reproduce. Testing a bit more today, I can reproduce with the hamburger panel and the control center / identity panels. I can't reproduce with the profiler or pocket panels.
AIUI, the outgoing opacity transition is intentional, so I don't think frontend and/or the toolkit css and custom element code can fix this
I don't think front-end code can avoid the bug without removing the opacity & transform transitions entirely.
we might need a fix like the one in attachment 9255857 [details] [diff] [review].
Here is a profile with this attachment applied: https://share.firefox.dev/3pd4033 It confirms that this change does fix the bug.
We might be able to write a simple test for this using ChromeUtils.vsyncEnabled()
before/after the hamburger panel has been shown and hidden.
Comment 34•3 years ago
|
||
(In reply to Florian Quèze [:florian] from comment #33)
We might be able to write a simple test for this using
ChromeUtils.vsyncEnabled()
before/after the hamburger panel has been shown and hidden.
This seems to just always return true
for me on Nightly, on Windows, when run from the browser console, both when the panel is open and when it's not. How is it supposed to help here? And, is there an alternative (perhaps using profiler markers) ?
Reporter | ||
Comment 35•3 years ago
|
||
(In reply to :Gijs (he/him) from comment #34)
(In reply to Florian Quèze [:florian] from comment #33)
We might be able to write a simple test for this using
ChromeUtils.vsyncEnabled()
before/after the hamburger panel has been shown and hidden.This seems to just always return
true
for me on Nightly, on Windows, when run from the browser console, both when the panel is open and when it's not.
This always returns true because the browser console uses a CSS animation to make the cursor blink. When testing locally I used a setTimeout to workaround it. Eg. run setTimeout(() => console.log(ChromeUtils.vsyncEnabled()), 1000);
in the console and quickly move the focus away from the console to make the cursor disappear.
Comment 36•3 years ago
|
||
Comment 37•3 years ago
|
||
Setting a needinfo for Florian to come back to this when he's back from PTO, as I didn't really get very far. The patch I attached works, as far as I can tell, and the test passes but (a) it's kinda ugly and (b) the test also passes before the patch, so that's not much use. Despite Florian's extensive help, I didn't get very far trying to figure out why - I tried to use the profiler to find out what was happening in the automated test, but it doesn't seem to include profiles of the gpu process, even though when I check if the GPU process is running by using requestProcInfo
it is indeed running. If I try to collect shutdown profiles by failing the test, the browser crashes, without useful stacks. So I'm a bit lost, and I'm hoping Florian is in a better position to pick this up (but if anyone else has ideas, I'm sure they'd be appreciated too - we should really try to get this fixed sooner rather than later, considering the potential perf and power usage impacts).
Updated•3 years ago
|
Assignee | ||
Comment 38•3 years ago
|
||
I managed to make the test work (on Linux with xul.panel-animations.enabled=true). So what I've noticed is it seems the issue only happens when the popup is closed by a native mouse click event. And I am now unsure though, I've changed to use the download panel instead of the browser menu.
(Though I didn't want to assign this bug to me, but moz-phab forcibly did it)
Updated•3 years ago
|
Reporter | ||
Comment 39•3 years ago
|
||
(In reply to :Gijs (back Jan 4, 2022; he/him) from comment #37)
Setting a needinfo for Florian to come back to this when he's back from PTO, as I didn't really get very far. The patch I attached works, as far as I can tell, and the test passes but (a) it's kinda ugly and (b) the test also passes before the patch, so that's not much use.
Given that Hiro managed to make the test work, do you still need my help?
I tried yesterday Hiro's version of the test on my local Windows build. The test passes with the layout/xul/nsMenuPopupFrame.cpp
fix applied, and fails without. So it looks like it's testing what we expect.
I forced the test to fail to have a full profile of it including the GPU process: https://share.firefox.dev/3HvPUQA
One thing I noticed in this profile is that the initial waitForCondition lasted 3.6s, waiting for vsync to be disabled at the beginning of the test. This is due to vsync notifications being observed by a content process for an about:blank document (https://share.firefox.dev/3zl1Th3), I don't have a good understanding of what's causing that yet, but it's something to figure out in a separate bug.
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Comment 40•3 years ago
|
||
(In reply to Florian Quèze [:florian] from comment #39)
One thing I noticed in this profile is that the initial waitForCondition lasted 3.6s, waiting for vsync to be disabled at the beginning of the test. This is due to vsync notifications being observed by a content process for an about:blank document (https://share.firefox.dev/3zl1Th3), I don't have a good understanding of what's causing that yet, but it's something to figure out in a separate bug.
I filed bug 1748466 for this.
Updated•3 years ago
|
Reporter | ||
Updated•3 years ago
|
Comment 42•3 years ago
|
||
Comment 43•3 years ago
|
||
Backed out changeset 14f484747a66 (Bug 1742797) for causing failures in nsMenuPopupFrame.cpp CLOSED TREE
Log: https://treeherder.mozilla.org/logviewer?job_id=363168402&repo=autoland&lineNumber=6130
https://treeherder.mozilla.org/logviewer?job_id=363169212&repo=autoland&lineNumber=2846
Backout: https://hg.mozilla.org/integration/autoland/rev/6cb156d6c627e91795567571e689d92e65b28e92
Assignee | ||
Comment 44•3 years ago
|
||
Do'h. mView
in nsMenuPopupFrame is a dangling pointer... I will figure out a way to avoid the situation.
Assignee | ||
Comment 45•3 years ago
|
||
Move if (auto* widget = GetWidget()) {}
block before mView
gets destroyed by nsIFrame::DestroyFrom.
Comment 46•3 years ago
|
||
Comment 47•3 years ago
|
||
bugherder |
Comment 48•3 years ago
|
||
Backed out for causing Bug 1748808
Comment 49•3 years ago
|
||
Backout merged to central: https://hg.mozilla.org/mozilla-central/rev/09808a64bedd
Comment 50•3 years ago
|
||
This is really bad for a range of reason. Bumping the priority. Graphics team might have some thoughts here.
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 51•3 years ago
•
|
||
I did investigate two test failures, bug 1748788 and bug 1748808. Now I am pretty sure bug 1748808 is unrelated to this bug. Bug 1748788 is the only one remaining issue to land this bug.
To performance team, I will not spend time for bug 1748788.
Assignee | ||
Comment 52•3 years ago
|
||
A way to avoid bug 1748788 I can think of is to add a new route to invoke ClearAnimationResources and use it in HidePopup, it should just work.
Comment 53•3 years ago
|
||
Hey Bas, do you know somebody who might be able to help with bug 1748788? There's a nice power / CPU usage win patch here that's being blocked by it.
Comment 54•3 years ago
|
||
Jeff would be the best person to figure out who might be able to look at that I think.
Comment 55•3 years ago
|
||
Hiro, is this ready to reland now that bug 1748788 is fixed?
Assignee | ||
Comment 56•3 years ago
|
||
Pushed a new try run to see if there's any suspicious failure or not. https://treeherder.mozilla.org/jobs?repo=try&revision=1a10622da604d539f42a7ae867d03ff8fb0e96de
That one is a try auto run, so I might have to trigger additional tasks there.
Assignee | ||
Comment 57•3 years ago
|
||
The test for this bug, browser_panel_vsync.js failed. Something has been changed since the last land?
Assignee | ||
Comment 58•3 years ago
|
||
It looks like the vsync tick on the compositor thread has been less often than before. Thus even if ChromeUtils.vsyncEnabled() returns false, DownloadsPanel.panel.state
is not yet closed
. With an additional awaiting for popuphidden
the test passes properly. I also confirmed with the additional await the test still properly fails without the proper fix in nsMenuPopupFrame.cpp.
Pushed a new try run; https://treeherder.mozilla.org/jobs?repo=try&revision=cc9d09aa4020c0206b98b7f1b80989e17d15ae6c (but it looks like it hasn't yet been queued)
Assignee | ||
Comment 59•3 years ago
•
|
||
Though bug 1748788 was backed out now, the test browser_panel_vsync.js is quite flaky, I did confirm that resources for compositor animations has been discarded but VsyncSource gets enabled after the discard.
Florian, are you aware of the case where VsyncSource gets re-enabled for some reason?
Updated•3 years ago
|
Reporter | ||
Comment 60•3 years ago
|
||
(In reply to Hiroyuki Ikezoe (:hiro) from comment #59)
Sorry for the late reply, I was on PTO most of last week.
Though bug 1748788 was backed out now, the test browser_panel_vsync.js is quite flaky, I did confirm that resources for compositor animations has been discarded but VsyncSource gets enabled after the discard.
Florian, are you aware of the case where VsyncSource gets re-enabled for some reason?
Flaky in which way? When running locally on my Windows machine, I saw cases where the test timeouts, with the downloads panel still visible and a leftover popuphidden event listener. I haven't been able to reproduce with the profiler running, but my guess is we need to wait for something before sending the click that will trigger the popup closing.
If you are talking about the Linux failures ("TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_panel_vsync.js | vsync should still be off") on the try run from comment 58, I would say the easiest way forward is to get the try server to upload profiles of the failures.
I did that using ./mach try fuzzy browser/base/content/test/performance/browser_panel_vsync.js --env MOZ_PROFILER_STARTUP=1
and here are the results: https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=b821db08fc7f9d9ef7acd4e8aa3191122731d685
Here's one of the profiles from that try run: https://share.firefox.dev/3t2wsGZ
It shows that TestUtils.waitForCondition(() => !ChromeUtils.vsyncEnabled())
("TestUtils - waitForCondition succeeded after 0 retries - undefined") finished before hiddenPromise
("BrowserTestUtils - waitForEvent: popuphidden"). So at the time we do the final ok(!ChromeUtils.vsyncEnabled(), "vsync should still be off");
we don't have a TestUtils.waitForCondition
, and it's normal for vsync to take about 200ms before being disabled after something has visibly changed.
This change https://hg.mozilla.org/try/rev/4899258047a1156aecc42278fd1ba65e190ceb9a seems to be enough to make the test green on try (https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=cccfb7dd33b48a1419826a889a18eff916906fec) minus a few leaks on debug builds that are likely due to running with the profiler.
By the way, it would be nice to provide a message for the second TestUtils.waitForCondition
call at https://hg.mozilla.org/try/rev/a3c46d3d95e0f389944c128485baed798b864bf7#l2.53
Reporter | ||
Comment 61•3 years ago
|
||
(In reply to Florian Quèze [:florian] from comment #60)
When running locally on my Windows machine, I saw cases where the test timeouts, with the downloads panel still visible and a leftover popuphidden event listener. I haven't been able to reproduce with the profiler running,
I haven't been able to get a profile of the failure locally, but TV jobs on try could! https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=cccfb7dd33b48a1419826a889a18eff916906fec&searchStr=win&selectedTaskRun=HOzdo7EMRQWJ--QAjzOgXw.0
This profile contains both a passing run of the test and a failing run that timeouts: https://share.firefox.dev/3MsmFRY
The sequences of DOM events are different in these 2 cases. The passing case has popuphiding/popuphidden events https://share.firefox.dev/3CygCqE, but no mousedown or click events. The failing case has mousedown, click (and event dblclick!) events: https://share.firefox.dev/3HNLWCN but no popuphiding/popuphidden events.
Comment 62•3 years ago
|
||
The CPU consumption issue seems to have gotten worse in 100.0b. Bug 1714846 did make 97+ seem more responsive when a large DL was in progress but it seems to have regressed or something.
Comment 63•2 years ago
|
||
This high CPU util is still present in 102 and 103b.
Reporter | ||
Comment 64•2 years ago
|
||
(In reply to Hiroyuki Ikezoe (:hiro) from comment #52)
A way to avoid bug 1748788 I can think of is to add a new route to invoke ClearAnimationResources and use it in HidePopup, it should just work.
It looks like nobody is currently working on fixing bug 1748788 / bug 1707052, so maybe this suggestion would be the way forward. Can you describe a bit more what "add a new route to invoke ClearAnimationResources" means? Is this something I might be able to do myself? Could this also open a way towards fixing bug 1768495?
Assignee | ||
Comment 65•2 years ago
|
||
This is what I had in my mind. This is just doing a part of what WebRenderLayerManager::ClearCachedResources does.
Though WebRenderLayerManager::ClearCachedResources has a transaction pattern of begin ~ end, honestly I am not sure whether the pattern also is necessary for animation resources or not. I believe Sotaro knows the answer.
Reporter | ||
Comment 66•2 years ago
|
||
(In reply to Florian Quèze [:florian] from comment #61)
(In reply to Florian Quèze [:florian] from comment #60)
When running locally on my Windows machine, I saw cases where the test timeouts, with the downloads panel still visible and a leftover popuphidden event listener. I haven't been able to reproduce with the profiler running,
I haven't been able to get a profile of the failure locally, but TV jobs on try could! https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=cccfb7dd33b48a1419826a889a18eff916906fec&searchStr=win&selectedTaskRun=HOzdo7EMRQWJ--QAjzOgXw.0
This profile contains both a passing run of the test and a failing run that timeouts: https://share.firefox.dev/3MsmFRY
The sequences of DOM events are different in these 2 cases. The passing case has popuphiding/popuphidden events https://share.firefox.dev/3CygCqE, but no mousedown or click events. The failing case has mousedown, click (and event dblclick!) events: https://share.firefox.dev/3HNLWCN but no popuphiding/popuphidden events.
I figured this out. The problem was that we use EventUtils.synthesizeNativeMouseEvent
twice to generate clicks, but that function actually generates a mouse down and mouse up. If we run that twice quickly enough, instead of synthesizing 2 separate clicks, we synthesize a double click, which triggers different behaviors. If I go to the Windows mouse settings and make the double clicks slower, then I can reproduce all the time locally. This also explains why the failures happened more often locally (faster developer machine) and stopped happening when profiling (the profiler slowing things down a little bit).
Using EventUtils.synthesizeKey("VK_ESCAPE"
instead of a second click to close the panel works.
Reporter | ||
Comment 67•2 years ago
|
||
I pushed to try the patch attached in comment 65 with the test fixes I mentioned in comment 60 and 66: https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=384b458b9b6a2b21871e30a361340cfa4f315fff&selectedTaskRun=cFI-Y9slSTKPX-aYdFBXuA.0
It is green, except on the linux swr-wayland jobs where it fails on an assertion: Assertion failure: mGdkWindow (Getting configure for invisible window?), added in bug 1738729.
To see if the assertion was caused by the fix, I pushed again the same test but with the fix commented out. It failed again with the same assertion on the linux swr-wayland jobs: https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=e8c7ac3ac87850073f5e61fd245415eeb1ab154a
It also failed on the Window 7 jobs (that's expected with the fix commented out!), but didn't fail on the Windows 10 jobs, which is puzzling as on my local Windows 10 and 11 machines, the test fails without the fix and passes with the fix.
Martin, how important is this assertion, and could it be changed to a simple return?
Hiro, thanks for the patch! Do you have any idea of why the test doesn't fail on Windows 10 on try when the fix is disabled? And do you think we could still go ahead and request review to reland without having understood this? (and with a skip-if for the linux swr-wayland jobs)
Assignee | ||
Comment 68•2 years ago
|
||
(In reply to Florian Quèze [:florian] from comment #67)
Do you have any idea of why the test doesn't fail on Windows 10 on try when the fix is disabled?
IIRC, closing the popup window by a native click is the way to make the test reliably fail without any fix, so using VK_ESCAPE did cause it?
EventUtils.synthesizeNativeMouseEvent can accept a callback function argument, so I guess if we use it, we can avoid the double click problem?
FWIW, from my comment #38
I managed to make the test work (on Linux with xul.panel-animations.enabled=true). So what I've noticed is it seems the issue only happens when the popup is closed by a native mouse click event. And I am now unsure though, I've changed to use the download panel instead of the browser menu.
(In reply to Florian Quèze [:florian] from comment #67)
And do you think we could still go ahead and request review to reland without having understood this? (and with a skip-if for the linux swr-wayland jobs)
I'd say yes, if the test reliably works on other platforms.
Comment 69•2 years ago
|
||
Just a side-note that the Wayland issue could well be a system bug. Ubuntu 18.04 did not support Wayland officially, so fixes to the display manager don't get backported. So it would be worth retrying once bug 1725245 lands.
Reporter | ||
Comment 70•2 years ago
|
||
(In reply to Hiroyuki Ikezoe (:hiro) from comment #68)
(In reply to Florian Quèze [:florian] from comment #67)
Do you have any idea of why the test doesn't fail on Windows 10 on try when the fix is disabled?
IIRC, closing the popup window by a native click is the way to make the test reliably fail without any fix, so using VK_ESCAPE did cause it?
Yes.
When I push the test to try without the fix and using VK_ESCAPE, there's a failure only on Windows 7.
If I push this using a second click to close the panel, I additionally get TV failures on Windows: https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C3%2C2&revision=09650045400f91e6449c268157c664d780c96d97
EventUtils.synthesizeNativeMouseEvent can accept a callback function argument, so I guess if we use it, we can avoid the double click problem?
My understand of how this observer works is that we would get notified after the click has been dispatched, but waiting for the panel to open already handles that. We would need to wait long enough for the second click to not be considered as a double click by the OS, but that won't be covered by the observer.
(In reply to Florian Quèze [:florian] from comment #67)
And do you think we could still go ahead and request review to reland without having understood this? (and with a skip-if for the linux swr-wayland jobs)
I'd say yes, if the test reliably works on other platforms.
I've been told in #sheriffs that the wayland jobs don't run in CI, and I don't need to add a skip-if annotation. Apparently it's a bug that the job is marked as tier1 on Try.
Reporter | ||
Comment 71•2 years ago
|
||
Reporter | ||
Comment 72•2 years ago
|
||
Comment 73•2 years ago
|
||
(In reply to Florian Quèze [:florian] from comment #67)
I pushed to try the patch attached in comment 65 with the test fixes I mentioned in comment 60 and 66: https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=384b458b9b6a2b21871e30a361340cfa4f315fff&selectedTaskRun=cFI-Y9slSTKPX-aYdFBXuA.0
It is green, except on the linux swr-wayland jobs where it fails on an assertion: Assertion failure: mGdkWindow (Getting configure for invisible window?), added in bug 1738729.
To see if the assertion was caused by the fix, I pushed again the same test but with the fix commented out. It failed again with the same assertion on the linux swr-wayland jobs: https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=e8c7ac3ac87850073f5e61fd245415eeb1ab154a
It also failed on the Window 7 jobs (that's expected with the fix commented out!), but didn't fail on the Windows 10 jobs, which is puzzling as on my local Windows 10 and 11 machines, the test fails without the fix and passes with the fix.Martin, how important is this assertion, and could it be changed to a simple return?
It's somehow expected so don't worry about it.
Comment 74•2 years ago
|
||
Reporter | ||
Comment 75•2 years ago
|
||
Comment 76•2 years ago
|
||
Comment 77•2 years ago
|
||
Backed out for causing failures on /browser_ext_getViews.js
- backout: https://hg.mozilla.org/integration/autoland/rev/51af9cc54189d365c76c625b529bb86cd36dd8e0
- push: https://treeherder.mozilla.org/jobs?repo=autoland&group_state=expanded&revision=a692ca1d84013f272d5e1170e23e3acdef89be08
- failure log: https://treeherder.mozilla.org/logviewer?job_id=384122871&repo=autoland&lineNumber=5004
[task 2022-07-12T17:11:48.760Z] 17:11:48 INFO - TEST-PASS | browser/components/extensions/test/browser/browser_ext_getViews.js | browsing context group is correct - Expected: 1911, Actual: 1911 -
[task 2022-07-12T17:11:48.760Z] 17:11:48 INFO - Buffered messages finished
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - TEST-UNEXPECTED-FAIL | browser/components/extensions/test/browser/browser_ext_getViews.js | view type is valid -
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - Stack trace:
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - chrome://mochikit/content/browser-test.js:test_ok:1400
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - chrome://mochikit/content/tests/SimpleTest/ExtensionTestUtils.js:testHandler:87
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - chrome://mochikit/content/tests/SimpleTest/ExtensionTestUtils.js:testResult:97
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://specialpowers/SpecialPowersChild.jsm:listener:2020
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://specialpowers/SpecialPowersChild.jsm:loadExtension/<:1950
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://specialpowers/SpecialPowersChild.jsm:receiveMessage:253
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - JSActor query*resource://specialpowers/SpecialPowersParent.jsm:resultListener:1120
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/ExtensionCommon.jsm:emit:320
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/Extension.jsm:receiveMessage:2536
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - MessageListener.receiveMessage*resource://gre/modules/Extension.jsm:Extension:2340
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/Extension.jsm:startup:2230
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/addons/XPIProvider.jsm:callBootstrapMethod:1852
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/addons/XPIProvider.jsm:startup:1960
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/addons/XPIProvider.jsm:_install:2037
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/addons/XPIProvider.jsm:install:2026
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/addons/XPIInstall.jsm:_activateAddon:4581
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/addons/XPIInstall.jsm:installTemporaryAddon:4440
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/addons/XPIProvider.jsm:XPIProvider[meth]:3284
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/AddonManager.jsm:installTemporaryAddon:2477
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://gre/modules/AddonManager.jsm:installTemporaryAddon:4142
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://testing-common/ExtensionTestCommon.jsm:startup:182
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - resource://specialpowers/SpecialPowersParent.jsm:receiveMessage/<:1196
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - promise callback*resource://specialpowers/SpecialPowersParent.jsm:receiveMessage:1190
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - JSActor query*resource://specialpowers/SpecialPowersChild.jsm:startup:1975
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - chrome://mochitests/content/browser/browser/components/extensions/test/browser/browser_ext_getViews.js:null:154
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - chrome://mochikit/content/browser-test.js:handleTask:988
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - chrome://mochikit/content/browser-test.js:_runTaskBasedTest:1060
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - chrome://mochikit/content/browser-test.js:Tester_execTest:1195
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - chrome://mochikit/content/browser-test.js:nextTest/<:977
[task 2022-07-12T17:11:48.761Z] 17:11:48 INFO - chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:SimpleTest.waitForFocus/<:1041
[task 2022-07-12T17:11:48.762Z] 17:11:48 INFO - TEST-PASS | browser/components/extensions/test/browser/browser_ext_getViews.js | browsing context group is correct - Expected: 1911, Actual: 1911 -
Comment 78•2 years ago
|
||
Updated•2 years ago
|
Reporter | ||
Comment 79•2 years ago
|
||
(In reply to Natalia Csoregi [:nataliaCs] from comment #77)
Backed out for causing failures on /browser_ext_getViews.js
This was investigated in bug 1748808.
Comment 80•2 years ago
|
||
Assignee | ||
Comment 81•2 years ago
|
||
browser_test_select_popup_position.js is the test caused the crash.
The changes here unlikely cause the crash, at first glance.
Comment 82•2 years ago
•
|
||
These kind of crashes happened also with the land in comment 74, link here. Bug 1779309 was filled for them then.
[task 2022-07-12T22:21:34.354Z] 22:21:34 INFO - PROCESS-CRASH | Main app process exited normally | application crashed [@ mozilla::layers::APZCTreeManagerParent::RecvUpdateZoomConstraints(mozilla::layers::ScrollableLayerGuid const&, mozilla::Maybe<mozilla::layers::ZoomConstraints> const&)]
[task 2022-07-12T22:21:34.354Z] 22:21:34 INFO - Crash dump filename: C:\Users\task_165765759579374\AppData\Local\Temp\tmplwrnxkty.mozrunner\minidumps\61f22c50-d8a2-4c59-9a49-a420248cffc7.dmp
[task 2022-07-12T22:21:34.354Z] 22:21:34 INFO - Operating system: Windows NT
[task 2022-07-12T22:21:34.354Z] 22:21:34 INFO - 10.0.19041
[task 2022-07-12T22:21:34.354Z] 22:21:34 INFO - CPU: amd64
[task 2022-07-12T22:21:34.354Z] 22:21:34 INFO - family 6 model 106 stepping 6
[task 2022-07-12T22:21:34.354Z] 22:21:34 INFO - 8 CPUs
[task 2022-07-12T22:21:34.354Z] 22:21:34 INFO -
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - Crash reason: EXCEPTION_ACCESS_VIOLATION_READ
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - Crash address: 0x0
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - Process uptime: 4 seconds
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO -
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - Thread 8 Compositor (crashed)
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - 0 xul.dll!mozilla::layers::APZCTreeManagerParent::RecvUpdateZoomConstraints(mozilla::layers::ScrollableLayerGuid const&, mozilla::Maybe<mozilla::layers::ZoomConstraints> const&) [APZCTreeManagerParent.cpp:a692ca1d84013f272d5e1170e23e3acdef89be08 : 97 + 0x0]
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - rax = 0x0000000100000023 rdx = 0x0000024b6f4bd880
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - rcx = 0x0000000000000000 rbx = 0x0000024b6f527910
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - rsi = 0x00000082f067f528 rdi = 0x00000082f067f548
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - rbp = 0x0000000000000001 rsp = 0x00000082f067f4c0
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - r8 = 0x00000082f067f548 r9 = 0x00000082f067f530
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - r10 = 0x0000024b7b46e010 r11 = 0x0000024b6f3e8100
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - r12 = 0x00000082f067faaf r13 = 0x0000000069a0c300
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - r14 = 0x0000024b6f4bd880 r15 = 0x0000000000000000
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - rip = 0x00007ffb84b353e8
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - Found by: given as instruction pointer in context
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - 1 xul.dll!mozilla::layers::PAPZCTreeManagerParent::OnMessageReceived(IPC::Message const&) [PAPZCTreeManagerParent.cpp: : 443 + 0x7]
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - rbx = 0x0000024b6f527910 rsi = 0x00000082f067f588
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - rbp = 0x0000000000000001 rsp = 0x00000082f067f4f0
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - r12 = 0x00000082f067faaf r13 = 0x0000000069a0c300
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - r14 = 0x0000024b6f4bd880 r15 = 0x0000000000000000
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - rip = 0x00007ffb84bb17b5
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - 2 xul.dll!mozilla::layers::PCompositorManagerParent::OnMessageReceived(IPC::Message const&) [PCompositorManagerParent.cpp: : 194 + 0xf]
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - rbx = 0x0000024b6f5ebc20 rsi = 0x0000024b6f527900
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - rdi = 0x0000000000000000 rbp = 0x0000000000000001
[task 2022-07-12T22:21:34.367Z] 22:21:34 INFO - rsp = 0x00000082f067f5f0 r12 = 0x00000082f067faaf
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - r13 = 0x0000000069a0c300 r14 = 0x0000024b69a50bb0
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - r15 = 0x0000000000000000 rip = 0x00007ffb84aa9e5f
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - 3 xul.dll!mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) [MessageChannel.cpp:a692ca1d84013f272d5e1170e23e3acdef89be08 : 1749 + 0xf]
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - rbx = 0x0000024b6ea9a200 rsi = 0x0000024b6f527900
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - rdi = 0x0000024b69a50c28 rbp = 0x0000000000000001
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - rsp = 0x00000082f067f6e0 r12 = 0x00000082f067faaf
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - r13 = 0x0000000069a0c300 r14 = 0x0000024b6ea9a200
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - r15 = 0x0000000000000000 rip = 0x00007ffb8472b3c0
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - 4 xul.dll!mozilla::ipc::MessageChannel::DispatchMessage(mozilla::ipc::ActorLifecycleProxy*, mozilla::UniquePtr<IPC::Message,mozilla::DefaultDelete<IPC::Message> >) [MessageChannel.cpp:a692ca1d84013f272d5e1170e23e3acdef89be08 : 1674 + 0xa]
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - rbx = 0x0000024b6ea513a0 rsi = 0x00000082f067f840
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - rdi = 0x0000024b69a50c28 rbp = 0x00000000ffffffff
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - rsp = 0x00000082f067f740 r12 = 0x00000082f067faaf
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - r13 = 0x0000000069a0c300 r14 = 0x0000024b6ea9a200
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - r15 = 0x0000000000000000 rip = 0x00007ffb8472a835
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - 5 xul.dll!mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::ActorLifecycleProxy*, mozilla::ipc::MessageChannel::MessageTask&) [MessageChannel.cpp:a692ca1d84013f272d5e1170e23e3acdef89be08 : 1474 + 0xa]
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - rbx = 0x0000024b6f528400 rsi = 0x0000024b69a50c28
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - rdi = 0x0000024b6ea9a200 rbp = 0x0000000000000000
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - rsp = 0x00000082f067f820 r12 = 0x00000082f067faaf
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - r13 = 0x0000000069a0c300 r14 = 0x0000024b6ea513a0
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - r15 = 0x0000000000000000 rip = 0x00007ffb8472abfe
[task 2022-07-12T22:21:34.368Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - 6 xul.dll!mozilla::ipc::MessageChannel::MessageTask::Run() [MessageChannel.cpp:a692ca1d84013f272d5e1170e23e3acdef89be08 : 1572 + 0xa]
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - rbx = 0x0000024b6ea9a200 rsi = 0x0000024b6f528440
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - rdi = 0x0000024b6f528400 rbp = 0x0000000000000000
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - rsp = 0x00000082f067f870 r12 = 0x00000082f067faaf
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - r13 = 0x0000000069a0c300 r14 = 0x0000024b6ea513a0
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - r15 = 0x0000000000000000 rip = 0x00007ffb8472ae91
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - 7 xul.dll!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:a692ca1d84013f272d5e1170e23e3acdef89be08 : 1199 + 0xc]
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - rbx = 0x00000082f067f9a0 rsi = 0x0000024b69a68300
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - rdi = 0x00000000ffffffff rbp = 0x0000000000000000
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - rsp = 0x00000082f067f8c0 r12 = 0x00000082f067faaf
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - r13 = 0x0000000069a0c300 r14 = 0x0000000000000000
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - r15 = 0x0000000000000000 rip = 0x00007ffb840fb958
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - 8 xul.dll!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:a692ca1d84013f272d5e1170e23e3acdef89be08 : 465 + 0x16]
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - rbx = 0x0000000069abe501 rsi = 0x0000024b69abe5c0
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - rdi = 0x00000082f067fc08 rbp = 0x00000082f067fb50
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - rsp = 0x00000082f067fa80 r12 = 0x00007ffbbe4bfaa0
[task 2022-07-12T22:21:34.369Z] 22:21:34 INFO - r13 = 0x00007ffb9c5a9468 r14 = 0x0000024b69a68300
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - r15 = 0x0000024b69abe5e0 rip = 0x00007ffb840ff928
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - 9 xul.dll!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [MessagePump.cpp:a692ca1d84013f272d5e1170e23e3acdef89be08 : 330 + 0x9]
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - rbx = 0x0000000069abe500 rsi = 0x0000024b69abe5c0
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - rbp = 0x00000082f067fb50 rsp = 0x00000082f067fad0
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - r12 = 0x00007ffbbe4bfaa0 r13 = 0x00007ffb9c5a9468
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - r14 = 0x0000024b69a68300 r15 = 0x0000024b69abe5e0
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - rip = 0x00007ffb8472e0ae
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - 10 xul.dll!MessageLoop::RunHandler() [message_loop.cc:a692ca1d84013f272d5e1170e23e3acdef89be08 : 373 + 0x15]
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - rbx = 0x0000024b69a68300 rsi = 0x00000082f067fc08
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - rdi = 0x000000000000000a rbp = 0x00000082f067fb50
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - rsp = 0x00000082f067fb30 r12 = 0x00007ffbbe4bfaa0
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - r13 = 0x00007ffb9c5a9468 r14 = 0x0000024b69ac5b28
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - r15 = 0x0000024b69ac5b20 rip = 0x00007ffb846ec750
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - 11 xul.dll!MessageLoop::Run() [message_loop.cc:a692ca1d84013f272d5e1170e23e3acdef89be08 : 355 + 0x4]
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - rbx = 0x0000024b69a68300 rsi = 0x00000082f067fc08
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - rdi = 0x000000000000000a rbp = 0x00000082f0600000
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - rsp = 0x00000082f067fb80 r12 = 0x00007ffbbe4bfaa0
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - r13 = 0x00007ffb9c5a9468 r14 = 0x0000024b69ac5b28
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - r15 = 0x0000024b69ac5b20 rip = 0x00007ffb846ec6c8
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - 12 xul.dll!static nsThread::ThreadFunc(void*) [nsThread.cpp:a692ca1d84013f272d5e1170e23e3acdef89be08 : 384 + 0x7]
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - rbx = 0x0000024b69a68300 rbp = 0x00000082f0600000
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - rsp = 0x00000082f067fbd0 r12 = 0x00007ffbbe4bfaa0
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - r13 = 0x00007ffb9c5a9468 r14 = 0x0000024b69ac5b28
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - r15 = 0x0000024b69ac5b20 rip = 0x00007ffb840f8cdd
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.370Z] 22:21:34 INFO - 13 nss3.dll!PR_NativeRunThread(void*) [pruthr.c:a692ca1d84013f272d5e1170e23e3acdef89be08 : 399 + 0xd]
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rbx = 0x0000024b69a30a30 rsi = 0x0000024b69a94a00
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rdi = 0x0000024b69a94b38 rbp = 0x00000082f0600000
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rsp = 0x00000082f067fda0 r12 = 0x00007ffbbe4bfaa0
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - r13 = 0x00007ffb9c5a9468 r14 = 0x0000024b69a30a00
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - r15 = 0x0000024b69a94b48 rip = 0x00007ffb9c496b67
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - 14 nss3.dll!pr_root(void*) [w95thred.c:a692ca1d84013f272d5e1170e23e3acdef89be08 : 139 + 0xc]
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rbx = 0x0000024b6b3b1540 rsi = 0x0000024b6b3b1540
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rdi = 0x0000000000000000 rbp = 0x0000000000000000
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rsp = 0x00000082f067fe20 r12 = 0x0000000000000000
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - r13 = 0x0000000000000000 r14 = 0x0000000000000000
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - r15 = 0x0000000000000000 rip = 0x00007ffb9c4892b1
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - 15 ucrtbase.dll!thread_start<unsigned int (__cdecl*)(void *),1> + 0x41
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rbx = 0x0000024b6b3b1540 rbp = 0x0000000000000000
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rsp = 0x00000082f067fe50 r12 = 0x0000000000000000
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - r13 = 0x0000000000000000 r14 = 0x0000000000000000
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - r15 = 0x0000000000000000 rip = 0x00007ffbbbed1bb2
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - 16 kernel32.dll!BaseThreadInitThunk + 0x13
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rbx = 0x00007ffbbbed1b70 rbp = 0x0000000000000000
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rsp = 0x00000082f067fe80 r12 = 0x0000000000000000
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - r13 = 0x0000000000000000 r14 = 0x0000000000000000
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - r15 = 0x0000000000000000 rip = 0x00007ffbbc727034
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - 17 mozglue.dll!patched_BaseThreadInitThunk(int, void*, void*) [WindowsDllBlocklist.cpp:a692ca1d84013f272d5e1170e23e3acdef89be08 : 572 + 0x28]
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rbx = 0x00007ffbbbed1b70 rbp = 0x0000000000000000
[task 2022-07-12T22:21:34.371Z] 22:21:34 INFO - rsp = 0x00000082f067feb0 r12 = 0x0000000000000000
[task 2022-07-12T22:21:34.372Z] 22:21:34 INFO - r13 = 0x0000000000000000 r14 = 0x0000000000000000
[task 2022-07-12T22:21:34.372Z] 22:21:34 INFO - r15 = 0x0000000000000000 rip = 0x00007ffba4224674
[task 2022-07-12T22:21:34.372Z] 22:21:34 INFO - Found by: call frame info
[task 2022-07-12T22:21:34.372Z] 22:21:34 INFO - 18 ntdll.dll!RtlUserThreadStart + 0x20
[task 2022-07-12T22:21:34.372Z] 22:21:34 INFO - rbx = 0x0000000000000000 rsi = 0x0000000000000000
[task 2022-07-12T22:21:34.372Z] 22:21:34 INFO - rdi = 0x0000000000000000 rbp = 0x0000000000000000
[task 2022-07-12T22:21:34.372Z] 22:21:34 INFO - rsp = 0x00000082f067ff20 r12 = 0x0000000000000000
[task 2022-07-12T22:21:34.372Z] 22:21:34 INFO - r13 = 0x0000000000000000 r14 = 0x0000000000000000
[task 2022-07-12T22:21:34.372Z] 22:21:34 INFO - r15 = 0x0000000000000000 rip = 0x00007ffbbe4e2651
[task 2022-07-12T22:21:34.372Z] 22:21:34 INFO - Found by: call frame info
~~~
Reporter | ||
Comment 83•2 years ago
|
||
It looks like there are 2 different crashes.
One [@ mozilla::layers::APZCTreeManagerParent::RecvUpdateZoomConstraints(mozilla::layers::ScrollableLayerGuid const&, mozilla::Maybe<mozilla::layers::ZoomConstraints> const&)]
, which seems to be a pre-existing bug, as bug 1488886 had the exact same stack. Interestingly, all the recent crash reports there are from Thunderbird. The line is https://searchfox.org/mozilla-central/rev/fa71140041c5401b80a11f099cc0cd0653295e2c/gfx/layers/ipc/APZCTreeManagerParent.cpp#97 mTreeManager->UpdateZoomConstraints(aGuid, aConstraints);
and the crash address is 0, maybe mTreeManager is null because it has already been destroyed?
Another one [@ mozilla::layers::APZCTreeManager::StartAutoscroll(mozilla::layers::ScrollableLayerGuid const&, mozilla::gfx::PointTyped<mozilla::ScreenPixel,float> const&)]
.
The line at https://searchfox.org/mozilla-central/rev/fa71140041c5401b80a11f099cc0cd0653295e2c/gfx/layers/apz/src/APZCTreeManager.cpp#1025 is NotifyAutoscrollRejected(aGuid);
. So I guess https://searchfox.org/mozilla-central/rev/fa71140041c5401b80a11f099cc0cd0653295e2c/gfx/layers/apz/src/APZCTreeManager.cpp#1064-1067 is inlined. The crash address is also 0, and this is on the GPU process.
Reporter | ||
Comment 84•2 years ago
|
||
(In reply to Florian Quèze [:florian] from comment #83)
Another one
[@ mozilla::layers::APZCTreeManager::StartAutoscroll(mozilla::layers::ScrollableLayerGuid const&, mozilla::gfx::PointTyped<mozilla::ScreenPixel,float> const&)]
.
The line at https://searchfox.org/mozilla-central/rev/fa71140041c5401b80a11f099cc0cd0653295e2c/gfx/layers/apz/src/APZCTreeManager.cpp#1025 isNotifyAutoscrollRejected(aGuid);
. So I guess https://searchfox.org/mozilla-central/rev/fa71140041c5401b80a11f099cc0cd0653295e2c/gfx/layers/apz/src/APZCTreeManager.cpp#1064-1067 is inlined. The crash address is also 0, and this is on the GPU process.
I opened a minidump in a debugger to see a better stack, and there's indeed an inlined stack frame. The crashing line is https://searchfox.org/mozilla-central/rev/108c7843696fdf951083889d6fb858953139be96/gfx/layers/apz/src/APZCTreeManager.cpp#1067 (controller->NotifyAsyncAutoscrollRejected(aGuid.mScrollId);
where controller
is NULL)
Reporter | ||
Comment 85•2 years ago
|
||
(In reply to Hiroyuki Ikezoe (:hiro) from comment #81)
browser_test_select_popup_position.js is the test caused the crash.
The changes here unlikely cause the crash, at first glance.
I think we just ran into a way to reproduce bug 1488886.
I noticed that in all cases where the crash was reproduced, the browser\base\content\test\performance\browser.ini manifest (which includes the new test added in this bug, which moves the mouse using EventUtils.synthesizeNativeMouseEvent({type: "click"
) and the gfx\layers\apz\test\mochitest\browser.ini manifest (where we intermittently crash at shutdown) ran in the same job (typically bc6). When gfx\layers\apz\test\mochitest\browser.ini ran in bc4 independently of browser\base\content\test\performance\browser.ini, there was no crash.
I could reproduce the crash on try, and if I add a simple change to our new test to move the mouse to the center of the browser window at the end of the test, I can't reproduce the crashes anymore. There is activity in bug 1779985 to avoid this entire class of issues in the future.
Comment 86•2 years ago
|
||
Comment 87•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/fa3e753f8f94
https://hg.mozilla.org/mozilla-central/rev/494979d407ea
https://hg.mozilla.org/mozilla-central/rev/0c27a3d7f914
Updated•2 years ago
|
Description
•