Closed Bug 1779295 Opened 2 years ago Closed 2 years ago

Intermittent browser/base/content/test/performance/browser_panel_vsync.js | Test timed out -

Categories

(Core :: Graphics, defect, P5)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1781881
Tracking Status
firefox-esr91 --- unaffected
firefox-esr102 --- unaffected
firefox102 --- unaffected
firefox103 --- unaffected
firefox104 --- wontfix
firefox105 --- wontfix

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: intermittent-failure, regression)

Filed by: imoraru [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=384119561&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/P7GFJ6KURaaqj2WDPcHeQw/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/P7GFJ6KURaaqj2WDPcHeQw/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1


[task 2022-07-12T16:24:54.286Z] 16:24:54     INFO - TEST-START | browser/base/content/test/performance/browser_panel_vsync.js
[task 2022-07-12T16:25:39.308Z] 16:25:39     INFO - TEST-INFO | started process screentopng
[task 2022-07-12T16:25:39.603Z] 16:25:39     INFO - TEST-INFO | screentopng: exit 0
[task 2022-07-12T16:25:39.604Z] 16:25:39     INFO - Buffered messages logged at 16:24:54
[task 2022-07-12T16:25:39.605Z] 16:25:39     INFO - Entering test bound test_opening_panel_and_closing_should_not_leave_vsync
[task 2022-07-12T16:25:39.606Z] 16:25:39     INFO - Buffered messages finished
[task 2022-07-12T16:25:39.607Z] 16:25:39     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_panel_vsync.js | Test timed out - 
[task 2022-07-12T16:25:39.608Z] 16:25:39     INFO - removing /tmp/dm-ui-test.file
[task 2022-07-12T16:25:39.609Z] 16:25:39     INFO - GECKO(3384) | MEMORY STAT vsizeMaxContiguous not supported in this build configuration.
[task 2022-07-12T16:25:39.610Z] 16:25:39     INFO - GECKO(3384) | MEMORY STAT | vsize 3259MB | residentFast 486MB | heapAllocated 239MB
[task 2022-07-12T16:25:39.611Z] 16:25:39     INFO - TEST-OK | browser/base/content/test/performance/browser_panel_vsync.js | took 45054ms
[task 2022-07-12T16:25:39.612Z] 16:25:39     INFO - checking window state
[task 2022-07-12T16:25:39.613Z] 16:25:39     INFO - GECKO(3384) | Completed ShutdownLeaks collections in process 3384
[task 2022-07-12T16:25:39.618Z] 16:25:39     INFO - TEST-START | Shutdown
[task 2022-07-12T16:25:39.619Z] 16:25:39     INFO - Browser Chrome Test Summary
[task 2022-07-12T16:25:39.620Z] 16:25:39     INFO - Passed:  0
[task 2022-07-12T16:25:39.621Z] 16:25:39     INFO - Failed:  1
[task 2022-07-12T16:25:39.622Z] 16:25:39     INFO - Todo:    0
[task 2022-07-12T16:25:39.623Z] 16:25:39     INFO - Mode:    e10s
[task 2022-07-12T16:25:39.624Z] 16:25:39     INFO - *** End BrowserChrome Test Results ***
[task 2022-07-12T16:25:39.625Z] 16:25:39     INFO - GECKO(3384) | Exiting due to channel error.
[task 2022-07-12T16:25:39.626Z] 16:25:39     INFO - GECKO(3384) | Exiting due to channel error.
[task 2022-07-12T16:25:39.626Z] 16:25:39     INFO - GECKO(3384) | Exiting due to channel error.
[task 2022-07-12T16:25:39.627Z] 16:25:39     INFO - GECKO(3384) | Exiting due to channel error.
[task 2022-07-12T16:25:39.627Z] 16:25:39     INFO - GECKO(3384) | Exiting due to channel error.
[task 2022-07-12T16:25:39.627Z] 16:25:39     INFO - GECKO(3384) | Exiting due to channel error.
[task 2022-07-12T16:25:39.627Z] 16:25:39     INFO - GECKO(3384) | Exiting due to channel error.
[task 2022-07-12T16:25:39.627Z] 16:25:39     INFO - TEST-INFO | Main app process: exit 0
[task 2022-07-12T16:25:39.627Z] 16:25:39     INFO - runtests.py | Application ran for: 0:00:51.898919

:hiro, since you are the author of the regressor, bug 1742797, could you take a look?
For more information, please visit auto_nag documentation.

Flags: needinfo?(hikezoe.birchill)

The root cause of the failure is that it seems on Linux our panels sometimes get randomly closed. I noticed on try the failure on the bc2 jobs: https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=617028520651b7cab9b66ecb58719dccdc48e75a
There it failed with:

TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_panel_vsync.js | Test timed out -
TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_panel_vsync.js | Uncaught exception received from previously timed out test - popuphidden listener on #downloadsPanel not removed before the end of test

Which means the failure happened while we were waiting for the "popuphidden" event. We never received it because the popup had already been closed before. I have a profile showing what happens in this case: https://share.firefox.dev/3z15kLg There's a pair of popuphiding/popuphidden DOM events being dispatched while we wait for vsync to be off after showing the panel, before we have any chance of closing it ourselves. The stack of the popuphiding event is:

mozilla::EventDispatcher::Dispatch(nsISupports*, nsPresContext*, mozilla::WidgetEvent*, mozilla::dom::Event*, nsEventStatus*, mozilla::EventDispatchingCallback*, nsTArray<mozilla::dom::EventTarget*>*)
EventDispatcher::Dispatch popuphiding
EventDispatcher::Dispatch
nsXULPopupManager::FirePopupHidingEvent(nsIContent*, nsIContent*, nsIContent*, nsPresContext*, nsPopupType, bool, bool)
nsXULPopupManager::HidePopup(nsIContent*, bool, bool, bool, bool, nsIContent*)
nsXULPopupManager::Rollup(unsigned int, bool, mozilla::gfx::IntPointTyped<mozilla::LayoutDevicePixel> const*, nsIContent**)
nsWindow::CheckForRollup(double, double, bool, bool)
nsWindow::OnContainerFocusOutEvent(_GdkEventFocus*)
focus_out_event_cb(_GtkWidget*, _GdkEventFocus*)
0x7ff22fee47fb
g_closure_invoke
signal_emit_unlocked_R
g_signal_emit_valist
g_signal_emit
0x7ff23002c534
0x7ff23003f245
0x7ff2300411b0
0x7ff230050bb3
0x7ff230050dbb
0x7ff22fee47fb
g_closure_invoke
signal_emit_unlocked_R
g_signal_emit_valist
g_signal_emit
0x7ff23002c534
0x7ff22fee38ce
0x7ff22f9f4765
0x7ff22fa24f92
g_main_context_dispatch
g_main_context_iterate.isra.26
g_main_context_iteration
nsAppShell::ProcessNextNativeEvent(bool)
nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool)
{virtual override thunk({offset(-8)}, nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool))}
nsThread::ProcessNextEvent(bool, bool*)
NS_ProcessNextEvent(nsIThread*, bool)
mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*)
MessageLoop::Run()
nsBaseAppShell::Run()
nsAppStartup::Run()
XREMain::XRE_mainRun()
XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&)
XREMain::XRE_main
(root)

On TV jobs, there's a second failure, which is actually just a variation of the first one: the unwanted popuphiding event is fired instead of the popupshown event that we are waiting for, which is never fired. Here's a profile of this: https://share.firefox.dev/3yZv1f3
On try, I never saw this failure happening in normal "BC" jobs (and I did more than 70 runs), but I saw it multiple times on TV jobs.

So I worked around the first failure in https://phabricator.services.mozilla.com/D151622 and ignored the variation, as fixing it would require modifying or forking the promisePanelOpened function that we used from browser/components/downloads/test/browser/head.js

Martin, is the stack (or profile) in comment 2 of any use to figure out why we have some popups randomly hiding on Linux?

Flags: needinfo?(stransky)

(In reply to Florian Quèze [:florian] from comment #4)

Martin, is the stack (or profile) in comment 2 of any use to figure out why we have some popups randomly hiding on Linux?

You can run it with MOZ_LOG="Widget:5, WidgetPopup:5" and obtain widget log, there may be logged why there's focus_out_event_cb(). I think it may be caused by test environment/window manager but the log should tell you more.

Flags: needinfo?(stransky)

I pushed another try run and got profiles with the log messages you requested:

Flags: needinfo?(stransky)

Can you reproduce that locally or does it affect the taskcluster testsuite only?
I tired to find browser_panel_vsync.js in the tree but it seems to me missing.

Flags: needinfo?(stransky) → needinfo?(florian)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #7)

Can you reproduce that locally or does it affect the taskcluster testsuite only?

I don't remember being able to reproduce locally.

I tired to find browser_panel_vsync.js in the tree but it seems to me missing.

It comes from the patch stack in bug 1742797 that has just been backed out for a crash on Windows.

Was there anything interesting in the log messages?

Flags: needinfo?(florian)
Flags: needinfo?(stransky)
Summary: Intermittent TV browser/base/content/test/performance/browser_panel_vsync.js | Test timed out - → Intermittent browser/base/content/test/performance/browser_panel_vsync.js | Test timed out -

Set release status flags based on info from the regressing bug 1742797

In reply to Florian Quèze [:florian] from comment #8)

Was there anything interesting in the log messages?

The related part of log is here:

[task 2022-07-18T20:41:01.080Z] 20:41:01     INFO - GECKO(2580) | [Parent 2580: Main Thread]: D/WidgetPopup [7ff8ff1b0000]: nsWindow::SetInputRegion(0, 4)
[task 2022-07-18T20:41:01.098Z] 20:41:01     INFO - GECKO(2580) | [Parent 2580: Main Thread]: D/WidgetPopup [7ff8ff1b0000]: nsWindow::OnWindowStateEvent for 7ff8f97c2c60 changed 0x81 new_window_state 0x80
[task 2022-07-18T20:41:01.099Z] 20:41:01     INFO - GECKO(2580) | [Parent 2580: Main Thread]: D/WidgetPopup [7ff8ff1b0000]: 	early return because no interesting bits changed
[task 2022-07-18T20:41:01.100Z] 20:41:01     INFO - GECKO(2580) | [Parent 2580: Main Thread]: D/WidgetPopup [7ff8ff1b0000]: nsWindow::OnWindowStateEvent for 7ff8f974cf50 changed 0x81 new_window_state 0x80
[task 2022-07-18T20:41:01.100Z] 20:41:01     INFO - GECKO(2580) | [Parent 2580: Main Thread]: D/WidgetPopup [7ff8ff1b0000]: nsWindow::SetHasMappedToplevel() state 1
[task 2022-07-18T20:41:01.101Z] 20:41:01     INFO - GECKO(2580) | [Parent 2580: Main Thread]: D/WidgetPopup [7ff8ff1b0000]: GrabPointer time=0x00000000 retry=1
[task 2022-07-18T20:41:01.104Z] 20:41:01     INFO - GECKO(2580) | [Parent 2580: Main Thread]: D/WidgetPopup [7ff8ff1b0000]: 	quick return because IS_MOZ_CONTAINER(aWidget) is true
[task 2022-07-18T20:41:01.104Z] 20:41:01     INFO - GECKO(2580) | [Parent 2580: Main Thread]: D/Widget [7ff8ff1abc00]: OnContainerFocusOutEvent

Main Firefox window loses focus so we close rollup (non-persistent) popups. But why is the main Firefox unfocused? Is that caused by nsWindow::GrabPointer() called on the popup? It seems so as we gain focus again when the popup is closed but nsWindow::GrabPointer() should not behave that, it should leave focus on the parent (as it does on my test box).

Also this part of the log is weird:

[task 2022-07-18T20:41:00.622Z] 20:41:00     INFO - GECKO(2580) | [Parent 2580: Main Thread]: D/Widget [7ff947dbc400]: OnContainerFocusOutEvent
[task 2022-07-18T20:41:00.625Z] 20:41:00     INFO - GECKO(2580) | [Parent 2580: Main Thread]: D/Widget [7ff947dbc400]: Done with container focus out
[task 2022-07-18T20:41:00.662Z] 20:41:00     INFO - GECKO(2580) | [Parent 2580: Main Thread]: D/Widget [7ff8ff1abc00]: OnContainerFocusInEvent

We just lose/gain focus without any apparent reason. Is that caused test suite or window manager or so?

Flags: needinfo?(stransky)

(In reply to Florian Quèze [:florian] from comment #8)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #7)

Can you reproduce that locally or does it affect the taskcluster testsuite only?

I don't remember being able to reproduce locally.

I tried to reproduce locally on my Ubuntu 22.04 with XWindow. I couldn't see the failure on verity runs.

Flags: needinfo?(hikezoe.birchill)

Set release status flags based on info from the regressing bug 1742797

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.