Open Bug 1983745 Opened 4 months ago Updated 2 months ago

Intermittent Linux Wayland browser/components/sessionstore/test/marionette/test_restore_manually.py | AssertionError: 552 != 500 : Second window has been restored to the correct height

Categories

(Firefox :: Session Restore, defect, P5)

defect

Tracking

()

ASSIGNED

People

(Reporter: intermittent-bug-filer, Assigned: emilio)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, intermittent-testcase, Whiteboard: [fidefe-session-restore] )

Attachments

(2 files)

Filed by: hskupin [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=522824083&repo=try&task=bJgX1alIQgmPTxmjuU24xg.0
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/bJgX1alIQgmPTxmjuU24xg/runs/0/artifacts/public/logs/live_backing.log


[task 2025-08-18T20:10:07.961+00:00] 20:10:07     INFO - TEST-UNEXPECTED-FAIL | browser/components/sessionstore/test/marionette/test_restore_manually.py TestSessionRestoreManually.test_restore | AssertionError: 552 != 500 : Second window has been restored to the correct height.
[task 2025-08-18T20:10:07.961+00:00] 20:10:07     INFO - Traceback (most recent call last):
[task 2025-08-18T20:10:07.962+00:00] 20:10:07     INFO -   File "/task_175554654672914/build/venv/lib/python3.10/site-packages/marionette_harness/marionette_test/testcases.py", line 193, in run
[task 2025-08-18T20:10:07.962+00:00] 20:10:07     INFO -     testMethod()
[task 2025-08-18T20:10:07.962+00:00] 20:10:07     INFO -   File "/task_175554654672914/build/tests/marionette/tests/browser/components/sessionstore/test/marionette/test_restore_manually.py", line 129, in test_restore
[task 2025-08-18T20:10:07.962+00:00] 20:10:07     INFO -     self.assertEqual(

This is the second last failure that currently blocks us from enabling Marionette integration tests on Wayland.

As it looks like there is a 2 pixel gap in at least the expected window height after Firefox was restarted. Is it known that under Wayland the window size can slightly change when restarting the application? Martin do you know?

Blocks: 1977223
Flags: needinfo?(stransky)
Summary: Intermittent Linux Wayland browser/components/sessionstore/test/marionette/test_restore_manually.py | single tracking bug → Intermittent Linux Wayland browser/components/sessionstore/test/marionette/test_restore_manually.py | AssertionError: 552 != 500 : Second window has been restored to the correct height

(In reply to Henrik Skupin [:whimboo][⌚️UTC+2] from comment #1)

This is the second last failure that currently blocks us from enabling Marionette integration tests on Wayland.

As it looks like there is a 2 pixel gap in at least the expected window height after Firefox was restarted. Is it known that under Wayland the window size can slightly change when restarting the application? Martin do you know?

Wayland window is generally sizeable but we can't control its position. When we set the size explicitly we're expected to get it. Is that a toplevel or popup window?

Please run with MOZ_LOG="Widget:5" env variable for toplevel and MOZ_LOG="WidgetPopup:5" for popup windows. The log should contain all related info how the window is processed/sized etc.

Also there's upcoming Wayland protocol related to session restore - https://wayland.app/protocols/wayland-protocols/18 - but it's not implemented yet (https://bugzilla.mozilla.org/show_bug.cgi?id=1959841).

Flags: needinfo?(stransky)

Interesting on bug 1959841 is that it says that KDE already supports it but not Gnome. This sounds similar to bug 1982745 comment 12. Emilio already added patch on that other bug and maybe it will as well fix this issue? Might be as well just the window decoration issue?

Flags: needinfo?(emilio)

Is this a test bug? Is the window really the wrong size? On Linux we restore client sizes rather than outer sizes, because we don't know the size of the decorations before-hand. But no, it's not related to that, since the browser window is not undecorated.

Flags: needinfo?(emilio)

Hard to say. Sarah wrote this test a while ago so maybe she can take a look if it's actually a test issue.

Flags: needinfo?(sclements)

This bug is caused by race condition and nature of Gtk resize logic. We set window window size before it's shown which apparently doesn't work here:

D/Widget [71a184071600]: nsWindow::ResizeInt w:502 h:500
[...]
D/Widget [71a184071600]: nsWindow::Show state 1 frame 
D/Widget [71a184071600]: nsWindow::NativeShow show
[...]
D/Widget [71a184071600]: configure event 0,0 -> 554 x 552 direct mGdkWindow scale 1
[...]
D/Widget moz_container_wayland_size_allocate [71a184071600] 26,23 -> 502 x 500
D/Widget [71a184071600]: nsWindow::OnContainerSizeAllocate 26,23 -> 502 x 500
[...]
V/Widget [71a184071600]: nsWindow::SetEGLNativeWindowSize() 502 x 500 scale 1.000000

So we actually restore the window correctly (change size from 550 to 500) but it takes some time after window show as it's async. That's the core issue here (and with all js size testing code) - size changes are async but js dom code expect the change is sync. We tried to do various tricks here to make the window appear already sized although the actual operation is still pending.

I think the test needs a waiting for the resize and eventually timeout failure if the window is not resized in some time.

Thank you for the investigation Martin! This should be easy to do with the Wait(..).until(..) call that various other tests already make use of.

Sarah, please let us know if you can get this fixed. Thanks.

Flags: needinfo?(sclements)
Whiteboard: [fidefe-session-restore]

I've rewritten this test using Wait(...).until(...) and there's a failure right after the resize now... seems like there's still a 52 pixel difference. I'm not sure where to go from here (and I don't see the same failure on a linux opt build). Here's the try push for the WIP patch.

I've mentioned this to sfoster to see if he can take this over for me while I'm on PTO (if the conclusion that this is still a test issue and needs further changes).

Flags: needinfo?(stransky)
Flags: needinfo?(hskupin)
Flags: needinfo?(sfoster)

That difference may come from CSD decorations and/or titlebar. Is that with or height?

You may try to run with MOZ_GTK_TITLEBAR_DECORATION=none and/or GTK_CSD=0 env variables which disables explicit CSD decorations on our side (but there may be a difference how we set/get the window sizes).

Flags: needinfo?(stransky)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #13)

That difference may come from CSD decorations and/or titlebar. Is that with or height?

The test to update specifically tests the height of the window. So the titlebar and the shadow (?) might influence the reported height.

Flags: needinfo?(hskupin)

Can I run the test locally somehow so I can check what's going on?
Thanks.

Yes, with the following command you can run the test with trace logs enabled for Marionette:

./mach marionette-test --gecko-log=- -vv browser/components/sessionstore/test/marionette/test_restore_manually.py

Okay, will try to look at it. So with the command above I shall see the error here, right?

Flags: needinfo?(stransky)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #18)

Okay, will try to look at it. So with the command above I shall see the error here, right?

Yes. If not then I would assume we have some specific setup for CI and we would have to figure out what's causing it. Thanks for checking!

CI runs without CSD right? because we don't set XDG_CURRENT_DESKTOP, see bug 1431429.

NVM, that doesn't affect wayland. But the patch in comment 11 is broken, it's calling resizeTo(..., 500), but that sets the outer, not inner size. That would also fail on Windows with the titlebar on.

That's an existing issue with the test.

Request a 500x500 inner size rather than outer size, so as to not hit
minimum size limits.

Assignee: nobody → emilio
Status: NEW → ASSIGNED

The try push in comment 24 has a new failure Timed out after 5.0 seconds with message: Non private browsing windows should have been restored. That seems unrelated to the window size thing. Which I suppose means we should land the patch? Or will it just get backed out for the new failure? I can help look into it but it really seems like 2 separate bug unless I'm misunderstanding.

Flags: needinfo?(sfoster) → needinfo?(emilio)

That try push hit bug 1986422, but apparently with that it fails in a subtly different way (locally it doesn't of course...).

Sam, can you reproduce the failure locally with that patch by any chance?

Flags: needinfo?(emilio) → needinfo?(sfoster)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #26)

Sam, can you reproduce the failure locally with that patch by any chance?

test_restore_manually.py passes for me locally with and without the patch on windows 11.
Its only the windows failures we care about there right? The manifest has it skipping some linux builds, but the : Unsupported platform for simulate_os_shutdow failures on linux and macos indicate this shouldn't be running there at all.

test_restore_windows_after_windows_shutdown.py is failing with main (without the patch) in the same way: Timed out after 5.0 seconds with message: Non private browsing windows should have been restored. (duh, the patch only updates the other test so this is expected)

Flags: needinfo?(sfoster) → needinfo?(emilio)

No, it's the Linux Wayland failures we care about :)

Flags: needinfo?(emilio)

To summarize here a filtered view of the try build on Treeherder:
https://treeherder.mozilla.org/jobs?repo=try&revision=120438321620088501cdf673ef7ca8188adb6f3f&searchStr=linux%2Cwayland

And especially it's this failure (as described in the bug's summary as well):
https://treeherder.mozilla.org/logviewer?job_id=525214290&repo=try&task=d1JcIagzSYeiCmglsbh0AQ.0&lineNumber=9413

[task 2025-09-02T11:18:33.746+00:00] 11:18:33     INFO - TEST-UNEXPECTED-FAIL | browser/components/sessionstore/test/marionette/test_restore_manually.py TestSessionRestoreManually.test_restore | AssertionError: 604 != 552 : Second window has been restored to the correct height.

With the patch from Emilio we now only increased both sides by exactly 52px.

Flags: needinfo?(stransky)

Sam, did the additional information help you to reproduce the failure?

Flags: needinfo?(sfoster)

(In reply to Henrik Skupin [:whimboo][⌚️UTC+2] from comment #31)

Sam, did the additional information help you to reproduce the failure?

I'm afraid not. I started a wayland session, and ran the browser/components/sessionstore/test/marionette/test_restore_manually.py test and it passes for me - with and without the patch.

Flags: needinfo?(sfoster)

Sam, you probably need to run this test multiple times in a row to get into this situation. You can use the --run-until-failure argument of the mach command for that. Does it reproduce for you then?

Flags: needinfo?(sfoster)

(In reply to Henrik Skupin [:whimboo][⌚️UTC+2] from comment #34)

Sam, you probably need to run this test multiple times in a row to get into this situation. You can use the --run-until-failure argument of the mach command for that. Does it reproduce for you then?

Actually I'm in the process of setting up a new laptop, and this does fail reliably on that machine. Maybe it was the higher screen resolution? Not sure. But I get (without any patches applied)

AssertionError: 1000 !== 500 : Second window has been restored to the correct height

With emilio's patch:

AssertionError: 604 !== 502 : Second window has been restored to the correct height

And with :scelements' patch:

TimeoutException: Timed out after 5.0 seconds with message: Waiting for resize second window. Expected 500, got 448.

Flags: needinfo?(sfoster)

Emilio, anything you can help with to clarify this above behavior?

Flags: needinfo?(emilio)

Would need to know at least Desktop Environment (GNOME / Ubuntu) and wayland vs. X11. Ideally a pernosco trace could shed light on what's going on there :)

Flags: needinfo?(emilio)

Emilio, maybe bug 1981073 is the underlying culprit here?

Flags: needinfo?(emilio)

Seems unlikely since comment 35 is not running on automation and that has the default behavior (no titlebar), right?

Flags: needinfo?(emilio)

What do you mean with running on automation? Where is the setting to change that behavior?

Flags: needinfo?(emilio)

(In reply to Henrik Skupin [:whimboo][⌚️UTC+2] from comment #40)

What do you mean with running on automation? Where is the setting to change that behavior?

Automation runs without XDG_CURRENT_DESKTOP, so it runs with a titlebar by default: https://searchfox.org/firefox-main/rev/931471037b8a4f16ce3ccfa864310540cac255c1/widget/gtk/nsWindow.cpp#9637

Flags: needinfo?(emilio)

Only conditionally tho...

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: