Closed Bug 1789823 Opened 2 years ago Closed 2 years ago

Restoring window size might not work if there's an ongoing sizemode change

Categories

(Core :: Widget: Gtk, defect)

Firefox 104
defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox-esr91 --- unaffected
firefox-esr102 --- unaffected
firefox104 --- wontfix
firefox105 --- wontfix
firefox106 --- fixed
firefox107 --- fixed

People

(Reporter: whimboo, Assigned: emilio)

References

(Regressed 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(3 files)

Originally filed on GitHub as: https://github.com/mozilla/geckodriver/issues/2042

As reported the issue started to happen for Firefox 104 and there is only the changes from bug 1780212 that were done for this particular release and affecting window position/size handling.

From the related trace logs:

1662588749987	Marionette	DEBUG	0 -> [0,2,"WebDriver:GetWindowRect",{}]
1662588749988	Marionette	DEBUG	0 <- [1,2,null,{"x":0,"y":0,"width":1152,"height":921}]
1662588749988	webdriver::server	DEBUG	<- 200 OK {"value":{"x":0,"y":0,"width":1152,"height":921}}
2022-09-07 22:12:29 INFO Selenium <- {"value":{"x":0,"y":0,"width":1152,"height":921}}
[..]
1662588749989	Marionette	DEBUG	0 -> [0,3,"WebDriver:SetWindowRect",{"x":10,"y":10}]
1662588754990	Marionette	WARN	TimedPromise timed out after 5000 ms: stacktrace:
TimedPromise/<@chrome://remote/content/marionette/sync.js:236:24
TimedPromise@chrome://remote/content/marionette/sync.js:221:10
restoreWindow@chrome://remote/content/marionette/driver.js:3228:9
GeckoDriver.prototype.setWindowRect@chrome://remote/content/marionette/driver.js:1133:13
1662588755492	Marionette	DEBUG	0 <- [1,3,null,{"x":0,"y":0,"width":1152,"height":921}]

Interesting here is the TimedPromise error when restoring the window. So most likely after startup of Firefox its window is maximized, and there are issues with sizemodechange events. I wonder if that is actually related here.

Note that the referenced CI job runs under xvfb. Maybe there is a bug in Firefox?

Emilio could you please have a look? Maybe there is something obvious that you can see? Thanks.

:emilio, since you are the author of the regressor, bug 1780212, could you take a look? Also, could you set the severity field?

For more information, please visit auto_nag documentation.

Flags: needinfo?(emilio)

Can we get some MOZ_LOG=Widget:5 logs perhaps? That should help figure out what's going on.

Flags: needinfo?(emilio) → needinfo?(hskupin)

(Otherwise without a way to repro it's going to be somewhat hard)

I asked the reporter to run the tests with the environment variable set. Lets see when we will get the details. It should hopefully happen soon.

In the meantime lets discuss the actual change from the regressor:
https://hg.mozilla.org/integration/autoland/rev/598d4ad77f18

Formerly we explicitly checked the window state to be normal. But now we only rely on the sizemodechange event. Given the usage of the DebounceCallback which is set to a timeout of 250ms by default I wonder if maybe the event is delayed and we now return too early?

Or could this maybe a problem with xfvb and maybe no window manager being installed? But would it then have been passed before your patch?

Flags: needinfo?(hskupin) → needinfo?(emilio)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #4)

Formerly we explicitly checked the window state to be normal. But now we only rely on the sizemodechange event. Given the usage of the DebounceCallback which is set to a timeout of 250ms by default I wonder if maybe the event is delayed and we now return too early?

Maybe? The X server could take arbitrarily long to respond. Hard to say without being able to reproduce in general :/

Flags: needinfo?(emilio)

We got a reply. Here are the logs for the particular case:
https://github.com/SeleniumHQ/selenium/actions/runs/3042167001/jobs/4900053524#step:10:330

1663040185817	Marionette	DEBUG	0 -> [0,3,"WebDriver:SetWindowRect",{"x":10,"y":10}]
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: nsWindow::SetSizeMode 0
[Parent 2499: Main Thread]: D/Widget [7faac303a000]:     set normal
1663040190818	Marionette	WARN	TimedPromise timed out after 5000 ms: stacktrace:
TimedPromise/<@chrome://remote/content/marionette/sync.js:236:24
TimedPromise@chrome://remote/content/marionette/sync.js:221:10
restoreWindow@chrome://remote/content/marionette/driver.js:3228:9
GeckoDriver.prototype.setWindowRect@chrome://remote/content/marionette/driver.js:1133:13
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: GetScreenBounds 0,0 -> 1152 x 921, unscaled 0,0 -> 1152 x 921
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: nsWindow::Move to 10 10
[Parent 2499: Main Thread]: D/Widget [7faac303a000]:   size state is not normal, bailing
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: GetScreenBounds 0,0 -> 1152 x 921, unscaled 0,0 -> 1152 x 921
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: GetScreenBounds 0,0 -> 1152 x 921, unscaled 0,0 -> 1152 x 921
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: GetScreenBounds 0,0 -> 1152 x 921, unscaled 0,0 -> 1152 x 921
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: GetScreenBounds 0,0 -> 1152 x 921, unscaled 0,0 -> 1152 x 921
1663040191319	Marionette	DEBUG	0 <- [1,3,null,{"x":0,"y":0,"width":1152,"height":921}]

As it looks like Marionette is not able to restore the window state to normal (which is the reason why the timeout promise can be seen) and as result the window's position cannot be moved. Maybe due to a missing window manager the size state is always fullscreen and cannot be restored to normal?

Interesting as well is the following log output:
https://github.com/SeleniumHQ/selenium/actions/runs/3042167001/jobs/4900053524#step:10:236

[Parent 2499: Main Thread]: D/Widget [7faac303a000]: nsWindow::SetSizeMode 0
[Parent 2499: Main Thread]: D/Widget [7faac303a000]:     already set
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: nsWindow::Resize 1152.000000 921.000000
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: nsWindow::ResizeInt w:1152 h:921
[Parent 2499: Main Thread]: D/Widget [7faac303a000]:   ConstrainSize: w:1152 h;921
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: nsWindow::NativeMoveResize move 0 resize 1 to 0,0 -> 1152 x 921
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: nsWindow::DispatchResized() size [1152, 921]
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: nsWindow::SetSizeMode 2
[Parent 2499: Main Thread]: D/Widget [7faac303a000]:     set maximized
[Parent 2499: Main Thread]: D/Widget [7faac303a000]: nsWindow::GetCompositorWidgetInitData

So the window is clearly set into maximized state during startup of Firefox. And in case of maximized or minimized we currently call win.restore() to get the window into normal mode. Hereby a timeout happens after 5s that only gets printed to the log but I actually would like to see gone.

Then we have only a requested change to the window position (x and y) so that we end up with an EventPromise for MozUpdateWindowPos and calling window.moveTo(10, 10). A MozUpdateWindowPos event seems to get fired given that we then do not end-up awaiting the promise and hang in this Marionette command.

Sadly I miss any output for a window size to the Widget logs so it's impossible to tell the position and size of the window. Or do I miss something?

Flags: needinfo?(emilio)

The relevant bit in comment 6 is the "set normal", and then "size state is not normal, bailing". Basically, the compositor ignored our request to unmaximize the window. I think this probably happened before my patch but just got papered over. In fact that makes me thing that your suspected regressor is not really the regressor here. It's more likely that bug 1773665 or similar uncovered this.

But given those logs I don't think this is particularly a firefox bug... Is there any chance to properly bisect this?

Flags: needinfo?(emilio) → needinfo?(hskupin)

The failure happens in GitHub actions only and given by https://github.com/browser-actions/setup-firefox it only supports releases, beta, and DevEdition. There is no way to run a Nightly or a custom build.

Maybe we could replicate the action definition (https://github.com/SeleniumHQ/selenium/blob/trunk/.github/workflows/ci-ruby.yml#L85) for a Python job and then download the Firefox version by using mozdownload and mozinstall. But recent Python jobs for Selenium pass as well and don't seem to show the issue. I'll check with the reporter if we can build a simple GitHub action for Python that might replicate it.

Nevertheless bug 1773665 sounds at least like a good candidate.

So it's basically only the window positioning that is failing. Changing the window size works as expected for these selenium tests. Here the appropriate tests from the Python bindings:

https://github.com/SeleniumHQ/selenium/blob/trunk/py/test/selenium/webdriver/common/window_tests.py#L49-L82

I'm finally able to run a GitHub action job with a given Firefox build:
https://github.com/SeleniumHQ/selenium/actions/runs/3048276798/jobs/4913169728

Emilio, could you provide me a try build that I could use to test? I tried to cherry-pick the changes from bug 1780212 and apply these to the parent commit before bug 1773665 landed, but sadly there are merge conflicts in at least Window.cpp that I'm not able to resolve. Once I have a build I can re-run the Github actions job again.

Flags: needinfo?(hskupin) → needinfo?(emilio)

Wouldn't it be enough to revert bug 1780212, and if that is still broken then revert the other one on top? One patch is written on top of each other.

Flags: needinfo?(emilio) → needinfo?(hskupin)

Without bug 1780212 it works yes, but I thought it's not the underlying issue and you wanted to see if the other bug is the cause. So how to proceed now? Shall we revert the Marionette change for now to fix the regression?

Flags: needinfo?(hskupin) → needinfo?(emilio)

So I tried to run the Selenium tests with just the Marionette patch on bug 1780212 backed out and it works just fine:
https://github.com/SeleniumHQ/selenium/actions/runs/3055646011/jobs/4928938584

So I would like to know if we should go ahead and backout the Marionette patch only or all patches on bug 1780212. If https://hg.mozilla.org/integration/autoland/rev/598d4ad77f18 is enough we should do that ASAP so that we could get the fix into the 105 release (not sure yet if the last RC has already been built or not).

I mean, the marionette change is correct. It seems xvfb is not unmaximizing the window, unless I'm missing something...

I'd be surprised if backing out only the marionette bits works? If so, so be it I guess... But I'm still fairly sure my patch is strictly more correct than the backout :)

Flags: needinfo?(emilio)

So I've got a chance now to test on my local Linux machine with Linux Mint installed. And good news is that I can actually replicate the problem when running Marionette tests under xvfb. Here the command that is necessary to get this specific failure reproduced:

xvfb-run ./mach marionette-test --gecko-log - -vv testing/marionette/harness/marionette_harness/tests/unit/test_window_rect.py

With that I was able run run mozregression locally (ignoring the time intensive task to push to Github and let the action jobs running) and to my surprise and probably yours as well I have to say that this regression has nothing to do with bug 1780212! Instead it's bug 1780372 which is causing the problem:

6:20.13 INFO: Test command result: 0 (build is good)
 6:20.13 INFO: Narrowed integration regression window from [824b18fd, bf01dc27] (3 builds) to [81c9cc81, bf01dc27] (2 builds) (~1 steps left)
 6:20.13 INFO: No more integration revisions, bisection finished.
 6:20.13 INFO: Last good revision: 81c9cc81146b08ed739c7218b206d4bcfed1d9b3
 6:20.13 INFO: First bad revision: bf01dc27a9f3cad7d59ec3c65cd007b1338ba474
 6:20.13 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=81c9cc81146b08ed739c7218b206d4bcfed1d9b3&tochange=bf01dc27a9f3cad7d59ec3c65cd007b1338ba474

Now I'll check if such a build not including this changes also works in the Selenium's Github actions.

Regressed by: 1780372
No longer regressed by: 1780212

Hm, that's not what I actually expected:

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #13)

So I tried to run the Selenium tests with just the Marionette patch on bug 1780212 backed out and it works just fine:
https://github.com/SeleniumHQ/selenium/actions/runs/3055646011/jobs/4928938584

And that doesn't help locally. When I backout the change from mozilla-central I still get a failure (whereby no timeout in restore):

1663246520521	Marionette	DEBUG	3 -> [0,3,"WebDriver:SetWindowRect",{"x":10,"y":10,"height":null,"width":null}]
[Parent 140518: Main Thread]: D/Widget [7fe9eda70000]: nsWindow::SetSizeMode 0
[Parent 140518: Main Thread]: D/Widget [7fe9eda70000]:     set normal
[Parent 140518: Main Thread]: D/Widget [7fe9eda70000]: GetScreenBounds 0,0 -> 1152 x 921, unscaled 0,0 -> 1152 x 921
[Parent 140518: Main Thread]: D/Widget [7fe9eda70000]: nsWindow::Move to 10 10
[Parent 140518: Main Thread]: D/Widget [7fe9eda70000]:   size state is not normal, bailing
[Parent 140518: Main Thread]: D/Widget [7fe9eda70000]: GetScreenBounds 0,0 -> 1152 x 921, unscaled 0,0 -> 1152 x 921
[Parent 140518: Main Thread]: D/Widget [7fe9eda70000]: GetScreenBounds 0,0 -> 1152 x 921, unscaled 0,0 -> 1152 x 921
[Parent 140518: Main Thread]: D/Widget [7fe9eda70000]: GetScreenBounds 0,0 -> 1152 x 921, unscaled 0,0 -> 1152 x 921
[Parent 140518: Main Thread]: D/Widget [7fe9eda70000]: GetScreenBounds 0,0 -> 1152 x 921, unscaled 0,0 -> 1152 x 921
1663246526030	Marionette	DEBUG	3 <- [1,3,null,{"x":0,"y":0,"width":1152,"height":921}]

As the widget log shows we were setting a window state of normal but as it looks like the it wasn't set given that we still bail out.

Flags: needinfo?(emilio)

In my local case I assume that we do not get an actual event which triggers updating the mSizeMode? I'm not seeing a nsWindow::OnWindowStateEvent for ... log entry.

So I assume locally I'm affected by the following change which no longer sets the new state when a restore is requested but only when the event is fired: https://hg.mozilla.org/integration/autoland/rev/bf01dc27a9f3#l1.33

Well, it's not that surprising when you think about it. The restore wasn't being effective, but we kept carrying on as if it was.

In fact seems you were aware of some of these issues back in bug 1520302 :)

I can try to poke a bit to see if there's a sane way of making xvfb u maximize the window.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #21)

Well, it's not that surprising when you think about it. The restore wasn't being effective, but we kept carrying on as if it was.

In fact seems you were aware of some of these issues back in bug 1520302 :)

Oh right. I totally forgot that one.

I can try to poke a bit to see if there's a sane way of making xvfb u maximize the window.

Given that this might take a while or might not be possible I wonder what minimal solution we could temporarily land on central right now and uplift to 105 to get it fixed for those releases. Then we can / could land a proper fix for 107.

I'm currently confused which of these changes actually would need to be backed-out to make Marionette working for me locally and remote in GitHub actions.

I was testing more on the Selenium repository and finally found a solution which seems to work pretty fine. As you said without a window manager we might miss these events. As such I tried to install fluxbox for the relevant GitHub Actions and that actually seems to help:

https://github.com/SeleniumHQ/selenium/pull/11025

I'll talk to the Selenium folks and propose this solution.

I don't think backing out is a good idea, unless it's adding a workaround in marionette. We're doing the right thing, as the window is still maximized.

It seems https://github.com/SeleniumHQ/selenium/pull/11025 is the right fix, but lmk if you disagree or you think there's something we should do on our side.

Flags: needinfo?(emilio)

Ah, I just realized that there is a case where right now we fail to properly restore the window. Let me attach some patches and you can let me know if they fix the issue on your end?

Attached file test-case.

If I run this on a maximized window, I get a maximized popup which is not great.

This is the listener for a size mode change, shouldn't need to set the
size mode again. While at it, match coding style better.

This doesn't change behavior now, but with the next patch it would.

Assignee: nobody → emilio
Status: NEW → ASSIGNED

Can you check if those patches help?

Flags: needinfo?(hskupin)

(It might not, in fact)

Pushed by ealvarez@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c9785cd100fc Remove redundant SetSizeMode call from AppWindow::SizeModeChanged. r=stransky https://hg.mozilla.org/integration/autoland/rev/1c24a2ccce19 More aggressively request size mode changes if there's a state change request in flight. r=stransky

Thanks Emilio! I'll have a look with the binaries as generated by taskcluster. I'll let you know once I have the results.

One point that I missed was as well that all the WebDriver tests as run with Chrome do not show this problem! As such using a window manager might not actually be the required step here.

Marking bug as leave-open for now.

Flags: needinfo?(hskupin)
Keywords: leave-open

Emilio, it doesn't look like that the problem is fixed:
https://github.com/SeleniumHQ/selenium/actions/runs/3082278554/jobs/4981834824

Flags: needinfo?(emilio)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #33)

One point that I missed was as well that all the WebDriver tests as run with Chrome do not show this problem! As such using a window manager might not actually be the required step here.

Well the tests would "pass" if we lied about the actual OS state, right? That's what we used to do and what Chrome probably does. But that's not great.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #34)

Emilio, it doesn't look like that the problem is fixed:
https://github.com/SeleniumHQ/selenium/actions/runs/3082278554/jobs/4981834824

Yeah, unfortunate not unexpected, if we're not getting the sizemode change at all...

Flags: needinfo?(emilio)

I finally got the tests running in Selenium's GitHub actions and that even with the affected 104 release:
https://github.com/SeleniumHQ/selenium/actions/runs/3084846363/jobs/4987477383

Basically needed steps are:

      - name: Install Fluxbox
        run: sudo apt-get -y install fluxbox
[..]
      - name: Start Xvfb
        run: Xvfb :99 &
      - name: Start Fluxbox
        run: fluxbox -display :99 &

I'll prepare a fix for all the Selenium bindings and will re-enable all formerly failing tests, which were broken because of a missing window manager.

Emilio, can you please move this bug to an appropriate component outside of Marionette please? And I assume it can be marked as fixed with an updated bug's summary.

Flags: needinfo?(emilio)
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Component: Marionette → Widget: Gtk
Flags: needinfo?(emilio)
Product: Testing → Core
Resolution: --- → FIXED
Summary: "WebDriver:SetWindowRect" sometimes fails to move the window to the specified coordinates → Restoring window size might not work if there's an ongoing sizemode change
Regressions: 1795804
Regressions: 1797463
Regressed by: 1809694
No longer regressed by: 1809694
Regressions: 1809694
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: