Closed Bug 1874857 Opened 1 year ago Closed 1 year ago

wayland proxy connection failure after applying update, defaulting to xwayland (take 2)

Categories

(Core :: Widget: Gtk, defect)

defect

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox121 --- unaffected
firefox122 --- unaffected
firefox123 + fixed
firefox124 --- fixed

People

(Reporter: emilio, Assigned: stransky)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(5 files)

Seems to still be an issue on the latest nightly. STR: Go to about:restartrequired, click restart.

Flags: needinfo?(stransky)

Don't you need to wait for the current nightly to see a restart ?

Yeah, I think so. Seems properly fixed, actually. Sorry for the noise.

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → WORKSFORME
Flags: needinfo?(stransky)

Maybe not ? Just updated from 20240116050321 to 20240117092715, and https://bugzilla.mozilla.org/show_bug.cgi?id=1874717 was supposed to be on 20240116050321 yet after update: xwayland, and

janv. 17 13:13:05 portable-alex Firefox-Nightly.desktop[3120016]: Wayland Proxy error: StartProxyServer(): bind() error : Adresse déjà utilisée
janv. 17 13:13:06 portable-alex Firefox-Nightly.desktop[3120016]: [GFX1-]: glxtest: Could not connect to wayland display, WAYLAND_DISPLAY=/run/user/1000/wayland-proxy-3120016

And 3120016 is the PID of the updated process, i.e., 20240117092715`

Status: RESOLVED → REOPENED
Flags: needinfo?(stransky)
Resolution: WORKSFORME → ---

Set release status flags based on info from the regressing bug 1743144

Depends on: 1875148

Okay, let's add more logging and look at it.

Updated to 20240117145030, still defaulting to Xwayland.

(In reply to Martin Stránský [:stransky] (ni? me) from comment #5)

Okay, let's add more logging and look at it.

Should I run with some MOZ_LOG to capture it?

Assignee: nobody → stransky

I managed to reproduce locally, thanks.

Flags: needinfo?(stransky)
Priority: -- → P2
Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/2916a04fa1be [Wayland] Don't create wayland proxy if it's already running r=emilio
Status: REOPENED → RESOLVED
Closed: 1 year ago1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 123 Branch

Applied updates:

  • 20240118095536 -> 20240120093931
  • 20240120093931 -> 20240121204046

Resulted in (20240120093931):

janv. 22 07:03:59 portable-alex Firefox-Nightly.desktop[3297629]: [3297629] Wayland Proxy [0x7fa7fc395670] Error: StartProxyServer(): bind() error : Adresse déjà utilisée
janv. 22 07:04:00 portable-alex Firefox-Nightly.desktop[3297629]: [GFX1-]: glxtest: Could not connect to wayland display, WAYLAND_DISPLAY=/run/user/1000/wayland-proxy-3297629

Then (20240121204046):

janv. 22 07:04:23 portable-alex Firefox-Nightly.desktop[3297629]: [Parent 3297629, IPC I/O Parent] WARNING: waitid failed pid:3297858 errno:10: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_posix.cc:244
janv. 22 07:04:23 portable-alex Firefox-Nightly.desktop[3297629]: [Parent 3297629, IPC I/O Parent] WARNING: waitid failed pid:3298900 errno:10: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_posix.cc:244
janv. 22 07:04:23 portable-alex Firefox-Nightly.desktop[3299070]: Error: ConnectToCompositor() connect() : Aucun fichier ou dossier de ce type
janv. 22 07:04:23 portable-alex Firefox-Nightly.desktop[3299070]: [3299070] Wayland Proxy [0x7fb4f4594020] Error: StartProxyServer(): bind() error : Adresse déjà utilisée
janv. 22 07:04:23 portable-alex Firefox-Nightly.desktop[3299070]: [GFX1-]: glxtest: Could not connect to wayland display, WAYLAND_DISPLAY=/run/user/1000/wayland-proxy-3299070

So still no pure Wayland after updates :(

Status: RESOLVED → REOPENED
Priority: P2 → --
Resolution: FIXED → ---
Target Milestone: 123 Branch → ---

Yes, looks like we really run the proxy twice in the same process after update/restart:

[9891] WaylandProxy [0x7f666038f890]: Created().
[9891] WaylandProxy [0x7f666038f890]: Init()
[9891] WaylandProxy [0x7f666038f890]: SetupWaylandDisplays() Wayland '/run/user/1000/wayland-0' proxy '/run/user/1000/wayland-proxy-9891'
[9891] WaylandProxy [0x7f666038f890]: Init() finished
[9891] WaylandProxy [0x7f666038f890]: SetWaylandProxyDisplay() WAYLAND_DISPLAY /run/user/1000/wayland-0
[9891] WaylandProxy [0x7f666038f890]: new child connection
WaylandProxy: ProxiedConnection::Init() OK
[9891] WaylandProxy [0x7f08d1a8f120]: Created().
[9891] WaylandProxy [0x7f08d1a8f120]: Init()
[9891] WaylandProxy [0x7f08d1a8f120]: SetupWaylandDisplays() Wayland '/run/user/1000/wayland-proxy-9891' proxy '/run/user/1000/wayland-proxy-9891'
[9891] Wayland Proxy [0x7f08d1a8f120] Error: StartProxyServer(): bind() error : Address already in use
[9891] WaylandProxy [0x7f08d1a8f120]: Init failed, exiting.
[9891] WaylandProxy [0x7f08d1a8f120]: terminated
[9891] WaylandProxy [0x7f08d1a8f120]: SetWaylandDisplay() WAYLAND_DISPLAY /run/user/1000/wayland-proxy-9891
Flags: needinfo?(stransky)

We're missing proxy reset at nsUpdateDriver.cpp / ApplyUpdate().

Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/1110669fd3d4 [Wayland] Use WAYLAND_DISPLAY_COMPOSITOR to save original Wayland compositor display and use it when it's available r=emilio https://hg.mozilla.org/integration/autoland/rev/ac0fd1778048 [Wayland] Relax Wayland proxy management r=emilio
Status: REOPENED → RESOLVED
Closed: 1 year ago1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 124 Branch

(In reply to Narcis Beleuzu [:NarcisB] from comment #18)

https://hg.mozilla.org/mozilla-central/rev/1110669fd3d4
https://hg.mozilla.org/mozilla-central/rev/ac0fd1778048

Ok, this landed on 20240122155815 which I just updated to. Let's see the next update cycle.

Bad news! Just got the update to 20240123053648 and while I see janv. 23 12:49:14 portable-alex Firefox-Nightly.desktop[935601]: [935601] Wayland Proxy [0x7f19af694f00] Error: StartProxyServer(): bind() error : Adresse déjà utilisée, which worried me a bit, the about:support page reports proper wayland protocol.

However,

$ sockstat -lcu|grep wayland
root     gnome-shell          4359     unix   /run/user/1000/wayland-0
root     gnome-shell          4359     unix   /run/user/1000/wayland-0
root     Xwayland             9046     unix   @/tmp/.X11-unix/X0
root     Xwayland             9046     unix   /tmp/.X11-unix/X0
root     Xwayland             9046     unix   @/tmp/.X11-unix/X1
root     alacritty            9182     unix   /run/user/1000/Alacritty-wayland-0-9182.sock
root     alacritty            14268    unix   /run/user/1000/Alacritty-wayland-0-14268.sock
root     alacritty            14431    unix   /run/user/1000/Alacritty-wayland-0-14431.sock
root     alacritty            14511    unix   /run/user/1000/Alacritty-wayland-0-14511.sock
root     thunderbird-bin      346186   unix   /run/user/1000/wayland-proxy-346186
root     alacritty            1465933  unix   /run/user/1000/Alacritty-wayland-0-1465933.sock
root     gnome-text-edit      3028109  unix   /run/user/1000/wayland-0
$ ll /run/user/1000/wayland-proxy-*
srwxrwxr-x 1 alex alex 0 janv. 23 07:13 /run/user/1000/wayland-proxy-346186
srwxrwxr-x 1 alex alex 0 janv. 23 07:13 /run/user/1000/wayland-proxy-346437
srwxrwxr-x 1 alex alex 0 janv. 18 09:46 /run/user/1000/wayland-proxy-3933464

There's no proxy for 935601 /home/alex/bin/firefox/firefox-bin

After a manual restart, we still see wayland as a protocol but finally the proxy is here:

$ ll /run/user/1000/wayland-proxy-*
srwxrwxr-x 1 alex alex 0 janv. 23 07:13 /run/user/1000/wayland-proxy-346186
srwxrwxr-x 1 alex alex 0 janv. 23 07:13 /run/user/1000/wayland-proxy-346437
srwxrwxr-x 1 alex alex 0 janv. 18 09:46 /run/user/1000/wayland-proxy-3933464
srwxrwxr-x 1 alex alex 0 janv. 23 12:52 /run/user/1000/wayland-proxy-942880
$ sockstat -lcu|grep wayland
root     gnome-shell          4359     unix   /run/user/1000/wayland-0
root     gnome-shell          4359     unix   /run/user/1000/wayland-0
root     Xwayland             9046     unix   @/tmp/.X11-unix/X0
root     Xwayland             9046     unix   /tmp/.X11-unix/X0
root     Xwayland             9046     unix   @/tmp/.X11-unix/X1
root     alacritty            9182     unix   /run/user/1000/Alacritty-wayland-0-9182.sock
root     alacritty            14268    unix   /run/user/1000/Alacritty-wayland-0-14268.sock
root     alacritty            14431    unix   /run/user/1000/Alacritty-wayland-0-14431.sock
root     alacritty            14511    unix   /run/user/1000/Alacritty-wayland-0-14511.sock
root     thunderbird-bin      346186   unix   /run/user/1000/wayland-proxy-346186
root     firefox-bin          942880   unix   /run/user/1000/wayland-proxy-942880
root     alacritty            1465933  unix   /run/user/1000/Alacritty-wayland-0-1465933.sock
root     gnome-text-edit      3028109  unix   /run/user/1000/wayland-0

I'll look at it. The bug is present in beta now, let's backport it or disable proxy cache in beta.

(In reply to Martin Stránský [:stransky] (ni? me) from comment #21)

I'll look at it. The bug is present in beta now, let's backport it or disable proxy cache in beta.

Reproduced again when updating to 20240123215750, let me know what kind of logging I can enable to help track the problem?

Thanks, I can reproduce it:

[110262] WaylandProxy [0x7f743ea8e240]: Created().
[110262] WaylandProxy [0x7f743ea8e240]: Init()
[110262] WaylandProxy [0x7f743ea8e240]: SetupWaylandDisplays() Wayland '/run/user/1000/wayland-0' proxy '/run/user/1000/wayland-proxy-110262'
[110262] WaylandProxy [0x7f743ea8e240]: Init() finished
[110262] WaylandProxy [0x7f743ea8e240]: SetWaylandProxyDisplay() WAYLAND_DISPLAY /run/user/1000/wayland-0
[110262] WaylandProxy [0x7f743ea8e240]: new child connection
WaylandProxy: ProxiedConnection::Init() OK

[110262] WaylandProxy [0x7fe63458f780]: Created().
[110262] WaylandProxy [0x7fe63458f780]: Init()
[110262] WaylandProxy [0x7fe63458f780]: SetupWaylandDisplays() Wayland '/run/user/1000/wayland-0' proxy '/run/user/1000/wayland-proxy-110262'
[110262] Wayland Proxy [0x7fe63458f780] Error: StartProxyServer(): bind() error : Address already in use
[110262] WaylandProxy [0x7fe63458f780]: Init failed, exiting.
[110262] WaylandProxy [0x7fe63458f780]: terminated
[110262] WaylandProxy [0x7fe63458f780]: RestoreWaylandDisplay() WAYLAND_DISPLAY restored to wayland-0

From the log it looks like we create the proxy twice on the same process. I wonder how it that possible.

Status: RESOLVED → REOPENED
Flags: needinfo?(stransky)
Resolution: FIXED → ---
Status: REOPENED → ASSIGNED
Flags: needinfo?(stransky)

I think I have a solution - we should unset wayland proxy right after connection (and clear the proxy file node) so we'll use it for primary connection only. That will prevent proxy re-use after execv or any other kind of restart. It also fix the issues with signal handlers (sigsegv & co).

Duplicate of this bug: 1874107
Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/13c835242a94 [Wayland] Use Wayland proxy for main display connection only r=emilio https://hg.mozilla.org/integration/autoland/rev/52a9ae4a9d70 [Wayland] Don't use Wayland proxy if --display param is provided r=emilio
Status: ASSIGNED → RESOLVED
Closed: 1 year ago1 year ago
Resolution: --- → FIXED

Can you please check if the latest patches work for you?
Thanks.

Flags: needinfo?(stransky) → needinfo?(lissyx+mozillians)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #30)

Can you please check if the latest patches work for you?
Thanks.

Do we agree the update I should start from is running 20240128091600 ?

Flags: needinfo?(lissyx+mozillians)

(I'm on 20240125212731 right now)

(In reply to :gerard-majax from comment #32)

(I'm on 20240125212731 right now)

Frankly no idea. Feel free to test updates tomorrow.

\o/

(In reply to :gerard-majax from comment #31)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #30)

Can you please check if the latest patches work for you?
Thanks.

Do we agree the update I should start from is running 20240128091600 ?

Manually downloaded 20240128091600 (staged updated I restarted to was 20240127211938, and a new one was already available so I probably would have skipped the fixing one), about:support reports Wayland.

Applied update via our updater, new version is 20240128212206

  • still wayland reported
  • PID is 2630669
$ sockstat -lcu|grep wayland
root     gnome-shell          4359     unix   /run/user/1000/wayland-0
root     gnome-shell          4359     unix   /run/user/1000/wayland-0
root     Xwayland             9046     unix   @/tmp/.X11-unix/X0
root     Xwayland             9046     unix   /tmp/.X11-unix/X0
root     Xwayland             9046     unix   @/tmp/.X11-unix/X1
root     alacritty            9182     unix   /run/user/1000/Alacritty-wayland-0-9182.sock
root     alacritty            14268    unix   /run/user/1000/Alacritty-wayland-0-14268.sock
root     alacritty            14431    unix   /run/user/1000/Alacritty-wayland-0-14431.sock
root     alacritty            14511    unix   /run/user/1000/Alacritty-wayland-0-14511.sock
root     thunderbird-bin      346186   unix   /run/user/1000/wayland-proxy-346186
root     alacritty            1465933  unix   /run/user/1000/Alacritty-wayland-0-1465933.sock
root     firefox-bin          2630669  unix   /run/user/1000/wayland-proxy-2630669
root     gnome-text-edit      3028109  unix   /run/user/1000/wayland-0
root     alacritty            3091677  unix   /run/user/1000/Alacritty-wayland-0-3091677.sock

No error reported, but I dont see the /run/user/1000/wayland-proxy-2630669 ?

(In reply to :gerard-majax from comment #35)

No error reported, but I dont see the /run/user/1000/wayland-proxy-2630669 ?

Yes, that's correct. We delete the proxy socket right after main display init. That means we use proxy for main connection only, any other temporary connection (from gfxtest for instance) use plain one. As we run content processes in headless mode this is not a problem.

Comment on attachment 9373453 [details]
Bug 1874857 [Wayland] Don't create wayland proxy if it's already running r?emilio

Beta/Release Uplift Approval Request

  • User impact if declined: Firefox fails to run in Wayland mode after update.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Low risk as it's quite simple and straightforward.
  • String changes made/needed:
  • Is Android affected?: No
Attachment #9373453 - Flags: approval-mozilla-beta?
Attachment #9375736 - Flags: approval-mozilla-beta?
Attachment #9375737 - Flags: approval-mozilla-beta?
Attachment #9376591 - Flags: approval-mozilla-beta?
Attachment #9376592 - Flags: approval-mozilla-beta?

Comment on attachment 9373453 [details]
Bug 1874857 [Wayland] Don't create wayland proxy if it's already running r?emilio

Approved for 123 beta 5, thanks.

Attachment #9373453 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9375736 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9375737 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9376591 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9376592 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Duplicate of this bug: 1878829
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: