Closed Bug 1820896 Opened 3 years ago Closed 3 years ago

[X11] Firefox hangs at XGetWindowProperty() when running with modified stderr/stdout

Categories

(Core :: Widget: Gtk, defect, P2)

Firefox 110
defect

Tracking

()

VERIFIED FIXED
114 Branch
Tracking Status
firefox112 --- wontfix
firefox113 --- verified
firefox114 --- verified

People

(Reporter: ache, Assigned: stransky)

Details

Attachments

(2 files)

Attached image ff_link_discord_bug.png

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/110.0

Steps to reproduce:

The bug is present at least since September.
To reproduce the bug, just open a link from Discord or Slack.

I have xdg-desktop-portal-gtk started and working, opening the link with xdg-open from the shell will NOT reproduce the bug even if it's exactly what Discord is doing.

Actual results:

Firefox hangs forever, never open itself but set a lock in /tmp/firefox/lock.
You can not open a new instance of Firefox because “Firefox is already running but not responding ...”. You have to kill it by sending SIGKILL (pkill, kill, whatever).

Expected results:

The instance should open itself and load the address.

There is one work around. The first is obviously to open Firefox before opening the link in Discord/Slack.

There is one hot fix that may help to understand the bug. To actually make it works, you can edit the /usr/bin/firefox script that will launch Firefox (distro dependent) to redirect standard error to /dev/null (or any file in fact). You may need to edit the firefox.desktop from you distribution to open /usr/bin/firefox instead of the binary.

tl;dr:
First edit /usr/bin/firefox to redirect stderr to /dev/null
Them check that firefox.desktop executes the script from your distribution and not the binary directly. So replace you must have Exec=/usr/bin/firefox %u instead of Exec=/usr/lib/firefox %u.

I attach a conversation that take place on the Framework Discord. Please don't mind the language.

From that conversation, you can see that I wasn't alone with that bug and that it happens in various distinct environment.

I'm using Arch Linux with Awesome WM and Firefox in it's latest version (but the bug is here since a long time).
Other users reported the bug on Gentoo with i3.

Common point: Linux and not DE i guess.

To reproduce the bug, just open a link from Discord or Slack.

I'm sorry, I didn't say that you must not have any Firefox open yet.

To reproduce the bug, open a link from Discord or Slack but be sure to not have any instance of Firefox already opened.

I tried to reproduce this issue using Discord and Slack desktop apps on Ubuntu 18.04 but Firefox always opens without issues even If I closed Firefox and immediately double clicked a different Slack link. I also set Firefox Nightly as the default browser but I couldnt reproduce it there either.

I will set the component for this issue and try to get my hands on an Arch linux machine.

QA Whiteboard: [qa-not-actionable]
Component: Untriaged → Widget: Gtk
Product: Firefox → Core

Ok, so I have updated my computer to Firefox 111.0.1 and so I had to setup my fix again.

But I took times to test others WM or DE.

So now, I can confirm that the bug is reproducible with Mate but not reproducible with gnome-shell.
I don't know why but with gnome-shell, that work's without the fix.

About firefox.desktop. The file used by xdg-open can be in ~/.local/share/application or system widely installed in /usr/share/application.

But the bug is still here.

If I can be helpful in any way, just ask.

No pushing or anything, just to stay on the same level of information.
One the ArchLinux forum, someone encounter the bug https://bbs.archlinux.org/viewtopic.php?pid=2094112.

Ok, you should be able to reproduce the bug on any xorg WM (even Gnome shell) with that command :

xdg-open https://ache.one >&- 2>&-

Or even just with :
firefox >&- 2>&- # Firefox will not start, chromium will do

I would like to thank the Arch Linux community who worked to delimit the bug.

@Darkspirit: I'm sorry but none of these bugs seems related. The nearest is Bug 1804180 but it seems to be a graphical problem. Here it's a file descriptor problem.

Ok, you should be able to reproduce the bug on any xorg WM (even Gnome shell) with that command :

It is also possible to reproduce the problem on Wayland (I tried Sway) with

xdg-open https://ache.one >&- 2>&-

Here are the two possible way to resolve this issue. I think we should tackle both :

  1. We should ensure that xdg-open call the subprogram with open stdout et stderr (even if /dev/null). (out of thit ticket scope)
  2. We should think about: Do firefox should hangs forever whenever called with closed stderr and stdout. Personally I think that hanging is the worst solution available. Some acceptable solutions to this not ideal use might be:
    • Ignoring and continue (as others Browsers).
    • Displaying a message box.

I will investigate the xdg-open point. For the firefox side, the ball is yours, you have my opinion.

Priority: -- → P3
Summary: Firefox hangs when a Discord/Slack link is open and no instance is already open. → [Awesome WM] Firefox hangs when a Discord/Slack link is open and no instance is already open.

Do I understand that firefox is launched but it's frozen somewhere? In such case please try to get backtrace of the freeze:
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems#Getting_Mozilla_crash_report_from_running_or_frozen_Firefox
Thanks.

Flags: needinfo?(ache)

Firefox doesn't show any window.
It just hang before anything happen. “Frozen” would have mean that I have an interface, I don't.

Here is the way to reproduce the problem from any Linux setup:

  1. Close every instance of Firefox
  2. Start Firefox with closed stderr and stdout with :
firefox >&- 2>&-

Or

xdg-open https://ache.one >&- 2>&-

I put my website, put any.

Here is the Crash Id: bp-9ac08691-ccfa-4546-b34f-f7cb90230411

https://crash-stats.mozilla.org/report/index/9ac08691-ccfa-4546-b34f-f7cb90230411

Flags: needinfo?(ache)

It's freeze in GetDisplayICCProfile().

Summary: [Awesome WM] Firefox hangs when a Discord/Slack link is open and no instance is already open. → [Awesome WM] Firefox hangs when a Discord/Slack link is open and no instance is already open. Frozen @ GetDisplayICCProfile
Summary: [Awesome WM] Firefox hangs when a Discord/Slack link is open and no instance is already open. Frozen @ GetDisplayICCProfile → [Awesome WM][GetDisplayICCProfile()] Firefox hangs when a Discord/Slack link is open and no instance is already open.

Cool !

From https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/3.5/ICC_color_correction_in_Firefox I tried setting gfx.color_management.mode to 0 but it still freeze.

If i call colormgr (with xiccd started) I get :

$ colormgr get-devices
Object Path:   /org/freedesktop/ColorManager/devices/xrandr_BOE_ache_1000
Owner:         ache
Created:       avril 12 2023, 01:17:26 PM
Modified:      avril 12 2023, 01:17:26 PM
Type:          display
Enabled:       Yes
Embedded:      Yes
Vendor:        BOE
Seat:          seat0
Scope:         temp
Colorspace:    rgb
Device ID:     xrandr-BOE
Profile 1:     icc-32623111d73f7be9bdbfe120cbefe5bd
               /home/ache/.local/share/icc/edid-e8221f81f2d407880136a91143cb6820.icc
Metadata:      OutputEdidMd5=e8221f81f2d407880136a91143cb6820
Metadata:      OutputPriority=primary
Metadata:      XRANDR_name=eDP-1
Metadata:      OwnerCmdline=xiccd

Nothing special in journalctl -u colord.

Since, the problem seems to be related to stderr and stdout.
Without closing them, here are the output of both when I don't close them (no previous firefox instance running) :

$ firefox > stdout_firefox 2> stderr_firefox

And them:

$ cat stdout_firefox
[GFX1-]: glxtest: VA-API test failed: failed to initialise VAAPI connection.
console.error: "/home/ache/.mozilla/firefox/ry9hhjcq.ache/storage" "to-be-removed" 0 "" "Quota"
console.error: "Cache folder attempt no 1"
console.error: "couldn't find cache folder /home/ache/.mozilla/firefox/ry9hhjcq.ache/storage/to-be-removed"
console.error: "Pinged glean, waiting for submission."
$ cat stderr_firefox
ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
*** You are running in background task mode. ***
*** You are running in headless mode.
[ERROR glean_core] Error setting metrics feature config: Json(Error("EOF while parsing a value", line: 1, column: 0))```

I can send you the result of `strace firefox` but I'm not sure if it will help.

Ask me if I can do something.

Thanks, I can reproduce firefox >&- 2>&- freeze locally.
A workaround is to use Wayland but that may not be an option for everyone.
Will look at it.

Assignee: nobody → stransky
Priority: P3 → P2
Summary: [Awesome WM][GetDisplayICCProfile()] Firefox hangs when a Discord/Slack link is open and no instance is already open. → [X11] Firefox hangs when running with modified stderr/stdout
Summary: [X11] Firefox hangs when running with modified stderr/stdout → [X11] Firefox hangs at XGetWindowProperty() when running with modified stderr/stdout

Reserve the lower positions of the file descriptors to make sure
we don't reuse stdin/stdout/stderr in case they we closed
before launch.

Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/ead19aef6624 [Linux/X11] Reserve the lower positions of the file descriptors to avoid reuse of stdin/stdout/stderr r=emilio

The bug has a release status flag that shows some version of Firefox is affected, thus it will be considered confirmed.

Status: UNCONFIRMED → NEW
Ever confirmed: true
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 114 Branch

The patch landed in nightly and beta is affected.
:stransky, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox113 to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(stransky)

I don't think it require any special process since the “bug” is here for years.
But for sure it will help people set Firefox as default browser on Linux since it was a pain before this fix.

Comment on attachment 9328316 [details]
Bug 1820896 [Linux/X11] Reserve the lower positions of the file descriptors to avoid reuse of stdin/stdout/stderr r?emilio

Beta/Release Uplift Approval Request

  • User impact if declined: Firefox fails when stdout/stderr is closed and fd 1/2 (stdout/stderr) is reopen for different purpose (X display connection for instance) but some hardcoded routines still use predefined file descriptors to write output to console. So the display connection may be interfered by loging for instance. I suspect some tests may fails for this reason too.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Run Firefox on X11 as

./firefox 2>&- 1>&-

you'll get some sort of freeze/hang/crash.

  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): We replicate that Chrome does and what's used for child processes already, i.e. open 3 fd and redirect them to /dev/null. That makes sure fd 0-2 won't be reused as they're open.
    open() always use first free fds.
  • String changes made/needed:
  • Is Android affected?: Yes
Flags: needinfo?(stransky)
Attachment #9328316 - Flags: approval-mozilla-beta?
Flags: qe-verify+

Comment on attachment 9328316 [details]
Bug 1820896 [Linux/X11] Reserve the lower positions of the file descriptors to avoid reuse of stdin/stdout/stderr r?emilio

Approved for 113.0b5.

Attachment #9328316 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
QA Whiteboard: [qa-not-actionable] → [qa-not-actionable][qa-triaged]

Verified as fixed in our latest Nightly Build, I was able to reproduce this issue in an older build using the ./firefox 2>&- 1>&- command, thanks for those extra steps everyone.

QA Whiteboard: [qa-not-actionable][qa-triaged] → [qa-triaged]

Verified as fixed in our latest Beta 113.0b5.

Status: RESOLVED → VERIFIED
QA Whiteboard: [qa-triaged]
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: