Closed Bug 1737499 Opened 3 years ago Closed 3 years ago

Startup crash: "RRGetScreenResourcesCurrent: BadRequest (invalid request code or no such operation)"

Categories

(Core :: Widget: Gtk, defect)

Firefox 95
x86_64
Linux
defect

Tracking

()

VERIFIED FIXED
95 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox-esr91 --- unaffected
firefox93 --- unaffected
firefox94 --- unaffected
firefox95 --- fixed

People

(Reporter: 6dnail, Assigned: rmader)

References

(Regression)

Details

(Keywords: nightly-community, regression)

Crash Data

Attachments

(5 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:92.0) Gecko/20100101 Firefox/92.0

Steps to reproduce:

First time simply allowed nightly to update
now, got the current tar file - same result

Actual results:

firefox crash. Current build 20211024212641

Expected results:

firefox 95 should have come up

I though it started happening 10/22 or maybe even 10/21 however, it appears the update which started the consistent crashes is something prior to 9 PM on 10/23 (local time). I am GMT-5 (Central US)

The Bugbug bot thinks this bug should belong to the 'Core::Widget: Gtk' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Widget: Gtk
Product: Firefox → Core

Running mozregression:
4:35.58 INFO: Last good revision: 9226c23abfe35c9cea21fc73726568f86276c8af (2021-10-22)
4:35.58 INFO: First bad revision: 226ea4af4493332137b513239b5c4506ee14dffe (2021-10-23)
When bad, return code is 11.
When it gets down to a specific date and switches to taskcluster, no additional versions are attempted. There are many messages, the last few are as follows:
5:07.08 WARNING: Skipping build 5e4047061e46: Unable to find build info using the taskcluster route 'gecko.v2.mozilla-central.revision.5e4047061e46c5cb86d1ef694bc206fc8f4e7d20.firefox.linux64-opt'
5:07.08 CRITICAL: First build 9226c23abfe3 is missing, but mozregression can't find a build before - so it is excluded, but it could contain the regression!
5:10.82 WARNING: Skipping build 8f78ed673519: Unable to find build info using the taskcluster route 'gecko.v2.mozilla-central.revision.8f78ed673519de2dd9d88ef0e6ee7466eeeee667.firefox.linux64-opt'
5:11.03 WARNING: Skipping build 33d6ea7b0aad: Unable to find build info using the taskcluster route 'gecko.v2.mozilla-central.revision.33d6ea7b0aadaa1fcdb03690d82cf9459764a9d7.firefox.linux64-opt'
5:11.04 CRITICAL: Last build 226ea4af4493 is missing, but mozregression can't find a build after - so it is excluded, but it could contain the regression!
5:14.45 WARNING: Skipping build 54d250f01e8b: Unable to find build info using the taskcluster route 'gecko.v2.mozilla-central.revision.54d250f01e8ba17a8526c1a64b21157da65d9619.firefox.linux64-opt'
5:14.46 INFO: There are no build artifacts on inbound for these changesets (they are probably too old).

It seems the latest crash report submitted (during mozregression) was bp-90203c7a-f1e7-41ce-9997-d9ee60211025.
This error is specific to remote desktop (X11)- which is extremely important for a multi-user ubuntu installation.

Flags: needinfo?(6dnail)

(In reply to Lee McFarland from comment #4)

Running mozregression:
4:35.58 INFO: Last good revision: 9226c23abfe35c9cea21fc73726568f86276c8af (2021-10-22)
4:35.58 INFO: First bad revision: 226ea4af4493332137b513239b5c4506ee14dffe (2021-10-23)

= https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=9226c23abfe35c9cea21fc73726568f86276c8af&tochange=226ea4af4493332137b513239b5c4506ee14dffe

  1. Which Linux distribution, version and desktop envionment are you using?

  2. Please open about:support, click on "Copy text to clipboard" and paste it here.
    (From this one for example: mozregression --launch 2021-10-21 -a about:support)

  3. Please attach the console output of $ xrandr.

Crash Signature: [@ X11Error ]
Has Regression Range: --- → yes
OS: Unspecified → Linux
Regressed by: 1640779
Hardware: Unspecified → x86_64
Summary: firefox 95 (nightly) crashes upon start every time since mid last week → Startup crash: "RRGetScreenResourcesCurrent: BadRequest (invalid request code or no such operation)"

Running Ubuntu version 16.04. Desktop is XFCE4.
The about:support is from a mozregression run using good=2021-10-21 because a current version is not able to show about:support due to the crashes.
About:support (of a version running on X11 prior to crash-causing change) has been placed as requested

From the xrandr command:
SZ: Pixels Physical Refresh
*0 1280 x 768 ( 339mm x 203mm ) *60
Current rotation - normal
Current reflection - none
Rotations possible - normal
Reflections possible - none

Description Mesa GLX Indirect (See failure log)
Vendor ID 0x0000
Device ID 0x0000
Driver Vendor mesa/software-unknown
Driver Version 4.0.4.0

IIUC, indirect GLX does only support OpenGL 1.4.
WebRender requires OpenGL 3.2 or GLES 3.

Is this indirect GLX on your computer or are you using Firefox/X11 over SSH?

Can the crash be fixed by starting Firefox with MOZ_ACCELERATED=0 environment variable?
$ MOZ_ACCELERATED=0 ./firefox
$ MOZ_ACCELERATED=0 mozregression --launch 2021-10-24 -a about:support

I think MOZ_ACCELERATED=0 would have no effect, it seems to be blocked already:

HW_COMPOSITING
available by default
blocked by env: Acceleration blocked by platform

bug 1640779 apparently enabled xrandr software vsync for software rendering.
It doesn't check for gfxConfig::IsEnabled(Feature::HW_COMPOSITING) and there is no early return for gfx::gfxVars::UseSoftwareWebRender().

I don't understand indirect GLX. I am not using X11 over SSH (to my knowledge). The clients are running remote desktop from XP systems.

My multi-user ubuntu system runs the production [current] version of firefox except, I personally run a copy of nightly for general browsing. My copy of nightly is run via an Icon which runs the command:
/home/neil/firefoxtest/firefox/firefox -no-remote -P test

I just ran
MOZ_ACCELERATED=0 /home/neil/firefoxtest/firefox/firefox -no-remote -P test


[GFX1-]: glxtest: libEGL initialize failed
[GFX1-]: glxtest: X error, error_code=1, request_code=149, minor_code=32
[GFX1-]: glxtest: process failed (exited with status 1)
ExceptionHandler::GenerateDump cloned child 9737
ExceptionHandler::SendContinueSignalToChild sent continue signal to child
ExceptionHandler::WaitForContinueSignal waiting for continue signal...

Now I ran
MOZ_ACCELERATED=0 mozregression --launch 2021-10-24 -a about:support
0:00.44 WARNING: You are using mozregression version 2.3.9, however version 4.0.18 is available.
0:00.44 WARNING: You should consider upgrading via the 'pip install --upgrade mozregression' command.
0:01.80 INFO: Using local file: /home/neil/moztest/2021-10-24--mozilla-central--firefox-95.0a1.en-US.linux-x86_64.tar.bz2
0:01.80 INFO: Running mozilla-central build for 2021-10-24
0:24.70 INFO: Launching /tmp/tmp7jDdB6/firefox/firefox
0:24.70 INFO: Application command: /tmp/tmp7jDdB6/firefox/firefox about:support -profile /tmp/tmp0VIggU.mozrunner
0:24.71 INFO: application_buildid: 20211024212641
0:24.71 INFO: application_changeset: 33d6ea7b0aadaa1fcdb03690d82cf9459764a9d7
0:24.71 INFO: application_name: Firefox
0:24.71 INFO: application_repository: https://hg.mozilla.org/mozilla-central
0:24.71 INFO: application_version: 95.0a1

0:25.56 WARNING: Process exited with code 11

Finally I ran
MOZ_ACCELERATED=0 mozregression --launch 2021-10-22 -a about:support
which did bring up nightly at the about:support page

sorry about the large lettering - I did copy/paste from a command line. Don't know why this would cause some lines to be quite large/bold

Lee, could you post the output of xrandr -v? Comment 8 looks like the server in your case only supports version 1.2 or lower, while we officially require 1.4 (which is some 10 years old or so). RRGetScreenResourcesCurrent() is 1.3 API.

Flags: needinfo?(6dnail)

(In reply to Robert Mader [:rmader] from comment #14)

Lee, could you post the output of xrandr -v? Comment 8 looks like the server in your case only supports version 1.2 or lower, while we officially require 1.4 (which is some 10 years old or so). RRGetScreenResourcesCurrent() is 1.3 API.

that command returns
xrandr program version 1.5.0
Server reports RandR version 1.1

Firefox 95 was working until someone entered a possible fix for bug 1640779. Now nightly isn't working. I'm using Ubuntu 16.04, which isn't that far out of date. I have no idea how something like xrandr could be upgraded without going to another release of Ubuntu. Unless the fix which causes this issue is withdrawn/redone, there is probably a significant part of the Ubuntu user community which would not be able to go beyond firefox 94.

Flags: needinfo?(6dnail)

(In reply to Lee McFarland from comment #15)

that command returns
xrandr program version 1.5.0
Server reports RandR version 1.1

This is definitively not what you'd see on a "normal" setup.
Quoting the Debian wiki:

The RandR 1.2 extension first appeared in Xserver 1.3 (i.e DebianTesting, since 2007-08-21).

That's more than 14 years :/

there is probably a significant part of the Ubuntu user community which would not be able to go beyond firefox 94.

I doubt that a significant part of the Ubuntu community uses such setups. To the contrary, I assume it's extremely rare these days. Such a setup would e.g. not offer any WebGL capabilities etc.
AFAIK remote X11 is far behind modern remote desktop alternatives. If you run any modern software, it's unlikely to properly support Xrender - and without proper Xrender support, you likely only get disadvantages with remote X11 compared to VNC or RDP.

I'll create a patch to make a xrandr server version check to avoid the issue here, but please, move away from such dead tech :/

(In reply to Robert Mader [:rmader] from comment #16)

(In reply to Lee McFarland from comment #15)

that command returns
xrandr program version 1.5.0
Server reports RandR version 1.1

This is definitively not what you'd see on a "normal" setup.
Quoting the Debian wiki:

The RandR 1.2 extension first appeared in Xserver 1.3 (i.e DebianTesting, since 2007-08-21).

That's more than 14 years :/

there is probably a significant part of the Ubuntu user community which would not be able to go beyond firefox 94.

I doubt that a significant part of the Ubuntu community uses such setups. To the contrary, I assume it's extremely rare these days. Such a setup would e.g. not offer any WebGL capabilities etc.
AFAIK remote X11 is far behind modern remote desktop alternatives. If you run any modern software, it's unlikely to properly support Xrender - and without proper Xrender support, you likely only get disadvantages with remote X11 compared to VNC or RDP.

I'll create a patch to make a xrandr server version check to avoid the issue here, but please, move away from such dead tech :/

Thank you,
The patch is much appreciated and needed. As for my situation, if possible I'd like to move the base system forward when the next long term Ubuntu version comes out in April 2022. Typically there are several weeks of significant activity involved in such a move so that would mean a June, maybe May 2022 finish, which is almost certainly after Firefox 95 becomes the official release.

(In reply to Lee McFarland from comment #17)

Thank you,
The patch is much appreciated and needed. As for my situation, if possible I'd like to move the base system forward when the next long term Ubuntu version comes out in April 2022. Typically there are several weeks of significant activity involved in such a move so that would mean a June, maybe May 2022 finish, which is almost certainly after Firefox 95 becomes the official release.

Sure, will make sure to have a patch in 95 (in the coming days).
Concerning your update plans: AFAICS a simple update to e.g. Ubuntu 22.04 won't change anything on this issue because the root problem is that remote X11 got pretty much abandoned a long time ago (stuck on GL 1.4 / xrandr 1.1). So maybe the switch to 22.04 is a good chance to also move to VNC or RDP - with the nice side effect that it would also support Wayland and thus be more future proof. But I'm starting to repeat myself, sorry :)

While we require libxrandr >= 1.4 to be present, the actual server
version can be lower. On remote X11 it appears to be stuck at 1.1,
crashing when trying to use API from higher version.

There are still setups using remote X11 out there, so add some server
version checks.

Assignee: nobody → robert.mader

We can't use refresh rates far below the default one (60Hz) because
otherwise random CI tests start to fail. However, many users have
screens just below the default rate, e.g. 59.95Hz. So slightly lower
the lower bound.

This is a drive-by optimization with no direct relationship to the bug.

Depends on D129522

People running firefox nightly are inclined to be more up to date on the version of their operating system and related pieces than the general user community. When a crash situation comes about due to an assumption that the software level of other items is not that many years out from current, it may not be considered significant to the nightly test community if it's even caught. Over about six months, the version of firefox seen on nightly eventually becomes the release version of firefox. Very soon after this happens, millions of firefox users will begin using this version. This production level community may be more likely to use older level software than the nightly users and, they are less likely to have the knowledge, background or time needed to resolve a browser crash.

Some time back, firefox was changed to make automatic update a default. Many in the general population of firefox users don't know how to revert to a previous version. To them, a hard crash is catastrophic.

People running firefox nightly are inclined to be more up to date on the version of their operating system and related pieces than the general user community.

My point is that it's not about some library version - it's about the used technology. Remote X11 has been deprecated for a very long time now. Its technological level is that of almost two decades ago (OpenGL 2.0 was released in 2004 and is not supported over remote GLX). Firefox can not support WebGL 1.0 on it and most desktop environments probably also don't start.
In other words: there are chances that a regular install of the very first Ubuntu version (4.10 in 2004) was technologically on the same level of what you are using there (GL 1.4 / xrandr 1.1) :/
It's simply dead. Please stop using it.

Pushed by robert.mader@posteo.de: https://hg.mozilla.org/integration/autoland/rev/19bf6c292686 Check xrandr server version before using it, r=stransky https://hg.mozilla.org/integration/autoland/rev/fe38b4ed2ef5 Allow refresh rates slightly below the default rate, r=stransky
Status: UNCONFIRMED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 95 Branch

Lee, once the new nightly is available, can give it a quick test and confirm the issue is fixed and attach the about:support? Thanks!

Flags: needinfo?(6dnail)
Thanks, it is working now.

Thanks, it is working now.
In addition to the about:support just attached:
xrandr -v
xrandr program version 1.5.0
Server reports RandR version 1.1

Since this incident I am now actively planning to upgrade when Ubuntu 22.04 comes out. With Firefox, I usually wait a few days after a new release before installing the latest for 'production'. With a whole operating system bundle perhaps an appropriate wait time would be more like two weeks.

Flags: needinfo?(6dnail)
Status: RESOLVED → VERIFIED
See Also: → 1739476
Blocks: 1739476
See Also: 1739476

While we require libxrandr >= 1.4 to be present, the actual server
version can be lower. On remote X11 it appears to be stuck at 1.1,
crashing when trying to use API from higher version.

There are still setups using remote X11 out there, so add some server
version checks.

Comment on attachment 9257567 [details]
WIP: Bug 1737499 - Check xrandr server version before using it

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: This silences some noisy warnings, see bug 1739476
  • User impact if declined: See above
  • Fix Landed on Version: 95
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The newly added code is run on every start since 95, is self-contained and didn't cause issue yet. The small difference from the original patch is loading of some symbols. That code is trivial i.e. it either works or not - I tested it to work.
Attachment #9257567 - Flags: approval-mozilla-esr91?

Comment on attachment 9257567 [details]
WIP: Bug 1737499 - Check xrandr server version before using it

Approved for 91.5esr, but I'm going to land it under bug 1739476 since I think that makes more sense from a tracking standpoint.

Attachment #9257567 - Flags: approval-mozilla-esr91? → approval-mozilla-esr91+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: