Closed Bug 626192 Opened 9 years ago Closed 9 years ago

firefox crashes when attempting this URL (XF86DRIQueryVersion) [@ libGL.so.1.2@0x6d5a0 ][@ libGL.so.1.2@0x73f58 ][@ libGL.so.1.2@0x7714c ][@ libGL.so.1.2@0x5b38e ]

Categories

(Core :: Graphics, defect, critical)

All
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla5
Tracking Status
blocking2.0 --- .x+

People

(Reporter: roylance, Assigned: karlt)

References

()

Details

(Keywords: crash, user-doc-needed)

Crash Data

Attachments

(7 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (X11; Linux i686; rv:2.0b9) Gecko/20100101 Firefox/4.0b9
Build Identifier: Mozilla/5.0 (X11; Linux i686; rv:2.0b9) Gecko/20100101 Firefox/4.0b9

try to navigate to http://abc.net.au/tv/ and http://abc.net.au/tv/guide/ and firefox closes unexpectedly.

firefox/3.6.13 , works

Reproducible: Always

Steps to Reproduce:
1. navigate to http://abc.net.au/tv/
2. firefox closes unexpectedly

Actual Results:  
firefox closes unexpectedly

Expected Results:  
web page should open

firefox/3.6.13  works
STR is 100% reproducible for me with a 2011-01-14 nightly on Ubuntu 10.10
bp-635d7364-fce1-486c-965e-5890f2110116
bp-19255e3e-3582-4c2c-ba43-bac862110116
Severity: normal → critical
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: crash
Summary: firefox crashes when attempting this URL → firefox crashes when attempting this URL [@ libGL.so.1.2@0x5b38e ]
checked webpage with Mozilla/5.0 (Windows NT 6.0; rv:2.0b9) Gecko/20100101 Firefox/4.0b9 on MS-Vista instance on VirtualBox(4.0) and page loads.
#0  XF86DRIQueryVersion () from /usr/lib/fglrx/libGL.so.1
#1  XF86DRIQueryExtension () from /usr/lib/fglrx/libGL.so.1
#2  ?? () from /usr/lib/fglrx/libGL.so.1
#3  ?? () from /usr/lib/fglrx/libGL.so.1
#4  glXQueryVersion () from /usr/lib/fglrx/libGL.so.1
#5  mozilla::gl::GLXLibrary::EnsureInitialized (this=0x7fffe9714f30) at gfx/thebes/GLContextProviderGLX.cpp:167
#6  mozilla::gl::CreateOffscreenPixmapContext (aSize=, aFormat=..., aShare=-265895936) at gfx/thebes/GLContextProviderGLX.cpp:552
#7  mozilla::gl::GLContextProviderGLX::CreateOffscreen (aSize=..., aFormat=...) at gfx/thebes/GLContextProviderGLX.cpp:672
#8  mozilla::WebGLContext::SetDimensions (this=0x7fffd22c1c00, width=300, height=150) at content/canvas/src/WebGLContext.cpp:480
#9  nsHTMLCanvasElement::UpdateContext (this=0x7fffd24a0f20, aNewContextOptions=0x0) at content/html/content/src/nsHTMLCanvasElement.cpp:611
#10 nsHTMLCanvasElement::GetContext (this=0x7fffd24a0f20, aContextId=, aContextOptions=..., aContext=) at content/html/content/src/nsHTMLCanvasElement.cpp:534
#11 nsIDOMHTMLCanvasElement_GetContext (cx=0x7fffe0a6c000, argc=, vp=0x7fffe53fe270) at dom_quickstubs.cpp:20390
#12 CallJSNative (this=0x7fffffffcf80) at js/src/jscntxtinlines.h:692
...
-> Core/Graphics for further triage.
Component: General → Graphics
Product: Firefox → Core
QA Contact: general → thebes
Hardware: x86 → All
Summary: firefox crashes when attempting this URL [@ libGL.so.1.2@0x5b38e ] → firefox crashes when attempting this URL [@ libGL.so.1.2@0x5b38e ] (XF86DRIQueryVersion)
still crashes at http://abc.net.au/tv/
Mozilla/5.0 (X11; Linux i686; rv:2.0b10) Gecko/20100101 Firefox/4.0b10
built from source this morning
http://abc.net.au/tv/ - loads properly
Mozilla/5.0 (X11; Linux i686; rv:2.0b11) Gecko/20110210 Firefox/4.0b11
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
I still crash in 20110210 4.0b12pre:
bp-cee151a6-abc7-418d-ba00-a7de32110210
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
system for comment 11 is Fedora 14
$ uname -r
2.6.35.11-83.fc14.i686.PAE
nvidia-xconfig:  version 260.19.36
Duplicate of this bug: 633980
Same problem here on Ubuntu 10.10 Self-built, among others on sf.net
bp-b28b7939-f8b9-4197-b6e9-c63ae2110215
However, all crashreports here are on x86_64, is someone experiencing it on other platforms?
I am using 32bit see comment 13 referring to comment 11, system crashed with b09 and b10, but worked with b11
Crash probably relates to the use of the ATI graphics driver fglrx with nVidia graphics card.
See here:
https://wiki.ubuntu.com/X/Troubleshooting/NvidiaDriverSwitching
for how to switch drivers

Switching drivers resolved the issue for me
Can those who can reproduce please attach their output from glxinfo?
Not much to show:
# glxinfo
name of display: :0.0
Segmentation fault

# lspci
05:00.0 VGA compatible controller: ATI Technologies Inc RV630 [Radeon HD 2600XT]
... no other video devices listed ...

# ldd /usr/bin/glxinfo |grep libGL
        libGLEW.so.1.5 => /usr/lib/libGLEW.so.1.5 (0x00007fb2bf26d000)
        libGLU.so.1 => /usr/lib/libGLU.so.1 (0x00007fb2beffb000)
        libGL.so.1 => /usr/lib/fglrx/libGL.so.1 (0x00007fb2bee21000)

If I force glxinfo to use /usr/lib/mesa/libGL.so.1 instead I get:
# (LD_LIBRARY_PATH=/usr/lib/mesa/ /usr/bin/glxinfo)
name of display: :0.0
X Error of failed request:  BadRequest (invalid request code or no such operation)
  Major opcode of failed request:  155 (GLX)
  Minor opcode of failed request:  19 (X_GLXQueryServerString)
  Serial number of failed request:  14
  Current serial number in output stream:  14
BTW, I wrote a minimal test program for this:
int main(int argc, char** argv)
{
    gtk_init(&argc, &argv);
    Display *display = GDK_DISPLAY();
    int screen = DefaultScreen(display);
    int gGLXMajorVersion, gGLXMinorVersion;
    if (!glXQueryVersion(display, &gGLXMajorVersion, &gGLXMinorVersion)) {
... etc ...

It crashes too...
My next step would be to see if I can find a unique symbol
to probe with dlsym that can be used to blacklist this dll.
Let me know if you have other ideas...
Thanks, Mats.  That's in-line with my ideas, and looking for a unique dynamic symbol to identify fglrx might be an option.

I was also wondering whether we could glXGetClientString to identify the libGL.so that is crashing.

http://www.opengl.org/sdk/docs/man/xhtml/glXGetClientString.xml

One possible issue is that "glXGetClientString is available only if the GLX version is 1.1 or greater", though I would hope that the presence of a glXGetClientString symbol means it is safe to use that for a function call.
I assume version 1.1 is old, so we may not need to worry about earlier versions anyway.
Attached patch wallpaperSplinter Review
After digging some more it turns out my system was misconfigured.
I had both the fglrx and radeon kernel modules loaded.  The radeon
one must be a remnant from some earlier configuration.

Fwiw, I did find a unique symbol in the fglrx libGL.so: AtiCallFGLComposite
This patch also looks for "radeon" in "/proc/modules" to detect my
particular misconfiguration.

Using glXGetClientString is also an option, it returns "ATI".
Attachment #515226 - Attachment is patch: true
Attached file fglrx crash-on-exit
After fixing the misconfiguration and using the fglrx driver;
forcing GL on by setting MOZ_GLX_IGNORE_BLACKLIST works fine
using it, until I exit where it crashes with this stack
(100% reproducible).  This looks like it might be our fault,
possibly some libGL resource we should release or some such.

A properly configured radeon driver seems to work fine,
but is incredibly slow (overall, not just GL).
So, since GL is going to be disabled anyway unless the vendor
string is "NVIDIA Corporation"
http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/GLContextProviderGLX.cpp#186
we could wallpaper these crashes by making an early return if
glXGetClientString says "ATI".

Not sure if it's worth it though, https://crash-stats.mozilla.com/
has about 300 crashes on Linux for the past to weeks matching
"libGL.so".  I suspect there are multiple crashes per user that
suffers from this.
s/to weeks/two weeks/
(In reply to comment #22)
> After digging some more it turns out my system was misconfigured.
> I had both the fglrx and radeon kernel modules loaded.  The radeon
> one must be a remnant from some earlier configuration.

Having the radeon module loaded may have resulted in the server having a
different GLX implementation, but still the client libGL should not crash
because of a GLX vendor mismatch.  (The server may be on another machine
even.)

Crashing on GLX vendor mismatch is reason enough to blacklist this libGL IMO.

(In reply to comment #24)
> So, since GL is going to be disabled anyway unless the vendor
> string is "NVIDIA Corporation"
> http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/GLContextProviderGLX.cpp#186
> we could wallpaper these crashes by making an early return if
> glXGetClientString says "ATI".

That would be the way I'd blacklist this.  I fear this crash can happen in
more situations than just having the radeon kernel module loaded.  This
approach also allows us to add a client GLX_VERSION check in the future.

I wonder why we currently blacklist on server string rather than client string?

> Not sure if it's worth it though, https://crash-stats.mozilla.com/
> has about 300 crashes on Linux for the past to weeks matching
> "libGL.so".  I suspect there are multiple crashes per user that
> suffers from this.

Yes, it's hard to interpret crash reports because one user can submit many
reports, skewing the results.  I'm guessing from the way build-ids in some
reports "go back in time", there are two users on nightlies seeing the
libGL.so.1.2@0x5b38e crash.  There is also this one on beta11 with a different
kernel version:

bp-3921d33f-da53-459a-87fb-2285d2110215

There are also a number of different libGL.so.1.2 crash reports that may be
related but it's hard to tell from the failed stack scanning.

beta11:
bp-bfea2a9e-328c-4033-ae7c-5889a2110219
bp-7f95f330-31b4-47cf-bc90-878c62110218

beta12:
bp-07778a18-c26c-488b-ac1a-3208e2110226
bp-df6c44c5-039b-443d-8595-fc2cd2110226
bp-02634b06-91f0-4379-a788-0a8a12110227

nightly:
bp-ea9dab85-1503-49ea-94f8-03b8e2110226

Mats, do you have resources set up to try running something (firefox or
glxinfo) with the fglrx libGL on your machine that had the problem,
but with a different server/DISPLAY, either on another machine or a tigervnc
session or similar?
> (In reply to comment #24)
> I wonder why we currently blacklist on server string rather than client string?

I didn't really know the difference and had to pick one. Feel free to change this.
(In reply to comment #28)
> Mats, do you have resources set up to try running something (firefox or
> glxinfo) with the fglrx libGL on your machine that had the problem,
> but with a different server/DISPLAY, either on another machine or a tigervnc
> session or similar?

Sorry.  Worked out how to do that myself at it was all bad with ati-driver-installer-11-2 on IA32 and amd64.

https://bugs.launchpad.net/ubuntu/+bug/555158 also gives a situation where this is likely to happen: hybrid intel/amd graphics.
All the crash reports in comment 28 have libatiuki.so.1.0 loaded, indicating ATI libGL.
(In reply to comment #28)
> (In reply to comment #24)
> > we could wallpaper these crashes by making an early return if
> > glXGetClientString says "ATI".
> 
> This
> approach also allows us to add a client GLX_VERSION check in the future.

Well, the version check is of limited use because 11.2 already reports GLX version 1.4.  Vendor specific info can be supplied in the form "<major_version.minor_version><space><vendor-specific info>" but none is provided currently.

GLX_EXTENSIONS may be useful for differentiating versions, but so might dynamic symbols.  I guess we worry about that once the bug is fixed.
I prefer glXGetClientString over library symbol testing mainly because that seems the appropriate API.  I'm yet to test this patch.
Attachment #515565 - Attachment is obsolete: true
I left the existing glXQueryServerString/NVIDIA check at this stage because that is what has been tested.  (Changing that to a client string test would allow situations, for example, where the client libGL is from NVIDIA but the server is using Mesa.  It might be reasonably likely that the client library wouldn't crash the browser, but we don't know whether the X server might crash.)

I also checked that Mesa and ATI's libGL make no X server requests during glXGetClientString.

If we want to unblacklist ATI drivers in situations where we expect them to work it looks like we should check for an ATIFGLEXTENSION extension on the server.
There's no reason to do that now though as it will be blacklisted anyway (because it is not NVIDIA).
Assignee: nobody → karlt
Status: REOPENED → ASSIGNED
Attachment #515803 - Flags: review?(bjacob)
Attachment #515803 - Flags: review?(bjacob) → review+
I generated some crashes with libGLs from different ATI driver packages and got these crash signatures.

driver version 32-bit                    64-bit
10.10          [@ libGL.so.1.2@0x6d5a0 ] [@ libGL.so.1.2@0x5b38e ]
10.11          [@ libGL.so.1.2@0x6d620 ] [@ libGL.so.1.2@0x5b3ee ]
10.12          [@ libGL.so.1.2@0x73f58 ] [@ libGL.so.1.2@0x5f92e ]
11.2           [@ libGL.so.1.2@0x7714c ] [@ libGL.so.1.2@0x61dbe ]

The 32-bit 10.10 and 10.12 signature show in 20 and 10 beta12 crashes from
several different users.
32-bit and 64-bit 11.2 signatures also show in crash reports but in smaller numbers.

There's definitely a variety of users seeing the crash, and so it is something we'd like to see fixed in a stability release (requiring no beta coverage).
blocking1.9.2: --- → ?
Summary: firefox crashes when attempting this URL [@ libGL.so.1.2@0x5b38e ] (XF86DRIQueryVersion) → firefox crashes when attempting this URL (XF86DRIQueryVersion) [@ libGL.so.1.2@0x6d5a0 ][@ libGL.so.1.2@0x73f58 ][@ libGL.so.1.2@0x7714c ][@ libGL.so.1.2@0x5b38e ]
Whiteboard: [asking for .x]
I'm going to assume Karl meant to request blocking on 2.0 since the 1.9.2 branch doesn't support WebGL or other acceleration.
blocking1.9.2: ? → ---
blocking2.0: --- → ?
blocking2.0: ? → .x+
Yes, thanks dveditz.
WRT comment #20, I adapted your minimal testcase to more recent GDK, and tried it on my system:
01:00.0 VGA compatible controller: ATI Technologies Inc RV730XT [Radeon HD 4670]
ATI Proprietary Linux Driver Version Identifier:8.78.30

Your sample...
http://m8y.org/tmp/glquery.c
gcc `pkg-config --cflags-only-I --libs  gdk-pixbuf-xlib-2.0 gtk+-2.0 gl` glquery.c
./a.out
Screen: 0
Major/Minor GL version: 1 4

no errors.
But then, you suggested it was due to an odd configuration?  So this wouldn't be typical for linux fglrx users?
'cause, I've also had no problems with the webgl test suite.

Basically, wondering if fglrx errors are rare and if it warrants whitelisting like nvidia.  Pretty sure I've run into people reporting nvidia crashes w/ our opengl game too in the past...
Keywords: user-doc-needed
Comment on attachment 515803 [details] [diff] [review]
blacklist "ATI" glXGetClientString v1.1

We're going to take this in a security/stability (.x) release in the future. When those flags are created, please request approval!
Attachment #515803 - Flags: approval2.0? → approval2.0-
People who want to turn WebGL off to stop these crashes want to change the hidden pref "webgl.disabled" to true.
(In reply to comment #38)
> But then, you suggested it was due to an odd configuration?  So this wouldn't
> be typical for linux fglrx users?

This crash shouldn't happen in the most common situation, but it does happen in plenty of sensibly configured situations.

> 'cause, I've also had no problems with the webgl test suite.

Did it pass with 100%, or are you saying it didn't crash?

If firefox exits with status 0 on quit after running the webgl test suite (doesn't crash as in comment 23), then we would like to be able to identify/differentiate the version of the fglrx_dri.so and/or libGL.so.1.2 library (not the kernel module or server modules) but I don't know how to do that.
Heh. AFAIK Firefox doesn't pass 100% with any driver :)

Results: (5385 of 5470 passed, 1 timed out)
http://m8y.org/tmp/fglrx_webgl.txt

Exited cleanly, no crash on shutdown, no errors of any kind.

Yeah, not too sure 'sactly what the crucial info is that differentiates these, but gave the ATI fglrx driver version, card type and glx version string already.
OpenGL itself appears to be provided by:
7.9~git20100924-0ubuntu2 which is presumably 7.9 + some ubuntu patches.
Mozilla/5.0 (X11; Linux i686; rv:2.0b12) Gecko/20100101 Firefox/4.0b12
http://abc.net.au/tv/
worked 06-Mar, crashes today
Steve, do you have hybrid graphics processors on your system?
$ uname -r                                                                     
2.6.35.11-83.fc14.i686.PAE
video card
NVidia - GeForce GT 220
NVidia driver 260.19.36
Sorry, Steve.  We've accidentally hijacked your bug.
The crash Mats was seeing is quite unrelated to your crash.

I've filed Bug 640071 to track your crash.
Can you link to a crash report there (or mention the lack of crash reporter if missing), please?
Whiteboard: [asking for .x]
http://hg.mozilla.org/mozilla-central/rev/fa658a942728
Status: ASSIGNED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.2
To Mats / anyone who could reproduce the ATI FGLRX crash (comment 23):
what is your GL_VERSION string ? You can get it by running:

  glxinfo | grep version

I am trying to figure if I can whitelist newer FGLRX drivers based on that.
I've uninstalled FGLRX now, but it reported 1.4 IIRC.

> I am trying to figure if I can whitelist newer FGLRX drivers based on that.

I tried both the Ubuntu packaged driver and the very latest from AMD -
both crashed on exit.  FWIW, I think we could whitelist it (after we
have fixed our crash).
(In reply to comment #49)
> I've uninstalled FGLRX now, but it reported 1.4 IIRC.

OK -- so it wasn't a very recent version. It's going to get blacklisted by my patch which requires OpenGL 3 for FGLRX (bug 645407).
Note that most GLX implementations report "1.4" for all the GLX versions.
The "OpenGL version string" may have been quite different.
For the record, the driver I downloaded directly from the vendor:
http://support.amd.com/us/gpudownload/windows/previous/11/Pages/radeon_linux.aspx?os=Linux%20x86&rev=11.2
Their internal version is 11.2, dated 2/15/2011.
Sorry, I have no record of what the "OpenGL version string" was.
(this is the driver I used in comment 23 that cause a crash on exit)

They have a new version now (11.3), dated 3/29/2011.  I haven't tried it.
http://support.amd.com/us/gpudownload/linux/Pages/radeon_linux.aspx?type=2.4.1&product=2.4.1.3.42&lang=English
Mats: I mean the 'opengl version string' as in `glxinfo | grep version`
So, I installed the latest fglrx driver from AMD, version 10.3.
I tried two different cards, using a low-end Radeon HD 2xxx card:

server glx version string: 1.4
client glx version string: 1.4
GLX version: 1.4
OpenGL version string: 3.3.10600 Compatibility Profile Context
OpenGL shading language version string: 3.30

and a Radeon HD 5570 card:

server glx version string: 1.4
client glx version string: 1.4
GLX version: 1.4
OpenGL version string: 4.1.10600 Compatibility Profile Context
OpenGL shading language version string: 4.10

Both seems to work fine with MOZ_GLX_IGNORE_BLACKLIST=1
there's no crash on exit anymore (to my surprise).
I tried both 4.0 and a local mozilla-central debug build (both x86-64).
s/10.3/11.3
That's good to know.  Thanks for testing, Mats.
Crash Signature: [@ libGL.so.1.2@0x6d5a0 ] [@ libGL.so.1.2@0x73f58 ] [@ libGL.so.1.2@0x7714c ] [@ libGL.so.1.2@0x5b38e ]
You need to log in before you can comment on or make changes to this bug.