Note: There are a few cases of duplicates in user autocompletion which are being worked on.

[FGLRX] glxtest process crashes the X server

RESOLVED FIXED in Firefox 16

Status

()

Core
Graphics
--
major
RESOLVED FIXED
6 years ago
4 years ago

People

(Reporter: bugzilla-mozilla, Assigned: martin.vogt)

Tracking

(Blocks: 1 bug, {crash})

Trunk
mozilla18
x86_64
Linux
crash
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox16 fixed, firefox17 verified, firefox18 verified)

Details

(Whiteboard: [sg:dos], crash signature, URL)

Attachments

(9 attachments, 3 obsolete attachments)

1.09 KB, application/octet-stream
Details
838 bytes, application/octet-stream
Details
9.75 KB, text/plain
Details
271.86 KB, application/octet-stream
Details
3.71 KB, patch
Details | Diff | Splinter Review
3.71 KB, patch
Details | Diff | Splinter Review
4.16 KB, patch
Details | Diff | Splinter Review
4.16 KB, patch
bjacob
: review+
Details | Diff | Splinter Review
5.76 KB, patch
karlt
: review+
Details | Diff | Splinter Review
(Reporter)

Description

6 years ago
After upgrading to iceweasel 6 the browser crashes when oping a URL from another
program (e.g. from the mail client, from the terminal, ...). One of about every
15 iceweasel calls crashes the browser. An URL which worked before fine will
crash the browser the next time, it's not URL based/related as far as I can see.
Unfortunately the bowser crash also terminates the X-server.


glxinfo | egrep version\|renderer\|vendor
server glx vendor string: ATI
server glx version string: 1.4
client glx vendor string: ATI
client glx version string: 1.4
GLX version: 1.4
OpenGL vendor string: ATI Technologies Inc.
OpenGL renderer string: ATI Radeon HD 3200 Graphics
OpenGL version string: 2.1 (3.3.11005 Compatibility Profile Context)
OpenGL shading language version string: (null)
I guess mozilla don't care officially about Fire-forks... try to download official latest Firefox from www.firefox.com, and try to reproduce it there... if it is not reproducible with official version, then you should probably go to http://www.debian.org/Bugs/
This has already been filed as bugs.debian.org/cgi-bin/bugreport.cgi?bug=638208 and I personally asked for it to be filed here.

Actually, this isn't really the kind of thing that would be different in a Firefox fork from Firefox itself, unless the fork does something really weird (can I see the diff / list of patches somewhere?)

In other words I expect that the same (driver) bug would be triggered by Firefox itself and therefore we have to do something about it.

At this point I don't see a much better thing to do than completely blacklisting FGLRX, if we're going to do anything. To help decide that, I'd like this bug to be reproduced. To that effect, can you share details about your system? I would like to know precise versions of:
  - kernel
  - X server
  - FGLRX driver

Also, can you check if the 'glxinfo' command also occasionally crashes the X server? If it doesn't then I'd like to understand why (given that what Firefox does on startup to check GLX information is extremely similar to glxinfo).
Also, how experimental/custom is your set of installed packages for stuff relevant to this (X, FGLRX, kernel)? I just don't want to blacklist FGLRX just based on a bug that happened with, say, an experimental driver or kernel or X server.
To be clear, here's what's going on in Firefox 6. See bug 639842. At startup, Firefox forks a process that creates a GL context and calls glGetString to get OpenGL driver information, to decide whether to blacklist. Doing this in a separate process is needed to work around driver bugs that can crash our process. But the one thing that we can't work around with this approach is a crash of the X server itself. To work around that, we need a solution that doesn't involve creating a GL context at all. Unfortunately, in this case we only have access to much coarser info. We can tell whether we have FGLRX or not by calling glxQueryServerString, but that's about it. We can't get any kind of version info in this way. That's why I say above that the only thing we could do would be to completely blacklist FGLRX.
Summary: Iceweasel crashes, also terminating the X-server → [FGLRX] glxtest process crashes the X server
list of debian patches here:
http://ftp.de.debian.org/debian/pool/main/i/iceweasel/iceweasel_6.0-2.debian.tar.gz
(Reporter)

Comment 6

6 years ago
I missed to point out on thing, sorry.
Iceweasel is already running, having a few tabs open. Calling URLs inside iceweasel (click link, add into URL bar, ...) works without any problem.
Only when trying to open a additional URL from "outside" this crash sometimes happens.
(In reply to bugzilla-mozilla from comment #6)
> I missed to point out on thing, sorry.
> Iceweasel is already running, having a few tabs open. Calling URLs inside
> iceweasel (click link, add into URL bar, ...) works without any problem.
> Only when trying to open a additional URL from "outside" this crash
> sometimes happens.

What happens is that everytime you run firefox, it tries again to create an OpenGL context to get driver info. When you do 'firefox someurl' it does that before checking if another Firefox is already running. That's why it can reproduce the crash even if it's going to just attach to another Firefox in the end. I'm sure you could reproduce this crash just by launching Firefox without any page loaded at all, quitting it, and retrying sufficiently many times.
(In reply to Oleg Romashin (:romaxa) from comment #5)
> list of debian patches here:
> http://ftp.de.debian.org/debian/pool/main/i/iceweasel/iceweasel_6.0-2.debian.
> tar.gz

Thanks, I've now checked that Iceweasel is no different from Firefox in this respect. Method:

$ cd debian/patches
$ grep -Ri glx .
./configure.patch: if test -n "$MOZ_WEBGL_GLX"; then
./configure.patch:        ac_safe=`echo "GL/glx.h" | sed 'y%./+-%__p_%'`
./configure.patch:   echo $ac_n "checking for GL/glx.h""... $ac_c" 1>&6
./configure.patch:-echo "configure:24888: checking for GL/glx.h" >&5
./configure.patch:+echo "configure:25052: checking for GL/glx.h" >&5
./configure.patch: #include <GL/glx.h>
./configure.patch: fi # MOZ_WEBGL_GLX
./debian-hacks/Check-less-things-during-configure-when-using-libxul.patch:@@ -9151,7 +9171,7 @@ if test -n "$MOZ_WEBGL_GLX"; then
./debian-hacks/Check-less-things-during-configure-when-using-libxul.patch: fi # MOZ_WEBGL_GLX

This stuff is just configure checks for GLX headers.
Can you please still answer my questions in comment 2 and comment 3 ?
(Reporter)

Comment 10

6 years ago
(In reply to Benoit Jacob [:bjacob] from comment #2)
I would like to know precise versions of:
>   - kernel
Linux vishnu 3.0.0-1-amd64 #1 SMP Wed Aug 17 04:08:52 UTC 2011 x86_64 GNU/Linux
>   - X server
xorg-server 2:1.10.3-1 (Cyril Brulebois <kibi@debian.org>)
>   - FGLRX driver
fglrx 8.88.7 [Jul 28 2011]
Thanks; can you please also check this:

> Also, can you check if the 'glxinfo' command also occasionally crashes the X
> server?

i.e. just run it a few dozen times...
(Reporter)

Comment 12

6 years ago
To be sure not loosing the entered data, I split the information up into small pieces.
glxinfo has been startet 1000 times without any crash
(Reporter)

Comment 13

6 years ago
(In reply to Benoit Jacob [:bjacob] from comment #3)
> Also, how experimental/custom is your set of installed packages for stuff
> relevant to this (X, FGLRX, kernel)? I just don't want to blacklist FGLRX
> just based on a bug that happened with, say, an experimental driver or
> kernel or X server.

All packages are from debian/sid (or debian testing). None of them is from the experimental branch.
(In reply to bugzilla-mozilla from comment #12)
> glxinfo has been startet 1000 times without any crash

Thanks, that's really interesting. If glxinfo can run without crashing your X server, so should we.

The next most interesting thing would be if you could run Firefox with APItrace,
https://github.com/apitrace/apitrace
capture the output and attach it to this bug.

Let me know if running APItrace is problematic for you. Then we can provide a custom build of Firefox with built-in logging for this stuff.
(Reporter)

Comment 15

6 years ago
Created attachment 554926 [details]
apitrace result

LD_PRELOAD=/path/to/glxtrace.so /usr/lib/iceweasel/firefox-bin
immediately crashed the server.
Excellent, thanks! Sorry for the delay.

$ ./tracedump ~/Downloads/xulrunner-stub.trace 
0 glXQueryExtension(dpy = 0x7f7be636a000, errorb = NULL, event = NULL) = True
1 glXChooseFBConfig(dpy = 0x7f7be636a000, screen = 0, attribList = {GLX_DRAWABLE_TYPE, GLX_BUFFER_SIZE, GLX_X_RENDERABLE, GLX_USE_GL, 0}, nitems = &73) = {0x7f7be6359a60, 0x7f7be635a160, 0x7f7be635a860, 0x7f7be635b660, 0x7f7be63cafa0, 0x7f7be635bd60, 0x7f7be63ca520, 0x7f7be63596e0, 0x7f7be6359de0, 0x7f7be635a4e0, 0x7f7be635abe0, 0x7f7be635af60, 0x7f7be635b2e0, 0x7f7be635b9e0, 0x7f7be63ca1a0, 0x7f7be63ca8a0, 0x7f7be63cac20, 0x7f7be63598a0, 0x7f7be6359fa0, 0x7f7be635a6a0, 0x7f7be635b4a0, 0x7f7be63cade0, 0x7f7be635bba0, 0x7f7be63ca360, 0x7f7be6359520, 0x7f7be6359c20, 0x7f7be635a320, 0x7f7be635aa20, 0x7f7be635ada0, 0x7f7be635b120, 0x7f7be635b820, 0x7f7be635bf20, 0x7f7be63ca6e0, 0x7f7be63caa60, 0x7f7be6359980, 0x7f7be635a080, 0x7f7be635a780, 0x7f7be635b580, 0x7f7be63caec0, 0x7f7be635bc80, 0x7f7be63ca440, 0x7f7be6359600, 0x7f7be6359d00, 0x7f7be635a400, 0x7f7be635ab00, 0x7f7be635ae80, 0x7f7be635b200, 0x7f7be635b900, 0x7f7be63ca0c0, 0x7f7be63ca7c0, 0x7f7be63cab40, 0x7f7be63597c0, 0x7f7be6359ec0, 0x7f7be635a5c0, 0x7f7be635b3c0, 0x7f7be63cad00, 0x7f7be635bac0, 0x7f7be63ca280, 0x7f7be6359440, 0x7f7be6359b40, 0x7f7be635a240, 0x7f7be635a940, 0x7f7be635acc0, 0x7f7be635b040, 0x7f7be635b740, 0x7f7be635be40, 0x7f7be63ca600, 0x7f7be63ca980, 0x7f7be63cb400, 0x7f7be63cb080, 0x7f7be63cb320, 0x7f7be63cb160, 0x7f7be63cb240}                                                        
2 glXGetVisualFromFBConfig(dpy = 0x7f7be636a000, config = 0x7f7be6359a60) = &{visual = 0x7f7be6378188, visualid = 42, screen = 0, depth = 24, c_class = 4, red_mask = 16711680, green_mask = 65280, blue_mask = 255, colormap_size = 256, bits_per_rgb = 8}
3 glXCreatePixmap(dpy = 0x7f7be636a000, config = 0x7f7be6359a60, pixmap = 54525953, attribList = NULL) = 54525954                                                                                                   
4 glXCreateNewContext(dpy = 0x7f7be636a000, config = 0x7f7be6359a60, renderType = GLX_RGBA_TYPE, shareList = NULL, direct = True) = 0x7f7be63c4180
5 glXMakeCurrent(dpy = 0x7f7be636a000, drawable = 54525954, ctx = 0x7f7be63c4180) = True
6 glGetString(name = GL_VENDOR) = "ATI Technologies Inc."
7 glGetString(name = GL_RENDERER) = "ATI Radeon HD 3200 Graphics"
8 glGetString(name = GL_VERSION) = "2.1 (3.3.11005 Compatibility Profile Context)"
9 glXMakeCurrent(dpy = 0x7f7be636a000, drawable = 0, ctx = NULL) = True
10 glXDestroyContext(dpy = 0x7f7be636a000, ctx = 0x7f7be63c4180)
11 glXDestroyPixmap(dpy = 0x7f7be636a000, pixmap = 54525954)


This means that cause the X server crash at this place:

http://hg.mozilla.org/mozilla-central/file/d0700ba932b4/toolkit/xre/glxtest.cpp#l238

  glXMakeCurrent(dpy, None, NULL);
  glXDestroyContext(dpy, context);
  glXDestroyPixmap(dpy, glxpixmap);  // X server crashes here
  XFreePixmap(dpy, pixmap);
  XCloseDisplay(dpy);
  dlclose(libgl);

This is during the final cleanup, after we've obtained the information we need.

There's 2 things I want to try to see if they work around this crash:
 1) don't free this pixmap on FGLRX, just leak it. Anyway it's going to be freed by the subsequent XCloseDisplay
 2) use a larger pixmap. Maybe it's too unusual to have a 4x4 pixmap to back a GL context and FGLRX assumes a larger size.
 3) check what glxinfo does.
glxinfo source code is here:
http://cgit.freedesktop.org/mesa/demos/tree/src/xdemos/glxinfo.c

It doesn't use a glx pixmap at all, instead it uses a XWindow and its size is 100x100.
I've launched a try-build:
http://tbpl.allizom.org/?tree=Try&usebuildbot=1&rev=a65faf2267f2

This bug will get notified when it's done (in ~2 hours). Please try this build with the following environment variables:

MOZ_GLXTEST_DONT_DESTROY_PIXMAP=1  - makes us not destroy the pixmap, which is what was crashing on your machine.
MOZ_GLXTEST_PIXMAP_SIZE=32    (or any other value) - makes us use a pixmap of given size (default is 4, try something larger like 32).

It will output to the terminal a message confirming that it's picked up your option, for example:

### Not destroying the pixmap!

and

### Using pixmap size 32
When the builds are ready, they will be available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-a65faf2267f2

Comment 20

6 years ago
Try run for a65faf2267f2 is complete.
Detailed breakdown of the results available here:
    http://tbpl.allizom.org/?tree=Try&usebuildbot=1&rev=a65faf2267f2
Results (out of 2 total builds):
    success: 1
    warnings: 1
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-a65faf2267f2
(Reporter)

Comment 21

6 years ago
New build still crashes. The number of lines in the apitrace log is the same, doesn't matter if it crashes or not.

no crash:
0 glXQueryExtension(dpy = 0x7f227f862000, errorb = NULL, event = NULL) = True
1 glXQueryVersion(dpy = 0x7f227f862000, maj = &1, min = &4) = True
2 glXChooseFBConfig(dpy = 0x7f227f862000, screen = 0, attribList = {GLX_DRAWABLE_TYPE, GLX_BUFFER_SIZE, GLX_X_RENDERABLE, GLX_USE_GL, 0}, nitems = &73) = {0x7f227f820b40, 0x7f227f821240, 0x7f227f821940, 0x7f227f822740, 0x7f227f8c3080, 0x7f227f822e40, 0x7f227f8c2600, 0x7f227f8207c0, 0x7f227f820ec0, 0x7f227f8215c0, 0x7f227f821cc0, 0x7f227f822040, 0x7f227f8223c0, 0x7f227f822ac0, 0x7f227f8c2280, 0x7f227f8c2980, 0x7f227f8c2d00, 0x7f227f820980, 0x7f227f821080, 0x7f227f821780, 0x7f227f822580, 0x7f227f8c2ec0, 0x7f227f822c80, 0x7f227f8c2440, 0x7f227f820600, 0x7f227f820d00, 0x7f227f821400, 0x7f227f821b00, 0x7f227f821e80, 0x7f227f822200, 0x7f227f822900, 0x7f227f8c20c0, 0x7f227f8c27c0, 0x7f227f8c2b40, 0x7f227f820a60, 0x7f227f821160, 0x7f227f821860, 0x7f227f822660, 0x7f227f8c2fa0, 0x7f227f822d60, 0x7f227f8c2520, 0x7f227f8206e0, 0x7f227f820de0, 0x7f227f8214e0, 0x7f227f821be0, 0x7f227f821f60, 0x7f227f8222e0, 0x7f227f8229e0, 0x7f227f8c21a0, 0x7f227f8c28a0, 0x7f227f8c2c20, 0x7f227f8208a0, 0x7f227f820fa0, 0x7f227f8216a0, 0x7f227f8224a0, 0x7f227f8c2de0, 0x7f227f822ba0, 0x7f227f8c2360, 0x7f227f820520, 0x7f227f820c20, 0x7f227f821320, 0x7f227f821a20, 0x7f227f821da0, 0x7f227f822120, 0x7f227f822820, 0x7f227f822f20, 0x7f227f8c26e0, 0x7f227f8c2a60, 0x7f227f8c34e0, 0x7f227f8c3160, 0x7f227f8c3400, 0x7f227f8c3240, 0x7f227f8c3320}
3 glXGetVisualFromFBConfig(dpy = 0x7f227f862000, config = 0x7f227f820b40) = &{visual = 0x7f227f870188, visualid = 42, screen = 0, depth = 24, c_class = 4, red_mask = 16711680, green_mask = 65280, blue_mask = 255, colormap_size = 256, bits_per_rgb = 8}
4 glXCreatePixmap(dpy = 0x7f227f862000, config = 0x7f227f820b40, pixmap = 54525953, attribList = NULL) = 54525954
5 glXCreateNewContext(dpy = 0x7f227f862000, config = 0x7f227f820b40, renderType = GLX_RGBA_TYPE, shareList = NULL, direct = True) = 0x7f227f8bc180
6 glXMakeCurrent(dpy = 0x7f227f862000, drawable = 54525954, ctx = 0x7f227f8bc180) = True
7 glXGetProcAddress(procName = "glXBindTexImageEXT") = 0x7f2275ee6100
8 glGetString(name = GL_VENDOR) = "ATI Technologies Inc."
9 glGetString(name = GL_RENDERER) = "ATI Radeon HD 3200 Graphics"
10 glGetString(name = GL_VERSION) = "2.1 (3.3.11005 Compatibility Profile Context)"
11 glXMakeCurrent(dpy = 0x7f227f862000, drawable = 0, ctx = NULL) = True
12 glXDestroyContext(dpy = 0x7f227f862000, ctx = 0x7f227f8bc180)
13 glXDestroyPixmap(dpy = 0x7f227f862000, pixmap = 54525954)


crash:
0 glXQueryExtension(dpy = 0x7f1f78062000, errorb = NULL, event = NULL) = True
1 glXQueryVersion(dpy = 0x7f1f78062000, maj = &1, min = &4) = True
2 glXChooseFBConfig(dpy = 0x7f1f78062000, screen = 0, attribList = {GLX_DRAWABLE_TYPE, GLX_BUFFER_SIZE, GLX_X_RENDERABLE, GLX_USE_GL, 0}, nitems = &73) = {0x7f1f78020b40, 0x7f1f78021240, 0x7f1f78021940, 0x7f1f78022740, 0x7f1f780c3080, 0x7f1f78022e40, 0x7f1f780c2600, 0x7f1f780207c0, 0x7f1f78020ec0, 0x7f1f780215c0, 0x7f1f78021cc0, 0x7f1f78022040, 0x7f1f780223c0, 0x7f1f78022ac0, 0x7f1f780c2280, 0x7f1f780c2980, 0x7f1f780c2d00, 0x7f1f78020980, 0x7f1f78021080, 0x7f1f78021780, 0x7f1f78022580, 0x7f1f780c2ec0, 0x7f1f78022c80, 0x7f1f780c2440, 0x7f1f78020600, 0x7f1f78020d00, 0x7f1f78021400, 0x7f1f78021b00, 0x7f1f78021e80, 0x7f1f78022200, 0x7f1f78022900, 0x7f1f780c20c0, 0x7f1f780c27c0, 0x7f1f780c2b40, 0x7f1f78020a60, 0x7f1f78021160, 0x7f1f78021860, 0x7f1f78022660, 0x7f1f780c2fa0, 0x7f1f78022d60, 0x7f1f780c2520, 0x7f1f780206e0, 0x7f1f78020de0, 0x7f1f780214e0, 0x7f1f78021be0, 0x7f1f78021f60, 0x7f1f780222e0, 0x7f1f780229e0, 0x7f1f780c21a0, 0x7f1f780c28a0, 0x7f1f780c2c20, 0x7f1f780208a0, 0x7f1f78020fa0, 0x7f1f780216a0, 0x7f1f780224a0, 0x7f1f780c2de0, 0x7f1f78022ba0, 0x7f1f780c2360, 0x7f1f78020520, 0x7f1f78020c20, 0x7f1f78021320, 0x7f1f78021a20, 0x7f1f78021da0, 0x7f1f78022120, 0x7f1f78022820, 0x7f1f78022f20, 0x7f1f780c26e0, 0x7f1f780c2a60, 0x7f1f780c34e0, 0x7f1f780c3160, 0x7f1f780c3400, 0x7f1f780c3240, 0x7f1f780c3320}
3 glXGetVisualFromFBConfig(dpy = 0x7f1f78062000, config = 0x7f1f78020b40) = &{visual = 0x7f1f78070188, visualid = 42, screen = 0, depth = 24, c_class = 4, red_mask = 16711680, green_mask = 65280, blue_mask = 255, colormap_size = 256, bits_per_rgb = 8}
4 glXCreatePixmap(dpy = 0x7f1f78062000, config = 0x7f1f78020b40, pixmap = 52428801, attribList = NULL) = 52428802
5 glXCreateNewContext(dpy = 0x7f1f78062000, config = 0x7f1f78020b40, renderType = GLX_RGBA_TYPE, shareList = NULL, direct = True) = 0x7f1f780bc180
6 glXMakeCurrent(dpy = 0x7f1f78062000, drawable = 52428802, ctx = 0x7f1f780bc180) = True
7 glXGetProcAddress(procName = "glXBindTexImageEXT") = 0x7f1f6e6e6100
8 glGetString(name = GL_VENDOR) = "ATI Technologies Inc."
9 glGetString(name = GL_RENDERER) = "ATI Radeon HD 3200 Graphics"
10 glGetString(name = GL_VERSION) = "2.1 (3.3.11005 Compatibility Profile Context)"
11 glXMakeCurrent(dpy = 0x7f1f78062000, drawable = 0, ctx = NULL) = True
12 glXDestroyContext(dpy = 0x7f1f78062000, ctx = 0x7f1f780bc180)
13 glXDestroyPixmap(dpy = 0x7f1f78062000, pixmap = 52428802)
Did you try defining the environment variables described in comment 18? Actually, from your APItrace log, it seems that you didn't. It's still crashing in glXDestroyPixmap which isn't called if you define MOZ_GLXTEST_DONT_DESTROY_PIXMAP in this build.
(Reporter)

Comment 23

6 years ago
Created attachment 556080 [details]
apitrace

sorry for not reading the steps carefully.

When not setting a specific pixmapsize and MOZ_GLXTEST_DONT_DESTROY_PIXMAP=1 the trace looks like this:
0 glXQueryExtension(dpy = 0x7ffeaa962000, errorb = NULL, event = NULL) = True
1 glXQueryVersion(dpy = 0x7ffeaa962000, maj = &1, min = &4) = True
2 glXChooseFBConfig(dpy = 0x7ffeaa962000, screen = 0, attribList = {GLX_DRAWABLE_TYPE, GLX_BUFFER_SIZE, GLX_X_RENDERABLE, GLX_USE_GL, 0}, nitems = &73) = {0x7ffeaa920b40, 0x7ffeaa921240, 0x7ffeaa921940, 0x7ffeaa922740, 0x7ffeaa9c3080, 0x7ffeaa922e40, 0x7ffeaa9c2600, 0x7ffeaa9207c0, 0x7ffeaa920ec0, 0x7ffeaa9215c0, 0x7ffeaa921cc0, 0x7ffeaa922040, 0x7ffeaa9223c0, 0x7ffeaa922ac0, 0x7ffeaa9c2280, 0x7ffeaa9c2980, 0x7ffeaa9c2d00, 0x7ffeaa920980, 0x7ffeaa921080, 0x7ffeaa921780, 0x7ffeaa922580, 0x7ffeaa9c2ec0, 0x7ffeaa922c80, 0x7ffeaa9c2440, 0x7ffeaa920600, 0x7ffeaa920d00, 0x7ffeaa921400, 0x7ffeaa921b00, 0x7ffeaa921e80, 0x7ffeaa922200, 0x7ffeaa922900, 0x7ffeaa9c20c0, 0x7ffeaa9c27c0, 0x7ffeaa9c2b40, 0x7ffeaa920a60, 0x7ffeaa921160, 0x7ffeaa921860, 0x7ffeaa922660, 0x7ffeaa9c2fa0, 0x7ffeaa922d60, 0x7ffeaa9c2520, 0x7ffeaa9206e0, 0x7ffeaa920de0, 0x7ffeaa9214e0, 0x7ffeaa921be0, 0x7ffeaa921f60, 0x7ffeaa9222e0, 0x7ffeaa9229e0, 0x7ffeaa9c21a0, 0x7ffeaa9c28a0, 0x7ffeaa9c2c20, 0x7ffeaa9208a0, 0x7ffeaa920fa0, 0x7ffeaa9216a0, 0x7ffeaa9224a0, 0x7ffeaa9c2de0, 0x7ffeaa922ba0, 0x7ffeaa9c2360, 0x7ffeaa920520, 0x7ffeaa920c20, 0x7ffeaa921320, 0x7ffeaa921a20, 0x7ffeaa921da0, 0x7ffeaa922120, 0x7ffeaa922820, 0x7ffeaa922f20, 0x7ffeaa9c26e0, 0x7ffeaa9c2a60, 0x7ffeaa9c34e0, 0x7ffeaa9c3160, 0x7ffeaa9c3400, 0x7ffeaa9c3240, 0x7ffeaa9c3320}
3 glXGetVisualFromFBConfig(dpy = 0x7ffeaa962000, config = 0x7ffeaa920b40) = &{visual = 0x7ffeaa970188, visualid = 42, screen = 0, depth = 24, c_class = 4, red_mask = 16711680, green_mask = 65280, blue_mask = 255, colormap_size = 256, bits_per_rgb = 8}
4 glXCreatePixmap(dpy = 0x7ffeaa962000, config = 0x7ffeaa920b40, pixmap = 65011713, attribList = NULL) = 65011714
5 glXCreateNewContext(dpy = 0x7ffeaa962000, config = 0x7ffeaa920b40, renderType = GLX_RGBA_TYPE, shareList = NULL, direct = True) = 0x7ffeaa9bc180
6 glXMakeCurrent(dpy = 0x7ffeaa962000, drawable = 65011714, ctx = 0x7ffeaa9bc180) = True
7 glXGetProcAddress(procName = "glXBindTexImageEXT") = 0x7ffea0fe6100
8 glGetString(name = GL_VENDOR) = "ATI Technologies Inc."
9 glGetString(name = GL_RENDERER) = "ATI Radeon HD 3200 Graphics"
10 glGetString(name = GL_VERSION) = "2.1 (3.3.11005 Compatibility Profile Context)"
11 glXMakeCurrent(dpy = 0x7ffeaa962000, drawable = 0, ctx = NULL) = True
12 glXDestroyContext(dpy = 0x7ffeaa962000, ctx = 0x7ffeaa9bc180)

With MOZ_GLXTEST_DONT_DESTROY_PIXMAP=1 and a pixmapsize of 128 the browser seemed to take longer (more tries) to crash.
tracedump displays "warning: incomplete call glXMakeCurrent", so trace has been attached
(In reply to bugzilla-mozilla from comment #23)
> With MOZ_GLXTEST_DONT_DESTROY_PIXMAP=1 and a pixmapsize of 128 the browser
> seemed to take longer (more tries) to crash.

Did the X server still crash?

If yes it's looking like the only reasonable response is to completely blacklist FGLRX. Indeed, without risking crashing the X server, I can only know whether you have FGLRX, I can't get more specific information.

Before doing that, since that has quite serious implications, I would like to have another report of FGLRX X server crashes on a different machine / setup though.
(Reporter)

Comment 25

6 years ago
(In reply to Benoit Jacob [:bjacob] from comment #24)
> (In reply to bugzilla-mozilla from comment #23)
> > With MOZ_GLXTEST_DONT_DESTROY_PIXMAP=1 and a pixmapsize of 128 the browser
> > seemed to take longer (more tries) to crash.
> 
> Did the X server still crash?
Yes
> 
> If yes it's looking like the only reasonable response is to completely
> blacklist FGLRX. Indeed, without risking crashing the X server, I can only
> know whether you have FGLRX, I can't get more specific information.
> 
> Before doing that, since that has quite serious implications, I would like
> to have another report of FGLRX X server crashes on a different machine /
> setup though.
I completely agree with you. I tried to modify a lot of parameters in the xorg.conf
file. Unfortunately no adjustment made the X-server survive.

I'm switching to "radeon" instead of "fglrx" driver, having a stable system again.
Whiteboard: (blacklist FGLRX if confirmed on another system)
Confirmed once. This needs to be confirmed on more machines before blacklisting GLRX. See comment 24
Status: UNCONFIRMED → NEW
Ever confirmed: true
Blocks: 693168
Keywords: crash
Whiteboard: (blacklist FGLRX if confirmed on another system) → [sg:dos](blacklist FGLRX if confirmed on another system)
(Assignee)

Comment 27

5 years ago
Hello,

I can reproduce this too, but only in case of indirect rendering.
(local works here)
I have attached a modified glxtest.cpp which does not crash if
the glXMakeCurrent is replaced with a window instead of a glxpixmap.      

So glxtest does not crash, but the bug that glXMakeCurrent
will crash xorg in the indirect case, when used with
GLXPixmap is not solved at all.

I will do a bugreport for fglrx for this. (http://ati.cchtml.com/)   

Additionally I upload a .tgz which demonstrate the crash with
indirect rendering.

regards,

Martin
(Assignee)

Comment 28

5 years ago
Created attachment 651343 [details]
modified glxtest (will still crash)
(Assignee)

Comment 29

5 years ago
Created attachment 651344 [details] [diff] [review]
replace GLXPixmap with window does not crash
(Assignee)

Comment 30

5 years ago
Created attachment 651346 [details]
demo for crash in indirect case (includes mesa libGL)
Yup, replacing the GLXPixmap by a XWindow is the right approach. Please make a cleaned up patch, set review? bjacob and I'll review it!
Also note that you will be able to remove the GLX 1.3 requirement in GLXtest as it was only needed for GLXPixmaps.
(Assignee)

Comment 33

5 years ago
The crash for the indirect case is reported 
in the Unofficial AMD Linux Bugzilla:

http://ati.cchtml.com/show_bug.cgi?id=585

Should the diff be against glxtest.cpp in toolkit/xre/ ?
(Assignee)

Comment 34

5 years ago
Ok, I made the patch against toolkit/xre for ff10 and the current version.
I have tested the ff10 version, which is in SLES10, and the xorg server
crashes do not occur anymore, so this should be solved.

I have not compiled the other version of the patch, but is applies...
(Assignee)

Comment 35

5 years ago
Created attachment 651699 [details] [diff] [review]
for ff 10
Attachment #651344 - Attachment is obsolete: true
(Assignee)

Comment 36

5 years ago
Created attachment 651701 [details] [diff] [review]
for current checkout (ff14)
Attachment #651699 - Flags: review?(bjacob)
Comment on attachment 651701 [details] [diff] [review]
for current checkout (ff14)

Setting review? :bjacob ensures that I can't forget about it.
Attachment #651701 - Flags: review?(bjacob)
Comment on attachment 651701 [details] [diff] [review]
for current checkout (ff14)

Review of attachment 651701 [details] [diff] [review]:
-----------------------------------------------------------------

Almost! Please make the minor adjustments described below:

::: src.org//toolkit/xre/glxtest.cpp
@@ +171,5 @@
> +  swa.colormap = XCreateColormap(dpy, RootWindow(dpy, vInfo->screen),
> +                                 vInfo->visual, AllocNone);
> +  swa.border_pixel = 0;
> +  win1 = XCreateWindow(dpy, RootWindow(dpy, vInfo->screen),
> +                       10, 10, 200, 200,

Please use a smaller size (and check that it still doesn't crash) to avoid needless memory usage. I would use 16x16, a good compromise between small and avoiding corner cases with extremely small sizes.

@@ +208,3 @@
>    ///// possible. Also we want to check that we're able to do that too without generating X errors.
>    glXMakeCurrent(dpy, None, NULL); // must release the GL context before destroying it
>    glXDestroyContext(dpy, context);

Please explicitly destroy the window -- even though the XCloseDisplay below should, I guess, do it.
Attachment #651701 - Flags: review?(bjacob) → review-
Comment on attachment 651699 [details] [diff] [review]
for ff 10

Let's first land this on mozilla-central, then backport.
Attachment #651699 - Flags: review?(bjacob)
(Assignee)

Comment 40

5 years ago
Created attachment 652708 [details] [diff] [review]
patch for ff10 V2

Changes as suggested by Benoit Jacob
Attachment #651699 - Attachment is obsolete: true
(Assignee)

Comment 41

5 years ago
Created attachment 652709 [details] [diff] [review]
patch for ff14 V2

Changes as suggested by Benoit Jacob
Attachment #651701 - Attachment is obsolete: true
(Assignee)

Comment 42

5 years ago
Created attachment 652710 [details] [diff] [review]
improvement for ff10 V2, which works on nvidia in the remote case too.

Changes as suggested by Benoit Jacob
+ remove the attribs.

On Nvidia in the indirect case (remote PC) the glxtest.cpp aborts
with No FBConfigs found. With this patch it works.
(Assignee)

Comment 43

5 years ago
Created attachment 652711 [details] [diff] [review]
improvement for ff14 V2, which works on nvidia in the remote case too.

Changes as suggested by Benoit Jacob
+ remove the attribs.

On Nvidia in the indirect case (remote PC) the glxtest.cpp aborts
with No FBConfigs found. With this patch it works.
Comment on attachment 652711 [details] [diff] [review]
improvement for ff14 V2, which works on nvidia in the remote case too.

Thanks! Again: without the 'review' flag (under 'details') I too easily forget about your patch. And we will only want to take this patch on mozilla-central, perhaps mozilla-aurora -- not beta/release/esr.
Attachment #652711 - Flags: review?(bjacob)
Comment on attachment 652711 [details] [diff] [review]
improvement for ff14 V2, which works on nvidia in the remote case too.

Review of attachment 652711 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thanks!
Attachment #652711 - Flags: review?(bjacob) → review+
Whiteboard: [sg:dos](blacklist FGLRX if confirmed on another system) → [sg:dos]
https://tbpl.mozilla.org/?tree=Try&rev=b96864829813
That's weird, TBPL reports a memory leak that reproduced with this patch alone.

Here's a try explicitly destroying the Colormap:
https://tbpl.mozilla.org/?tree=Try&rev=a38843ce365c
Still that weird leak, which I can only interprete as a crash, maybe due to antique drivers on the test slave. new try with debug output:

https://tbpl.mozilla.org/?tree=Try&rev=a324307b7f42
glxtest g
X connection to :2.0 broken (explicit kill or server shutdown).

Ouch. This means that GL context creation makes the X server crash, on our test machines with NVIDIA 190.42 driver.
I modified the code around here to be closer to what glxinfo does... hopefully it won't crash the X server anymore...https://tbpl.mozilla.org/?tree=Try&rev=f7b3ae88c99f
(Assignee)

Comment 51

5 years ago
When I did the patch, I tought about:

glXCreateContext(Display*  dpy,XVisualInfo*  vis,GLXContext  shareList,Bool  direct);

and from the manpage:
>http://www.opengl.org/sdk/docs/man/xhtml/glXCreateContext.xml
>Notes:It may not be possible to render to a GLX pixmap with a direct rendering 
>      context.

OpenGL supports the call:glXIsDirect()

But I havent found it in the exported symbols in libGL.
So if this call can be loaded with dlsym, it should be possible to work
around the note from glxCreateContext.
Btw: I tested with Nvidia 295.71 and fglrx 12.4..
New try (previous one failed to compile):
https://tbpl.mozilla.org/?tree=Try&rev=feccc1c0eb5f

This one is as close as possible to what the traditional |glxinfo| program does:
http://cvsweb.xfree86.org/cvsweb/xc/programs/glxinfo/glxinfo.c?rev=HEAD&content-type=text/vnd.viewcvs-markup

In particular it uses glXCreateContext.

These functions are not usual symbols in the sense of dlsym, that's why we have to use glXGetProcAddress.

Also, I figured what's happening here, making this fail only on 'B' (build phase) on our try servers: the B tests run on the build machines, not on the test machines. So the X crash here is specific to whatever driver is on the build machines. The NVIDIA driver I was referring to above was on the test machines.
Green!
Created attachment 655594 [details] [diff] [review]
imitate what glxinfo does to avoid crashing the X server on the build slaves

See above comments. While Martin's patch was fine, it crashes the X server on the build slaves (not on the test slaves). To end these issues, the simplest seemed to be to imitate glxinfo as much as possible; that works.
Attachment #655594 - Flags: review?(karlt)
Note that this patch applies on top of Martin's, whence the new code in the context lines.
(In reply to Benoit Jacob [:bjacob] from comment #52)
> In particular it uses glXCreateContext.

http://www.opengl.org/sdk/docs/man/xhtml/glXCreateNewContext.xml says
"This context can be used to render into GLX windows, pixmaps, or pixel buffers."
while http://www.opengl.org/sdk/docs/man/xhtml/glXCreateContext.xml says
"This context can be used to render into both windows and GLX pixmaps."

Maybe the driver was expecting a GLXWindow instead of a Window, or maybe I'm reading too much into that difference as perhaps there was no such thing as a GLX window when glxCreateContext documentation was written.
Comment on attachment 655594 [details] [diff] [review]
imitate what glxinfo does to avoid crashing the X server on the build slaves

It's a shame that glxtest is behaving less like the core Gecko code, and so it is not so much testing what we want to test.

However, there may be no other options, given we want to avoid crashing servers.
Bug 772446 should mean we soon no longer need to care about CentOS 5, but I guess there may be other systems out there.
Attachment #655594 - Flags: review?(karlt) → review+
(In reply to Karl Tomlinson (:karlt) from comment #57)
> Comment on attachment 655594 [details] [diff] [review]
> imitate what glxinfo does to avoid crashing the X server on the build slaves
> 
> It's a shame that glxtest is behaving less like the core Gecko code, and so
> it is not so much testing what we want to test.

I had thought and worried about that in the past, and was tempted to make glxtest actually test whatever we end up doing in libxul to detect any issues, but I backed out from that approach as it would have only been scalable if glxtest itself could call into libxul GLContext stuff, but that would have pulled a lot of dependencies that would have meant that glxtest had to run much later during startup, risking delaying startup.

So the approach that we had already been following here was to keep glxtest minimal and just query GL strings. Nothing new here.
Oh yes, and running glxtest later during startup would also complicate things: we'd have to manually turn off the crash reporter, etc.
http://hg.mozilla.org/integration/mozilla-inbound/rev/e461878f0567
http://hg.mozilla.org/integration/mozilla-inbound/rev/920d71aa1d2c
Assignee: nobody → martin.vogt
Target Milestone: --- → mozilla17
Version: 6 Branch → Trunk
Comment on attachment 652711 [details] [diff] [review]
improvement for ff14 V2, which works on nvidia in the remote case too.


[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 639842 (Firefox 6)
User impact if declined: a small percentage of linux users can't use Firefox at all, as it crashes their X server immediately on startup
Testing completed (on m-c, etc.): m-i
Risk to taking this patch (and alternatives if risky): non risky. Of course, as we  noted here, any change here can always run into a different driver crash, but at least, the new code is very close to the traditional glxinfo program.
String or UUID changes made by this patch: none
Attachment #652711 - Flags: approval-mozilla-beta?
Attachment #652711 - Flags: approval-mozilla-aurora?
Comment on attachment 655594 [details] [diff] [review]
imitate what glxinfo does to avoid crashing the X server on the build slaves

[Approval Request Comment]
See previous comment.
Attachment #655594 - Flags: approval-mozilla-beta?
Attachment #655594 - Flags: approval-mozilla-aurora?

Updated

5 years ago
Target Milestone: mozilla17 → mozilla18
https://hg.mozilla.org/mozilla-central/rev/e461878f0567
https://hg.mozilla.org/mozilla-central/rev/920d71aa1d2c
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Attachment #652711 - Flags: approval-mozilla-beta?
Attachment #652711 - Flags: approval-mozilla-beta+
Attachment #652711 - Flags: approval-mozilla-aurora?
Attachment #652711 - Flags: approval-mozilla-aurora+
Comment on attachment 655594 [details] [diff] [review]
imitate what glxinfo does to avoid crashing the X server on the build slaves

Since it's very early in the cycle, and the fix is low risk, we can take this for uplift.
Attachment #655594 - Flags: approval-mozilla-beta?
Attachment #655594 - Flags: approval-mozilla-beta+
Attachment #655594 - Flags: approval-mozilla-aurora?
Attachment #655594 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/9f7f69b1bfa9
https://hg.mozilla.org/releases/mozilla-aurora/rev/5dc0a1568b70

https://hg.mozilla.org/releases/mozilla-beta/rev/fa3a1f0b6796
https://hg.mozilla.org/releases/mozilla-beta/rev/73e45f0fba05
status-firefox16: --- → fixed
status-firefox17: --- → fixed
status-firefox18: --- → fixed
Keywords: verifyme
Is there something I could do here to verify this is fixed? Ubuntu and Fedora systems available. 

Or any issues I should look for on Linux?
(In reply to Virgil Dicu [:virgil] [QA] from comment #66)
> Is there something I could do here to verify this is fixed? Ubuntu and
> Fedora systems available. 
> 
> Or any issues I should look for on Linux?

I suspect this may be difficult to verify as it requires specific hardware (ATI) and driver (Catalyst) on the server machine, and a remote connection (perhaps ssh -Y from the same machine).

Check that this doesn't cause bug 799544 on an older system with Mesa (open source, any hardware) drivers and LIBGL_ALWAYS_SOFTWARE=1 in the environment.
Depends on: 799544
>(In reply to Karl Tomlinson (:karlt) from comment #67) 
> Check that this doesn't cause bug 799544 on an older system with Mesa (open
> source, any hardware) drivers and LIBGL_ALWAYS_SOFTWARE=1 in the environment.

Ok, verified this on the systems I currently have access to with the following drivers on Firefox 17 beta5:

nouveau -- Gallium 0.4 on NV98 -- 2.1 Mesa 8.0.2 (Ubuntu 12.04)
X.org--Gallium 0.4 on AMD RS780 -- 2.1 Mesa 8.0.2 (Fedora 17)

No crashes when loading about:support.

I think this is all I can do here with my current hardware. If there is anything left to do, re-set the flag, please.
status-firefox17: fixed → verified
Mozilla/5.0 (X11; Linux i686; rv:18.0) Gecko/18.0 Firefox/18.0

Same result as in comment 68 on Ubuntu and Fedora. Firefox 18 beta 1.
status-firefox18: fixed → verified
mass remove verifyme requests greater than 4 months old
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.