Closed Bug 659842 Opened 13 years ago Closed 13 years ago

X_GLXMakeCurrent: GLXBadContextTag SIGABRT with indirect Mesa at closedown and on about:support

Categories

(Core :: Graphics, defect)

All
Linux
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla7
Tracking Status
firefox5 --- wontfix
firefox6 + fixed

People

(Reporter: tonymec, Assigned: bjacob)

References

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(5 files, 8 obsolete files)

10.64 KB, patch
karlt
: review+
Details | Diff | Splinter Review
717 bytes, patch
bjacob
: review-
Details | Diff | Splinter Review
4.68 KB, patch
Details | Diff | Splinter Review
6.05 KB, application/octet-stream
Details
855 bytes, patch
karlt
: review+
Details | Diff | Splinter Review
This bug was filed from the Socorro interface and is 
report bp-db62e4c1-c47f-40ff-8501-748932110525 .
============================================================= 
«at closedown»

Mozilla/5.0 (X11; Linux x86_64; rv:7.0a1) Gecko/20110525 Firefox/7.0a1 SeaMonkey/2.2a1pre
ID:20110525003002

also:
bp-3a718eb4-8f48-427c-b637-5aca92110525 «after Ctrl+Q in Safe Mode»
bp-8b03e8d0-b942-4ecb-a2bf-1993e2110525 «after clicking "Restart with Add-ons Disabled" then okaying the popups»

Since installing this build, I get this crash at every closedown of SeaMonkey, even in Safe Mode.

NB: CPU is "amd64" according to Breakpad/Soccorro (as shown below) but "GenuineIntel" (model name: Intel(R) Pentium(R) 4 CPU 2.80GHz) according to openSUSE YaST.

Here comes the crash report. Sorry, this nightly was apparently built without symbols.

Signature	libc-2.11.3.so@0x32ab5
UUID	db62e4c1-c47f-40ff-8501-748932110525
Uptime	35.6 minutes
Last Crash	2833 seconds (47.2 minutes) before submission
Install Age	15300 seconds (4.2 hours) since version was first installed.
Install Time	2011-05-25 21:07:23
Product	SeaMonkey
Version	2.2a1pre
Build ID	20110525003002
Release Channel	nightly
Branch	2.2
OS	Linux
OS Version	0.0.0 Linux 2.6.37.6-0.5-desktop #1 SMP PREEMPT 2011-04-25 21:48:33 +0200 x86_64
CPU	amd64
CPU Info	family 15 model 4 stepping 1
Crash Reason	SIGABRT
Crash Address	0x363e
User Comments	at closedown
App Notes 	OpenGL: Mesa Project -- Software Rasterizer -- 1.4 (2.1 Mesa 7.10.2)
WebGL? libGL.so.1? libGL.so.1+
GL Context? GL Context+
WebGL-
X_GLXMakeCurrent: GLXBadContextTagxpcom_runtime_abort(###!!! ABORT: X_GLXMakeCurrent: GLXBadContextTag: file /builds/slave/comm-cen-trunk-lnx64-ntly/build/mozilla/toolkit/xre/nsX11ErrorHandler.cpp, line 199)
Processor Notes 	
EMCheckCompatibility	False
Winsock LSP	
Adapter Vendor ID	
Adapter Device ID	
Bugzilla - Report this Crash
Crashing Thread
Frame 	Module 	Signature [Expand] 	Source
0 	libc-2.11.3.so 	libc-2.11.3.so@0x32ab5 	
1 	libc-2.11.3.so 	libc-2.11.3.so@0x33fb5 	
2 	libxul.so 	libxul.so@0x1e7e78f 	
3 	libxul.so 	libxul.so@0x20563c5 	
4 	libxul.so 	libxul.so@0x2056fbf 	
5 	libc-2.11.3.so 	libc-2.11.3.so@0x6b1f 	
6 	libmozalloc.so 	libmozalloc.so@0x517 	
7 	libxul.so 	libxul.so@0x29e022f 	
8 	libxul.so 	libxul.so@0x1e7e177 	
9 	libplds4.so 	libplds4.so@0x202fff 	
10 	ld-2.11.3.so 	ld-2.11.3.so@0xcb10 	
11 	libmozalloc.so 	libmozalloc.so@0x1091 	
12 	libxul.so 	libxul.so@0x16bf038 	
13 	libxul.so 	libxul.so@0x16d7052 	
14 	libGL.so.1.2 	libGL.so.1.2@0xd52d 	
15 	libxul.so 	libxul.so@0x6d6def 	
16 	libxul.so 	libxul.so@0x6d6ea0 	
17 	libxul.so 	libxul.so@0x16d423e 	
18 	libxul.so 	libxul.so@0x20563c5 	
19 	ld-2.11.3.so 	ld-2.11.3.so@0x9225 	
20 	libxul.so 	libxul.so@0x1095f 	
21 	libxul.so 	libxul.so@0x2b80f 	
22 	libxul.so 	libxul.so@0x16678c7 	
23 	libxul.so 	libxul.so@0x43b19 	
24 	libxul.so 	libxul.so@0x4365f 	
25 	libxul.so 	libxul.so@0x2b80f 	
26 	ld-2.11.3.so 	ld-2.11.3.so@0x9531 	
27 	libxul.so 	libxul.so@0x1e7e66c 	
28 	libxul.so 	libxul.so@0x20563c5 	
29 	libxul.so 	libxul.so@0x43b19 	
30 	libxul.so 	libxul.so@0x1e7e78f 	
31 	libxul.so 	libxul.so@0x20563c5 	
32 	libxul.so 	libxul.so@0x2056fbf 	
33 	libxul.so 	libxul.so@0x29e022f 	
34 	libxul.so 	libxul.so@0x16d423e 	
35 	libxul.so 	libxul.so@0x2b80f 	
36 	libxul.so 	libxul.so@0x29e022f 	
37 	libxpcom.so 	libxpcom.so@0x203fff 	
38 	ld-2.11.3.so 	ld-2.11.3.so@0xcb10 	
39 	libxul.so 	libxul.so@0x2b80f 	
40 	libxul.so 	libxul.so@0x6d1e79
The same crash happened when clicking "Reload" on about:support at first startup of this build, after noticing that no extensions were listed (see also bug 659772): bp-a058f651-88d6-401b-854a-b38cc2110525
(In reply to comment #0)
> 1.4 (2.1 Mesa 7.10.2)

This is what is normally displayed with indirect rendering.
Can you check with "LIBGL_DEBUG=verbose glxinfo", please, that it says "direct rendering: No" and see what it says about why it is indirect?

> X_GLXMakeCurrent: GLXBadContextTag

Same issue was reported in bug 645407 comment 10 and 17.

http://www.opengl.org/documentation/specs/glx/glx1.4.pdf says:

"The following error codes may be generated by a faulty GLX implementation,
but would not normally be visible to clients:
GLXBadContextTag A rendering request contains an invalid context tag.
(Context tags are used to identify contexts in the protocol.)"

In bug 429604 comment 0 it was happening when trying to MakeCurrent a context whose drawable had already been destroyed.

I guess this is a regression from bug 645407 because before then Mesa libGL was blacklisted.
Blocks: 645407
Component: General → Graphics
Product: SeaMonkey → Core
QA Contact: general → thebes
Summary: crash [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f] SIGABRT at closedown → X_GLXMakeCurrent: GLXBadContextTag SIGABRT with indirect software Mesa at closedown and on about:support [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f]
Keywords: regression
In reply to comment #2

linux:~ # LIBGL_DEBUG=verbose glxinfo
name of display: :0
display: :0  screen: 0
direct rendering: No
server glx vendor string: SGI
server glx version string: 1.4
server glx extensions:
    GLX_ARB_multisample, GLX_EXT_visual_info, GLX_EXT_visual_rating, 
    GLX_EXT_import_context, GLX_EXT_texture_from_pixmap, GLX_OML_swap_method, 
    GLX_SGI_make_current_read, GLX_SGIS_multisample, GLX_SGIX_hyperpipe, 
    GLX_SGIX_swap_barrier, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer, 
    GLX_MESA_copy_sub_buffer, GLX_INTEL_swap_event
client glx vendor string: Mesa Project and SGI
client glx version string: 1.4
client glx extensions:
    GLX_ARB_get_proc_address, GLX_ARB_multisample, GLX_EXT_import_context, 
    GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_MESA_copy_sub_buffer, 
    GLX_MESA_swap_control, GLX_OML_swap_method, GLX_OML_sync_control, 
    GLX_SGI_make_current_read, GLX_SGI_swap_control, GLX_SGI_video_sync, 
    GLX_SGIS_multisample, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer, 
    GLX_SGIX_visual_select_group, GLX_EXT_texture_from_pixmap, 
    GLX_INTEL_swap_event
GLX extensions:
    GLX_ARB_get_proc_address, GLX_ARB_multisample, GLX_EXT_import_context, 
    GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_MESA_copy_sub_buffer, 
    GLX_OML_swap_method, GLX_SGI_make_current_read, GLX_SGIS_multisample, 
    GLX_SGIX_fbconfig, GLX_SGIX_pbuffer, GLX_EXT_texture_from_pixmap, 
    GLX_INTEL_swap_event
OpenGL vendor string: Mesa Project
OpenGL renderer string: Software Rasterizer
OpenGL version string: 1.4 (2.1 Mesa 7.10.2)
OpenGL extensions:
    GL_ARB_depth_texture, GL_ARB_draw_buffers, GL_ARB_fragment_program, 
    GL_ARB_fragment_program_shadow, GL_ARB_multisample, GL_ARB_multitexture, 
    GL_ARB_occlusion_query, GL_ARB_point_parameters, GL_ARB_point_sprite, 
    GL_ARB_shadow, GL_ARB_shadow_ambient, GL_ARB_texture_border_clamp, 
    GL_ARB_texture_compression, GL_ARB_texture_cube_map, 
    GL_ARB_texture_env_add, GL_ARB_texture_env_combine, 
    GL_ARB_texture_env_crossbar, GL_ARB_texture_env_dot3, 
    GL_ARB_texture_mirrored_repeat, GL_ARB_texture_non_power_of_two, 
    GL_ARB_texture_rectangle, GL_ARB_transpose_matrix, GL_ARB_vertex_program, 
    GL_ARB_window_pos, GL_EXT_abgr, GL_EXT_bgra, GL_EXT_blend_color, 
    GL_EXT_blend_equation_separate, GL_EXT_blend_func_separate, 
    GL_EXT_blend_logic_op, GL_EXT_blend_minmax, GL_EXT_blend_subtract, 
    GL_EXT_copy_texture, GL_EXT_draw_range_elements, GL_EXT_fog_coord, 
    GL_EXT_framebuffer_object, GL_EXT_multi_draw_arrays, GL_EXT_packed_pixels, 
    GL_EXT_paletted_texture, GL_EXT_point_parameters, GL_EXT_polygon_offset, 
    GL_EXT_rescale_normal, GL_EXT_secondary_color, 
    GL_EXT_separate_specular_color, GL_EXT_shadow_funcs, 
    GL_EXT_shared_texture_palette, GL_EXT_stencil_two_side, 
    GL_EXT_stencil_wrap, GL_EXT_subtexture, GL_EXT_texture, GL_EXT_texture3D, 
    GL_EXT_texture_edge_clamp, GL_EXT_texture_env_add, 
    GL_EXT_texture_env_combine, GL_EXT_texture_env_dot3, 
    GL_EXT_texture_lod_bias, GL_EXT_texture_mirror_clamp, 
    GL_EXT_texture_object, GL_EXT_texture_rectangle, GL_EXT_vertex_array, 
    GL_3DFX_texture_compression_FXT1, GL_APPLE_packed_pixels, 
    GL_ATI_draw_buffers, GL_ATI_texture_env_combine3, 
    GL_ATI_texture_mirror_once, GL_ATIX_texture_env_combine3, 
    GL_IBM_texture_mirrored_repeat, GL_INGR_blend_func_separate, 
    GL_MESA_pack_invert, GL_MESA_ycbcr_texture, GL_NV_blend_square, 
    GL_NV_depth_clamp, GL_NV_fragment_program, GL_NV_fragment_program_option, 
    GL_NV_light_max_exponent, GL_NV_point_sprite, GL_NV_texgen_reflection, 
    GL_NV_texture_env_combine4, GL_NV_texture_rectangle, GL_NV_vertex_program, 
    GL_NV_vertex_program1_1, GL_SGIS_generate_mipmap, 
    GL_SGIS_texture_border_clamp, GL_SGIS_texture_edge_clamp, 
    GL_SGIS_texture_lod, GL_SGIX_shadow_ambient, GL_SUN_multi_draw_arrays
glu version: 1.3
glu extensions:
    GLU_EXT_nurbs_tessellator, GLU_EXT_object_space_tess

***
*** WARNING: Direct Rendering is NOT enabled
***


   visual  x  bf lv rg d st colorbuffer ax dp st accumbuffer  ms  cav
 id dep cl sp sz l  ci b ro  r  g  b  a bf th cl  r  g  b  a ns b eat
----------------------------------------------------------------------
0x21 24 tc  0 32  0 r  y  .  8  8  8  8  0 24  8  0  0  0  0  0 0 None
0xc2 24 tc  0 24  0 r  .  .  8  8  8  0  0  0  0  0  0  0  0  0 0 None
0xc3 24 tc  0 24  0 r  .  .  8  8  8  0  0  0  0 16 16 16  0  0 0 Slow
0xc4 24 tc  0 24  0 r  y  .  8  8  8  0  0  0  0  0  0  0  0  0 0 None
0xc5 24 tc  0 24  0 r  y  .  8  8  8  0  0  0  0 16 16 16  0  0 0 Slow
0xc6 24 tc  0 24  0 r  .  .  8  8  8  0  0  0  8  0  0  0  0  0 0 None
0xc7 24 tc  0 24  0 r  .  .  8  8  8  0  0  0  8 16 16 16  0  0 0 Slow
0xc8 24 tc  0 24  0 r  y  .  8  8  8  0  0  0  8  0  0  0  0  0 0 None
0xc9 24 tc  0 24  0 r  y  .  8  8  8  0  0  0  8 16 16 16  0  0 0 Slow
0xca 24 tc  0 24  0 r  .  .  8  8  8  0  0 24  0  0  0  0  0  0 0 None
0xcb 24 tc  0 24  0 r  .  .  8  8  8  0  0 24  0 16 16 16  0  0 0 Slow
0xcc 24 tc  0 24  0 r  y  .  8  8  8  0  0 24  0  0  0  0  0  0 0 None
0xcd 24 tc  0 24  0 r  y  .  8  8  8  0  0 24  0 16 16 16  0  0 0 Slow
0xce 24 tc  0 24  0 r  .  .  8  8  8  0  0 24  8  0  0  0  0  0 0 None
0xcf 24 tc  0 24  0 r  .  .  8  8  8  0  0 24  8 16 16 16  0  0 0 Slow
0xd0 24 tc  0 24  0 r  y  .  8  8  8  0  0 24  8  0  0  0  0  0 0 None
0xd1 24 tc  0 24  0 r  y  .  8  8  8  0  0 24  8 16 16 16  0  0 0 Slow
0xd2 24 tc  0 32  0 r  .  .  8  8  8  8  0  0  0  0  0  0  0  0 0 None
0xd3 24 tc  0 32  0 r  .  .  8  8  8  8  0  0  0 16 16 16 16  0 0 Slow
0xd4 24 tc  0 32  0 r  y  .  8  8  8  8  0  0  0  0  0  0  0  0 0 None
0xd5 24 tc  0 32  0 r  y  .  8  8  8  8  0  0  0 16 16 16 16  0 0 Slow
0xd6 24 tc  0 32  0 r  .  .  8  8  8  8  0  0  8  0  0  0  0  0 0 None
0xd7 24 tc  0 32  0 r  .  .  8  8  8  8  0  0  8 16 16 16 16  0 0 Slow
0xd8 24 tc  0 32  0 r  y  .  8  8  8  8  0  0  8  0  0  0  0  0 0 None
0xd9 24 tc  0 32  0 r  y  .  8  8  8  8  0  0  8 16 16 16 16  0 0 Slow
0xda 24 tc  0 32  0 r  .  .  8  8  8  8  0 24  0  0  0  0  0  0 0 None
0xdb 24 tc  0 32  0 r  .  .  8  8  8  8  0 24  0 16 16 16 16  0 0 Slow
0xdc 24 tc  0 32  0 r  y  .  8  8  8  8  0 24  0 16 16 16 16  0 0 Slow
0xdd 24 tc  0 32  0 r  .  .  8  8  8  8  0 24  8  0  0  0  0  0 0 None
0xde 24 tc  0 32  0 r  .  .  8  8  8  8  0 24  8 16 16 16 16  0 0 Slow
0xdf 24 tc  0 32  0 r  y  .  8  8  8  8  0 24  8 16 16 16 16  0 0 Slow
0xe0 24 dc  0 24  0 r  .  .  8  8  8  0  0  0  0  0  0  0  0  0 0 None
0xe1 24 dc  0 24  0 r  .  .  8  8  8  0  0  0  0 16 16 16  0  0 0 Slow
0xe2 24 dc  0 24  0 r  y  .  8  8  8  0  0  0  0  0  0  0  0  0 0 None
0xe3 24 dc  0 24  0 r  y  .  8  8  8  0  0  0  0 16 16 16  0  0 0 Slow
0xe4 24 dc  0 24  0 r  .  .  8  8  8  0  0  0  8  0  0  0  0  0 0 None
0xe5 24 dc  0 24  0 r  .  .  8  8  8  0  0  0  8 16 16 16  0  0 0 Slow
0xe6 24 dc  0 24  0 r  y  .  8  8  8  0  0  0  8  0  0  0  0  0 0 None
0xe7 24 dc  0 24  0 r  y  .  8  8  8  0  0  0  8 16 16 16  0  0 0 Slow
0xe8 24 dc  0 24  0 r  .  .  8  8  8  0  0 24  0  0  0  0  0  0 0 None
0xe9 24 dc  0 24  0 r  .  .  8  8  8  0  0 24  0 16 16 16  0  0 0 Slow
0xea 24 dc  0 24  0 r  y  .  8  8  8  0  0 24  0  0  0  0  0  0 0 None
0xeb 24 dc  0 24  0 r  y  .  8  8  8  0  0 24  0 16 16 16  0  0 0 Slow
0xec 24 dc  0 24  0 r  .  .  8  8  8  0  0 24  8  0  0  0  0  0 0 None
0xed 24 dc  0 24  0 r  .  .  8  8  8  0  0 24  8 16 16 16  0  0 0 Slow
0xee 24 dc  0 24  0 r  y  .  8  8  8  0  0 24  8  0  0  0  0  0 0 None
0xef 24 dc  0 24  0 r  y  .  8  8  8  0  0 24  8 16 16 16  0  0 0 Slow
0xf0 24 dc  0 32  0 r  .  .  8  8  8  8  0  0  0  0  0  0  0  0 0 None
0xf1 24 dc  0 32  0 r  .  .  8  8  8  8  0  0  0 16 16 16 16  0 0 Slow
0xf2 24 dc  0 32  0 r  y  .  8  8  8  8  0  0  0  0  0  0  0  0 0 None
0xf3 24 dc  0 32  0 r  y  .  8  8  8  8  0  0  0 16 16 16 16  0 0 Slow
0xf4 24 dc  0 32  0 r  .  .  8  8  8  8  0  0  8  0  0  0  0  0 0 None
0xf5 24 dc  0 32  0 r  .  .  8  8  8  8  0  0  8 16 16 16 16  0 0 Slow
0xf6 24 dc  0 32  0 r  y  .  8  8  8  8  0  0  8  0  0  0  0  0 0 None
0xf7 24 dc  0 32  0 r  y  .  8  8  8  8  0  0  8 16 16 16 16  0 0 Slow
0xf8 24 dc  0 32  0 r  .  .  8  8  8  8  0 24  0  0  0  0  0  0 0 None
0xf9 24 dc  0 32  0 r  .  .  8  8  8  8  0 24  0 16 16 16 16  0 0 Slow
0xfa 24 dc  0 32  0 r  y  .  8  8  8  8  0 24  0  0  0  0  0  0 0 None
0xfb 24 dc  0 32  0 r  y  .  8  8  8  8  0 24  0 16 16 16 16  0 0 Slow
0xfc 24 dc  0 32  0 r  .  .  8  8  8  8  0 24  8  0  0  0  0  0 0 None
0xfd 24 dc  0 32  0 r  .  .  8  8  8  8  0 24  8 16 16 16 16  0 0 Slow
0xfe 24 dc  0 32  0 r  y  .  8  8  8  8  0 24  8  0  0  0  0  0 0 None
0xff 24 dc  0 32  0 r  y  .  8  8  8  8  0 24  8 16 16 16 16  0 0 Slow
0x41 32 tc  0 32  0 r  y  .  8  8  8  8  0 24  0  0  0  0  0  0 0 None
Karl: do you think I could (and could you tell me how to) disable high-speed graphics for the time being? (Until yesterday about:support said: "Graphics: 0/2")
I thought this was just related to about:support and possibly WebGL.

I thought about:support should still be saying "Graphics: 0/2".  OpenGL (low-speed) composition landed briefly but was backed out.

If you are seeing "Graphics: 2/2", then try disabling Preferences -> Advanced -> General -> "Use hardware acceleration where available".
(It's a bad name.  What it really means is "Prefer OpenGL over RENDER for layers composition if deemed appropriate".)  I don't know that this is related to your issue though.

I think what you want to do is disable use of OpenGL in about:support, but I don't know how to do that.  Benoit or Matt might know?
In about:config, set webgl.disabled and layers.acceleration.disabled.
about:support still "Created offscreen FBO" with webgl.disabled and layers.acceleration.disabled both set to true in a build based on f9a070327df8.
In reply to comment #5:
I have no such "Use hardware acceleration where available" checkbox in SeaMonkey.

What I'm now seeing at the bottom of about:support is:
Graphics
    Adapter Description        Mesa Project -- Software Rasterizer
    Driver Version             1.4 (2.1 Mesa 7.10.2)
    WebGL Renderer             false

and in addition, when sending about:support to the clipboard, it adds the following, which I don't see when viewing the page:
    GPU Accelerated Windows    0/2

Until yesterday (in a build from 17 April) I didn't see the first three lines but I saw the fourth one.

(In reply to comment #6)
> In about:config, set webgl.disabled and layers.acceleration.disabled.
thanks, I'll try that, let's hope it masks these crashes.
(In reply to comment #7)
> about:support still "Created offscreen FBO" with webgl.disabled and
> layers.acceleration.disabled both set to true in a build based on
> f9a070327df8.

Then that's a bug we should fix!
Similar crash report on shutdown on a different system: bp-f328d261-dd63-4b80-8c8e-9b37e2110528
I can't reproduce this in Firefox. As soon as I set webgl.disabled, about:support ceases trying to create GL contexts. Specifically, about:support calls GfxInfoWebGL::GetWebGLParameter() which calls WebGLContext::SetDimensions which checks if WebGL is disabled or blacklisted before calling GLContextProvider::CreateOffscreen().

The original bug report here is about SeaMonkey, not Firefox. Could this be a SeaMonkey-specific bug, or a bug in an older version of this code that SeaMonkey is still using?
Summary: X_GLXMakeCurrent: GLXBadContextTag SIGABRT with indirect software Mesa at closedown and on about:support [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f] → [SeaMonkey] X_GLXMakeCurrent: GLXBadContextTag SIGABRT with indirect software Mesa at closedown and on about:support [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f]
(This bug report is getting confused between two different things, the crash as in comment 10 on the one hand, and the fact that SeaMonkey's about:support does GL stuff even when WebGL is disabled)
I'm still getting this crash at every closedown of SeaMonkey, and Socorro dutifully mentions this bug when displaying them, but no symbols so far. If someone could tell me how to get a SeaMonkey linux-x86_64 trunk build with symbols without compiling it myself, I would gladly do so, just so I could link a stack trace with symbols from this bug.

In reply to comment #12:
I have webgl.disabled = true in about:config, all other webgl prefs defaulted (.force-enabled = false, .force_osmesa = false, .osmesalib = "", .prefer-native-gl = false, .shader_validator = true, .verbose = false). Under layers.acceleration I also have .disabled (user set) = true and .force-enabled (default) = false. At the bottom of about:support I see (ATM, in this build):

Adapter Description      Mesa Project -- Software Rasterizer
Driver Version           1.4 (2.1 Mesa 7.10.2)
WebGL Renderer           false
GPU Accelerated Windows  0/3

If this build does _not_ crash when I close it down later today ("today" for my CEST timezone) to install the next nightly, I'll mention it here.

Mozilla/5.0 (X11; Linux x86_64; rv:7.0a1) Gecko/20110529 Firefox/7.0a1 SeaMonkey/2.2a1pre ID:20110529003002
Adding topcrash keyword: this crash is ATM #2 topcrasher for SeaMonkey 2.2a1pre for all three of 7- 14- and 28-day stats periods (and #1 on Linux, since #1 all over is a Windows-only crash).
Keywords: topcrash
(In reply to comment #12)
> (This bug report is getting confused between two different things, the crash
> as in comment 10 on the one hand,

That is this bug.
Note that the crash in comment 10 is from Firefox, the same crash as initially reported in SeaMonkey.

> and the fact that SeaMonkey's
> about:support does GL stuff even when WebGL is disabled)

That was discovered in trying to use the suggested workaround for this bug.

Firefox's about:support also does GL stuff when WebGL is disabled.

(In reply to comment #11)
> Specifically,
> about:support calls GfxInfoWebGL::GetWebGLParameter() which calls
> WebGLContext::SetDimensions which checks if WebGL is disabled or blacklisted
> before calling GLContextProvider::CreateOffscreen().

I see the blacklist check at
http://hg.mozilla.org/mozilla-central/annotate/d105fc895d91/content/canvas/src/WebGLContext.cpp#l458

but where is SafeToCreateCanvas3DContext() meant to be called or webgl.disabled checked elsewhere?
(In reply to comment #15)
> (In reply to comment #12)
> > (This bug report is getting confused between two different things, the crash
> > as in comment 10 on the one hand,
> 
> That is this bug.
> Note that the crash in comment 10 is from Firefox, the same crash as
> initially reported in SeaMonkey.

[...]

If it's the same crash (but with symbols in Firefox :-) ) then the bug's Summary doesn't need to say "[SeaMonkey]".
Summary: [SeaMonkey] X_GLXMakeCurrent: GLXBadContextTag SIGABRT with indirect software Mesa at closedown and on about:support [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f] → X_GLXMakeCurrent: GLXBadContextTag SIGABRT with indirect software Mesa at closedown and on about:support [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f]
(In reply to comment #15)
> (In reply to comment #11)
> > Specifically,
> > about:support calls GfxInfoWebGL::GetWebGLParameter() which calls
> > WebGLContext::SetDimensions which checks if WebGL is disabled or blacklisted
> > before calling GLContextProvider::CreateOffscreen().
> 
> I see the blacklist check at
> http://hg.mozilla.org/mozilla-central/annotate/d105fc895d91/content/canvas/
> src/WebGLContext.cpp#l458
> 
> but where is SafeToCreateCanvas3DContext() meant to be called or
> webgl.disabled checked elsewhere?

Ah! Sorry about that. I had the blacklisting in mind and forgot to check where we were checking for webgl.disabled. That is indeed in SafeToCreateCanvas3DContext() which is called from GetContext(). The bug here is that about:support bypasses GetContext, so it doesn't honor webgl.disabled.
Attached patch check if webgl is disabled (obsolete) — Splinter Review
Here's a patch that should work (didn't try yet).

Also the reason why I failed to reproduce is probably that I did something wrong on my end (changed not only disabled but also force-enabled which I need to bypass the blacklist).
Attachment #536116 - Flags: review?(karlt)
(In reply to comment #18)
> Created attachment 536116 [details] [diff] [review] [review]
> check if webgl is disabled
> 
> Here's a patch that should work (didn't try yet).

I tried it, it works here.
when you get reviews, please ask for approval on the patch for aurora for 6. thanks.
Comment on attachment 536116 [details] [diff] [review]
check if webgl is disabled

>+  // we need to check for webgl.disabled here, because otherwise it's only checked in GetContext() which
>+  // we're bypassing here. We can't call SafeToCreateCanvas3DContext here because it's whitelisting chrome altogether,
>+  // and here we're chrome and we don't want to be whitelisted.

This feels to me like giving different meanings to webgl.disabled in different places.

If it is not safe to use WebGL in about:support, then in which part of chrome would it be safe to use?

It is time to change the meaning of webgl.disabled, so that it disables webgl everywhere?
Or do we need separate prefs?  Perhaps a global disable-opengl pref?
(In reply to comment #21)
> It is time to change the meaning of webgl.disabled,

Sorry, I meant "Is it time to ..."
(In reply to comment #22)
> (In reply to comment #21)
> > It is time to change the meaning of webgl.disabled,
> 
> Sorry, I meant "Is it time to ..."

I think I agree with you. I don't see why code was written in such a way that webgl.disabled is not honored in chrome.
Mozilla/5.0 (X11; Linux x86_64; rv:7.0a1) Gecko/20110604 Firefox/7.0a1 SeaMonkey/2.2a1pre ID:20110604003003

Got the "about:support reload" variant of this crash for the second time (the first was comment #1): bp-b373949c-f5cc-4e84-a8cf-a05de2110604; on restart, all extensions were enabled, see bug 659772 comment #24

I'm still getting the other variant of this bug at every shutdown.
See Also: → 659772
Mozilla/5.0 (X11; Linux x86_64; rv:7.0a1) Gecko/20110607 Firefox/7.0a1 SeaMonkey/2.4a1 ID:20110607003003

My latest closedown had no crash, but it was after very little activity: after a series of startup crashes, I invoked "seamonkey -P default -url about:addons", uninstalled one extension (QuickFolders 2.6), shut down (with crash), restarted the same, checked that the addon was gone, shut down (no crash), and restarted again (with my usual multitab homepage). If the next closedown doesn't crash I'll comment again.

(Don't rush and deduce that it was all the addon's fault: in the past I've had this crash even in Safe Mode.)
Tony, we've not checked in any patch yet here, so don't expect anything to be fixed yet. I'll try to implement the idea in comment 21 ASAP.
(In reply to comment #26)
> Tony, we've not checked in any patch yet here, so don't expect anything to
> be fixed yet. I'll try to implement the idea in comment 21 ASAP.

No prob; just telling what I'm seeing, since AFAICT from Socorro reports most or all of the reports making this the #1 Linux topcrash are from me (and BTW, I don't crash for the pleasure of crashing ;-), it's just that as a nightly tester I close down SeaMonkey at least once a day, often more than that).
Bug 645407 comment 17 says he got this crash also with r300g, so it's not specific to software Mesa.

Maybe we're really doing something wrong here. Will make you a tryserver build that prints more debugging info.
Tryserver build: http://tbpl.mozilla.org/?tree=Try&rev=caf49abe7cdd

Once the buils are done, they will be available at:
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-caf49abe7cdd

Please run like this:

MOZ_X_SYNC=1 ./firefox -P -no-remote 2>&1 | tee logfile.txt

The MOZ_X_SYNC ensures that the X error occurs synchronously wrt debug output. The 2>&1 is because I print the debug info to stderr. Please then (compress and) attach logfile.txt.
This SafeToCreateCanvas3DContext method was apparently a remnant of the days when WebGL was called Canvas3D. This patch removes it, instead the webgl.disabled pref is checked in SetDimensions which is where the other prefs are checked and where we potentially create GL contexts. The prefs-checking code is moved up a bit to occur before we do anything about GL ContextFormats. Chrome canvases are no longer special-cases. The about:support code no longer does anything special, it just creates a WebGL canvas and check if that succeeds.
Attachment #536116 - Attachment is obsolete: true
Attachment #536116 - Flags: review?(karlt)
Attachment #538022 - Flags: review?(karlt)
(In reply to comment #28)
> Bug 645407 comment 17 says he got this crash also with r300g, so it's not
> specific to software Mesa.

Yes, but FYI that was also indirect, so it would be the same client-side code running.  (The device-dependent part would be on the server.)

> Maybe we're really doing something wrong here.

I wonder whether the context might need to be cleared (or something) when the drawable is destroyed.  Do we do that?
Summary: X_GLXMakeCurrent: GLXBadContextTag SIGABRT with indirect software Mesa at closedown and on about:support [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f] → X_GLXMakeCurrent: GLXBadContextTag SIGABRT with indirect Mesa at closedown and on about:support [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f]
Comment on attachment 538022 [details] [diff] [review]
kill SafeToCreateCanvas3DContex, check webgl.disabled in SetDimensions

Empty patch :)
Attachment #538022 - Flags: review?(karlt)
(In reply to comment #29)
> Tryserver build: http://tbpl.mozilla.org/?tree=Try&rev=caf49abe7cdd
> 
> Once the buils are done, they will be available at:
> https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.
> com-caf49abe7cdd
> 
> Please run like this:
> 
> MOZ_X_SYNC=1 ./firefox -P -no-remote 2>&1 | tee logfile.txt
> 
> The MOZ_X_SYNC ensures that the X error occurs synchronously wrt debug
> output. The 2>&1 is because I print the debug info to stderr. Please then
> (compress and) attach logfile.txt.

Is that instead of, or in addition to, the --sync command-line option to "make X calls synchronous"? I assumed "in addition to".

The crash report is at bp-733dd75f-a0cb-487e-b02f-8fc622110608

I usually experience this crash with SeaMonkey; this time I used your try-build of Firefox in an ad-hoc profile, viewed some pages including about:, about:support and about:addons, and got the expected closedown crash.
Sorry about the empty patch, here is the real patch.
Attachment #538022 - Attachment is obsolete: true
Attachment #538161 - Flags: review?(karlt)
(In reply to comment #34)
> > The MOZ_X_SYNC ensures that the X error occurs synchronously wrt debug
> > output. The 2>&1 is because I print the debug info to stderr. Please then
> > (compress and) attach logfile.txt.
> 
> Is that instead of, or in addition to, the --sync command-line option to
> "make X calls synchronous"? I assumed "in addition to".

I didn't know about --sync. The description sounds exactly like MOZ_X_SYNC does, but I don't know how/where it's implemented so I don't even know if it works. MOZ_X_SYNC at least works for sure.

> 
> The crash report is at bp-733dd75f-a0cb-487e-b02f-8fc622110608
> 
> I usually experience this crash with SeaMonkey; this time I used your
> try-build of Firefox in an ad-hoc profile, viewed some pages including
> about:, about:support and about:addons, and got the expected closedown crash.

Thanks for the log. It tells us that a WebGL context was created but failed to initialize (WebGL was unavailable), then the WebGL context was successfully destroy, but then on shutdown you got a crash in glXMakeCurrent() called from GLContext::MarkDestroyed() called from GLContextProviderGLX::Shutdown().

(In reply to comment #32)
> I wonder whether the context might need to be cleared (or something) when
> the drawable is destroyed.  Do we do that?

I don't think we do: I don't even know how we would learn when the drawable is destroyed? But since ultimately we are the ones who are destroying the drawable, this should be a matter of doing things in the right order.
Notice how GLContextProviderGLX::Shutdown() is just doing:

  gGlobalContext = nsnull;

I.e. destroying the gGlobalContext. This is happening during XPCOM shutdown, it seems that this is too late, perhaps later than when the drawable is destroyed.

Whatever the benefits of having this "global context" are, I'm not sure they offset the cost...
http://www.opengl.org/sdk/docs/man/xhtml/glXMakeCurrent.xml
"To release the current context without assigning a new one, call glXMakeCurrent with drawable set to None and ctx set to NULL."

If the current context depends on a drawable, then I expect this needs to be done before the drawable is destroyed.
(In reply to comment #36)
[...]
> I didn't know about --sync. The description sounds exactly like MOZ_X_SYNC
> does, but I don't know how/where it's implemented so I don't even know if it
> works. MOZ_X_SYNC at least works for sure.
[...]
It's listed under "X11 options" by seamonkey -h or firefox -h, and it is not specific to Mozilla applications: e.g. gvim has it too (at least when built with GTK2 GUI). I think it is a GTK setting (the Vim help says "Look in the GTK documentation for how they are used", talking of that option and several others). It is also mentioned by some X error messages, which tell you to set that option and then run the app under gdb with a breakpoint at (some particular symbol) if you want to get a stack trace or examine some variables at the moment the error is triggered.
-- Well, I used both.
Comment on attachment 538161 [details] [diff] [review]
kill SafeToCreateCanvas3DContex, check webgl.disabled in SetDimensions

(In reply to comment #30)
> This patch removes it, instead the
> webgl.disabled pref is checked in SetDimensions which is where the other
> prefs are checked and where we potentially create GL contexts.

Looks good.

> The prefs-checking code is moved up a bit to occur before we do anything
> about GL ContextFormats.

If WebGLContext::SetDimensions() is going to fail, then wouldn't it make sense
to still DestroyResourcesAndContext() to clear old resources?
i.e. shouldn't the check be after DestroyResourcesAndContext()?

> Chrome canvases are no longer special-cases.

Makes sense to me.

> The about:support code no longer does anything special, it just creates a
> WebGL canvas and check if that succeeds.

I didn't see this change.  Was that meant to be included in this patch, or is
this a statement about what has already happened?
Or was the "special" thing about about:support that you refer to, the fact
that a WebGL context was created even when WebGL is disabled?
(In reply to comment #39)
> -- Well, I used both.

Yes, both is fine.  --sync is implemented in GDK; MOZ_X_SYNC is implemented in Gecko.  An advantage of MOZ_X_SYNC is that it works in child (plugin and, in the future, content) processes, though that is not relevant here.
Crash Signature: [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f]
(In reply to comment #40)
> Comment on attachment 538161 [details] [diff] [review] [review]
> > The prefs-checking code is moved up a bit to occur before we do anything
> > about GL ContextFormats.
> 
> If WebGLContext::SetDimensions() is going to fail, then wouldn't it make
> sense
> to still DestroyResourcesAndContext() to clear old resources?
> i.e. shouldn't the check be after DestroyResourcesAndContext()?

You're entirely right, thanks for spotting this! New patch calls DestroyResourcesAndContext() as soon as we're done with the early success cases.


> > The about:support code no longer does anything special, it just creates a
> > WebGL canvas and check if that succeeds.
> 
> I didn't see this change.  Was that meant to be included in this patch, or is
> this a statement about what has already happened?
> Or was the "special" thing about about:support that you refer to, the fact
> that a WebGL context was created even when WebGL is disabled?

Sorry, what I wrote here was totally confused. Just ignore it :-)
Attachment #538161 - Attachment is obsolete: true
Attachment #538161 - Flags: review?(karlt)
Attachment #538421 - Flags: review?(karlt)
Attachment #538421 - Flags: review?(karlt) → review+
This 1-line patch should silence the X error. The question is, do we want to do that?

My understanding is that we're calling GLContext::MarkDestroyed() during XPCOM context, which is a very late time to do such a thing, and the GL context has gone bad by the time we reach this point, probably the underlying surface is already destroyed.

So if we don't do this, then what are our alternatives?
 - probably the cleanest fix would be to find where exactly we're destroying our X resources and to destroy this global Gl context right before that.
 - alternatively we could decide that the idea of a "global GL context" was a bad idea?
Attachment #538423 - Flags: review?(karlt)
(In reply to comment #43)
> So if we don't do this, then what are our alternatives?
>  - probably the cleanest fix would be to find where exactly we're destroying

Perhaps a real fix on the horizon:

* X display closedown is done by MOZ_gdk_display_close, called at 2 places in nsAppRunner.cpp:
  http://mxr.mozilla.org/mozilla-central/ident?i=MOZ_gdk_display_close

* XPCOM shutdown is done by NS_ShutdownXPCOM, also called in nsAppRunner.cpp:
  http://mxr.mozilla.org/mozilla-central/ident?i=NS_ShutdownXPCOM&tree=mozilla-central&filter=

Patch coming...
Crash Signature: [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f] → [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f]
...though one could also decide that it's not worth complexifying nsAppRunner.cpp just to avoid a stupid X error! So please consider the 1-line patch.
The X display is closed after the ScopedXPCOMStartup is destroyed in XRE_main, so that is not causing the drawable to get destroyed.  Even if it were, the lack of the display would cause different problems; the GLXBadContextTag error would not even be received.  Also remember this happens on reload of about:support (before shutdown).
Most crashes in Mesa's glXMakeCurrent seem to involve unbinding the old current context rather than making new context current.

I'm suspicious that the drawable of the old current context may have been destroyed.

What ensures that a context is not current when its drawable gets destroyed?
(In reply to comment #47)
> What ensures that a context is not current when its drawable gets destroyed?

The GLX docs state:

If /draw/ is destroyed after glXMakeContextCurrent is called, then subsequentrendering commands will be processed and the context state will be updated, butthe frame buffer state becomes undefined. If /read/ is destroyed after glXMake-ContextCurrent then pixel values read from the framebuffer (e.g., as result of calling glReadPixels ,glCopyPixels or glCopyColorTable) are undefined. If theX Window underlying the GLXWindow /draw/ or /read/ drawable is destroyed, ren-dering and readback are handled as above.

So it should be valid to have a current GL context whose GLX or X drawables have been destroyed.  This could well be a mesa bug.
Mozilla/5.0 (X11; Linux x86_64; rv:7.0a1) Gecko/20110610 Firefox/7.0a1 SeaMonkey/2.4a1 ID:20110610003052

The latest nightly where I crashed at every closedown @ libc-2.11.3.so@0x32ab5 was dated 2006-06-08; but see bp-4158b2a9-c75e-40c5-8ec1-eab7f2110610 which happened also at closedown (might be unrelated though). I had a few non-crashing closedowns with yesterday's nightly (June 9). I don't remember any specific X package updates but I wouldn't swear that there hasn't been any: I'm applying all patches from the openSUSE 11.4 "Update-Test" repository as soon as they are published.
New try build with a LOT more debugging output based on above discussion:
http://tbpl.mozilla.org/?tree=Try&rev=8b7d6ea3affb

Once the buils are done, they will be available at:
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-8b7d6ea3affb

Run with:

MOZ_X_SYNC=1 ./firefox -P -no-remote 2>&1 | tee logfile.txt

I'm interested in both a shutdown crash, and a non-shutdown crash.

Note that this time the ligfile.txt might get really big so perhaps compress before attaching (or just pipe xz in the command line)
Sorry, previous build failed, this one should work:
http://tbpl.mozilla.org/?tree=Try&rev=bbb566273556

Once the buils are done, they will be available at:
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-bbb566273556
Landed the patch making about:support honor webgl.disabled:
http://hg.mozilla.org/mozilla-central/rev/cacfd85ffb49
Assignee: nobody → bjacob
Target Milestone: --- → mozilla7
This file (minefield.log.bz2) is a compressed log from a short run of the try-build from comment #51 in an ad-hoc "almost fresh" profile used only for tests of new Firefox builds (and not very often at that).

The closedown crash dump is available at bp-16a69069-72ca-493a-90d9-72b342110611
Attachment #538148 - Attachment is obsolete: true
Thanks a lot. Still processing the log; note that I made a mistake here,

gfxXlibSurface::~gfxXlibSurface()
{
    FUNCTION_DEBUG_HELPER
    printf_stderr("XXXXX this = %p, drawable = %d\n", this, mDrawable);
    if (mPixmapTaken) {
        XFreePixmap (mDisplay, mDrawable);
	 printf_stderr("XXXXX freed drawable = %d\n", this, mDrawable);
    }
}

So the "freed drawable" lines have wrong information,  but it doesn't matter as the above "XXXXX this = %p, drawable = %d\n" line gives the same information anyway.
The place where it starts going wrong is "Error resizing offscreen framebuffer -- framebuffer not complete". Still working on the log...
(In reply to comment #48)
> If theX Window underlying the GLXWindow /draw/ or /read/
> drawable is destroyed, ren-dering and readback are handled as above.
> 
> So it should be valid to have a current GL context whose GLX or X drawables
> have been destroyed.  This could well be a mesa bug.

Thanks, Vlad.  I have a little trouble reconciling that with "If the previous
context of the calling thread has unflushed commands, and the previous
drawable is no longer valid, GLXBadCurrentDrawable is generated" but at least
BadContextTag is not appropriate for such a situation.

I tried modifying glxtest (as attached) to check that the GLX implementation
handled this situation, but it did not detect any problem. 

Here I get a SIGSEGV, which is a bit different to the error reported in this
bug, but happens in the same situations:

OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: Mesa DRI R600 (JUNIPER 68A0) 20090101  TCL DRI2
OpenGL version string: 1.4 (2.1 Mesa 7.10.2)

#4  <signal handler called>
#5  0x00007f428dd54ed0 in xcb_glx_get_string_string_length () from /usr/lib64/libxcb-glx.so.0
#6  0x00007f429003b987 in __glXGetString (dpy=<value optimized out>, opcode=<value optimized out>, contextTag=9961477, name=7939) at glx_query.c:82
#7  0x00007f4290038436 in __indirect_glGetString (name=7939) at single2.c:685
#8  0x00007f429001cb0a in indirect_bind_context (gc=0x7f422ee6eda0, old=0x7f422ee6d180, draw=62915964, read=62915964) at indirect_glx.c:156
#9  0x00007f4290019f01 in MakeContextCurrent (dpy=0x7f428da40000, draw=62915964, read=62915964, gc_user=<value optimized out>) at glxcurrent.c:263
#10 0x00007f4299c85750 in mozilla::gl::GLContextGLX::MakeCurrentImpl (this=0x7f426df53800, aForce=0) at /home/karl/moz/dev/gfx/thebes/GLContextProviderGLX.cpp:439
I think we have two separate problems here:
 1. That we get this incomplete framebuffer errors
 2. That we crash after this error

I made a new tryserver build to understand these problems.
http://tbpl.mozilla.org/?tree=Try&rev=260494d04975

When the builds are ready they will be at
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-260494d04975

This time, please run with this additional environment variable:
MOZ_GL_DEBUG_VERBOSE=1 MOZ_X_SYNC=1 ./firefox -P -no-remote 2>&1 | tee logfile.txt

This will record all GL calls and GL errors, which is likely to help understand the incomplete framebuffer errors. Also this new build has the patch from bug 654424 applied, which gives a lot more information about that.
Also, notice that about:support should now honor webgl.disabled. if you want to reproduce the crash, either go to some other webgl page, or don't disable.
Attached file bzipped log, see comment #58 to #60 (obsolete) —
(In reply to comment #59)
> Also, notice that about:support should now honor webgl.disabled. if you want
> to reproduce the crash, either go to some other webgl page, or don't disable.

Mozilla/5.0 (X11; Linux x86_64; rv:7.0a1) Gecko/20110613 Firefox/7.0a1

This is a test: I'm running this build of Nightly (big N), which is not a nightly (small n) but a try-build, in a test profile which differs very little from the default; in particular, webgl.disabled is false. (I see a pref named webgl.verbose which is also false by default, if you want it true next time, tell me).

Crash dump bp-31e79929-8c55-45dc-a987-e0bd72110613
Attachment #538690 - Attachment is obsolete: true
This unfortunately doesn't have MOZ_GL_DEBUG_VERBOSE=1, see comment 58, can you please retry with it? It's really useful here.

No need for webgl.verbose, that is only to help JS developers fix their code.
Attached file bzipped log see comment #61 and #62 (obsolete) —
(In reply to comment #61)
> This unfortunately doesn't have MOZ_GL_DEBUG_VERBOSE=1, see comment 58, can
> you please retry with it? It's really useful here.

Oops sorry; here it is.
Attachment #538977 - Attachment is obsolete: true
Thank a lot. This part of the log already shows a clear Mesa bug here:

[gl:0x2a41c00] > void mozilla::gl::GLContext::fFramebufferRenderbuffer(GLenum, GLenum, GLenum, GLuint)
parameters:
  attachmentPoint = 0x8d00
  renderbuffer = 16
[gl:0x2a41c00] < void mozilla::gl::GLContext::fFramebufferRenderbuffer(GLenum, GLenum, GLenum, GLuint) [0x0000]
[gl:0x2a41c00] > GLenum mozilla::gl::GLContext::fCheckFramebufferStatus(GLenum)
[gl:0x2a41c00] < GLenum mozilla::gl::GLContext::fCheckFramebufferStatus(GLenum) [0x0000]
framebuffer info:
  default framebuffer. No FBO is currently bound.

Here, FramebufferRenderbuffer succeeds (see the [0x0000], that means no GL error) which implies that a FBO is currently bound (and indeed just above in the log we called BindFramebuffer)... but actually no FBO is bound (the code behind that is calling GetIntegerv(FRAMEBUFFER_BINDING) and gets the value 0 meaning no FBO).

--> for sure we have to blacklist the (current versions of the) swrast driver.

That leaves open 2 questions:
 1) didn't we have similar bugs on other drivers too? Karl?
 2) we should recover more gracefully from that i.e. the X errors we're currently getting suggest that perhaps we don't clean up appropriately from that error point.
(In reply to comment #64)
> Thank a lot. This part of the log already shows a clear Mesa bug here:
[...]

If you want to report the bug upstream (e.g. at bugzilla.novell.com), I have the following installed:

OS: openSUSE Linux 11.4 (version: Final, architecture: x86_64)

Software packages (among others, of course):
xorg-x11-driver-video 7.6-53.58.1 ("intel" driver in service)
Mesa 7.10.2-7.3.1
DirectFB-Mesa 1.4.5-14.2

Hardware devices (among others, of course):
Motherboard: Intel/Fujitsu Scenic W620 (handling display, network, PCI)
Framebuffer Device: Intel(r)915G/915GV/910GL Graphics Controller
I expect there may well be similar bugs with all drivers and indirect mesa because the client-side code is the same.  But (at least part of) the server-side code is different, so I'm not sure.

Bug 664066 is making it hard for me to test the Try builds, but I seem to have GL_FRAMEBUFFER_COMPLETE after CheckFramebufferStatus.

Knowing how to recover is tricky if the library doesn't follow defined behaviour, so I'm not sure we should try too hard.

However, I did see X_GLXMakeCurrent: GLXBadCurrentWindow suggesting that we didn't flush the previous context before its underlying drawable was destroyed.

I can try Mesa 7.10.3 with r600 when bug 664066 is fixed.
Attached patch block swrast (obsolete) — Splinter Review
Attachment #539276 - Flags: review?(karlt)
Note, I've been told by a developer that swrast is deprecated anyways and the new supported Mesa software renderers are llvmpipe and softpipe. They will be unblocked as soon as we unblock Gallium.
I edited the code to simulate that bug here, in the hope to reproduce the X error, but I didn't manage to reproduce it or any crash (NVIDIA driver here).
A Mesa developer has replied,
https://bugs.freedesktop.org/show_bug.cgi?id=38312#c3

and requires more information that can be obtained by running this build
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-49c288664dcd

with
MOZ_GL_DEBUG_VERBOSE=1 MOZ_X_SYNC=1 ./firefox -P -no-remote 2>&1 | tee logfile.txt
(In reply to comment #71)
> A Mesa developer has replied,
> https://bugs.freedesktop.org/show_bug.cgi?id=38312#c3

Good, I've CCed myself to that bug.

> 
> and requires more information that can be obtained by running this build
> https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.
> com-49c288664dcd
> 
> with
> MOZ_GL_DEBUG_VERBOSE=1 MOZ_X_SYNC=1 ./firefox -P -no-remote 2>&1 | tee
> logfile.txt

You'll have it tomorrow (in 12 hours or so), I have to sleep and I have an appointment with the doctor in the morning.
Comment on attachment 539276 [details] [diff] [review]
block swrast

It's hard to tell whether this issue is specific to the Software Rasterizer when indirect or whether it is a bug with the Software Rasterizer in general because we don't even get this far when direct.  But that means there's no harm in blocking it in general.
Attachment #539276 - Flags: review?(karlt) → review+
Here it is, the minefield.log.bz2 from
  (download the linux64 try build linked in comment #71)
  rm -Rvf firefox
  tar -jxvf firefox-7.0a1.en-US.linux-x86_64.tar.bz2
  MOZ_GL_DEBUG_VERBOSE=1 MOZ_X_SYNC=1 ./firefox/firefox --sync -P virgin -no-remote 2>&1|tee minefield.log
  bzip2 -kvf minefield.log

The corresponding closedown crash dump is bp-1b0d2c69-bb93-4f9c-a6f2-9d7992110615
Attachment #538986 - Attachment is obsolete: true
Thanks. That shows that GetIntegerv does not generate an error, and explicitly sets the result to 0 (does not just leave it uninitialized). Replying to the Mesa bug.
Jose's comments 21 and 24 on https://bugs.freedesktop.org/show_bug.cgi?id=38312 suggest that if we don't release the GL context before destroying it, then its destruction is postpone to a later point in time. But we are immediately destroying its drawable, and indeed we must do that otherwise we'd potentially be leaking drawables. So, in order to have well-defined behavior, this patch releases the GL context and checks with a NS_ABORT_IF_FALSE that that indeed happened. (See glXMakeCurrent man page). This hopefully fixes the X errors we're getting here.
Attachment #539555 - Flags: review?(karlt)
https://bugs.freedesktop.org/show_bug.cgi?id=38312#c27 confirms that the 'release GL context' patch fixes the crash. Thanks Tony!
Comment on attachment 539555 [details] [diff] [review]
release GL context before destroying it

I don't understand well enough to know whether this is resolving a bug on our side or working around a bug in Mesa, but if it fixes the problem, great.

I notice that MarkDestroyed already makes the context current (to free resources i assume), so there is no reason to skip this if another context were current.
Attachment #539555 - Flags: review?(karlt) → review+
(In reply to comment #78)
> Comment on attachment 539555 [details] [diff] [review] [review]
> release GL context before destroying it
> 
> I don't understand well enough to know whether this is resolving a bug on
> our side or working around a bug in Mesa, but if it fixes the problem, great.

Bug on our side, we were relying on undefined behavior, were lucky with the NVIDIA driver and sometimes unlucky with Mesa. See comment 76. See the glXDestroyContext man page, it says:

  If GLX rendering context ctx is not current to any thread,
  glXDestroyContext  destroys it immediately.  Otherwise, ctx is destroyed
  when it becomes not current to any thread.  In either	case, the resource ID
  referenced by	ctx is freed immediately.

In other words, if we want glXDestroyContext to have the well-defined behavior of destroying the context before future X commands take effect, we must first release the GL context before calling it. We were failing to do that, but we were destroying the drawable immediately after that call, and as a result, the context was outliving its underlying drawable.

> I notice that MarkDestroyed already makes the context current (to free
> resources i assume), so there is no reason to skip this if another context
> were current.

But precisely, we want the GL context to NOT be current by the time glXDestroyContext is called on it.
Comment on attachment 539555 [details] [diff] [review]
release GL context before destroying it

Please approve this bug for Firefox 6. While we've had this crash since Firefox 4 (see original report) it's getting a lot worse in Firefox 6 as many Mesa setups got whitelisted. It's risk-free and only 3 lines.
Attachment #539555 - Flags: approval-mozilla-aurora?
Attachment #538423 - Flags: review?(karlt) → review-
(In reply to comment #77)
> https://bugs.freedesktop.org/show_bug.cgi?id=38312#c27 confirms that the
> 'release GL context' patch fixes the crash. Thanks Tony!

I just reported a crash I saw, and then ran builds of which you sent me the links. Nothing hard in that. You were the one who tracked down the error, discussed it with the Mesa people, wrote a patch, and (IIUC) are now pushing at the wheel to get the fix into both trunk and aurora repositories. In a few days (hopefully), the bug will be FIXED and then I'll be entitled to say: Merci Benoît!
Comment on attachment 539555 [details] [diff] [review]
release GL context before destroying it

Clearing approval request since this doesn't seem to be landed on mozilla-central yet. Once it's landed, and has baked for a bit, please re-request approval.
Attachment #539555 - Flags: approval-mozilla-aurora?
Landed:
http://hg.mozilla.org/mozilla-central/rev/60182c83c925

Tony, your work capturing logs and running apitrace was crucial, so really thanks.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Please fix the build warnings you caused in opt builds.
What warnings?
There are warnings in webgl code,
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1308326872.1308330305.12966.gz&fulltext=1
but i don't see new ones.
>-  glXDestroyContext(dpy, context);

>+  glXMakeCurrent(dpy, None, NULL); // must release the GL context before destroying it

I think you still want to destroy the context in glxtest.

(In reply to comment #84)
> Please fix the build warnings you caused in opt builds.

Ms2ger: a more specific comment would be more helpful.

A suspect this is "success" set but not used.
It could be silenced by putting its declaration in "ifdef DEBUG".
I stopped having this crash in SeaMonkey some time ago, see comment #49.

The current Firefox nightly, built before the fix was landed, still has the bug:
Mozilla/5.0 (X11; Linux x86_64; rv:7.0a1) Gecko/20110617 Firefox/7.0a1 ID:20110617030741 crash: bp-a66426e7-b788-49b8-ab78-d8a252110617 and bp-5ad044e9-d0e8-4a20-b315-26f022110617

Now let's find a Firefox "Nightly" tinderbox-build whose source was pulled later than comment #87... there is none yet... wait until there is one available...
Mozilla/5.0 (X11; Linux x86_64; rv:7.0a1) Gecko/20110617 Firefox/7.0a1 ID:20110617184644 (Built from http://hg.mozilla.org/mozilla-central/rev/5b56da7babb9)... no crash.

I'm setting this bug VERIFIED on the assumption that the fix applies across all applications and platforms. People, if you want to REOPEN, first make sure that you observe the crash in a build whose 14-digit timestamp ("Build ID" as shown in the crash report) is later than comment #87, and then paste your bp-something crash ID in your reopening comment.
Status: RESOLVED → VERIFIED
Thanks Benoît! But your job isn't finished yet. I'm setting Fx6-affected on the basis of comment #80, and "when the fix will have baked a little on trunk" comment #82 won't apply anymore.
Comment on attachment 539555 [details] [diff] [review]
release GL context before destroying it

Requesting aurora approval for this patch + the followup fix http://hg.mozilla.org/mozilla-central/rev/5b56da7babb9
Attachment #539555 - Flags: approval-mozilla-aurora?
Attachment #539276 - Attachment is obsolete: true
Blocks: 624935
Attachment #539555 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Fx6-fixed according to comment #92

According to Socorro this signature is seen:
- on Fx 6.0a2 dated 2011-06-23 to 2011-06-25 with comment "html5test"
- on Fx 5.0 dated 2011-06-15 (the latest beta?)
- on Fx4 before that.

Only on Linux64 but I suppose on i686 the offset in libxul would be different.

I suppose Fx4 is at EOL. Is this crash fix worth porting to Fx5, or will the release-branch fix have to wait for Fx6 release in 6 weeks or so?
On Firefox 5, this should only be affecting users with indirect Mesa libGL and either connecting to a display with NVIDIA GLX or forcing webgl on for their OpenGL.  That's probably a small enough proportion of situations that it is not worth the risk of making a change to Firefox 5.
(In reply to comment #94)
> On Firefox 5, this should only be affecting users with indirect Mesa libGL
> and either connecting to a display with NVIDIA GLX or forcing webgl on for
> their OpenGL.  That's probably a small enough proportion of situations that
> it is not worth the risk of making a change to Firefox 5.

https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A5.0&platform=linux&query_search=signature&query_type=contains&reason_type=contains&date=06%2F29%2F2011%2012%3A53%3A15&range_value=4&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=libc-2.11.3.so%400x32ab5

Twelve crashes in four weeks (of which six less than a week old), with five different Build IDs, and each a different install time. Crash #60 to #79 (ex-aequo) on page 1 of 13 for Fx5 on Linux this week. This said... Linux crashes (and especially linux-x86_64 crashes) are of course nowhere near the number of Windows crashes; maybe too small a sample to draw statistically valid conclusions.
The same signature in this case doesn't mean anything more than abort was called with the same libc.  There are a few different changes there but none mention X_GLXMakeCurrent: GLXBadContextTag.  The most common looks like Bug 640908.
Summary: X_GLXMakeCurrent: GLXBadContextTag SIGABRT with indirect Mesa at closedown and on about:support [@ libc-2.11.3.so@0x32ab5] [@ libxul.so@0x1e7e78f] → X_GLXMakeCurrent: GLXBadContextTag SIGABRT with indirect Mesa at closedown and on about:support
You need to log in before you can comment on or make changes to this bug.