Closed Bug 658826 Opened 13 years ago Closed 8 years ago

Stalling in D3DCompile(), called from ANGLE's libGLESv2, called from WebGL (irradiance demo)

Categories

(Core :: Graphics: CanvasWebGL, defect)

2.0 Branch
x86
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
mozilla10

People

(Reporter: bugs, Unassigned)

References

()

Details

(Keywords: hang, regression, Whiteboard: webgl-angle [gfx-noted])

Attachments

(4 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0a1) Gecko/20110521 Firefox/6.0a1
Build Identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0a1) Gecko/20110521 Firefox/6.0a1

Nightly not responding (clean profile)

http://codeflow.org/webgl/irradiance/

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0a1) Gecko/20110521 Firefox/6.0a1

Graphics
Adapter Description: NVIDIA GeForce GT 240
Vendor ID: 10de (=Gainward)
Device ID: 0ca3 (=GeForce GT240 1024MB GDDR5 GS)
Adapter RAM: 1024
Adapter Drivers: nvd3dumx,nvwgf2umx,nvwgf2umx nvd3dum,nvwgf2um,nvwgf2um
Driver Version: 8.17.12.7527
Driver Date: 5-13-2011
Direct2D Enabled: true
DirectWrite Enabled: true (6.1.7601.17563)
ClearType Parameters: ClearType parameters not found
WebGL Renderer: Google Inc. -- ANGLE -- OpenGL ES 2.0 (ANGLE 0.0.0.611)
GPU Accelerated Windows: 1/1 Direct3D 10

Reproducible: Always

Steps to Reproduce:
1. open the url
2. I get "not responding"


Actual Results:  
"not responding"
I don't get the Mozilla crash reporter

Expected Results:  
execute the WebGL code (or the Mozilla crash reporter should be visible after about 60 seconds)

http://forums.mozillazine.org/viewtopic.php?p=10828709#p10828709
Regression window:
Works:
http://hg.mozilla.org/mozilla-central/rev/b758d7b3e139
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b13pre) Gecko/20110301 Firefox/4.0b13pre ID:20110301140506
Hang:
http://hg.mozilla.org/mozilla-central/rev/c0b114d35e7b
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b13pre) Gecko/20110301 Firefox/4.0b13pre ID:20110301142845
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b758d7b3e139&tochange=c0b114d35e7b

Graphics
  Adapter Description: ATI Radeon HD 4300/4500 Series
  Vendor ID: 1002
  Device ID: 954f
  Adapter RAM: 512
  Adapter Drivers: aticfx64 aticfx64 aticfx32 aticfx32 atiumd64 atidxx64 atiumdag atidxx32 atiumdva atiumd6a atitmm64
  Driver Version: 8.850.0.0
  Driver Date: 4-19-2011
  Direct2D Enabled: true
  DirectWrite Enabled: true (6.1.7601.17563)
  ClearType Parameters: Gamma: 2200 Pixel Structure: RGB ClearType Level: 50 Enhanced Contrast: 50 
  WebGL Renderer: Google Inc. -- ANGLE -- OpenGL ES 2.0 (ANGLE 0.0.0.611)
  GPU Accelerated Windows: 1/1 Direct3D 10
Severity: normal → critical
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: hang, regression
Version: unspecified → Trunk
Google chrome also.
I have informed the author (Florian Bösch) about it.
No abnormal behavior here, renders as intended in both FF and Chrome, does not hang.

FF: Official release, 4.01
Chromium: Official release, 11.0.696.68 (84545)
OS: Ubuntu 10.10
GFX: GTX-460
Driver: nvidia-current via VDPAU PPA version: 270.41.06

$ glxinfo | grep -i version
server glx version string: 1.4
client glx version string: 1.4
GLX version: 1.4
OpenGL version string: 4.1.0 NVIDIA 270.41.06
OpenGL shading language version string: 4.10 NVIDIA via Cg compiler
Attached file stack to stalling location (obsolete) —
Confirmed here, today's Nightly (ANGLE r653), Windows 7.

Attached is a call stack showing where we are stalled. It's in D3DCompiler_43.dll, called from ANGLE's gl::Program::linkVaryings()
Actually, the previous stack was inaccurate. Stepping through the code shows that it's stalling inside of D3DCompile(), called from Program::compileToBinary(), as shown in attached stack.
Attachment #535126 - Attachment is obsolete: true
This attachment gives  the values of the parameters we're passing to D3DCompile(), making it stall. In particular, the HLSL shader source is:


float2 vec2(float x0, float x1)
{
    return float2(x0, x1);
}
float4 vec4(float2 x0, float x1, float x2)
{
    return float4(x0, x1, x2);
}
float4 vec4(float3 x0, float x1)
{
    return float4(x0, x1);
}
// Varyings

static float4 gl_Color[1] = {float4(0, 0, 0, 0)};
Daniel, this looks like a bug in the DirectX SDK, present at least in the June 2010 version. Do you think there might be a work-around? If not, I'm afraid this is a WONTFIX.
Summary: WebGL - irradiance - Nightly not responding → Stalling in D3DCompile(), called from ANGLE's libGLESv2, called from WebGL (irradiance demo)
(In reply to comment #1)
> Regression window:
> Works:
> http://hg.mozilla.org/mozilla-central/rev/b758d7b3e139
> Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b13pre) Gecko/20110301
> Firefox/4.0b13pre ID:20110301140506
> Hang:
> http://hg.mozilla.org/mozilla-central/rev/c0b114d35e7b
> Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b13pre) Gecko/20110301
> Firefox/4.0b13pre ID:20110301142845
> Pushlog:
> http://hg.mozilla.org/mozilla-central/
> pushloghtml?fromchange=b758d7b3e139&tochange=c0b114d35e7b

This is really surprising, as the changes in this regression window don't seem related at all to where it's stalling. Could it just be that before these changes, the demo was behaving differently e.g. not trying to compile this shader?
Florian, the regression window is where webgl.getSupportedExtensions() got implemented. Could it be that your demo only tries to compiler the shader in question depending on certain conditions?
This sounds similar to http://code.google.com/p/angleproject/issues/detail?id=146 and some threads on the angleproject mailing list where certain shaders take an inordinately long time to compile with the d3d compiler.  Chrome uses a build option to reduce the d3d compiler optimization level to help with this, but that can have other undesirable side effects.  We don't have a good way around this currently.
(In reply to comment #11)
> Chrome uses a build option to
> reduce the d3d compiler optimization level to help with this, but that can
> have other undesirable side effects.

Interesting. Can you give me a link to that code and perhaps also to any bug/discussion around it? I'd consider doing the same in Firefox.
Nope, it'll preload and compile all shaders independently of the extensions. The one thing it does alternate is the texture format depending on weather texture_float is supported or not.

The loading sequence calls getExtension('OES_texture_float') and then proceeds to load the shaders.

This is the entry point http://hg.codeflow.org/webgl_part2_sky/file/b79966b55f5b/main.js#l23

Which triggers the resource loader for the shaders that eventually get to the shader loader: http://hg.codeflow.org/webgl_part2_sky/file/b79966b55f5b/glee/shader.js
(In reply to comment #12)
> (In reply to comment #11)
> > Chrome uses a build option to
> > reduce the d3d compiler optimization level to help with this, but that can
> > have other undesirable side effects.
> 
> Interesting. Can you give me a link to that code and perhaps also to any
> bug/discussion around it? I'd consider doing the same in Firefox.

This was part of r591.  See: http://code.google.com/p/angleproject/source/detail?r=591# and note that the GYP build system defines ANGLE_COMPILE_OPTIMIZATION_LEVEL  as D3DCOMPILE_OPTIMIZATION_LEVEL0.  http://codereview.appspot.com/4275063 is the review URL with a little bit of discussion..
(In reply to comment #7)
> Created attachment 535131 [details]
> local variables when before the stalling D3DCompile() call
> 
> This attachment gives  the values of the parameters we're passing to
> D3DCompile(), making it stall. In particular, the HLSL shader source is:
> 
> 
> float2 vec2(float x0, float x1)
> {
>     return float2(x0, x1);
> }
> float4 vec4(float2 x0, float x1, float x2)
> {
>     return float4(x0, x1, x2);
> }
> float4 vec4(float3 x0, float x1)
> {
>     return float4(x0, x1);
> }
> // Varyings
> 
> static float4 gl_Color[1] = {float4(0, 0, 0, 0)};

This can't possibly be the complete HLSL shader as there is no main function.  Perhaps someone can providing the original GLSL text?
Oh --- I got that from the MSVC debugger; probably it truncated the string, I'm not very used to it.
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0a1) Gecko/20111003 Firefox/10.0a1
Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:10.0a1) Gecko/20111003 Firefox/10.0a1

webgl.prefer-native-gl;true = works
WebGL Renderer = NVIDIA Corporation -- GeForce GT 240/PCI/SSE2 -- 3.3.0


webgl.prefer-native-gl;false = not responding
WebGL Renderer = Google Inc. -- ANGLE (NVIDIA GeForce GT 240) -- OpenGL ES 2.0 (ANGLE 0.0.0.774)



Adapter Description = NVIDIA GeForce GT 240
Vendor ID = 10de
Device ID = 0ca3
Adapter RAM = 1024
Adapter Drivers = nvd3dumx,nvwgf2umx,nvwgf2umx nvd3dum,nvwgf2um,nvwgf2um
Driver Version = 8.17.12.8026
Driver Date = 8-3-2011
Adapter RAM (GPU #2) = Unknown
Adapter Drivers (GPU #2) = Unknown
Direct2D Enabled = true
DirectWrite Enabled = true (6.1.7601.17563)
ClearType Parameters = ClearType parameters not found
GPU Accelerated Windows = 1/1 Direct3D 10
Daniel explained to me the reason why Chrome is unaffected: they disable optimization in the D3D compiler by defining ANGLE_COMPILE_OPTIMIZATION_LEVEL to D3DCOMPILE_OPTIMIZATION_LEVEL0. Patch coming.
This comes from ANGLE's build_angle.gyp:

  'target_defaults': {
    'defines': [
      'ANGLE_DISABLE_TRACE',
      'ANGLE_COMPILE_OPTIMIZATION_LEVEL=D3DCOMPILE_OPTIMIZATION_LEVEL0',
    ],
  },

i also copied the ANGLE_DISABLE_TRACE bit, as it only disables a tracing feature that we don't use so there's no reason to differ from the upstream gyp file here.
Attachment #564343 - Flags: review?(jmuizelaar)
Comment on attachment 564343 [details] [diff] [review]
use same defines as in ANGLE's own gyp file

Review of attachment 564343 [details] [diff] [review]:
-----------------------------------------------------------------

Putting this in all of the files is silly
Attachment #564343 - Flags: review?(jmuizelaar) → review-
+CC Ted and Kyle

What is the Good Way of adding some DEFINES in a Makefile in a way that propagates to other Makefiles in sub-directories?
The only way to do that correctly would be to define it in configure. You could ostensibly "export DEFINES" in a Makefile, but if you build in a subdirectory they won't take effect.
Comment on attachment 564343 [details] [diff] [review]
use same defines as in ANGLE's own gyp file

ergo, this should be good enough for you. rerequesting review!
Attachment #564343 - Flags: review- → review?
Attachment #564343 - Flags: review? → review?(jmuizelaar)
And Ted, please add this to my already long "we wouldn't have had this problem with cmake or any sane buildsystem-generator language" list.
Comment on attachment 564343 [details] [diff] [review]
use same defines as in ANGLE's own gyp file

Indeed.
Attachment #564343 - Flags: review?(jmuizelaar) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/65556aa65339

nice hex value. this means that i have a good karma and the push is going to be green.
https://hg.mozilla.org/mozilla-central/rev/65556aa65339
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
(In reply to Benoit Jacob [:bjacob] from comment #25)
> And Ted, please add this to my already long "we wouldn't have had this
> problem with cmake or any sane buildsystem-generator language" list.

We wouldn't have this problem with a non-recursive make setup either. We're all aware that's a problem...
The URL for the test irradiance is still causing the browser to go not-responding using the latest hourly build with this patch on my machine. 

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0a1) Gecko/20111004 Firefox/10.0a1
http://hg.mozilla.org/mozilla-central/rev/3c9e65a1a5bb



  Graphics

        Adapter Description
        ATI Radeon HD 3200 Graphics

        Vendor ID
        1002

        Device ID
        9610

        Adapter RAM
        256

        Adapter Drivers
        aticfx64 aticfx64 aticfx32 aticfx32 atiumd64 atidxx64 atiumdag atidxx32 atiumdva atiumd6a atitmm64

        Driver Version
        8.881.0.0

        Driver Date
        7-28-2011

        Adapter RAM (GPU #2)
        Unknown

        Adapter Drivers (GPU #2)
        Unknown

        Direct2D Enabled
        true

        DirectWrite Enabled
        true (6.1.7601.17563)

        ClearType Parameters
        DISPLAY1 [ Gamma: 2200 Pixel Structure: RGB ClearType Level: 100 Enhanced Contrast: 300 ] DISPLAY4 [ Gamma: 2200 Pixel Structure: RGB ClearType Level: 100 Enhanced Contrast: 50 ]

        WebGL Renderer
        Google Inc. -- ANGLE (ATI Radeon HD 3200 Graphics) -- OpenGL ES 2.0 (ANGLE 0.0.0.774)

        GPU Accelerated Windows
        1/1 Direct3D 10
FWIW latest dev build of Chrome also goes not responding, but they give a nice dialog box asking to 'wait' of 'kill page' rather than like Firefox just hangs until you kill the browser with task manager. 

Chrome:
16.0.899.0 dev-m
(In reply to Jim Jeffery not reading bug-mail 1/2/11 from comment #31)
> FWIW latest dev build of Chrome also goes not responding, but they give a
> nice dialog box asking to 'wait' of 'kill page' rather than like Firefox
> just hangs until you kill the browser with task manager. 
> 
> Chrome:
> 16.0.899.0 dev-m

Oh yes, I forgot about that but Daniel told me that this change (disabling D3DCompile optimization) didn't fix the problem in all cases, but it still helps in many cases.

Reopening this bug, but again this doesn't mean that this change was not useful.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I was able to reproduce this bug in Firefox 8.0 Beta (clean profile).

OS: Windows 7 x64 SP1

Since this seems to be related to Direct3D, video hardware info may be important.

ATI Radeon HD 5850
1GB VRAM
Driver Version 11.9 (9/28/2011 - Latest available at time of writing)

To reproduce:
1. Navigate to URL

Shortly after loading the page, Firefox becomes unresponsive. Its CPU usage remains high and its memory usage increases at a constant rate (I watched the memory usage increase for a minute or two.).

I've attached a stack trace of navigating to the problem URL and allowing Firefox to hang, breaking the debug shortly after.
Depends on: 791733
Whiteboard: webgl-angle
Update:
The code is not working anymore "A shader failed to compile."
So I currently cannot test if the "not responding" problem still exist.
Updated my test for this problem: http://codeflow.org/issues/unroll-problem/

It's also available as a conformance test here: https://www.khronos.org/registry/webgl/sdk/tests/conformance/glsl/misc/large-loop-compile.html?webglVersion=1 which passes for both chrome and firefox on linux and osx.
Does this still reproduce?
Flags: needinfo?(bugs)
Whiteboard: webgl-angle → webgl-angle [gfx-noted]
Version: Trunk → 2.0 Branch
I have retested it with Firefox 49 and 52 (32-bit and 64-bit) on Windows 10 and Windows 7.
It is ok for me.
Status: REOPENED → RESOLVED
Closed: 13 years ago8 years ago
Flags: needinfo?(bugs)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: