Closed Bug 749459 Opened 12 years ago Closed 12 years ago

The desktop web runtime fails to create WebGL contexts because it can't load the ANGLE libEGL.dll

Categories

(Firefox Graveyard :: Webapp Runtime, defect, P1)

defect

Tracking

(blocking-kilimanjaro:+)

VERIFIED FIXED
Firefox 15
blocking-kilimanjaro +

People

(Reporter: jsmith, Assigned: benjamin)

References

()

Details

(Whiteboard: [topapps], [blocking-webrtdesktop1+], [appreview-blocker])

Attachments

(5 files, 1 obsolete file)

The 3D context test in the desktop runtime (http://www.khronos.org/registry/webgl/specs/latest/) is failing within the web runtime. Strangely enough, it does not fail in desktop firefox on nightly. This bug tracks the investigation and includes any fixes related to getting the 3D context test to pass within the web runtime.
Benoit & Joe - Do you guys know what the 3D context Web GL test does on html5test.com? Are there any test cases you guys have I could use that relate to 3D context test with WebGL?

I'm trying to figure out the root cause of why the test is failing in the web runtime, but passing desktop firefox.
Summary: WebGL 3D Context HTML 5 Test is Failing in the Desktop Runtime, Does Not Fail in Desktop Firefox in Nightly → The desktop web runtime does not support WebGL 3D graphics
Dug into this issue - So it looks like 3D graphics are not supported in the web runtime. This does render at least one top app I know of unusable - Biodigital Human. See the attached screenshot for an example of the implications of this issue - You can't use Biodigital Human in the web runtime as a result.
Whiteboard: [topapps]
Whiteboard: [topapps] → [topapps], [marketplace-beta=]
Why is WebGL not enabled in WebRT? Where is the code for WebRT, so we can check what's going on?

Are you sure that this is not just an issue with a blacklisted graphics driver? (Set the preference webgl.force-enabled=true)
Am I doing this correctly?

http://ed.agadak.net/app/

You can install the app from that page then there'll be a link for "webgl 3d" on the second line which is just a local copy of:
https://developer.mozilla.org/samples/webgl/sample5/index.html

(It links to http://ed.agadak.net/app/gl/ which seems to work fine for me in a native app.)
(In reply to Edward Lee :Mardak from comment #5)
> Am I doing this correctly?
> 
> http://ed.agadak.net/app/
> 
> You can install the app from that page then there'll be a link for "webgl
> 3d" on the second line which is just a local copy of:
> https://developer.mozilla.org/samples/webgl/sample5/index.html
> 
> (It links to http://ed.agadak.net/app/gl/ which seems to work fine for me in
> a native app.)

Weird. That does not work for me. On Desktop Firefox, it works. On the desktop runtime, I get an error saying "Unable to initialize WebGL. Your browser may not support it".
Attached image WebGL FF vs. WebRT
Where can I get a WebRT build to test myself?
Where does the source code live?
The webapprt is a binary that loads libxul and then a xul window with the web page. http://mxr.mozilla.org/mozilla-central/source/webapprt/win/webapprt.cpp
http://mxr.mozilla.org/mozilla-central/source/webapprt/content/webapp.xul

Does the toplevel xul <window> need to have some attribute to trigger hardware acceleration and webgl? or is there some logging on the webgl code that can be enabled with a pref to see where it's failing?
This issue also occurs with another top app called tinkercad. I can confirm this also happens on another person's machine as well.
Third person confirmed. We should check also if this a windows 7 specific issue, cause I know Ed is running Mac, but on Mac, WebGL works.
Change comment 11 - I saw this happen on Mac OS X 10.5 too.
Blocks: 737571
I have no clue about XUL. Clearly, the best way to debug this is to set a breakpoint in WebGLContext::SetDimensions and step through it: where does it fail?
blocking-kilimanjaro: --- → +
Could you please point me to build instructions for WebRT on linux? i would like to debug this.
Unfortunately, we haven't yet implemented the runtime for Linux (bug 745018).
Making a windows build now. Trying 32bit as that was unspecified.
(In reply to Jason Smith [:jsmith] from comment #12)
> Change comment 11 - I saw this happen on Mac OS X 10.5 too.

Oh, WebGL is blacklisted on Mac OS 10.5.

Anyway, I have a Windows debug build now. I do see webapprt-stub.exe but when I run it, it complains that it can't open webapp.ini. Trying to figure how to run a Web app.
Got it, installed 'ed agadak' from above link, got a desktop icon pointing apparently to a new exe file, ran it, clicked the webgl link, reproduced the WebGL context creation failure. Debugging now.
Alright, that was fairly easy debugging. It should really be Windows-specific (see comment 17).

What's happening is that it's failing to load libEGL.dll. That DLL we ship together with Firefox, so firefox.exe naturally finds it as it's in the same directory.

Here's the full list of DLLs that we ship with Firefox, that are needed for WebGL:
libEGL.dll
libGLESv2.dll
D3DCompiler_43.dll
d3dx9_43.dll

The best way to solve this bug is to make Web Apps able to find these DLLs where they currently are. That should be possible as they have to load xul.dll anyways, and these DLLs are in the same directory as libxul.
Summary: The desktop web runtime does not support WebGL 3D graphics → The desktop web runtime fails to create WebGL contexts because it doesn't find libEGL.dll (which is in same directory as xul.dll)
bsmedberg helped us figure out some other library loading issues; cc:ing him for his insight into this one.
Where is this library built, and how is it loaded by xul.dll? Normally direct dependencies of xul.dll should be listed in dependentlibs.list (built from xpcom/stub/Makefile.in). But if this library is dynamically loaded we should just load GRE_DIR/dllname.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #21)
> Where is this library built,

It's built by gfx/angle/src/libEGL/Makefile.in and the companion libGLESv2.dll library is built by gfx/angle/src/libGLESv2/Makefile.in.

They rely on DirectX SDK libraries that are copied from their system location into dist/bin by gfx/angle/Makefile.in.

> and how is it loaded by xul.dll?

We are doing basically

  PR_LoadLibrary("libEGL.dll");

at GLLibraryEGL.cpp:101.

By the way, that code also allows specifying an explicit path with the gfx.angle.egl.path preference.

> Normally
> direct dependencies of xul.dll should be listed in dependentlibs.list (built
> from xpcom/stub/Makefile.in). But if this library is dynamically loaded we
> should just load GRE_DIR/dllname.

It is dynamically loaded (PR_LoadLibrary).
It is not correct or especially safe to load a library without a full pathname, although we've mostly fixed this in gecko. This code *should* be using a correct default. I'm a bit surprised that we allowed a pref-controlled path to end up in release code: what's the use-case for that?

We should be following the first codepath (inside the do loop) at http://hg.mozilla.org/mozilla-central/annotate/89e9b9213670/gfx/gl/GLLibraryEGL.cpp#l66 in all cases, and if we fail to get the pref we should default to getting NS_GRE_DIR and using that. That doesn't account for the oddity of LoadApitraceLibrary, which I don't understand.

The fallback code at http://hg.mozilla.org/mozilla-central/annotate/89e9b9213670/gfx/gl/GLLibraryEGL.cpp#l96 (which is what we use by default currently) doesn't load libGLESv2.dll first, and we need to do that explicitly.
https://marketplace.mozilla.org/en-US/app/tinkercad/

- affected app so adding so we know when to retest
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #23)
> It is not correct or especially safe to load a library without a full
> pathname, although we've mostly fixed this in gecko. This code *should* be
> using a correct default. I'm a bit surprised that we allowed a
> pref-controlled path to end up in release code: what's the use-case for that?

Frankly, I just found out about that pref while looking at the code to debug this, so at least it's not widely used. Maybe the use case was to allow using a different OpenGL ES implementation than the one we ship, but this is such a niche use case that it's fine to have to manually replace files to achieve that. So let's remove this pref.

CC'ing Vlad.

> 
> We should be following the first codepath (inside the do loop) at
> http://hg.mozilla.org/mozilla-central/annotate/89e9b9213670/gfx/gl/
> GLLibraryEGL.cpp#l66 in all cases, and if we fail to get the pref we should
> default to getting NS_GRE_DIR and using that. That doesn't account for the
> oddity of LoadApitraceLibrary, which I don't understand.

CC'ing George for LoadApitraceLibrary.

On Windows, we want to load the DLLs from our own dist/bin directory. On other platforms (typically, Android) we want to load the DLLs from system locations (like /system/lib).

Currently, everytime we load OpenGL DLLs from system locations, we do it with PR_LoadLibrary without specifying a full path. Why is that unsafe? How else can we achieve that?

> 
> The fallback code at
> http://hg.mozilla.org/mozilla-central/annotate/89e9b9213670/gfx/gl/
> GLLibraryEGL.cpp#l96 (which is what we use by default currently) doesn't
> load libGLESv2.dll first, and we need to do that explicitly.

The order in which libEGL.dll and libGLESv2.dll are loaded, doesn't matter.
> The order in which libEGL.dll and libGLESv2.dll are loaded, doesn't matter.

Are you sure? The comment at http://hg.mozilla.org/mozilla-central/annotate/89e9b9213670/gfx/gl/GLLibraryEGL.cpp#l76 says the exact opposite.

If we know at build time whether we are looking for system libraries or self-built libraries we can just switch this around at build time. Although if they are system libs, why not just link directly with them instead of doing this dance with loadlibrary? Are they so expensive to load that the delay-load is worth it?
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #26)
> > The order in which libEGL.dll and libGLESv2.dll are loaded, doesn't matter.
> 
> Are you sure? The comment at
> http://hg.mozilla.org/mozilla-central/annotate/89e9b9213670/gfx/gl/
> GLLibraryEGL.cpp#l76 says the exact opposite.

Oh, I hadn't noticed that. OK, that makes sense, sorry.

> 
> If we know at build time whether we are looking for system libraries or
> self-built libraries we can just switch this around at build time.

We do know at build-time whether we want to load the DLLs from our own directory or from system directories: #ifdef XP_WIN tells us that.

So, what is the proper way of loading these libraries from system directories (on non-Windows platforms)?

> Although
> if they are system libs, why not just link directly with them instead of
> doing this dance with loadlibrary? Are they so expensive to load that the
> delay-load is worth it?

I haven't measured it or even checked if that works. A typical problem with OpenGL is that many OpenGL functions are not exposed by the OpenGL libraries in the normal way and the only way to load them is to use a dedicated function such as eglGetProcAddress in EGL  (in particular, dlsym() can't be used to load OpenGL functions in general). So I'm not sure that linking to OpenGL ES libraries will work.
Priority: -- → P1
Still waiting on answers to my questions in comment 27.
Can someone from our desktop WebRT engineering team (Felipe?, Myk?) address Benoit's questions in comment 27?

Also, got a question for the GFX guys - Should we block or not block the 1st release of the WebRT release with not having WebGL support due to this bug?
Well, it's up to you guys to decide how important WebGL is for you. But if it's any important, we should just fix this bug, because it should be very easy.
Assignee: nobody → benjamin
Status: NEW → ASSIGNED
I'm still a bit puzzled about the interceptor, but other than that I can whip up a patch for this very easily.
No longer blocks: 737571
(In reply to Jason Smith [:jsmith] from comment #29)

> Also, got a question for the GFX guys - Should we block or not block the 1st
> release of the WebRT release with not having WebGL support due to this bug?

The Product team marked this blocking. WebRT is Gecko. Not having major Gecko features in the WebRT is not OK. Thus, this blocks.
Whiteboard: [topapps], [marketplace-beta=] → [topapps], [blocking-webrtdesktop1+]
(In reply to Asa Dotzler [:asa] from comment #32)
> (In reply to Jason Smith [:jsmith] from comment #29)
> 
> > Also, got a question for the GFX guys - Should we block or not block the 1st
> > release of the WebRT release with not having WebGL support due to this bug?
> 
> The Product team marked this blocking. WebRT is Gecko. Not having major
> Gecko features in the WebRT is not OK. Thus, this blocks.

Sounds good. Flagging as a blocking bug for the 1st release of the desktop web runtime.
Target Milestone: --- → M1
Attached patch Load EGL more correctly, rev. 1 (obsolete) — Splinter Review
Attachment #624117 - Flags: review?(bjacob)
Attachment #624117 - Flags: feedback?(gwright)
This passed tryserver, but I'm not sure how good our test coverage is for this kind of thing: https://tbpl.mozilla.org/?tree=Try&rev=d19d1e7da6ee
Comment on attachment 624117 [details] [diff] [review]
Load EGL more correctly, rev. 1

Review of attachment 624117 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLLibraryEGL.cpp
@@ +93,5 @@
> +    // On non-Windows (Android) we use system copies of libEGL. We look for
> +    // the APITrace lib, libEGL.so, and libEGL.so.1 in that order.
> +
> +    if (!mEGLLibrary)
> +        mEGLLibrary = LoadApitraceLibrary();

Shouldn't this be in an #ifdef ANDROID block? I don't think this will compile as is.
Comment on attachment 624117 [details] [diff] [review]
Load EGL more correctly, rev. 1

Yeah, I'll fix that on checkin. It doesn't matter in practice because this code is only built on windows and android currently.
Comment on attachment 624117 [details] [diff] [review]
Load EGL more correctly, rev. 1

Review of attachment 624117 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLLibraryEGL.cpp
@@ +59,5 @@
>  #ifdef XP_WIN
> +    if (!mEGLLibrary) {
> +        // On Windows, the EGL and GLESv2 libraries are shipped with libxul and
> +        // we should look for them there. We have to load the libs in this
> +        // order, because libEGL.dll depends on libGLESv2.dll.

"in this order" in this sentence is a bit surprising because earlier in this sentence you say "EGL and GLESv2" but they have to be loaded in the opposite order. I'd rephrase the beginning into "GLESv2 and EGL".

@@ +71,3 @@
>  
> +        rv = dirService->Get(NS_GRE_DIR, NS_GET_IID(nsIFile),
> +                             getter_AddRefs(libraryFile));

I am very unfamiliar with this file/directory API. Any pointer to documentation or headers?
Attachment #624134 - Flags: review?(bjacob)
Comment on attachment 624134 [details] [diff] [review]
Load EGL more correctly, rev. 1.1

Review of attachment 624134 [details] [diff] [review]:
-----------------------------------------------------------------

R=me but a comment mostly for Vlad about the existing code:

::: gfx/gl/GLLibraryEGL.cpp
@@ +79,4 @@
>  
> +        libraryFile->Load(&glesv2lib);
> +
> +        // Intentionally leak glesv2lib

I really amn't a fan of intentionally leaking. AIUI, this means that on Windows, if we open a page that does WebGL and close it, our vsize will stay increased by the GLESv2 library size. Could matter on 32bit. Can we not leak?
Attachment #624134 - Flags: review?(bjacob) → review+
By the way, this is quite well covered by unit tests:
 - on Android, failure to load EGL or GLESv2 would mean failure to start at all
 - on Windows, failure would cause the WebGL mochitest to fail with 'cannot create a WebGL context'.
Attachment #624117 - Attachment is obsolete: true
Attachment #624117 - Flags: review?(bjacob)
Attachment #624117 - Flags: feedback?(gwright)
As far as I can see we never unload the EGL library, and leaving gles in memory isn't really avoidable as long as EGL is loaded.
OK. We could unload the EGL library though. I don't know.
I think that would be very hard and potentially crashy (dead pointers), and then we'd have to pay a penalty to reload it the next time. Although I'd be willing to see data proving me wrong! ;-)
https://hg.mozilla.org/integration/mozilla-inbound/rev/2716f09884b1

Unable to set the TM to Firefox 15...
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #46)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/2716f09884b1
> 
> Unable to set the TM to Firefox 15...

Until we get this moved over to the firefox product, use M1 (which is already set).
https://hg.mozilla.org/mozilla-central/rev/2716f09884b1
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Whiteboard: [topapps], [blocking-webrtdesktop1+] → [topapps], [blocking-webrtdesktop1+], [qa+]
Component: Desktop Runtime → Webapp Runtime
Product: Web Apps → Firefox
Target Milestone: M1 → Firefox 15
Requesting to reopen. This does not work on the current nightly build on both Windows 7 64-bit and Windows XP. I did two different test cases:

Test Case 1

1. Install Ed's app here - http://ed.agadak.net/app
2. Launch Ed's app
3. Launch the WebGL demo in Ed's app

Result: Error reported saying unable to initialize WebGL

Test Case 2

1. Install Tinkercad
2. Launch Tinkercad

Result: A Get WebGL Supported browser to design with Tinkercad is indicated. Clicking that says that my browser does not support WebGL.
Re-tested just to be sure for today's build - This definitely isn't fixed. I am getting the same issues mentioned in comment 49. Reopening.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [topapps], [blocking-webrtdesktop1+], [qa+] → [topapps], [blocking-webrtdesktop1+]
I tested the nightly against ed.agadak.net and it seems to work fine... I'm not sure what to do next here...
I've(In reply to Benjamin Smedberg  [:bsmedberg] from comment #51)
> I tested the nightly against ed.agadak.net and it seems to work fine... I'm
> not sure what to do next here...

tested directly in Nightly?  Or installed the app via 'install ed agadak' and tested within the app?  It doesn't work for me in app, using May 22nd's Nightly, though does direct in the browser. 
(Win7, btw)
Installed app test.
So WebGL is blocked by default for me in Nightly, but I have it force-enabled through about:config. WebGL kinda-sorta works in my browser set up like this.

Installing the app version, I don't think my force-enabled preference carried over to the app runtime, so it gives me the "unable to initialize" error.
<bsmedberg> edmorley: go to http://ed.agadak.net/app/
<bsmedberg> first click "webgl_3d" and make sure canvas3d is working
<bsmedberg> then go back and choose "install ed_agadak"
<bsmedberg> then open ed_agadak from your start menu and report whether canvas3d is working there

Works here, the cube animates in the app window. I can post about support if need be.
Strange. Tested again, still seeing the same issue. Graphics info:

Adapter Description: Intel(R) HD Graphics Family
Adapter Description (GPU #2): NVIDIA Quadro 1000M
WebGL Renderer: Google Inc. -- ANGLE (Intel(R) HD Graphics Family) -- OpenGL ES 2.0 (ANGLE 1.0.0.1041)
GPU Accelerated Windows: 2/2 Direct3D 9
AzureBackend: direct2d
(In reply to Jim Mathies [:jimm] from comment #55)
> Works here, the cube animates in the app window. I can post about support if
> need be.

Could you post your about:support graphics info here?
(In reply to Jason Smith [:jsmith] from comment #57)
> (In reply to Jim Mathies [:jimm] from comment #55)
> > Works here, the cube animates in the app window. I can post about support if
> > need be.
> 
> Could you post your about:support graphics info here?

  Graphics

        Adapter Description
        NVIDIA Quadro NVS 295

        Vendor ID
        0x10de

        Device ID
        0x06fd

        Adapter RAM
        256

        Adapter Drivers
        nvd3dumx,nvwgf2umx,nvwgf2umx nvd3dum,nvwgf2um,nvwgf2um

        Driver Version
        8.17.12.9635

        Driver Date
        3-22-2012

        Direct2D Enabled
        true

        DirectWrite Enabled
        true (6.1.7601.17789)

        ClearType Parameters
        Gamma: 2200 Pixel Structure: RGB ClearType Level: 100 Enhanced Contrast: 50

        WebGL Renderer
        Google Inc. -- ANGLE (NVIDIA Quadro NVS 295) -- OpenGL ES 2.0 (ANGLE 1.0.0.1041)

        GPU Accelerated Windows
        1/1 Direct3D 10

        AzureBackend
        direct2d

user prefs are all pretty much default. Nothing changed gfx related.
(In reply to Jason Smith [:jsmith] from comment #56)
> Strange. Tested again, still seeing the same issue. Graphics info:
> 
> Adapter Description: Intel(R) HD Graphics Family
> Adapter Description (GPU #2): NVIDIA Quadro 1000M
> WebGL Renderer: Google Inc. -- ANGLE (Intel(R) HD Graphics Family) -- OpenGL
> ES 2.0 (ANGLE 1.0.0.1041)
> GPU Accelerated Windows: 2/2 Direct3D 9
> AzureBackend: direct2d

Oh, you have Optimus. We have lots of trouble getting the blacklisting to work correctly in this case. This is a known issue, orthogonal to WebRT. The best path toward resolving it will be to get a Direct3D 11 back-end for ANGLE. This is a long-term project (maybe late 2012). Not clear yet who's going to do that work.
(In reply to Benoit Jacob [:bjacob] from comment #59)
> (In reply to Jason Smith [:jsmith] from comment #56)
> > Strange. Tested again, still seeing the same issue. Graphics info:
> > 
> > Adapter Description: Intel(R) HD Graphics Family
> > Adapter Description (GPU #2): NVIDIA Quadro 1000M
> > WebGL Renderer: Google Inc. -- ANGLE (Intel(R) HD Graphics Family) -- OpenGL
> > ES 2.0 (ANGLE 1.0.0.1041)
> > GPU Accelerated Windows: 2/2 Direct3D 9
> > AzureBackend: direct2d
> 
> Oh, you have Optimus. We have lots of trouble getting the blacklisting to
> work correctly in this case. This is a known issue, orthogonal to WebRT. The
> best path toward resolving it will be to get a Direct3D 11 back-end for
> ANGLE. This is a long-term project (maybe late 2012). Not clear yet who's
> going to do that work.

Oh okay. So why does this issue then only happen in the web runtime, but not on desktop firefox?
(In reply to Jason Smith [:jsmith] from comment #60)
> Oh okay. So why does this issue then only happen in the web runtime, but not
> on desktop firefox?

That's weird. Do you have relevant modified preferences in your desktop firefox?
(In reply to Benoit Jacob [:bjacob] from comment #61)
> (In reply to Jason Smith [:jsmith] from comment #60)
> > Oh okay. So why does this issue then only happen in the web runtime, but not
> > on desktop firefox?
> 
> That's weird. Do you have relevant modified preferences in your desktop
> firefox?

Don't think so. Just tried testing on a fresh profile on the latest nightly - still getting a reproduction of this issue.
Whiteboard: [topapps], [blocking-webrtdesktop1+] → [topapps], [blocking-webrtdesktop1+], [appreview-blocker]
(In reply to Benoit Jacob [:bjacob] from comment #59)
> (In reply to Jason Smith [:jsmith] from comment #56)
> > Strange. Tested again, still seeing the same issue. Graphics info:
> > 
> > Adapter Description: Intel(R) HD Graphics Family
> > Adapter Description (GPU #2): NVIDIA Quadro 1000M
> > WebGL Renderer: Google Inc. -- ANGLE (Intel(R) HD Graphics Family) -- OpenGL
> > ES 2.0 (ANGLE 1.0.0.1041)
> > GPU Accelerated Windows: 2/2 Direct3D 9
> > AzureBackend: direct2d
> 
> Oh, you have Optimus. We have lots of trouble getting the blacklisting to
> work correctly in this case. This is a known issue, orthogonal to WebRT. The
> best path toward resolving it will be to get a Direct3D 11 back-end for
> ANGLE. This is a long-term project (maybe late 2012). Not clear yet who's
> going to do that work.

btw, I have a different graphics setup to Jason and have the same issue:
Adapter Description: AMD Radeon HD 6800 Series
WebGL Renderer: Google Inc. -- ANGLE (AMD Radeon HD 6800 Series) -- OpenGL ES 2.0 (ANGLE 1.0.0.1041)
GPU Accelerated Windows: 1/1 Direct3D 10
AzureBackend: direct2d

I quickly hacked an app to have the origin as about:support and it confirms the only difference to the graphics details cf. Nightly is:
WebGL Renderer: false

There are no modified graphics preferences in the app or Nightly (other than a, I  assume automatically set, one: "gfx.direct3d.prefer_10_1: true" - and its the same in both)
Are there any gfx.blacklist.* preferences? If yes, reset them.
(In reply to Benoit Jacob [:bjacob] from comment #64)
> Are there any gfx.blacklist.* preferences? If yes, reset them.

There aren't.  The only gfx. (or webgl.) modified preference in either Nightly or the app is "gfx.direct3d.prefer_10_1".
(In reply to Andrew Williamson [:eviljeff] from comment #65)
> (In reply to Benoit Jacob [:bjacob] from comment #64)
> > Are there any gfx.blacklist.* preferences? If yes, reset them.
> 
> There aren't.  The only gfx. (or webgl.) modified preference in either
> Nightly or the app is "gfx.direct3d.prefer_10_1".

Same here. Random question - How did you get about:support to be the origin of the application? I'd like to take a look at my about:support info as well.
(In reply to Jason Smith [:jsmith] from comment #66)
> Same here. Random question - How did you get about:support to be the origin
> of the application? I'd like to take a look at my about:support info as well.

I edited webapp.json and changed "origin" in it to be "about:support".  You might need to delete launch_path also, depending on the app.  (Obviously back up the original webapp.json if you want to use the app again without deleting and reinstalling)
(In reply to Andrew Williamson [:eviljeff] from comment #67)
> (In reply to Jason Smith [:jsmith] from comment #66)
> > Same here. Random question - How did you get about:support to be the origin
> > of the application? I'd like to take a look at my about:support info as well.
> 
> I edited webapp.json and changed "origin" in it to be "about:support".  You
> might need to delete launch_path also, depending on the app.  (Obviously
> back up the original webapp.json if you want to use the app again without
> deleting and reinstalling)

Gotcha. That's weird. I'll have to test that scenario. Anyways, I'm also seeing WebGL Renderer as false in the desktop web runtime.
The 'ed agadak' app, has WebGL working for me, i.e. I can't reproduce, on a thinkpad w520 configured to stay on the NVIDIA cards. I'll try asap in default Optimus mode.
Tried with Optimus: it's working for me. The 'ed agadak' app has WebGL. So does Firefox. Both are now using the Intel GPU. I don't see what else I can try to reproduce.
(In reply to Benoit Jacob [:bjacob] from comment #70)
> Tried with Optimus: it's working for me. The 'ed agadak' app has WebGL. So
> does Firefox. Both are now using the Intel GPU. I don't see what else I can
> try to reproduce.

Weird. I think at this point, we'll need to debug this directly on a machine that WebGL isn't working on. I tried again - still not working for me. Is there any other diagnostic information I could provide to better catch the root cause of this problem?
I ran Ed's app against the latest debug build on the failing machine in the QA testing lab, and I noted that the following messages appeared in the console when I loaded the WebGL test:

WARNING: Couldn't load libEGL.dll, canvas3d will be disabled.: file e:/builds/moz2_slave/m-cen-w32-dbg/build/gfx/gl/GLLibraryEGL.cpp, line 87
JavaScript warning: http://ed.agadak.net/app/gl/webgl-demo.js, line 67: WebGL: Error during ANGLE OpenGL ES initialization

So I installed Visual Studio 2010 and configured it to use Mozilla's symbol server, but then I got stuck looking for a debug nightly and ran out of time.  (I just found it in https://ftp.mozilla.org/pub/firefox/nightly/2012-05-25-mozilla-central-debug/, but at this point I'm on the train home.)

I'll pick this back up on Wednesday, May 30, when I'm back in Mountain View.

Or is there someone else in Mountain View who can get to it sooner?
Many thanks Myk, this is already interesting as it shows that the issue on this machine is the exact same issue we had originally in this bug and not, as I was afraid of in comment 59, an issue in our blacklisting logic. I confirm that the next steps you outline (running this in a debugger) are the right thing to do.

Side note --- this warning message is an archaeological jewel, it still uses the term 'canvas3d' which was the ancestor of WebGL.
(In reply to Benoit Jacob [:bjacob] from comment #73)
> Many thanks Myk, this is already interesting as it shows that the issue on
> this machine is the exact same issue we had originally in this bug and not,
> as I was afraid of in comment 59, an issue in our blacklisting logic. I
> confirm that the next steps you outline (running this in a debugger) are the
> right thing to do.
> 
> Side note --- this warning message is an archaeological jewel, it still uses
> the term 'canvas3d' which was the ancestor of WebGL.

Should we open a separate bug for this issue then, knowing that it's different? Then move this bug to verified fixed, given that we've confirmed the original bug found is fixed?
Note that in comment 19 I listed 4 DLLs that must be found in order for this to work. These are normally present in the same directory as xul.dll. If any of these 4 DLLs is not found, that will probably give this "Couldn't load libEGL.dll" message i.e. the filename in this message may be misleading.
(In reply to Jason Smith [:jsmith] from comment #74)
> Should we open a separate bug for this issue then, knowing that it's
> different? Then move this bug to verified fixed, given that we've confirmed
> the original bug found is fixed?

Sorry if my comment 73 was unclear, but to the contrary, I was trying to say that the remaining issue _is_ exactly the same as we were originally debugging here, and _not_ a separate issue as I feared.
Summary: The desktop web runtime fails to create WebGL contexts because it doesn't find libEGL.dll (which is in same directory as xul.dll) → The desktop web runtime fails to create WebGL contexts because it can't load the ANGLE libEGL.dll
Update: I struggled again today to get MSVC to load symbols for a nightly debug build, probably because I'm not familiar enough with the tool.  In the end, I installed build tools and kicked off a build, to which I'll return tomorrow morning.
Doing your own build will indeed work much better. In fact, I didn't even know that we had any debug symbols for our Nightly debug builds.
We do not, in fact, have debug symbols for the nightly debug builds. (Whether they're useful at all is another question, then.)
Yeah, in this case I was suggesting to use the symbol server and a regular (non-debug) nightly build.
(In reply to Ted Mielczarek [:ted] from comment #79)
> We do not, in fact, have debug symbols for the nightly debug builds.

Hmm, "Using the Mozilla symbol server" says:

    Even with Mozilla symbol server, debugging release builds is not always
    easy as the debugger will not be able to show you the content of all
    variables and the execution path can seem strange. To avoid this problem,
    try using a debug Firefox nightly from tinderbox instead.

    - https://developer.mozilla.org/en/Using_the_Mozilla_symbol_server

Perhaps I misunderstand, but I interpreted that to mean that the symbol server has symbols for the "debug Firefox nightly" builds in firefox/nightly/.  Although "from tinderbox" is confusing, as is the document's link to firefox/tinderbox-builds/.  But I tried a variety of builds from both locations, and MSVC couldn't find symbols for any of them.


In any case, my debug build on the machine in the QA lab finished yesterday evening, but when I tested it this morning, it didn't exhibit the problem.

Meanwhile, the Nightly build on the machine had prompted me about an update, and I blithely updated it, after which I decided to retest it, and it too worked correctly.

So I took a closer look at it, and it turns out to be from the Elm branch, on the nightly-elm update channel.  That branch merged in bsmedberg's fix on May 17 <https://hg.mozilla.org/projects/elm/pushloghtml?startID=381&endID=382>, so Elm builds from May 18 onward should have it.  But the nightly-elm update channel doesn't seem to get updated every day.  The latest build in it is currently from May 27.

So my theory is that when Jason tested the latest Nightly build on May 23, he was testing a pre-May 18 build from nightly-elm, and that's why it exhibited the problem.

We could confirm that by consulting a log of update channel changes, if there is such a thing.  But I also double-checked by installing the Elm build from May 15, which exhibited the problem, and then updating to the latest available Elm build, which doesn't.  And then I installed the latest build from the regular nightly channel, which also worked correctly.

I know folks have confirmed the problem on other machines, but at this point I think it's reasonable to mark this bug fixed and open new ones for specific instances in which there is still a discrepancy between Firefox's behavior and the runtime's behavior, as the confirmed problem on all machines that was initially reported in this bug has been resolved.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
(In reply to Myk Melez [:myk] [@mykmelez] from comment #81)
> So I took a closer look at it, and it turns out to be from the Elm branch,
> on the nightly-elm update channel.  That branch merged in bsmedberg's fix on
> May 17
> <https://hg.mozilla.org/projects/elm/pushloghtml?startID=381&endID=382>, so
> Elm builds from May 18 onward should have it.  But the nightly-elm update
> channel doesn't seem to get updated every day.  The latest build in it is
> currently from May 27.
> 
> So my theory is that when Jason tested the latest Nightly build on May 23,
> he was testing a pre-May 18 build from nightly-elm, and that's why it
> exhibited the problem.
>

Actually, no that's not it. I had an elm build installed on that machine after I did my testing of the WebGL fix to test a different feature (prefetch elimination with snappy). I've done testing across nightly builds for multiple days and someone else has confirmed this problem. I also attached a screenshot of this issue occurring on Windows XP SP3 with the 5/30 Nightly Build on a very close to vanilla installation of Win XP. I don't think this is a problem where I had an incorrect build.
 
> We could confirm that by consulting a log of update channel changes, if
> there is such a thing.  But I also double-checked by installing the Elm
> build from May 15, which exhibited the problem, and then updating to the
> latest available Elm build, which doesn't.  And then I installed the latest
> build from the regular nightly channel, which also worked correctly.

See my comment above.

> 
> I know folks have confirmed the problem on other machines, but at this point
> I think it's reasonable to mark this bug fixed and open new ones for
> specific instances in which there is still a discrepancy between Firefox's
> behavior and the runtime's behavior, as the confirmed problem on all
> machines that was initially reported in this bug has been resolved.

Sorry, but I still consistently reproduce this problem, including the screenshot I showed. Andy's also getting a reproduction, so I know this isn't specific to me. We've tried this multiple times for testing - I think it's evident that the problem is still here.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Oh, this screenshot shows it's running in VirtualBox? WebGL is blacklisted on virtual machines drivers. You can however bypass the blacklist by going to about:config and setting webgl.force-enabled to true.
(In reply to Benoit Jacob [:bjacob] from comment #84)
> Oh, this screenshot shows it's running in VirtualBox? WebGL is blacklisted
> on virtual machines drivers. You can however bypass the blacklist by going
> to about:config and setting webgl.force-enabled to true.

Okay. Re-tested on my Win 7 64-bit machine. See screenshot. Tested this on nightly with 5/31 build.
Just to be sure: are you testing with 64-bit builds or 32-bit builds?
(In reply to Joe Drew (:JOEDREW!) from comment #87)
> Just to be sure: are you testing with 64-bit builds or 32-bit builds?

http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/firefox-15.0a1.en-US.win32.installer.exe is what I installed
I just tested on a Windows 7 machine in the QA Lab running today's 32bit nightly build, and I was able to reproduce the behavior.  However, I couldn't confirm that the behavior has the same cause, because I didn't have access to debug messages.

And I couldn't get a debug build to work.  When I tried to run it, the OS complained that msvcr100d.dll was missing.  But (re)installing the Visual C++ 2010 Redistributable Package, which is supposed to solve that problem, didn't.

The issue needs more investigation.  But it isn't useful to continue to iterate in this bug, which is about a failure on all Windows installations, when that original problem has been resolved.  Especially when it was a blocker, which isn't yet clear about the remaining issue.

So I have filed bug 760323 to investigate the remaining issue.  Let's iterate on it there.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Marking as verified given that other people have confirmed this to be fixed, and a followup bug is being tracked for the remaining issues as to why the problem is happening on certain machines.
Status: RESOLVED → VERIFIED
Not to belabor the point, but:

(In reply to Myk Melez [:myk] [@mykmelez] from comment #81)
> (In reply to Ted Mielczarek [:ted] from comment #79)
> > We do not, in fact, have debug symbols for the nightly debug builds.
> 
> Hmm, "Using the Mozilla symbol server" says:
> 
>     Even with Mozilla symbol server, debugging release builds is not always
>     easy as the debugger will not be able to show you the content of all
>     variables and the execution path can seem strange. To avoid this problem,
>     try using a debug Firefox nightly from tinderbox instead.

This documentation is wrong. I'm not sure who put it there. We've never had symbols for debug tinderbox builds.

(In reply to Myk Melez [:myk] [@mykmelez] from comment #89)
> And I couldn't get a debug build to work.  When I tried to run it, the OS
> complained that msvcr100d.dll was missing.  But (re)installing the Visual
> C++ 2010 Redistributable Package, which is supposed to solve that problem,
> didn't.

The debug CRT DLLs are not included in the redistributable package. In fact, they're explicitly *not* redistributable. You need to install the matching compiler (or possibly an SDK, I forget if they include debug CRTs).
Flags: in-moztrap?(jsmith)
QA Contact: desktop-runtime → jsmith
I seem to be having this exact same issue in Xulrunner 13, 14 and 15 on the windows platform. Only in latest trunk this seems to be fixed. It also worked correctly in Xulrunner 12.

We're using WebGL in a Xulrunner context (more accurately: through the use of an iframe element ). It too throws the "WebGL: Error during ANGLE OpenGL ES initialization" error. Copying the mentioned files to the stub executable's directory fixes this problem, but significantly slows down the application startup process.

Please ask if you need any more information on this problem, or if I need to create a seperate bugpost for this issue.
Tom, you're likely running into bug 760323 which is only fixed in version >= 16 (will be in Aurora channel today/tomorrow).
Flags: in-moztrap?(jsmith)
@Benoit: Thank you. I've noticed this and used a workaround already. I will keep my eyes for FF16.
Product: Firefox → Firefox Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: