Closed Bug 660070 Opened 13 years ago Closed 13 years ago

Use EGL_CONTEXT_LOST to detect driver resets in WebGL on Windows with ANGLE

Categories

(Core :: Graphics: CanvasWebGL, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla11

People

(Reporter: bjacob, Assigned: drs)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 4 obsolete files)

This is the low-hanging-fruit part of bug 656824. Windows is easiest to handle because:
 1) ANGLE has EGL_CONTEXT_LOST to tell us when a driver reset occurs
 2) On Windows, besides WebGL, we don't use GL/GLES
 3) For D3D contexts, this is already a solved problem (thanks Bas)

So it's just a matter of detecting EGL_CONTEXT_LOST in WebGL and then properly lose the WebGL context.
Assignee: nobody → dsherk
This patch is pretty sub-optimal, but seems to be the only sane way to do it that I could think of. Before I explain, some notes:

* You can't test this in debug mode, or the whole browser just crashes whenever you use my test case (whereas it keeps going with WebGL disabled in release mode). There might be some way around this, but I wrote/tested this patch in release mode and just sort of guessed what was going on behind the scenes.
* This patch only applies to Windows, and of course only EGL.

So what this actually implements is informing a script through "webglcontextlost" that the context has been lost when the implementation receives an EGL_CONTEXT_LOST error. It uses some of the ARB_robustness code and is thus dependent on the ARB_robustness patch. 

Now, to actually generate this EGL_CONTEXT_LOST error, you have to try to switch the context and it has to be gone. The problem is, our EGL code only switches the context if the current one is different than the one we're trying to set (i.e. if they're the same we just stop there and pretend it succeeded). So if the user isn't switching pages or anything like that, they will never generate this error even if the context is indeed lost.

There were 2 workarounds I thought of:
1) Always force a context switch, even if we're already on that context. I tossed this out because it's terrible and slow.
2) I looked at the angle code and saw that the MakeCurrent() code there checks if the device is null or has been lost, so I figured I could test that myself in our code to predict what result MakeCurrent() will give. If we see the same conditions that would make MakeCurrent() error, then I run MakeCurrent() anyways (even if I normally wouldn't because the context is already current) and check for an error. 

I chose the second method, but the problem is that thebes has no concept of an actual EGLDisplay inside it, and instead it's typedef'd to a void pointer. So my (crappy) solution was to just copy the header for EGLDisplay over to thebes, and replace any tokens that we don't have a definition for with a void pointer within its class def. This works, but is incredibly hacky.
Attachment #565426 - Flags: review?(bjacob)
No longer blocks: 656824
Depends on: 656824
Cleared up bitrot.
Attachment #565426 - Attachment is obsolete: true
Attachment #565426 - Flags: review?(bjacob)
Attachment #570182 - Flags: review?(bjacob)
Catches the EGL_CONTEXT_LOST error which occurs after driver resets, and sends a
WebGL context the canvas event webglcontextlost when this occurs. Forces a
context switch during DrawArrays and DrawElements, which could potentially be a
performance regression.

Tested on Windows, may work on Android although untested. Will push to try once it re-opens for checkins.
Attachment #570182 - Attachment is obsolete: true
Attachment #570182 - Flags: review?(bjacob)
Attachment #573724 - Flags: review?(bjacob)
Rewrote for the second time, this time it abuses the robustness timer to do random forced MakeCurrent() if the context provider is EGL. I simulate a CONTEXT_GUILTY_CONTEXT_RESET_ARB, reason being that we don't want to allow the user to respawn the context since we have no information on whether it was guilty or not. Thus, we have to assume the worst case, which is that any context which receives this error is guilty of causing it.
Attachment #573724 - Attachment is obsolete: true
Attachment #573724 - Flags: review?(bjacob)
Attachment #577829 - Flags: review?(bjacob)
Small change to MakeCurrent() behavior.
Attachment #577829 - Attachment is obsolete: true
Attachment #577829 - Flags: review?(bjacob)
Attachment #578081 - Flags: review?(bjacob)
Comment on attachment 578081 [details] [diff] [review]
Patch v3.1, handling EGL_CONTEXT_LOST.

Review of attachment 578081 [details] [diff] [review]:
-----------------------------------------------------------------

Neat! Like this much better than previous versions.
Attachment #578081 - Flags: review?(bjacob) → review+
https://hg.mozilla.org/mozilla-central/rev/66f61be58aa3
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla11
Blocks: 707860
There doesn't seem to be a reason to include nsContentUtils.h here; compiles fine without.
Attachment #583149 - Flags: review?(bugzilla)
Was this patch meant for this ticket and is that the correct patch?
Comment on attachment 583149 [details] [diff] [review]
Remove nsContentUtils.h include

Review of attachment 583149 [details] [diff] [review]:
-----------------------------------------------------------------

Clearing from review queue, doesn't seem relevant to this bug anyways.
Attachment #583149 - Flags: review?(bugzilla)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: