Closed
Bug 1290831
Opened 7 years ago
Closed 6 years ago
Hit MOZ_CRASH(Unexpected error with MOZ_GL_DEBUG_ABORT_ON_ERROR. (Run with MOZ_GL_DEBUG_ABORT_ON_ERROR=0 to disable)) at GLContext.h:759
Categories
(Core :: Graphics: CanvasWebGL, defect)
Core
Graphics: CanvasWebGL
Tracking
()
People
(Reporter: cbook, Assigned: cleu)
References
()
Details
(Keywords: assertion, Whiteboard: [gfx-noted])
Attachments
(4 files, 1 obsolete file)
6.48 KB,
text/plain
|
Details | |
3.09 KB,
text/plain
|
Details | |
6.46 KB,
text/plain
|
Details | |
58 bytes,
text/x-review-board-request
|
cleu
:
review+
gchang
:
approval-mozilla-aurora+
gchang
:
approval-mozilla-beta+
|
Details |
Found via bughunter and seems affected beta -> nightly Reproduced with latest m-c debug tinderbox build on windows Steps to reproduce: -> http://www.anj.fyi/behold/ -> Load Hit MOZ_CRASH(Unexpected error with MOZ_GL_DEBUG_ABORT_ON_ERROR. (Run with MOZ_GL_DEBUG_ABORT_ON_ERROR=0 to disable)) at c:\builds\moz2_slave\m-cen-w32-d-000000000000000000\build\src\gfx\gl\GLContext.h:759
Reporter | ||
Comment 1•7 years ago
|
||
[Tracking Requested - why for this release]: bughunter
tracking-firefox49:
--- → ?
tracking-firefox50:
--- → ?
Updated•7 years ago
|
Whiteboard: [gfx-noted]
Comment 3•7 years ago
|
||
I can't reproduce with the build: http://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-central-win64-debug/1470060589/ at win10.
Reporter | ||
Comment 4•7 years ago
|
||
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #3) > I can't reproduce with the build: > http://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-central- > win64-debug/1470060589/ > at win10. it seems you might reload the page several times via shift+reload this works for me to trigger the crash on win 7
Flags: needinfo?(cbook)
Reporter | ||
Comment 5•7 years ago
|
||
about:support (its a win7 vm on a fusion 8 mac 10.11 mbp). Graphics -------- Features Compositing: Basic Asynchronous Pan/Zoom: wheel input enabled; touch input enabled WebGL Renderer: Google Inc. -- ANGLE (Software Adapter Direct3D11 vs_5_0 ps_5_0) WebGL2 Renderer: WebGL creation failed: * Refused to create native OpenGL context because of blacklist entry: FEATURE_FAILURE_UNKNOWN_DEVICE_VENDOR * Exhausted GL driver options. Hardware H264 Decoding: No; Hardware video decoding disabled or blacklisted Audio Backend: wasapi Direct2D: Blocked for your graphics card because of unresolved driver issues. DirectWrite: false (6.2.9200.17568) GPU #1 Active: Yes Description: VMware SVGA 3D Vendor ID: 0x15ad Device ID: 0x0405 Driver Version: 8.15.1.33 Driver Date: 10-16-2015 Drivers: vm3dum vm3dum_10 Subsys ID: 040515ad RAM: 384 Diagnostics AzureCanvasAccelerated: 0 AzureCanvasBackend: skia AzureContentBackend: cairo AzureFallbackCanvasBackend: cairo Decision Log D3D11_COMPOSITING: Blocklisted; failure code BLOCKLIST_FEATURE_FAILURE_UNKNOWN_DEVICE_VENDOR D3D9_COMPOSITING: Blocklisted; failure code BLOCKLIST_FEATURE_FAILURE_UNKNOWN_DEVICE_VENDOR DIRECT2D: unavailable by default: Direct2D requires Direct3D 11 compositing D3D11_HW_ANGLE: unavailable by default: D3D11 compositing is disabled disabled by env: D3D11 compositing is disabled ---------------------
Jerry or Jeff, any luck reproducing this?
status-firefox49:
--- → affected
status-firefox50:
--- → affected
status-firefox51:
--- → ?
tracking-firefox51:
--- → ?
Flags: needinfo?(jgilbert)
Flags: needinfo?(hshih)
David or Milan, can you help? Not sure how serious this is. Hitting MOZ_CRASH is usually not good. If I don't need to worry about it for 49, let's defer it to 51.
Flags: needinfo?(milan)
Flags: needinfo?(dbolter)
Comment 9•7 years ago
|
||
I still can't reproduce this at win7. I will try a win7 vm on fusion 8 later.
Flags: needinfo?(hshih)
Comment 10•7 years ago
|
||
I ran this url in bughunter on windows 7 again and found only one aurora debug crash @ gl::GLContext::AfterGLCall(char const *) [GLContext.h:264adddee81b : 758 + 0x22] It may be intermittent. If you need, I can try to reproduce more aggressively.
First comment - we weren't allowing this configuration (software ANGLE for WebGL, basic compositor) until bug 1271770, which landed on nightly in 50, then got uplifted to 49 on July 27th. I would imagine that to be the cause - exposing an existing bug because of the additional number of people that can now run WebGL, rather than a new bug. Based on the debug stack - perhaps VMWare doesn't do mipmaps properly.
Flags: needinfo?(milan) → needinfo?(jgilbert)
Updated•7 years ago
|
Flags: needinfo?(dbolter)
Milan, that sounds like a potentially good thing (more people able to run WebGL). Deferring this to 50 at this point as we are heading into 49 beta 8 now.
Updated•6 years ago
|
Flags: needinfo?(jgilbert) → needinfo?(howareyou322)
Comment 13•6 years ago
|
||
Michael, please help to reproduce this crash in win 7 VM. I think it might go different flow after bug 1271770 or bug 1297965.
Flags: needinfo?(howareyou322) → needinfo?(cleu)
Assignee | ||
Comment 14•6 years ago
|
||
OK, I will prepare a Windows 7 VM running under VMWare Fusion
Assignee | ||
Comment 15•6 years ago
|
||
Hi Tomcat, I have set up a windows 7 VM with same VMWare SVGA driver version as the about:support you posted. But I cannot reproduce the crash, can you pull the latest mozilla-central code and test it again? In addition, what is the build number of your VMWare fusion? I think maybe this issue is related to the GFX emulation inside VMWare Fusion.
Flags: needinfo?(cleu) → needinfo?(cbook)
Reporter | ||
Comment 16•6 years ago
|
||
crashed after a long time mozilla-central tinderbox build (debug, windows 7) based on https://hg.mozilla.org/mozilla-central/rev/9baec74b3db1bf005c66ae2f50bafbdb02c3be38
Flags: needinfo?(cbook)
Reporter | ||
Comment 17•6 years ago
|
||
(In reply to Michael Leu[:lenzak800](UTC+8)[PTO 10/6 ~ 10/13] from comment #15) > Hi Tomcat, > > I have set up a windows 7 VM with same VMWare SVGA driver version as the > about:support you posted. > > But I cannot reproduce the crash, can you pull the latest mozilla-central > code and test it again? > > In addition, what is the build number of your VMWare fusion? > > I think maybe this issue is related to the GFX emulation inside VMWare > Fusion. Hi Michael, its Version 8.5.0 (4352717) Fusion Mac - also attached a m-c stack from today
Assignee | ||
Comment 18•6 years ago
|
||
OK, I will open it and wait for several minutes. BTW, here is my about:support. Graphics Features Compositing Basic Asynchronous Pan/Zoom wheel input enabled; touch input enabled WebGL Renderer Google Inc. -- ANGLE (Software Adapter Direct3D11 vs_4_1 ps_4_1) WebGL2 Renderer Google Inc. -- ANGLE (Software Adapter Direct3D11 vs_4_1 ps_4_1) Hardware H264 Decoding No; Hardware video decoding disabled or blacklisted Audio Backend unknown Direct2D Blocked for your graphics card because of unresolved driver issues. DirectWrite false (6.1.7601.17514) GPU #1 Active Yes Description VMware SVGA 3D Vendor ID 0x15ad Device ID 0x0405 Driver Version 8.15.1.33 Driver Date 10-16-2015 Drivers vm3dum vm3dum_10 Subsys ID 040515ad RAM 8 Diagnostics AzureCanvasAccelerated 0 AzureCanvasBackend skia AzureContentBackend cairo AzureFallbackCanvasBackend cairo failures [GFX1-]: Refresh driver waiting for the compositor for1.04809 seconds. Decision Log D3D11_COMPOSITING Blocklisted; failure code BLOCKLIST_FEATURE_FAILURE_UNKNOWN_DEVICE_VENDOR D3D9_COMPOSITING Blocklisted; failure code BLOCKLIST_FEATURE_FAILURE_UNKNOWN_DEVICE_VENDOR DIRECT2D unavailable by default: Direct2D requires Direct3D 11 compositing D3D11_HW_ANGLE unavailable by default: D3D11 compositing is disabled disabled by env: D3D11 compositing is disabled
Assignee | ||
Comment 19•6 years ago
|
||
OK, I can reproduce it after I move the VM to same VMWare Fusion build. It seems that only windows 7 32bit has this issue.
Comment 20•6 years ago
|
||
Track 51+ as it can be reproduced.
Updated•6 years ago
|
Assignee: nobody → cleu
Assignee | ||
Comment 21•6 years ago
|
||
I found that this crash always happens after a buffer allocation failure. This website has a texture with dimension 8192*8192, which requires 256MB of memory. And this allocation which utilizes calloc sometime fails with a nullptr return here. https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp?q=Unable+to+allocate+buffer+during+conversion&redirect_type=single#253 But both the VM's system or GFX memory are sufficient to allocate, so I am still trying to figure out why this memory allocation failure happen and why this happens intermittently.
Assignee | ||
Comment 22•6 years ago
|
||
OK, I think I found the reason. calloc returns nullptr when no consecutive required memory space available. Since 256MB is a very big value, it will be fail-prone. It also explains why only 32bit windows happens because 64bit has a much larger address space. So it may be a problem of our error handling, we should have bailed out from the texture operations before it hit MOZ_CRASH.
Comment 23•6 years ago
|
||
(In reply to Michael Leu[:lenzak800](UTC+8)[PTO 10/6 ~ 10/13] from comment #22) > OK, I think I found the reason. > > calloc returns nullptr when no consecutive required memory space available. > > Since 256MB is a very big value, it will be fail-prone. > > It also explains why only 32bit windows happens because 64bit has a much > larger address space. > > So it may be a problem of our error handling, we should have bailed out from > the texture operations before it hit MOZ_CRASH. Good catch, Michael.
Assignee | ||
Comment 24•6 years ago
|
||
So here is how the crash happens. 1. The website calls TexImage2D, requesting a large buffer. 2. Under certain environment, we use CPU-side conversion in ConvertIfNeeded https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp?q=ConvertIfNeeded&redirect_type=direct#171 3. We cannot allocate such a large buffer in CPU side, so the calloc returns null, following an OOM exception thrown. https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp?q=ConvertIfNeeded&redirect_type=direct#253 4. The website still calls GenerateMipmap with this faulty texture, so we got an INVALID_OPERATION exception, then we crashed because it is a debug build with MOZ_GL_DEBUG_ABORT_ON_ERROR set. This whole flow seems to be normal, we have told the website we're out of memory but they still use it, so we pop out more error, and it crashes because its debug configuration. Jeff, do you have any suggestion about this one? Maybe it is just a normal reaction instead of a bug.
Flags: needinfo?(jgilbert)
Comment 25•6 years ago
|
||
Great run-down. #4 should not happen. We should be validating that it's valid to send glGenerateMipmaps to the driver before we call it. The fix is the repair our GenerateMipmap validation.
Flags: needinfo?(jgilbert)
Assignee | ||
Comment 26•6 years ago
|
||
Assignee | ||
Comment 27•6 years ago
|
||
Comment on attachment 8804598 [details] [diff] [review] Prevent BaseImageInfo being initialized when TexOrSubImage fails. Review of attachment 8804598 [details] [diff] [review]: ----------------------------------------------------------------- I found that we do have some checks when we call GenerateMipmap, the problem is that if blob->TexOrSubImage fails with a false return before it reaches the real GL part (DoTexSubImage), it wouldn't be treated as failure because glError is 0 since we didn't reach real GL call. https://dxr.mozilla.org/mozilla-central/source/dom/canvas/WebGLTextureUpload.cpp#1423 So we continue to execute, initializing the ImageInfo inside WebGLTexture which make other further calls thought it is a valid texture. This patch add checks for blob->TexOrSubImage, bailing out if it returns false, which prevent the ImageInfo from falsely initialized. So further texture calls will bail out by their own checking.
Attachment #8804598 -
Flags: review?(jgilbert)
Comment on attachment 8804598 [details] [diff] [review] Prevent BaseImageInfo being initialized when TexOrSubImage fails. Review of attachment 8804598 [details] [diff] [review]: ----------------------------------------------------------------- Feels like we should do similar error reporting as we do for out of memory scenarios, rather than just return?
Assignee | ||
Comment 29•6 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #28) > Comment on attachment 8804598 [details] [diff] [review] > Prevent BaseImageInfo being initialized when TexOrSubImage fails. > > Review of attachment 8804598 [details] [diff] [review]: > ----------------------------------------------------------------- > > Feels like we should do similar error reporting as we do for out of memory > scenarios, rather than just return? It has thrown exception here so the website should have known they run out of memory. https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp?q=ConvertIfNeeded&redirect_type=direct#253 The problem is that it continue to initialize ImageInfo after the exception thrown, so further web gl calls don't know the texture is faulty. Since it has thrown OM exception, I think we can just return.
Too late for 50, we should plan to fix this in 51.
Comment hidden (mozreview-request) |
Comment 32•6 years ago
|
||
Comment on attachment 8804598 [details] [diff] [review] Prevent BaseImageInfo being initialized when TexOrSubImage fails. Review of attachment 8804598 [details] [diff] [review]: ----------------------------------------------------------------- This is the right idea, but I didn't really make the guarantee tight when I wrote this before. I uploaded patch that expands on yours. ::: dom/canvas/WebGLTextureUpload.cpp @@ +1336,5 @@ > > GLenum glError; > + if (!blob->TexOrSubImage(isSubImage, needsRespec, funcName, > + this, target, level, driverUnpackInfo, > + xOffset, yOffset, zOffset, &glError)) { { goes on its own line for multi-line conditionals.
Attachment #8804598 -
Flags: review?(jgilbert) → review-
Updated•6 years ago
|
Attachment #8804598 -
Attachment is obsolete: true
Updated•6 years ago
|
Attachment #8811103 -
Flags: review?(cleu)
Assignee | ||
Comment 33•6 years ago
|
||
mozreview-review |
Comment on attachment 8811103 [details] Bug 1290831 - Clarify TexUnpackBlob::TexOrSubImage's fallibility and update callers. - https://reviewboard.mozilla.org/r/93320/#review93480 Thanks for your suggestion. :)
Attachment #8811103 -
Flags: review?(cleu) → review+
Comment 34•6 years ago
|
||
Pushed by jgilbert@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/971de933cd5d Clarify TexUnpackBlob::TexOrSubImage's fallibility and update callers. - r=cleu
Reporter | ||
Comment 35•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/971de933cd5d
Status: NEW → RESOLVED
Closed: 6 years ago
status-firefox53:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Comment 36•6 years ago
|
||
Hi Michael, could you please nominate this uplift to Beta51 and Aurora52 if this patch is not too risky?
Flags: needinfo?(cleu)
Assignee | ||
Comment 37•6 years ago
|
||
Comment on attachment 8811103 [details] Bug 1290831 - Clarify TexUnpackBlob::TexOrSubImage's fallibility and update callers. - Approval Request Comment [Feature/regressing bug #]: Bug1290831 [User impact if declined]: Users will encounter unstable webGL when we perform CPu-side texture processing. [Describe test coverage new/current, TreeHerder]: Hand test and see no crash in debug build. [Risks and why]: Low, it just slightly modifies the operation validation process. [String/UUID change made/needed]: N/A
Flags: needinfo?(cleu)
Attachment #8811103 -
Flags: approval-mozilla-aurora?
Comment 38•6 years ago
|
||
Hi Michael, Because WebGL 2 will be shipped in 51, isn't this worth uplifting to Beta51?
Flags: needinfo?(cleu)
Comment 40•6 years ago
|
||
Comment on attachment 8811103 [details] Bug 1290831 - Clarify TexUnpackBlob::TexOrSubImage's fallibility and update callers. - Approval Request Comment [Feature/regressing bug #]: webgl2 [User impact if declined]: [Describe test coverage new/current, TreeHerder]: [Risks and why]: [String/UUID change made/needed]:
Attachment #8811103 -
Flags: approval-mozilla-beta?
Comment 41•6 years ago
|
||
Comment on attachment 8811103 [details] Bug 1290831 - Clarify TexUnpackBlob::TexOrSubImage's fallibility and update callers. - Fix an issue related to WebGL 2. Beta51+ and Aurora52+. Should be in 51 beta 2.
Attachment #8811103 -
Flags: approval-mozilla-beta?
Attachment #8811103 -
Flags: approval-mozilla-beta+
Attachment #8811103 -
Flags: approval-mozilla-aurora?
Attachment #8811103 -
Flags: approval-mozilla-aurora+
Comment 42•6 years ago
|
||
bugherderuplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/867d03558dc2
Comment 43•6 years ago
|
||
bugherderuplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/2bf05af8c02d
Updated•6 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•