Closed Bug 1376399 Opened 7 years ago Closed 7 years ago

Hard crash in FF 54 and Nightly 56 on Windows 7

Categories

(Core :: Graphics: CanvasWebGL, defect)

54 Branch
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla56
Tracking Status
firefox-esr52 56+ fixed
firefox55 --- wontfix
firefox56 --- verified

People

(Reporter: floooh, Assigned: svargas)

References

Details

(Keywords: regression)

Crash Data

Attachments

(1 file, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:56.0) Gecko/20100101 Firefox/56.0
Build ID: 20170626030209

Steps to reproduce:

1) on a Windows7 machine open this asm.js/WebGL2 demo: http://floooh.github.io/oryol-samples/
2) if it doesn't crash, try again (after at most 5..10x the browser should crash hard)
3) see below for crash report ids (the problem seems to be related to copying dynamic vertex data)


Actual results:

Here are some crash report id's (they all point to rx::CopyNativeVertexData<T>)

bp-a9f6c7df-71c9-40d0-8691-a75840170627

bp-35e35929-6855-452f-ab41-036361170627

bp-3e6bce40-0cc6-48a6-8815-7331b1170627

bp-def8b0d9-1f87-4440-b492-698151170627
The crash is also reproducible with WebGL2 disabled in about:config:

bp-e6d49e48-de08-4835-925f-b70970170627

bp-58502de6-490f-4223-8efd-87cc90170627

bp-c7ce115a-39ca-415f-93b8-cac290170627
Component: Untriaged → Canvas: WebGL
Product: Firefox → Core
Erm, I just noticed I posted a link to the toplevel samples page, sorry.

The crashing demo is this: 

http://floooh.github.io/oryol-samples/asmjs/Dragons.html

Webassembly version is here (same problem)

http://floooh.github.io/oryol-samples/wasm/Dragons.html
Depends on: 1341131
Crash Signature: [@ rx::CopyNativeVertexData<T>]
Here is the call stack when running http://floooh.github.io/oryol-samples/wasm/Dragons.html on Win 10 Nightly debug version.

It hits the assertion of 
ASSERT(!bufferD3D ||
        ElementsInBuffer(attrib, static_cast<unsigned int>(bufferD3D->getSize())) >=
        static_cast<int>(totalCount));

I get 1024 from ElementsInBuffer() and the totalCount is 2603.


>	libGLESv2.dll!rx::VertexDataManager::reserveSpaceForAttrib(const rx::TranslatedAttribute & translatedAttrib, int count, int instances) Line 440	C++
 	libGLESv2.dll!rx::VertexDataManager::storeDynamicAttribs(std::vector<rx::TranslatedAttribute,std::allocator<rx::TranslatedAttribute> > * translatedAttribs, const std::bitset<16> & dynamicAttribsMask, int start, int count, int instances) Line 393	C++
 	libGLESv2.dll!rx::VertexArray11::updateDirtyAndDynamicAttribs(rx::VertexDataManager * vertexDataManager, const gl::State & state, int start, int count, int instances) Line 215	C++
 	libGLESv2.dll!rx::Renderer11::applyVertexBuffer(const gl::State & state, unsigned int mode, int first, int count, int instances, rx::TranslatedIndexData * indexInfo) Line 1698	C++
 	libGLESv2.dll!rx::Renderer11::genericDrawElements(rx::Context11 * context, unsigned int mode, int count, unsigned int type, const void * indices, int instances, const gl::IndexRange & indexRange) Line 4522	C++
 	libGLESv2.dll!rx::Context11::drawElements(unsigned int mode, int count, unsigned int type, const void * indices, const gl::IndexRange & indexRange) Line 171	C++
 	libGLESv2.dll!gl::Context::drawElements(unsigned int mode, int count, unsigned int type, const void * indices, const gl::IndexRange & indexRange) Line 1654	C++
 	libGLESv2.dll!gl::DrawElements(unsigned int mode, int count, unsigned int type, const void * indices) Line 791	C++
 	libGLESv2.dll!glDrawElements(unsigned int mode, int count, unsigned int type, const void * indices) Line 227	C++
 	xul.dll!mozilla::gl::GLContext::raw_fDrawElements(unsigned int mode, int count, unsigned int type, const void * indices) Line 1087	C++
 	xul.dll!mozilla::gl::GLContext::fDrawElements(unsigned int mode, int count, unsigned int type, const void * indices) Line 1100	C++
 	xul.dll!mozilla::WebGLContext::DrawElements(unsigned int mode, int vertCount, unsigned int type, __int64 byteOffset, const char * funcName) Line 811	C++
 	xul.dll!mozilla::dom::WebGL2RenderingContextBinding::drawElements(JSContext * cx, JS::Handle<JSObject *> obj, mozilla::WebGL2Context * self, const JSJitMethodCallArgs & args) Line 10371	C++
Status: UNCONFIRMED → NEW
Ever confirmed: true
ANGLE seems to have changed a lot on VertexDataManager. We need to check if the update of ANGLE [Bug 1371190] would solve the issue.
> I get 1024 from ElementsInBuffer() and the totalCount is 2603.

This should be the vertex buffer update with per-instance data which happens per frame. This has 1024 vertices with 16 floats each (12 floats for a 4x3 transposed model matrix, and 4 floats with location info into an RGBA32F texture with character skin matrices).

Before rendering, the instance vertex buffer is updated with a call to glBufferSubdata(), only with the required size for rendering the current number of dragons.

When updating the data, I'm alternating between two buffers (I found that this is the best way to update dynamic data on WebGL).

The totalCount=2603 is strange, the instance update code should never update more than 1024 elements.

A similar update happens for the RGBA32F texture.

I'm using the same instancing approach with dynamic instance data in other demos which work fine though. What's new is the texture update (but why would this affect the vertex update). Here are the other instancing demos:

http://floooh.github.io/oryol/asmjs/Instancing.html

http://floooh.github.io/oryol-samples/asmjs/BulletPhysicsBasic.html

http://floooh.github.io/oryol-samples/asmjs/BulletPhysicsCloth.html

Cheers and thanks for looking into this!
(In reply to Daosheng Mu[:daoshengmu] from comment #4)
> ANGLE seems to have changed a lot on VertexDataManager. We need to check if
> the update of ANGLE [Bug 1371190] would solve the issue.

After using the installation with ANGLE3118 update from https://treeherder.mozilla.org/#/jobs?repo=try&revision=fbd66914dfc861882d2f55d6337ad916dc5556c0&selectedJob=109469196, the demo can work sometimes but still has crash intermittently.(In reply to Daosheng Mu[:daoshengmu] from comment #4)
> ANGLE seems to have changed a lot on VertexDataManager. We need to check if
> the update of ANGLE [Bug 1371190] would solve the issue.

After using the installation with ANGLE3118 update from https://treeherder.mozilla.org/#/jobs?repo=try&revision=fbd66914dfc861882d2f55d6337ad916dc5556c0&selectedJob=109469196, the demo can work sometimes but still has crash intermittently.
Assignee: nobody → svargas
WebGLContext::DrawElements(GLenum mode, GLsizei vertCount, GLenum type,
                           WebGLintptr byteOffset, const char* funcName)

It looks like it happens when DrawElements() checks the index buffer range for the vertex buffer. As below, we can see the index range is 2734 ~ 5336, so the range count is 2603. But, the element count in vertex buffer is only 1024.

vertCount = 9456
index_start = 2734
index_end = 5336

indexRange 2603 = index_end - index_start + 1;
ANGLE is passing primcount=0 for non-instanced draw calls, where it should be passing primcount=1. (primcount=0 would be a no-op)

It looks like upstream is fixed.

One such 0=>1 change upstream is:
https://github.com/google/angle/commit/fd456445c8830576240a32b71d73c2efcb443a63
This bug is caused by the value '0' being passed into the draw / validate elements functions instead of '1' in ANGLE.

Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8c78b457cee229ec3eb57dd6534cd0085d7549fe
Attachment #8885075 - Flags: review?(jgilbert)
Comment on attachment 8885075 [details] [diff] [review]
0001-Bug-1376399-Hard-crash-in-FF-54-and-Nightly-56-on-Wi.patch

Review of attachment 8885075 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/angle/src/libGLESv2/entry_points_gles_2_0.cpp
@@ -782,4 @@
>      if (context)
>      {
>          IndexRange indexRange;
> -        if (!ValidateDrawElements(context, mode, count, type, indices, 0, &indexRange))

Do the same for ValidateDrawArrays.
Attachment #8885075 - Flags: review?(jgilbert) → review-
Attachment #8885078 - Flags: review?(jgilbert) → review+
I'm happy you found the reason for the crash!

Is there a chance that the current public Firefox will get an update for this or will the fix need to go through the usual release-steps?
We'll see what releng thinks. I think we'll be able to uplift to beta, though.
Keywords: checkin-needed
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/451af4e965fb
Hard crash in FF 54 and Nightly 56 on Windows 7 - r=jgilbert
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/451af4e965fb
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Flags: qe-verify+
I have reproduced this issue using an affected Nightly build (56.0a1, 2017-06-26).
- https://crash-stats.mozilla.com/report/index/9c879d4e-656d-4b1f-a056-24b250170824

The demo from comment 2 is not crashing anymore, on Beta 56.0b5 (20170821193225) running Windows 7 x64/x86.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Blocks: 1341131
No longer depends on: 1341131
Comment on attachment 8885078 [details] [diff] [review]
0001-Bug-1376399-Hard-crash-in-FF-54-and-Nightly-56-on-Wi.patch

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: crash fix
User impact if declined: bug 1376399
Fix Landed on Version: 56
Risk to taking this patch (and alternatives if risky): minimal
String or UUID changes made by this patch: none
Attachment #8885078 - Flags: approval-mozilla-esr52?
Comment on attachment 8885078 [details] [diff] [review]
0001-Bug-1376399-Hard-crash-in-FF-54-and-Nightly-56-on-Wi.patch

wow that's a bad commit message...

crash fix, esr52.4+
Attachment #8885078 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
See Also: → CVE-2017-7845
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: