WebGL crash in MeShade [@ run_vertex_stage]

RESOLVED FIXED in mozilla1.9.3a5

Status

()

--
critical
RESOLVED FIXED
8 years ago
7 years ago

People

(Reporter: bjacob, Assigned: bjacob)

Tracking

({crash})

Trunk
mozilla1.9.3a5
x86_64
Linux
crash
Points:
---
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(crash signature)

Attachments

(1 attachment, 3 obsolete attachments)

Comment hidden (empty)
(Assignee)

Comment 1

8 years ago
Talking about this WebGL page:

http://spidergl.org/meshade/

When I click on "Load" I sometimes get this crash (using OSMesa):

#0  0x0000000000000000 in ?? ()
#1  0x00007fffca51d08b in run_vertex_stage () from /usr/lib/libOSMesa.so
#2  0x00007fffca510dca in _tnl_run_pipeline () from /usr/lib/libOSMesa.so
#3  0x00007fffca511b94 in _tnl_draw_prims () from /usr/lib/libOSMesa.so
#4  0x00007fffca511e96 in _tnl_vbo_draw_prims () from /usr/lib/libOSMesa.so
#5  0x00007fffca508702 in vbo_validated_drawrangeelements () from /usr/lib/libOSMesa.so
#6  0x00007fffca50886d in vbo_exec_DrawElements () from /usr/lib/libOSMesa.so
#7  0x00007ffff5d4c50e in mozilla::WebGLContext::DrawElements (this=0x7fffc95f9320, mode=4, 
    count=2946, type=5123, offset=0)
    at ../../../../../mozilla-central/content/canvas/src/WebGLContextGL.cpp:974
#8  0x00007ffff648c4d3 in nsICanvasRenderingContextWebGL_DrawElements (cx=0x7fffe0b29000, argc=4, 
    vp=0x7fffc71f9738) at dom_quickstubs.cpp:23666
#9  0x00007ffff4c72629 in js_Interpret (cx=0x7fffe0b29000)
    at ../../../../mozilla-central/js/src/jsops.cpp:2199
#10 0x00007ffff4c858e6 in js_Invoke (cx=0x7fffe0b29000, argc=1, vp=0x7fffc71f9038, flags=0)
    at ../../../../mozilla-central/js/src/jsinterp.cpp:831
#11 0x00007ffff4c85c4c in js_InternalInvoke (cx=0x7fffe0b29000, obj=0x7fffc8a90800, 
    fval=140737013998144, flags=0, argc=1, argv=0x7fffe3051ad8, rval=0x7fffffffd008)
    at ../../../../mozilla-central/js/src/jsinterp.cpp:882
#12 0x00007ffff4bfa270 in JS_CallFunctionValue (cx=0x7fffe0b29000, obj=0x7fffc8a90800, 
    fval=140737013998144, argc=1, argv=0x7fffe3051ad8, rval=0x7fffffffd008)
    at ../../../../mozilla-central/js/src/jsapi.cpp:4922
#13 0x00007ffff5f4fa74 in nsJSContext::CallEventHandler (this=0x7fffe0b2fac0, 
    aTarget=0x7fffc95edc00, aScope=0x7fffc8a90800, aHandler=0x7fffe3b9e240, aargv=0x7fffc7cf0fc8, 
    arv=0x7fffffffd150) at ../../../../mozilla-central/dom/base/nsJSEnvironment.cpp:2197
#14 0x00007ffff5f8a2c8 in nsGlobalWindow::RunTimeout (this=0x7fffc95edc00, 
    aTimeout=0x7fffe3068f20) at ../../../../mozilla-central/dom/base/nsGlobalWindow.cpp:8572
#15 0x00007ffff5f8af73 in nsGlobalWindow::TimerCallback (aTimer=0x7fffe3068f80, 
---Type <return> to continue, or q <return> to quit---
    aClosure=0x7fffe3068f20) at ../../../../mozilla-central/dom/base/nsGlobalWindow.cpp:8916
#16 0x00007ffff6ac2fe0 in nsTimerImpl::Fire (this=0x7fffe3068f80)
    at ../../../../mozilla-central/xpcom/threads/nsTimerImpl.cpp:427
#17 0x00007ffff6ac323c in nsTimerEvent::Run (this=0x7fffc7ca7bb0)
    at ../../../../mozilla-central/xpcom/threads/nsTimerImpl.cpp:519
#18 0x00007ffff6abbbfd in nsThread::ProcessNextEvent (this=0x7fffec0380d0, mayWait=0, 
    result=0x7fffffffd3ec) at ../../../../mozilla-central/xpcom/threads/nsThread.cpp:547
#19 0x00007ffff6a503f4 in NS_ProcessNextEvent_P (thread=0x7fffec0380d0, mayWait=0)
    at nsThreadUtils.cpp:250
#20 0x00007ffff698b964 in mozilla::ipc::MessagePump::Run (this=0x7fffec0dc340, 
    aDelegate=0x7fffec055840) at ../../../../mozilla-central/ipc/glue/MessagePump.cpp:118
#21 0x00007ffff6b305af in MessageLoop::RunInternal (this=0x7fffec055840)
    at ../../../../mozilla-central/ipc/chromium/src/base/message_loop.cc:216
#22 0x00007ffff6b30534 in MessageLoop::RunHandler (this=0x7fffec055840)
    at ../../../../mozilla-central/ipc/chromium/src/base/message_loop.cc:199
#23 0x00007ffff6b304c5 in MessageLoop::Run (this=0x7fffec055840)
    at ../../../../mozilla-central/ipc/chromium/src/base/message_loop.cc:173
#24 0x00007ffff68416f5 in nsBaseAppShell::Run (this=0x7fffe3c73080)
    at ../../../../../mozilla-central/widget/src/xpwidgets/nsBaseAppShell.cpp:175
#25 0x00007ffff65b3c91 in nsAppStartup::Run (this=0x7fffe35d3420)
    at ../../../../../../mozilla-central/toolkit/components/startup/src/nsAppStartup.cpp:192
#26 0x00007ffff56c3d87 in XRE_main (argc=4, argv=0x7fffffffe008, aAppData=0x7fffec0250f0)
    at ../../../../mozilla-central/toolkit/xre/nsAppRunner.cpp:3612
#27 0x000000000040214c in main (argc=4, argv=0x7fffffffe008)
    at ../../../../mozilla-central/browser/app/nsBrowserApp.cpp:158

Updated

8 years ago
Severity: normal → critical
Keywords: crash
Summary: WebGL crash in MeShade → WebGL crash in MeShade [@ run_vertex_stage]
(Assignee)

Comment 2

8 years ago
Created attachment 447905 [details] [diff] [review]
Fix buffer validation issues

This patch fixes this crash and many other issues, all revolving around the validation of buffers (both index arrays and vertex attrib arrays) before calls to glDrawArrays and glDrawElements.
Assignee: nobody → bjacob
Status: NEW → ASSIGNED
Attachment #447905 - Flags: review?(vladimir)
Comment on attachment 447905 [details] [diff] [review]
Fix buffer validation issues

Looks like a good start, but a few comments:

> class WebGLContext :
>     public nsICanvasRenderingContextWebGL,
>     public nsICanvasRenderingContextInternal,
>     public nsSupportsWeakReference
> {
> public:
>@@ -319,16 +344,18 @@ protected:
>     nsRefPtrHashtable<nsUint32HashKey, WebGLShader> mMapShaders;
>     nsRefPtrHashtable<nsUint32HashKey, WebGLFramebuffer> mMapFramebuffers;
>     nsRefPtrHashtable<nsUint32HashKey, WebGLRenderbuffer> mMapRenderbuffers;
> 
> public:
>     // console logging helpers
>     static void LogMessage (const char *fmt, ...);
>     static void LogMessage(const char *fmt, va_list ap);
>+
>+    friend class WebGLBuffer;

^ No need; see below why.

>-    PRBool Deleted() { return mDeleted; }
>-    GLuint GLName() { return mName; }
>-    PRUint32 ByteLength() { return mByteLength; }
>+    PRBool Deleted() const { return mDeleted; }
>+    GLuint GLName() const { return mName; }
>+    PRUint32 ByteLength() const { return mByteLength; }

>+    GLenum target() const { return mTarget; }
>+    const void *dataCopy() const { return mDataCopy; }

These should be capitalized to match the others, and probably called something like Target and Data.  No need to call it DataCopy -- the fact that it's a copy doesn't really matter.

>-    void SetByteLength(GLuint len) {
>+    void SetData(GLenum target, GLuint len, const void* data = nsnull) {
>+        free(mDataCopy);
>+        mDataCopy = nsnull;
>+        mTarget = target;
>         mByteLength = len;
>+        if (mTarget == LOCAL_GL_ELEMENT_ARRAY_BUFFER) {
>+            mDataCopy = malloc(mByteLength);
>+            if (data)
>+                memcpy(mDataCopy, data, mByteLength);
>+            else
>+                memset(mDataCopy, 0, mByteLength);
>+        }

It might be worth using realloc() here, but maybe it doesn't matter; also might be worth checking to see if the old length is the same as the new length before doing a free/malloc.  The realloc would take care of that.  Maybe something like:

{
  mByteLength = len;
  if (mTarget == LOCAL_GL_ELEMENT_ARRAY_BUFFER) {
    mDataCopy = realloc(mDataCopy, mByteLength);
    if (data)
      memcpy(mDataCopy, data, mByteLength);
    else
      memset(mDataCopy, 0, mByteLength);
  }
}

Also note that mTarget should not be set here -- there's no need to pass it in, even.  It should be checked and set when BindBuffer is called, which is missing from your patch.  e.g. it should check to see if the buffer's target is NONE, and if so just set it; otherwise it should check to make sure that it matches, and raise an error if not.

>+    void SetSubDataCopy(GLuint byteOffset, GLuint byteLen, const void* data) {

^ May as well call this SetSubData to match SetData

>+        if (mTarget != LOCAL_GL_ELEMENT_ARRAY_BUFFER) {
>+            return;
>+        }

^ No need for the { }

>+        memcpy((void*) (size_t(mDataCopy)+byteOffset), data, byteLen);
>+    }
>+
>+    PRBool ValidateBuffers(WebGLContext* context, GLuint count, GLenum type, GLuint byteOffset) const;

ValidateBuffers really shouldn't be on here -- it should be on the context itself.  However, what should be on here
is a function to get the highest unsigned byte/unsigned short in a given range.  The results of those computations should
also be cached for any given offset,length pair so that we don't have to walk the array each time.

Not sure what the right structure to cache them in would be; probably a nsDataHashtable with a struct key that's something like

struct {
  size_t offset;
  size_t length;
}

with a separate hashtable for UNSIGNED_BYTE and UNSIGNED_SHORT.  Could put the type in the hash struct also, but given that an array is likely to only be used as one or the other, we may as well skip it.

So, something like:

   GLuint MaxUnsignedByteValueInRange(GLuint offset, GLuint length);
   GLuint MaxUnsignedShortValueInRange(GLuint offset, GLuint length);

And then the result of that would just get passed to ValidateBuffers, because once we have that value it's the same as if we were in DrawArrays as far as validation goes.


>     // XXX what happens if BufferData fails? We probably shouldn't
>     // update our size here then, right?

^ this actually made me think of another problem (well, think again of the other problem); see my mail to the private webgl mailing list.  I'll file a followup bug for fixing this.

>+    if (!mBoundElementArrayBuffer->ValidateBuffers(this, count, type, byteOffset)) {
>+        return ErrorInvalidOperation("DrawElements: bound vertex attribute buffers do not have sufficient "
>+                                     "data for given indices from the bound element array");
>+    }

^ See above; just calculate the highest index in the appropriate range for the given type, and then just call ValidateBuffers.
Attachment #447905 - Flags: review?(vladimir) → review-
From IRC:

10:13 < bjacob> vlad: you said we dont even need a mTarget member in WebGLBuffer but if we dont have one, how will we test in BindBuffer() that the target is correct? We'll have to do a glGet call and that will be slow, right? Though even with mTarget members we'll still have to do a lookup. What did you have in mind?
10:22 < bjacob> vlad: actually it looks like there's not even a glGet to determine the target of an existing buffer!

Nope, note that above I said "mTarget should not be set here", meaning it should not be set as part of the SetData call -- instead, it should be set/checked on the WebGLBuffer object at BindBuffer time.
(Assignee)

Comment 5

8 years ago
Created attachment 448376 [details] [diff] [review]
Fix buffer validation issues
Attachment #447905 - Attachment is obsolete: true
Attachment #448376 - Flags: review?(vladimir)
(Assignee)

Comment 6

8 years ago
This new version addresses, I hope, all of your comments, except that I didn't implement the caching, explanation below.

I changed a lot of stuff in class WebGLBuffer, hope you like it like that.

> It might be worth using realloc() here

You're right! I was hesitating because of the inherent cost of being conservative (realloc preserves the buffer data) but on the other hand it can realistically offer optimization opportunities to the allocator, as opposed to free()+malloc().

Since realloc() allowed cleaner/shorter code on our side, I went for it.

> ValidateBuffers really shouldn't be on here

I was just following (my understanding of) your comment (removed by the patch):

    // XXXmark fix validation
    // XXX either GLushort or GLubyte; just put this calculation as
       a method on the array object

But I agree that it should be put in the context like the existing ValidateBuffers()! New patch does that.

> However, what should be on here is a function to get
> the highest unsigned byte/unsigned short in a given range.

Right, in my first patch this was a global function in WebGLValidate.cpp, now it's a method here.

> And then the result of that would just get passed to ValidateBuffers,
> because once we have that value it's the same as if we were in DrawArrays
> as far as validation goes.

Yes, that's what I was doing!

> ^ this actually made me think of another problem (well, think again of
> the other problem); see my mail to the private webgl mailing list.

I'm still not subscribed to it.

***

Now discussing caching. I see what you mean but there are a couple of issues to solve. One is, what to do in the case where the user uses a lot of (count,offset) pairs? One simple solution is that when the cache has more than N entries, we clear it.

A second question is whether using a hash table is really the right data structure? In a lot of cases the hash table will have only 1 entry, and in that case, a hash table would be less efficient than e.g. a plain array of key-value pairs.

Finally, maybe it's just more efficient to compute, at the time of the BufferData call, an auxiliary helper array that allows to later compute very quickly the max index for a given (count,offset) pair? For example, using binary partitioning we can make the max computation in a sub-array have only log complexity (or something like that... perhaps log^2).

Maybe it's better to defer this optimization talk to a later date.
(Assignee)

Comment 7

8 years ago
Created attachment 448392 [details] [diff] [review]
Fix buffer validation issues

New patch fixes minor style issues.
Attachment #448376 - Attachment is obsolete: true
Attachment #448392 - Flags: review?(vladimir)
Attachment #448376 - Flags: review?(vladimir)
Comment on attachment 448392 [details] [diff] [review]
Fix buffer validation issues

Looks great, with one bug:

> NS_IMETHODIMP
>+WebGLContext::BufferSubData_buf(GLenum target, GLsizei byteOffset, js::ArrayBuffer *wb)
> {
> ...
>-    gl->fBufferSubData(target, offset, wb->byteLength, wb->data);
>+    boundBuffer->SetByteLength(wb->byteLength);

^ This is wrong for BufferSubData -- the array is already the correct length, this would set it to a wrong/shorter length (it would set it to the length of the passed-in "sub" segment).


> NS_IMETHODIMP
>+WebGLContext::BufferSubData_array(GLenum target, GLsizei byteOffset, js::TypedArray *wa)
> ...
>+    boundBuffer->SetByteLength(wa->byteLength);

^ same thing

I agree about the caching -- file a followup bug and let's figure that out?
Attachment #448392 - Flags: review?(vladimir) → review+
(Assignee)

Comment 9

8 years ago
Created attachment 448575 [details] [diff] [review]
Fix buffer validation issues, updated

Thanks for spotting this! Here's an updated patch. Carried r+ forward.
Attachment #448392 - Attachment is obsolete: true
Attachment #448575 - Flags: review+
(Assignee)

Updated

8 years ago
Keywords: checkin-needed
Comment on attachment 448575 [details] [diff] [review]
Fix buffer validation issues, updated

un-carrying-forward, this is a nontrivial update
Attachment #448575 - Flags: review+ → review?(vladimir)
> I agree about the caching -- file a followup bug and let's figure that out?

bug 569431
http://hg.mozilla.org/mozilla-central/rev/4d20c3eeed53
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED

Updated

8 years ago
Keywords: checkin-needed
Target Milestone: --- → mozilla1.9.3a5
Flags: in-testsuite? → in-testsuite+
Crash Signature: [@ run_vertex_stage]
You need to log in before you can comment on or make changes to this bug.