Last Comment Bug 754669 - Update WebGL tests to r17794
: Update WebGL tests to r17794
Product: Core
Classification: Components
Component: Canvas: WebGL (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla15
Assigned To: Benoit Jacob [:bjacob] (mostly away)
: Milan Sreckovic [:milan]
Depends on:
Blocks: 749497
  Show dependency treegraph
Reported: 2012-05-13 06:27 PDT by Benoit Jacob [:bjacob] (mostly away)
Modified: 2012-05-24 09:34 PDT (History)
5 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

remove legacy cast to int32 in vertexAttribPointer (937 bytes, patch)
2012-05-22 15:13 PDT, Benoit Jacob [:bjacob] (mostly away)
bzbarsky: review+
Details | Diff | Splinter Review
fix stack corruption (1.27 KB, patch)
2012-05-23 12:27 PDT, Benoit Jacob [:bjacob] (mostly away)
jmuizelaar: review+
Details | Diff | Splinter Review

Description Benoit Jacob [:bjacob] (mostly away) 2012-05-13 06:27:31 PDT
Upstream is:

This has some important fixes since our last sync, but many of the important recently added tests are not there. If we want the most exhaustive tests, we'll have to switch back to trunk.

Comment 1 Benoit Jacob [:bjacob] (mostly away) 2012-05-22 15:13:55 PDT
Created attachment 626204 [details] [diff] [review]
remove legacy cast to int32 in vertexAttribPointer

Needed to pass updated 1.0.1 tests.
Comment 2 Benoit Jacob [:bjacob] (mostly away) 2012-05-22 15:14:37 PDT
Comment 3 Boris Zbarsky [:bz] (still a bit busy) 2012-05-22 17:49:19 PDT
Comment on attachment 626204 [details] [diff] [review]
remove legacy cast to int32 in vertexAttribPointer

Comment 4 Benoit Jacob [:bjacob] (mostly away) 2012-05-22 19:22:30 PDT
The Mac oranges on this tryserver run are crashes while running conformance/misc/invalid-passed-params.html. Need a mac to test locally.
Comment 5 Benoit Jacob [:bjacob] (mostly away) 2012-05-23 10:59:31 PDT
Can't reproduce on MacBook Air with 10.7. This suggests a NVIDIA-specific driver bug (the test slaves have NVIDIA).
Comment 6 Vladimir Vukicevic [:vlad] [:vladv] 2012-05-23 11:11:10 PDT
Note that Mo at google mentioned that 10.7.4 regressed graphics drivers badly, breaking the webgl conformance suite.  Wonder if we're hitting something like that?
Comment 7 Benoit Jacob [:bjacob] (mostly away) 2012-05-23 11:19:47 PDT
Nope, the crash occurs on both our 10.6 and 10.7 mac slaves.
Comment 8 Benoit Jacob [:bjacob] (mostly away) 2012-05-23 11:26:53 PDT
Actually, Linux 32bit slaves crash too, in the same test. Not the 64bit ones though. Anyway, this adds more suspiscion of a NVIDIA driver bug.
Comment 9 Benoit Jacob [:bjacob] (mostly away) 2012-05-23 11:29:07 PDT
Googlers -- Have you seen NVIDIA driver crashes in invalid-passed-params.html in recent revisions of 1.0.1 tests?
Comment 10 Benoit Jacob [:bjacob] (mostly away) 2012-05-23 11:35:48 PDT
The last log line is from

  shouldGenerateGLError(context, context.NO_ERROR, "context.bindTexture(context.TEXTURE_2D, tex)");

So the crash occurs in the next line, which is

  shouldGenerateGLError(context, context.INVALID_VALUE, "context.texImage2D(context.TEXTURE_2D, 0, context.RGBA, -16, -16, 0, context.RGBA, context.UNSIGNED_BYTE, null)");

We should be catching negative sizes in texImage2D...
Comment 11 Benoit Jacob [:bjacob] (mostly away) 2012-05-23 11:42:15 PDT
Yes we do check for negative sizes.
Comment 12 Benoit Jacob [:bjacob] (mostly away) 2012-05-23 12:23:18 PDT
TL;DR: not a driver bug. Pure Mozilla bug. The reason why it was finicky to reproduce was stack corruption.

I was adding some printf debugging to push to tryserver, ran it locally just to test... and boom! Got the crash. The stack shows I was crashing in NSPR printf implementation:

(gdb) bt
#0  0x00007ffff7e92dc0 in cvt_s (ss=0x7fffffff7d80, str=0x4000 <Address 0x4000 out of bounds>, width=0, prec=-1, flags=0)
    at /hack/mozilla-central/nsprpub/pr/src/io/prprf.c:370
#1  0x00007ffff7e94429 in dosprintf (ss=0x7fffffff7d80, fmt=0x7ffff5a55fb2 ": width and height must be >= 0", ap=0x7fffffff8248)
    at /hack/mozilla-central/nsprpub/pr/src/io/prprf.c:998
#2  0x00007ffff7e94bc9 in PR_vsnprintf (out=0x7fffffff7e10 "PUO\312\377\177", outlen=1024, 
    fmt=0x7ffff5a55fb0 "%s: width and height must be >= 0", ap=0x7fffffff8248) at /hack/mozilla-central/nsprpub/pr/src/io/prprf.c:1202
#3  0x00007ffff3d5196c in mozilla::WebGLContext::GenerateWarning (this=0x7fffdec2d400, 
    fmt=0x7ffff5a55fb0 "%s: width and height must be >= 0", ap=0x7fffffff8248)
    at /hack/mozilla-central/content/canvas/src/WebGLContextUtils.cpp:56
#4  0x00007ffff3d51d8e in mozilla::WebGLContext::ErrorInvalidValue (this=0x7fffdec2d400, 
    fmt=0x7ffff5a55fb0 "%s: width and height must be >= 0") at /hack/mozilla-central/content/canvas/src/WebGLContextUtils.cpp:144
#5  0x00007ffff3d55796 in mozilla::WebGLContext::ValidateLevelWidthHeightForTarget (this=0x7fffdec2d400, target=3553, level=0, 
    width=-16, height=-16, info=0x7ffff5a50fcb "texImage2D") at /hack/mozilla-central/content/canvas/src/WebGLContextValidate.cpp:436
#6  0x00007ffff3d4b1d3 in mozilla::WebGLContext::TexImage2D_base (this=0x7fffdec2d400, target=3553, level=0, internalformat=6408, 
    width=-16, height=-16, srcStrideOrZero=0, border=0, format=6408, type=5121, data=0x0, byteLength=0, jsArrayType=-1, 
    srcFormat=mozilla::WebGLTexelConversions::Auto, srcPremultiplied=false)
    at /hack/mozilla-central/content/canvas/src/WebGLContextGL.cpp:5582

And the cause is a regression from bug 728017 which refactored how we do this validation: the new code is:

    if (width < 0 || height < 0) {
        ErrorInvalidValue("%s: width and height must be >= 0");
        return false;

Spot the bug? Printf-like call with a %s but without matching extra argument!

This would explain why the stacks we got were useless.
Comment 13 Benoit Jacob [:bjacob] (mostly away) 2012-05-23 12:27:13 PDT
Created attachment 626547 [details] [diff] [review]
fix stack corruption

New try:
Comment 14 Benoit Jacob [:bjacob] (mostly away) 2012-05-23 14:58:04 PDT
Green \o/
Comment 15 Benoit Jacob [:bjacob] (mostly away) 2012-05-23 15:08:16 PDT
Landed the stack corruption fix on a CLOSED TREE as somehow, this preexisting stack corruption started causing crashes with my earlier push today.

Keep open for the other patches, which I'll land once the tree reopens.
Comment 16 Benoit Jacob [:bjacob] (mostly away) 2012-05-23 19:22:00 PDT
Remaining patches:

Can close this bug now!
Comment 18 Ed Morley [:emorley] 2012-05-24 09:34:10 PDT
And the first changeset:

Note You need to log in before you can comment on or make changes to this bug.