Closed Bug 682335 (CVE-2011-3003) Opened 13 years ago Closed 13 years ago

Investigate crash [@ WebGLContext::BufferSubData_array]

Categories

(Core :: Graphics: CanvasWebGL, defect)

6 Branch
All
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla9
Tracking Status
firefox7 + fixed
firefox8 + fixed
firefox9 + fixed
status1.9.2 --- unaffected
status1.9.1 --- unaffected

People

(Reporter: bsterne, Assigned: bsterne)

Details

(Whiteboard: [sg:critical?][qa+])

Crash Data

Attachments

(5 files)

Ben Hawkes reported the following to security@m.o yesterday.  There are 31MB of uncompressed testcases (4MB compressed) so, for the purposes of this bug, I am going to remove the testcases beyond the one where he observes the crash. Otherwise I won't be able to attach the PoC to Bugzilla.

I'll separately follow up with Ben to see if there is a way we can get the testcase generator and perhaps integrate it into our existing fuzzers.

------

Hi,

I'm getting a reliable crash in WebGLContext::BufferSubData_array in Firefox 6.0 while doing some basic fuzzing of WebGL/GLESSL. I'm having some trouble finding the root-cause (presumably its a use-after-free), but since the crash looks exploitable I thought I'd get you guys involved early to help investigate further.

The attached test cases are run sequentially from "harness01.html" (starting from test 3750.html) and I reliably see the crash at 4169.html on a Linux system using the proprietary NVidia driver. Stack trace as follows:

Program received signal SIGSEGV, Segmentation fault.
0xf675e700 in mozilla::WebGLBuffer::CopySubDataIfElementArray
(this=0x54ff8680, byteOffset=558000083, byteLength=16948,
data=0xb7e01000)
    at /usr/include/bits/string3.h:52
52        return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
(gdb) x/i 0xf675e700
=> 0xf675e700 <mozilla::WebGLBuffer::CopySubDataIfElementArray(GLuint,
GLuint, void const*)+32>:        rep movsb %ds:(%esi),%es:(%edi)
(gdb) i r esi edi ecx
esi            0xb7e01000       -1210052608
edi            0x214267d3       558000083
ecx            0x4234   16948
(gdb) x/bx 0x214267d3
0x214267d3:     Cannot access memory at address 0x214267d3
(gdb) bt
#0  0xf675e700 in mozilla::WebGLBuffer::CopySubDataIfElementArray
(this=0x54ff8680, byteOffset=558000083, byteLength=16948,
data=0xb7e01000)
    at /usr/include/bits/string3.h:52
#1  0xf674e540 in mozilla::WebGLContext::BufferSubData_array
(this=0x1c3f32e0, target=34963, byteOffset=558000083, wa=0x53b70600)
    at /usr/local/google/source/firefox/firefox32/mozilla-release/content/canvas/src/WebGLContextGL.cpp:560
#2  0xf6b80ca8 in nsIDOMWebGLRenderingContext_BufferSubData
(cx=0xe90b3290, argc=3, vp=0xf19ff0e8)
    at ../../../../dist/include/CustomQS_WebGL.h:208
#3  0xf72fe8f9 in CallCompiler::generateNativeStub() () from
/source/firefox/firefox32/mozilla-release/objdir-ff-dbg32/dist/bin/libxul.so
#4  0xf72fa26f in js::mjit::ic::NativeCall (f=..., ic=0x54ff8680)
    at /usr/local/google/source/firefox/firefox32/mozilla-release/js/src/methodjit/MonoIC.cpp:1026
#5  0xe49acf28 in ?? ()
#6  0xf72aba8c in EnterMethodJIT (cx=0x0, safePoint=0xe49661dc)
    at /usr/local/google/source/firefox/firefox32/mozilla-release/js/src/methodjit/MethodJIT.cpp:685
#7  CheckStackAndEnterMethodJIT (cx=0x0, safePoint=0xe49661dc)
    at /usr/local/google/source/firefox/firefox32/mozilla-release/js/src/methodjit/MethodJIT.cpp:715
#8  js::mjit::JaegerShotAtSafePoint (cx=0x0, safePoint=0xe49661dc)
    at /usr/local/google/source/firefox/firefox32/mozilla-release/js/src/methodjit/MethodJIT.cpp:742
#9  0xf7362522 in js::Interpret (cx=0xe90b3290, entryFrame=0xf19ff068,
inlineCallCount=1, interpMode=js::JSINTERP_NORMAL)
    at /usr/local/google/source/firefox/firefox32/mozilla-release/js/src/jsinterp.cpp:6347
#10 0xf71788c1 in js::RunScript (cx=0xe90b3290, script=0x537ff900,
fp=0xf19ff068)
    at /usr/local/google/source/firefox/firefox32/mozilla-release/js/src/jsinterp.cpp:613
#11 0xf71794a5 in js::Invoke (cx=0xe90b3290, argsRef=...,
option=js::INVOKE_NORMAL)
    at /usr/local/google/source/firefox/firefox32/mozilla-release/js/src/jsinterp.cpp:694
etc

Note: I was only able to reproduce this on 32-bit builds, my 64-bit build would not trigger this crash.

Thanks,
Ben
Have not investigated but based on the stack it's plausibly sg:critical until proven otherwise. Benoit, please check this out.
Assignee: nobody → bjacob
Whiteboard: [sg:critical?]
I am on linux x86-64 (debian squeeze) with the NVIDIA 280.13 driver, and I can't reproduce this crash at all, not even in Firefox 6.0.2, 32-bit version.

Without me being able to reproduce here are some things that could move this forward:
 1. Can you reproduce this crash in Nightly?
 2. Can you run this with APItrace,
      https://github.com/apitrace/apitrace
    and attach here the resulting trace?
The very large byteLength and the fact that this only happens on 32bit suggests this might be a duplicate of bug 665070. If this crash persists in Nightly then it's not a duplicate of bug 665070.
Also, if this is bug 665070 or any other out-of-memory-related bug, then running this in a debug build with MOZ_GL_DEBUG_VERBOSE=1 will show it (this prints GL error codes and GL_OUT_OF_MEMORY value will appear).
bsterne: reassigning to you to follow up with the reporter on comment 4
Assignee: bjacob → bsterne
Whiteboard: [sg:critical?] → [sg:critical?] might be dupe of 665070
(In reply to Daniel Veditz from comment #6)
> bsterne: reassigning to you to follow up with the reporter on comment 4

He's copied on the bug.

Ben, can you respond to comment 4?  Are you able to reproduce this using Nightly?  We are trying to determine if this is a dupe of a bug we already fixed.
Hi! I've reproduced the crash on Nightly 9.0a1 (2011-09-08). It is still crashing on test case 4169.html during a memcpy to an invalid destination. I've hosted the apitrace file here:

http://sota.gen.nz/webgl_crash_trace.tar.gz

I don't have access to bug 665070, so I can't comment on the likelihood of it being related to this crash.

Cheers!
Thanks, but:

$ ./build/apitrace/tracedump ~/firefox.trace
error: unsupported trace format version 97
error: failed to open /home/bjacob/firefox.trace

It seems that we have incompatible versions of APItrace. Could you please run tracedump on your side, and compress and attach to this bug the resulting text file?
The fact that it was reproduced in Nightly means it's not a dupe of 665070.
Whiteboard: [sg:critical?] might be dupe of 665070 → [sg:critical?]
Trace dump output uploaded here:

http://sota.gen.nz/webgl_crash_trace_dump.tar.gz
TL;DR: we have a driver bug here. What driver is this?


Here are all the Buffer-related calls in this trace:

376196 glGenBuffers(n = 1, buffer = &3)
376276 glGetBufferParameteriv(target = GL_ARRAY_BUFFER, pname = GL_BUFFER_SIZE, params = &0)
376291 glBindBuffer(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 3)
376447 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 0)
376479 glGetBufferParameteriv(target = GL_ARRAY_BUFFER, pname = GL_BUFFER_SIZE, params = &0)
376482 glIsBuffer(buffer = 3) = true
376496 glIsBuffer(buffer = 3) = true
376518 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 0)
376543 glBufferData(target = GL_ELEMENT_ARRAY_BUFFER, size = 956231260, data = NULL, usage = GL_STATIC_DRAW)
376544 glGetError() = GL_NO_ERROR
376603 glGetBufferParameteriv(target = GL_ARRAY_BUFFER, pname = GL_BUFFER_USAGE, params = &0)
376824 glGetBufferParameteriv(target = GL_ELEMENT_ARRAY_BUFFER, pname = GL_BUFFER_SIZE, params = &0)


Here is the interesting part:

376518 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 0)
376543 glBufferData(target = GL_ELEMENT_ARRAY_BUFFER, size = 956231260, data = NULL, usage = GL_STATIC_DRAW)
376544 glGetError() = GL_NO_ERROR

BindBuffer with buffer=0 unbind any bound buffer. So the glBufferData call should fail and produce a GL error:

http://www.khronos.org/opengles/sdk/docs/man/xhtml/glBufferData.xml

"GL_INVALID_OPERATION is generated if the reserved buffer object name 0 is bound to target."

So this is a bug in this OpenGL implementation/driver.

What driver is this?

Next to that we must also have a bug on our side. Indeed, our WebGL BufferData functions do check that a buffer is bound, and generate an error if no buffer is bound. So here, apparently they believe that a buffer is bound but none is. This could be another bug in the category "Mistake in keeping track of GL state".
(In reply to Benoit Jacob [:bjacob] from comment #12)
> TL;DR: we have a driver bug here. What driver is this?
> 
> 
> Here are all the Buffer-related calls in this trace:
> 
> 376196 glGenBuffers(n = 1, buffer = &3)
> 376276 glGetBufferParameteriv(target = GL_ARRAY_BUFFER, pname =
> GL_BUFFER_SIZE, params = &0)
> 376291 glBindBuffer(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 3)
> 376447 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 0)
> 376479 glGetBufferParameteriv(target = GL_ARRAY_BUFFER, pname =
> GL_BUFFER_SIZE, params = &0)
> 376482 glIsBuffer(buffer = 3) = true
> 376496 glIsBuffer(buffer = 3) = true
> 376518 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 0)
> 376543 glBufferData(target = GL_ELEMENT_ARRAY_BUFFER, size = 956231260, data
> = NULL, usage = GL_STATIC_DRAW)
> 376544 glGetError() = GL_NO_ERROR
> 376603 glGetBufferParameteriv(target = GL_ARRAY_BUFFER, pname =
> GL_BUFFER_USAGE, params = &0)
> 376824 glGetBufferParameteriv(target = GL_ELEMENT_ARRAY_BUFFER, pname =
> GL_BUFFER_SIZE, params = &0)
> 
> 
> Here is the interesting part:
> 
> 376518 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 0)
> 376543 glBufferData(target = GL_ELEMENT_ARRAY_BUFFER, size = 956231260, data
> = NULL, usage = GL_STATIC_DRAW)
> 376544 glGetError() = GL_NO_ERROR

Oops. The target is different: GL_ARRAY_BUFFER versus GL_ELEMENT_ARRAY_BUFFER. Need to think more.
(In reply to Brandon Sterne (:bsterne) from comment #0)
> Program received signal SIGSEGV, Segmentation fault.
> 0xf675e700 in mozilla::WebGLBuffer::CopySubDataIfElementArray
> (this=0x54ff8680, byteOffset=558000083, byteLength=16948,
> data=0xb7e01000)
>     at /usr/include/bits/string3.h:52
> 52        return __builtin___memcpy_chk (__dest, __src, __len, __bos0
> (__dest));
> (gdb) x/i 0xf675e700
> => 0xf675e700 <mozilla::WebGLBuffer::CopySubDataIfElementArray(GLuint,
> GLuint, void const*)+32>:        rep movsb %ds:(%esi),%es:(%edi)
> (gdb) i r esi edi ecx
> esi            0xb7e01000       -1210052608
> edi            0x214267d3       558000083
> ecx            0x4234   16948
> (gdb) x/bx 0x214267d3
> 0x214267d3:     Cannot access memory at address 0x214267d3

Pardon my ignorance but how does this show that we're writing at an invalid destination? Here edi is just the offset, not the absolute address. We're writing to es:edi, not to edi. Right? So at this point we don't know that we're doing an invalid write, this could also be an invalid read. right?
...thinking about this, maybe the answer is that es itself is 0. That would mean that mData is null. Looking at the code, this can happen if it failed to allocate and we don't seem to properly check for that.
When mData is null, we want to avoid dereferencing (pointers based on) it. Rather than adding mData!=0 checks everywhere, we already have mByteLength!=0 checks in most places so I used that. When mData allocation fails, mByteLength gets set to 0 so that our bookkeeping is consistent (a buffer that failed to allocate has size 0). Should remove the crash.
Attachment #559455 - Flags: review?(jmuizelaar)
Comment on attachment 559455 [details] [diff] [review]
set byteLength to 0 on mData allocation failure

Should we be using a safer data type for this?
Attached file Minimal testcase
This is a really minimal testcase. But since it relies on OOM and WebGL doesn't allow buffer allocations bigger than 2G, it's still hard to reproduce on 64bit systems with lots of virtual memory (note that VRAM is virtualized). So in order to reproduce easily, I patched Firefox to make 100x bigger allocations in CopyDataIfElementArray and in ZeroDataIfElementArray in WebGLBuffer, in WebGLContext.h.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #17)
> Should we be using a safer data type for this?

Yes, we should use something like nsTArray<PRUint8>.

The reason for void* was that a WebGL buffer, already in the ELEMENT_ARRAY_BUFFER case that's relevant here, can be either an array of uint8's or an array of uint16's with the possibility of uint32's in the future.

But this doesn't really matter to the code here, and using nsTArray<PRUint8> and letting it handle resizing and querying its length instead of storing mByteLength separately would have prevented this bug.

Please still r+ this patch as it is minimal and so will be easier to approve for beta. Will write the long-term-solution patch.
(In reply to Benoit Jacob [:bjacob] from comment #19)
> But this doesn't really matter to the code here, and using nsTArray<PRUint8>
> and letting it handle resizing and querying its length instead of storing
> mByteLength separately would have prevented this bug.

Ah, wait, no, there is a difficulty.

We only want to store a copy of the WebGLBuffer contents (mData) in the ELEMENT_ARRAY_BUFFER case.

In the ARRAY_BUFFER case, storing it would be useless and typically a lot more expensive. That's why we don't do it. But we _still_ need to store the byteLength!

I think the full solution is we want to have both:
 * always store mByteLength as an integer (as we currently do)
 * store mData as a nsTArray
 * there would not quite be a redundancy between mByteLength and mData.Length() as  mByteLength would be the length of the actual GPU-side buffer while mData.Length() would be the size of the copy we have on our side. So they're different concepts. Of course their values are the same unless we are in an error case.
I confirm that the current patch fixes the crash on my testcase.
Comment on attachment 559455 [details] [diff] [review]
set byteLength to 0 on mData allocation failure

Benoit, and I talked about a mechanism for making this easier to test. We'll follow up with a link to that bug.
Attachment #559455 - Flags: review?(jmuizelaar) → review+
Indeed, the problem is that any crashtest for this bug would have to reproduce OOM conditions, which on 64bit machines with lots of slow virtual memory is indistinguishable from a plain timeout.
http://hg.mozilla.org/mozilla-central/rev/9f664f2ac12c
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Comment on attachment 559455 [details] [diff] [review]
set byteLength to 0 on mData allocation failure

Please approve for beta. It's a really bad bug. It allows to implement POKE() if you can trigger an OOM with bufferData, which is not hard to do on 32bit machines.
Attachment #559455 - Flags: approval-mozilla-beta?
Attachment #559455 - Flags: approval-mozilla-aurora?
Target Milestone: --- → mozilla9
Attachment #559455 - Flags: approval-mozilla-beta?
Attachment #559455 - Flags: approval-mozilla-beta+
Attachment #559455 - Flags: approval-mozilla-aurora?
Attachment #559455 - Flags: approval-mozilla-aurora+
qa- as no QA fix verification needed
Whiteboard: [sg:critical?] → [sg:critical?][qa-]
Again not understanding the qa- on a crash bug with a reproducible minimal testcase.
Whiteboard: [sg:critical?][qa-] → [sg:critical?]
Sorry, I made a mistake. I thought I was commenting on a different bug (triaging 131 bugs in a couple hours has a high likelihood of human error).
Whiteboard: [sg:critical?] → [sg:critical?][qa+]
Brandon, could you help us verify this bug on your environment? We've been trying to get a hold of a machine with similar specs, but I haven't been able to reproduce the crash as described nor see a difference on a build with the fix.
Juan, as I explained in comment 18, the key to reproducing this is to use a 32bit system with not too much RAM (1G would work well). Alternatively, to reproduce this on my 64bit system, I patched Firefox to make 100x bigger allocations there. If you want, I can make you such a patch.
For time sake, it would be great if someone who is already set up to reproduce this bug could verify the fix.
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #34)
> For time sake, it would be great if someone who is already set up to
> reproduce this bug could verify the fix.

Benoit, would it be possible for you to share the build you already generated? or throw something up on Try which QA can test?
pushed the alloc-100x-more to try:

https://tbpl.mozilla.org/?tree=Try&usebuildbot=1&rev=b6363b4d9bcd

builds will be available in a few hours at

http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-b6363b4d9bcd

The QA steps from there are run the PoC in this build and note that it doesn't crash, but does run out of memory (I think if you check in the Web console you'll see a WebGL warning like "bufferData: out of memory". You will need to set webgl.verbose=true in about:config).
Alias: CVE-2011-3003
There was a mistake in the PoC which caused it fail on OSX and Linux due to a path issue. I have attached a diff to fix this.
Group: core-security
rforbes-bugspam-for-setting-that-bounty-flag-20130719
Flags: sec-bounty+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: