Closed Bug 791905 (CVE-2013-0763) Opened 12 years ago Closed 11 years ago

Heap-use-after-free in Mesa, triggerable by resizing a WebGL canvas

Categories

(Core :: Graphics: CanvasWebGL, defect)

x86_64
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
firefox18 --- fixed
firefox19 + fixed
firefox20 + fixed
firefox-esr10 --- unaffected
firefox-esr17 --- fixed

People

(Reporter: inferno, Unassigned)

References

Details

(Keywords: csectype-uaf, sec-critical, sec-vector, Whiteboard: [asan][adv-main18+][adv-esr17+])

Attachments

(5 files, 1 obsolete file)

Attached file Testcase (obsolete) —
Run the testcase from content/canvas/test/webgl/conformance/more/functions/ directory. Here is my OpenGL info.
OpenGL version string: 2.1 Mesa 8.0.2

Unsymbolized - 
==28427== ERROR: AddressSanitizer heap-use-after-free on address 0x7f299c105380 at pc 0x4c1336 bp 0x7fff93b8a980 sp 0x7fff93b8a748
WRITE of size 1 at 0x7f299c105380 thread T0
    #0 0x4c1335 (/tmp/firefox/firefox+0x4c1335)
    #1 0x7f29a15438c1 (/usr/lib/x86_64-linux-gnu/dri/swrast_dri.so+0x738c1)
0x7f299c105380 is located 0 bytes inside of 64-byte region [0x7f299c105380,0x7f299c1053c0)
freed by thread T0 here:
    #0 0x4c4a10 (/tmp/firefox/firefox+0x4c4a10)
    #1 0x7f29a11c4c19 (/usr/lib/x86_64-linux-gnu/dri/libdricore.so+0x155c19)
previously allocated by thread T0 here:
    #0 0x4c4d4a (/tmp/firefox/firefox+0x4c4d4a)
    #1 0x7f29a10f9d60 (/usr/lib/x86_64-linux-gnu/dri/libdricore.so+0x8ad60)
Shadow byte and word:
  0x1fe533820a70: fd
  0x1fe533820a70: fd fd fd fd fd fd fd fd
More shadow bytes:
  0x1fe533820a50: fd fd fd fd fd fd fd fd
  0x1fe533820a58: fd fd fd fd fd fd fd fd
  0x1fe533820a60: fa fa fa fa fa fa fa fa
  0x1fe533820a68: fa fa fa fa fa fa fa fa
=>0x1fe533820a70: fd fd fd fd fd fd fd fd
  0x1fe533820a78: fd fd fd fd fd fd fd fd
  0x1fe533820a80: fa fa fa fa fa fa fa fa
  0x1fe533820a88: fa fa fa fa fa fa fa fa
  0x1fe533820a90: fd fd fd fd fd fd fd fd
Stats: 214M malloced (217M for red zones) by 289321 calls
Stats: 30M realloced by 12288 calls
Stats: 174M freed by 171411 calls
Stats: 42M really freed by 47020 calls
Stats: 420M (107600 full pages) mmaped in 105 calls
  mmaps   by size class: 8:212979; 9:40955; 10:8190; 11:10235; 12:2048; 13:2048; 14:1536; 15:256; 16:384; 17:1312; 18:144; 19:40; 20:20; 
  mallocs by size class: 8:221586; 9:38394; 10:7643; 11:13561; 12:2266; 13:2004; 14:1530; 15:310; 16:504; 17:1333; 18:136; 19:37; 20:17; 
  frees   by size class: 8:122227; 9:28108; 10:4313; 11:10535; 12:1422; 13:1409; 14:1332; 15:256; 16:443; 17:1283; 18:32; 19:36; 20:15; 
  rfrees  by size class: 8:34255; 9:5161; 10:1300; 11:4626; 12:388; 13:372; 14:332; 15:83; 16:229; 17:263; 18:6; 19:4; 20:1; 
Stats: malloc large: 1523 small slow: 1623
==28427== ABORTING

Symbolized -
==27925== ERROR: AddressSanitizer heap-use-after-free on address 0x7ff8472f3c80 at pc 0x4c1336 bp 0x7fffa43d7d20 sp 0x7fffa43d7ae8
WRITE of size 1 at 0x7ff8472f3c80 thread T0
    #0 0x4c1335 in memcpy ??:?
    #1 0x7ff84cf438c1 in ?? ??:0
0x7ff8472f3c80 is located 0 bytes inside of 64-byte region [0x7ff8472f3c80,0x7ff8472f3cc0)
freed by thread T0 here:
    #0 0x4c4a10 in __interceptor_free ??:?
    #1 0x7ff84cbc4c19 in ?? ??:0
previously allocated by thread T0 here:
    #0 0x4c4d4a in __interceptor_posix_memalign ??:?
    #1 0x7ff84caf9d60 in ?? ??:0
Shadow byte and word:
  0x1fff08e5e790: fd
  0x1fff08e5e790: fd fd fd fd fd fd fd fd
More shadow bytes:
  0x1fff08e5e770: fd fd fd fd fd fd fd fd
  0x1fff08e5e778: fd fd fd fd fd fd fd fd
  0x1fff08e5e780: fa fa fa fa fa fa fa fa
  0x1fff08e5e788: fa fa fa fa fa fa fa fa
=>0x1fff08e5e790: fd fd fd fd fd fd fd fd
  0x1fff08e5e798: fd fd fd fd fd fd fd fd
  0x1fff08e5e7a0: fa fa fa fa fa fa fa fa
  0x1fff08e5e7a8: fa fa fa fa fa fa fa fa
  0x1fff08e5e7b0: fd fd fd fd fd fd fd fd
Stats: 214M malloced (218M for red zones) by 289327 calls
Stats: 30M realloced by 12281 calls
Stats: 174M freed by 171411 calls
Stats: 42M really freed by 46444 calls
Stats: 420M (107600 full pages) mmaped in 105 calls
  mmaps   by size class: 8:212979; 9:40955; 10:8190; 11:10235; 12:2048; 13:2048; 14:1536; 15:256; 16:384; 17:1312; 18:144; 19:40; 20:20;
  mallocs by size class: 8:221603; 9:38387; 10:7646; 11:13555; 12:2265; 13:2004; 14:1530; 15:310; 16:503; 17:1333; 18:137; 19:37; 20:17;
  frees   by size class: 8:122240; 9:28103; 10:4313; 11:10529; 12:1420; 13:1409; 14:1332; 15:256; 16:443; 17:1283; 18:32; 19:36; 20:15;
  rfrees  by size class: 8:33919; 9:5096; 10:1273; 11:4499; 12:374; 13:370; 14:328; 15:83; 16:231; 17:260; 18:6; 19:4; 20:1;
Stats: malloc large: 1524 small slow: 1614
==27925== ABORTING
Component: General → Canvas: WebGL
Product: Firefox → Core
I can't reproduce, with Mesa Git + LLVMpipe and with the NVIDIA driver.

What driver is this with (please paste about:support Graphics)
specifically: valgrind doesn't see any error when running this on Mesa Git + LLVMpipe.
Oh, comment 0 has swrast_dri.so. "swrast" is the antiquated Mesa software renderer, now replaced by softpipe and llvmpipe. I guess we might as well blacklist it at this point. We just dropped the built-in OSMesa support anyway (OSMesa was one of last things to use swrast).
oh... you _are_ on llvmpipe like me. Can you get a stack trace to the use-after-free?
(In reply to Benoit Jacob [:bjacob] (away at Khronos meeting until Thursday) from comment #5)
> oh... you _are_ on llvmpipe like me. Can you get a stack trace to the
> use-after-free?

Sorry i couldn't get a good stack here due to lack of symbolized mesa library and ASAN couldn't unwind up the stack here. Can you try with asanified firefox builds from https://people.mozilla.com/~choller/firefox/asan/ and using a Xvfb screen 1280x1024x24 ? Also, check out the last line in the repro testcase which might be indicating the bug.
That was an issue in ASAN stack unwinding as i suspected. Here is the valgrind stack.

==22580== Invalid read of size 8
==22580==    at 0x4C2D108: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==22580==    by 0x257C48C1: ??? (in /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
==22580==    by 0x257C4CB1: ??? (in /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
==22580==    by 0x257C52F7: ??? (in /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
==22580==    by 0x257C586A: lp_setup_flush (in /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
==22580==    by 0x257BEE09: llvmpipe_flush (in /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
==22580==    by 0x257BEE5D: llvmpipe_finish (in /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
==22580==    by 0x257BEF3A: llvmpipe_flush_resource (in /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
==22580==    by 0x257A8F3A: ??? (in /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
==22580==    by 0x2632A3BD: ??? (in /usr/lib/x86_64-linux-gnu/dri/libgallium.so)
==22580==    by 0x25E67AF5: _mesa_readpixels (in /usr/lib/x86_64-linux-gnu/dri/libdricore.so)
==22580==    by 0x25E687AA: _mesa_ReadnPixelsARB (in /usr/lib/x86_64-linux-gnu/dri/libdricore.so)
==22580==  Address 0x21001e68 is 8 bytes inside a block of size 64 free'd
==22580==    at 0x4C2A82E: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==22580==    by 0x25F07C19: _mesa_free_parameter_list (in /usr/lib/x86_64-linux-gnu/dri/libdricore.so)
==22580==    by 0x25EF7D6F: _mesa_delete_program (in /usr/lib/x86_64-linux-gnu/dri/libdricore.so)
==22580==    by 0x25EF7DFA: _mesa_reference_program_ (in /usr/lib/x86_64-linux-gnu/dri/libdricore.so)
==22580==    by 0x2632153E: ??? (in /usr/lib/x86_64-linux-gnu/dri/libgallium.so)
==22580==    by 0x2631EE6E: st_validate_state (in /usr/lib/x86_64-linux-gnu/dri/libgallium.so)
==22580==    by 0x2632C1FD: ??? (in /usr/lib/x86_64-linux-gnu/dri/libgallium.so)
==22580==    by 0x25E687AA: _mesa_ReadnPixelsARB (in /usr/lib/x86_64-linux-gnu/dri/libdricore.so)
==22580==    by 0x25E687C9: _mesa_ReadPixels (in /usr/lib/x86_64-linux-gnu/dri/libdricore.so)
==22580==    by 0x87E2901: mozilla::gl::GLContext::raw_fReadPixels(int, int, int, int, unsigned int, unsigned int, void*) (GLContext.h:2632)
==22580==    by 0x87E1B9A: mozilla::gl::GLContext::fReadPixels(int, int, int, int, unsigned int, unsigned int, void*) (GLContext.h:1345)
==22580==    by 0x9AE659C: mozilla::gl::GLContext::ReadPixelsIntoImageSurface(gfxImageSurface*) (GLContext.cpp:2050)
Both the valgrind and asan logs seem to indicate a use-after-free, but down in the driver? Is this our bug or someone else's?
Whiteboard: [asan]
Assigning to Benoit while we're here. Reassign as necessary.
Assignee: nobody → bjacob
This sounds like it's a really embarrassing bug wrt setting height to a negative number.
(In reply to Jeff Gilbert [:jgilbert] from comment #10)
> This sounds like it's a really embarrassing bug wrt setting height to a
> negative number.

I do agree that fixing the negative height (last line in repro) should fix the root cause and we don't have to worry about how driver behaves. Even if you guys are not able to reproduce, i can always try a potential patch on my box to make sure that it is fixed (where it is reliably reproducing).
> try { gl.setAttribute("height", "-39"); } catch(e) {}

What is this supposed to do? The WebGL context does not have a "height" attribute. Is this a specific of the wrapGLcontext stuff used here? Does this somehow resize the canvas?

Above I had not placed the file in the right directory -- now that I have, I can reproduce the use-after-free in valgrind:

==3613== Invalid read of size 8
==3613==    at 0x402F198: memcpy@@GLIBC_2.14 (mc_replace_strmem.c:839)
==3613==    by 0x30FCA2ED: try_update_scene_state (lp_setup.c:827)
==3613==    by 0x30FC8CFB: begin_binning (lp_setup.c:197)
==3613==    by 0x30FC8F93: execute_clears (lp_setup.c:262)
==3613==    by 0x30FC90CB: set_scene_state (lp_setup.c:310)
==3613==    by 0x30FC91C5: lp_setup_flush (lp_setup.c:342)
==3613==    by 0x30FBB083: llvmpipe_flush (lp_flush.c:55)
==3613==    by 0x30FBB0FB: llvmpipe_finish (lp_flush.c:89)
==3613==    by 0x30FBB1DB: llvmpipe_flush_resource (lp_flush.c:128)
==3613==    by 0x30F998E3: llvmpipe_get_transfer (lp_texture.c:601)
==3613==    by 0x31130099: pipe_get_transfer (u_inlines.h:389)
==3613==    by 0x3113126B: st_MapRenderbuffer (st_cb_fbo.c:679)
==3613==  Address 0x5cfced8 is 8 bytes inside a block of size 64 free'd
==3613==    at 0x402C4B3: free (vg_replace_malloc.c:444)
==3613==    by 0x31035869: _mesa_align_free (imports.c:181)
==3613==    by 0x3111E3F6: _mesa_free_parameter_list (prog_parameter.c:88)
==3613==    by 0x31117D7E: _mesa_delete_program (program.c:395)
==3613==    by 0x3120651C: st_delete_program (st_cb_program.c:175)
==3613==    by 0x31117F92: _mesa_reference_program_ (program.c:460)
==3613==    by 0x311F84DB: _mesa_reference_program (program.h:102)
==3613==    by 0x311F8598: st_reference_fragprog (st_program.h:259)
==3613==    by 0x311F86C0: update_fp (st_atom_shader.c:88)
==3613==    by 0x311F4765: st_validate_state (st_atom.c:177)
==3613==    by 0x31207389: st_readpixels (st_cb_readpixels.c:52)
==3613==    by 0x310BCA13: _mesa_ReadnPixelsARB (readpix.c:875)
==3613==
Our SetDimensions methods take signed width and height, but the canvas spec says these are unsigned attributes. Using unsigned here would solve the problem, as we would then go on to note that the resulting huge positive value exceeds the max texture size.
...but then, we have a ton of code around here that assumes signed sizes (like gfxIntSize) so it's safer to just add a check here.
Comment on attachment 663089 [details] [diff] [review]
reject negative canvas size

Review of attachment 663089 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/canvas/src/WebGLContext.cpp
@@ +338,5 @@
>  NS_IMETHODIMP
>  WebGLContext::SetDimensions(int32_t width, int32_t height)
>  {
> +    if (width < 0 || height < 0) {
> +        GenerateWarning("Canvas size is too large (seems like a negative value wrapped)");

This seems like a weird way to say that we don't allow negative values, as IIRC, the javascript width/height values are signed. I would rather this just say that we don't allow negative width or height.
Attachment #663089 - Flags: review?(jgilbert) → review+
Comment on attachment 663089 [details] [diff] [review]
reject negative canvas size

Review of attachment 663089 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/canvas/src/WebGLContext.cpp
@@ +339,5 @@
>  WebGLContext::SetDimensions(int32_t width, int32_t height)
>  {
> +    if (width < 0 || height < 0) {
> +        GenerateWarning("Canvas size is too large (seems like a negative value wrapped)");
> +        return NS_ERROR_OUT_OF_MEMORY;

As above, I don't actually think this is the right error to throw, since this should be invalid rather than too-large, unless the WebGL spec (or similar) requires these values to be interpreted as unsigned.
(In reply to Jeff Gilbert [:jgilbert] from comment #17)
> Comment on attachment 663089 [details] [diff] [review]
> reject negative canvas size
> 
> Review of attachment 663089 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: content/canvas/src/WebGLContext.cpp
> @@ +338,5 @@
> >  NS_IMETHODIMP
> >  WebGLContext::SetDimensions(int32_t width, int32_t height)
> >  {
> > +    if (width < 0 || height < 0) {
> > +        GenerateWarning("Canvas size is too large (seems like a negative value wrapped)");
> 
> This seems like a weird way to say that we don't allow negative values, as
> IIRC, the javascript width/height values are signed. I would rather this
> just say that we don't allow negative width or height.

The canvas width/height attributes are actually unsigned:

http://www.whatwg.org/specs/web-apps/current-work/multipage/the-canvas-element.html#the-canvas-element

That's why I felt the need to phrase this in this convoluted way.
(In reply to Jeff Gilbert [:jgilbert] from comment #18)
> Comment on attachment 663089 [details] [diff] [review]
> reject negative canvas size
> 
> Review of attachment 663089 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: content/canvas/src/WebGLContext.cpp
> @@ +339,5 @@
> >  WebGLContext::SetDimensions(int32_t width, int32_t height)
> >  {
> > +    if (width < 0 || height < 0) {
> > +        GenerateWarning("Canvas size is too large (seems like a negative value wrapped)");
> > +        return NS_ERROR_OUT_OF_MEMORY;
> 
> As above, I don't actually think this is the right error to throw, since
> this should be invalid rather than too-large, unless the WebGL spec (or
> similar) requires these values to be interpreted as unsigned.

Assuming we agree that width and height should be handled as unsigned values, I suppose that adresses your comment.

It sucks that we have this discrepancy between the specs mandating unsigned and our c++ funcs wanting signed, but that is quite nontrivial to fix and I wanted a quick fix here. This is nontrivial because we pass along these values to other functions that expect signed values, like ResizeOffscreenFBO. gfxIntSize is signed.
FWIW -- the reason why this hasn't landed yet is that this causes crashes on tryserver. It seems that we are actually doing a resize to a negative size somewhere in aboutSupport.js, and this patch causes the |gl| pointer to be null in this case.
Hi Benoit, since you were reproduce the crash, did you try with your patch on and made sure that test does not crash. I tried it on my local and it still crashes [can't believe since the patch and bug seemed straightforward]. I also blew away my build dir and rechecked with your patch applied, it still crashes.
There is something else going on here. The bug does not reproduce without the setAttribute line, however positive values work as well. i tested with 39, 184 and it crashes the same way.
I hadn't tried --- trying now.

One thing that I found strange in your testcase is that it seemed to set width/height attributes on the WebGL context, but only the canvas (as opposed to the WebGL context) has width/height attributes. But I assumed that there was something hidden in the wrapper you're using around the WebGL context.
OK, the problem still reproduces with this patch.

Also, the crash I had seen on tryserver was unrelated to this patch (was caused by patch on bug 791521).

I'll land the patch anyways as it seems like a useful thing, but this bug stays open.

Has this been reported to Mesa?
No it is not reported to Mesa. Please report it as needed.
Benoit, please request approval if this applies cleanly on branches.
Would be nice to get this fix (which looks completely minimal and safe) into Firerfox 16 and ESR-10. Please request approvals unless you need a different patch on older branches.
Tracking for the upcoming ESR that will ship with FF 16, please submit the mozilla-beta/mozilla-aurora approval nominations with risk assessment for landing this late in beta cycle.
@ comment 31: unfortunately, the bug is still open -- the patch that landed, while nice to have, does not close the issue.
Comment on attachment 663089 [details] [diff] [review]
reject negative canvas size

Sorry for the delayed request.

Important: while nice to have, this patch does not avoid the driver bug here. The bug is still unfixed.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): driver bug
User impact if declined: while this is nice to have, we don't actually know that this protects the user from any bad thing.
Testing completed (on m-c, etc.): m-c since last week
Risk to taking this patch (and alternatives if risky): no risk
String or UUID changes made by this patch: none
Attachment #663089 - Flags: approval-mozilla-aurora?
Ian, this is a sec-critical Mesa bug and we don't know any work-around for it.
The Mesa bug tracker does not seem to allow filing hidden bugs, so let's stay here. CC'ing our friends at Google.
I remember testing this with Chromium (I work in chrome security team) and it didn't crash there. zmo@ would know if we have workarounds for this.
Ah OK! Could you please CC me on the corresponding Chromium security bug if there is one? My Google account is jacob.benoit.1
(In reply to Benoit Jacob [:bjacob] from comment #37)
> Ah OK! Could you please CC me on the corresponding Chromium security bug if
> there is one? My Google account is jacob.benoit.1

I haven't filed a chromium bug yet. Firstly because we were first thinking this is related to the negative height issue which seemed firefox code issue. Now, i tried the testcase again and it still does not look to crash in Chromium, so i dont know if it affects us. Will let zmo@ triage this and then we can file a tracking chromium security bug if needed.
OK, I see.

I suppose that the real test is not whether the browser (or in Chrome's case, the content process or GPU process) crashes, but rather whether Valgrind or ASan can detect an error.
(In reply to Benoit Jacob [:bjacob] from comment #39)
> OK, I see.
> 
> I suppose that the real test is not whether the browser (or in Chrome's
> case, the content process or GPU process) crashes, but rather whether
> Valgrind or ASan can detect an error.

Exactly. By crash i meant crash in an asanified build [i tried both with and without --use-gl=osmesa cmd line flags]. we have a large archive of our builds here - https://commondatastorage.googleapis.com/chromium-browser-asan/index.html :)
Hm, notice that I have only reproduced the error in LLVMpipe, which is a specific Mesa driver. I don't know if it reproduces in OSMesa. What were you using in comment 0?
OK, so here are the line numbers in Mesa Git:

==14095== Invalid read of size 8
==14095==    at 0x402F288: memcpy@@GLIBC_2.14 (mc_replace_strmem.c:877)
==14095==    by 0x2EE75C81: try_update_scene_state (lp_setup.c:827)
==14095==    by 0x2EE74689: begin_binning (lp_setup.c:197)
==14095==    by 0x2EE74921: execute_clears (lp_setup.c:262)
==14095==    by 0x2EE74A59: set_scene_state (lp_setup.c:310)
==14095==    by 0x2EE74B53: lp_setup_flush (lp_setup.c:342)
==14095==    by 0x2EE66707: llvmpipe_flush (lp_flush.c:55)
==14095==    by 0x2EE6674C: llvmpipe_finish (lp_flush.c:82)
==14095==    by 0x2EE66828: llvmpipe_flush_resource (lp_flush.c:121)
==14095==    by 0x2EE45128: llvmpipe_get_transfer (lp_texture.c:611)
==14095==    by 0x2EFE55FB: pipe_get_transfer (u_inlines.h:417)
==14095==    by 0x2EFE6959: st_MapRenderbuffer (st_cb_fbo.c:745)
==14095==  Address 0x19438e78 is 8 bytes inside a block of size 64 free'd
==14095==    at 0x402C563: free (vg_replace_malloc.c:447)
==14095==    by 0x2EEE8FE1: _mesa_align_free (imports.c:179)
==14095==    by 0x2EFD39FA: _mesa_free_parameter_list (prog_parameter.c:88)
==14095==    by 0x2EFCD252: _mesa_delete_program (program.c:392)
==14095==    by 0x2F0BE587: st_delete_program (st_cb_program.c:170)
==14095==    by 0x2EFCD466: _mesa_reference_program_ (program.c:457)
==14095==    by 0x2F0B08B3: _mesa_reference_program (program.h:102)
==14095==    by 0x2F0B0970: st_reference_fragprog (st_program.h:265)
==14095==    by 0x2F0B0AF7: update_fp (st_atom_shader.c:93)
==14095==    by 0x2F0AB72E: st_validate_state (st_atom.c:203)
==14095==    by 0x2F0BF49D: st_readpixels (st_cb_readpixels.c:52)
==14095==    by 0x2EF728C6: _mesa_ReadnPixelsARB (readpix.c:767)

In the call stack to the free() (the second one) we see that a ReadPixels resulted in some program refcounting work (_mesa_reference_program_) which decides that the refcount hit 0 hence deletes the program: program.c:457 is this:

      deleteFlag = ((*ptr)->RefCount == 0);
      /*_glthread_UNLOCK_MUTEX((*ptr)->Mutex);*/

      if (deleteFlag) {
         ASSERT(ctx);
457->    ctx->Driver.DeleteProgram(ctx, *ptr);
      }

It's really strange that readPixels would really want to delete a program object, so I suppose that that is the real bug.
This readPixels call in Mesa is caused by Firefox's software compositor (which is still default on linux) reading back the WebGL canvas (checked in GDB). Confirmed that the testcase setting the height attribute does result in a canvas resizing, which causes Mesa readPixels to be called again and that's when the error occurs.
Attached file testcase updated
Attachment #661996 - Attachment is obsolete: true
In the testcase, calling gl.finish() before setting the height attribute removes the error; so we may be on track to find a work-around.
...nope, tried putting glFinish() before glClear() and before glReadPixels(), the error is still present. no workaround in sight.
Summary: Heap-use-after-free with webgl (negative canvas height) → Heap-use-after-free in Mesa, triggerable by resizing a WebGL canvas
Do you know which program is being deleted? Could you explicitly defer calling glDeleteProgram against it as a workaround?
This program deletion is not apparently the result of a glDeleteProgram call that we issue. Even calling glFinish after every gl call, I still get this error happening in places such as glClear or glReadPixels calls.
Attachment #663089 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(In reply to Benoit Jacob [:bjacob] from comment #27)
> http://hg.mozilla.org/integration/mozilla-inbound/rev/3d81919584ff

Should this be marked reso/fixed ?
As said above (comment 32, comment 33) this patch was only a 'nice to have' patch and does not resolve the issue here. We don't have a fix for this bug as of yet.
Un-assigning from myself to reflect the reality that I've not been working on this. If you think that this is so important that I should work on this instead of other things I'm working on, let me know.
Assignee: bjacob → nobody
If we are going to do more investigation here, here is what could be done:

APItrace Firefox reproducing this, and then replay the trace and check if that reproduces the heap use-after-free. If it does, we then have a testcase independent of Firefox which we can then try to minimize.
So far, it seems that we have only reproduced this on LLVMpipe, and the stack trace found by Valgrind seems very LLVMpipe-specific. So if it comes to blacklisting, we could at least consider blacklisting LLVMpipe only.
This is being tracked for Firefox 17 and seems to be very much at risk since no one is assigned to it. What should we be doing here in the short term?
We have 2 options in the short term:
 - live with the vulnerability (we have no idea at this point of whether it is exploitable AFAIK)
 - or blacklist LLVMpipe. This won't affect too many users.
(In reply to Benoit Jacob [:bjacob] from comment #56)
>  - live with the vulnerability (we have no idea at this point of whether it
We'll be taking this option, wontfixing for 17.
blacklisting llvmpipe is probably fine -- performance would be pretty sucky anyway, wouldn't it be better to simply not offer false hope?
Depends on: 816695
Filed bug 816695 to blocklist the driver.
(In reply to Daniel Veditz [:dveditz] from comment #58)
> blacklisting llvmpipe is probably fine -- performance would be pretty sucky
> anyway, wouldn't it be better to simply not offer false hope?

LLVMpipe is a fast enough software renderer to be useful in some cases (either very fast machine, or not-too-demanding WebGL code).

Before blacklisting let's ask again the Mesa developers CC'd here if they know a work-around for this bug? Somehow, it seems to only be affecting Firefox and not Chrome, which lets me hope that a work-around may exist.
Let me expand a little more on why I am a bit reluctant to blacklist llvmpipe and find it worth my time to work around its bugs:

 - It is the fastest open-source software GL renderer in existence, so it is our only reasonable option if we want to offer a software fallback for WebGL. Note that the Chrome team found it important enough to offer a software fallback for WebGL, that they licensed a commercial GL renderer for that.

 - Being open source AND running entirely on the CPU side, it is particularly easy to discover its bugs, for example it can be run entirely in Valgrind, recompiled with Address Sanitizer, etc, so that the fact that we get more bug reports about it does not necessarily imply that it is buggier than other drivers. It just is the driver that we can understand the best. Blacklisting it risks just papering over the most easy to see of all driver bugs. Part of this argument can be turned around though: as it runs entirely as a library in our process (as opposed to, say, running parts on the GPU), llvmpipe bugs can affect us more severely (e.g. more opportunities to corrupt our heap).

 - It is only used by a small minority of our users at the moment: it is virtually unused outside of desktop Linux (1%) and even there, only used by a small minority of Linux users. so the overall proportion of our users using it is certainly no more than 0.1%. And indeed, if I churn over the crash-data CSV files,

bjacob:~/crash-stats$ zcat 20121128-pub-crashdata.csv.gz | wc -l
405231
bjacob:~/crash-stats$ zcat 20121128-pub-crashdata.csv.gz | grep llvmpipe | wc -l
55
bjacob:~/crash-stats$ zcat 20121128-pub-crashdata.csv.gz | grep llvmpipe | grep Linux | wc -l
55

So we find that only 0.01% of our crash reports are on llvmpipe, and that all of them are on Linux. Then again, that argument can be turned around as evidence that we may as well blacklist it by default and tell the few tech-savvy people using it who care about getting WebGL that they can force-enable it.
Attached file apitrace
This allows to reproduce without using Firefox, like this:

LD_PRELOAD=/hack/mesa/build/linux-x86_64-debug/gallium/targets/libgl-xlib/libGL.so.1
LD_LIBRARY_PATH=/hack/mesa/build/linux-x86_64-debug/gallium/targets/libgl-xlib
valgrind --smc-check=all-non-file ../apitrace/build/glretrace -v
firefox.2.trace

Shows that the issue occurs in glBlitFramebuffer... now I have a sparkle of hope: that suggests we should try without MSAA...
Attached file apitrace without MSAA
Nope.... /o\
this apitrace has MSAA disabled, and the same error still occurs, this time in a MakeCurrent call apparently.
Also tried disabling the share group, because of a warning pointing in that direction... no effect.

I'm leaning in favor of blacklisting (reluctantly!) because for all I can understand, this is looking like a reference-counting bug, and if that is true, this could allow arbitrarily bad memory access bugs. Also, I'm long past the point where I can justify spending this time on this driver professionally :-P Let's give the Mesa devs a couple days and maybe blacklist this coming Monday.
Filed Mesa security bug (which are a thing since today): https://bugs.freedesktop.org/show_bug.cgi?id=57733
(In reply to Benoit Jacob [:bjacob] from comment #64)
> Also tried disabling the share group, because of a warning pointing in that
> direction... no effect.
> 
> I'm leaning in favor of blacklisting (reluctantly!) because for all I can
> understand, this is looking like a reference-counting bug, and if that is
> true, this could allow arbitrarily bad memory access bugs. Also, I'm long
> past the point where I can justify spending this time on this driver
> professionally :-P Let's give the Mesa devs a couple days and maybe
> blacklist this coming Monday.

It seems clear we should blacklist then, unfortunately.
(In reply to Jeff Gilbert [:jgilbert] from comment #67)
> (In reply to Benoit Jacob [:bjacob] from comment #64)
> > Also tried disabling the share group, because of a warning pointing in that
> > direction... no effect.
> > 
> > I'm leaning in favor of blacklisting (reluctantly!) because for all I can
> > understand, this is looking like a reference-counting bug, and if that is
> > true, this could allow arbitrarily bad memory access bugs. Also, I'm long
> > past the point where I can justify spending this time on this driver
> > professionally :-P Let's give the Mesa devs a couple days and maybe
> > blacklist this coming Monday.
> 
> It seems clear we should blacklist then, unfortunately.

For the record, we did last week (as you know --- you reviewed the patch ;-) )
So I did! Shall we mark this fixed then?

Also, it looks like though my response tried to reset all the flags, it didn't do anything. While strange, this is good.
Yes, I believe that we can close this bug: it seems llvmpipe specific, and llvmpipe is now blacklisted.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
I believe we need to ship a blocklist with ESR10 for this to take effect.
Mesa is already blocked altogether in ESR10. See widget/src/xpwidgets/GfxInfoX11.cpp.
Whiteboard: [asan] → [asan][adv-main18+][adv-esr17+]
Alias: CVE-2013-0763
Group: core-security
rforbes-bugspam-for-setting-that-bounty-flag-20130719
Flags: sec-bounty+
Blocks: 899802
You need to log in before you can comment on or make changes to this bug.