Last Comment Bug 827106 - (CVE-2013-0796) freeing unallocated address with webgl
(CVE-2013-0796)
: freeing unallocated address with webgl
Status: RESOLVED FIXED
[asan][adv-main20+][adv-esr1705+]
: sec-critical, sec-vector
Product: Core
Classification: Components
Component: Canvas: WebGL (show other bugs)
: Trunk
: x86 Linux
: -- normal (vote)
: mozilla21
Assigned To: Benoit Jacob [:bjacob] (mostly away)
:
Mentors:
Depends on: 801158
Blocks:
  Show dependency treegraph
 
Reported: 2013-01-06 06:37 PST by miaubiz
Modified: 2013-11-25 13:26 PST (History)
17 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
wontfix
+
fixed
+
fixed
20+
fixed
20+
unaffected
unaffected


Attachments
asan repro (1.26 KB, text/plain)
2013-01-06 06:37 PST, miaubiz
no flags Details
asan log linux (1.44 KB, text/plain)
2013-01-06 06:37 PST, miaubiz
no flags Details
cherry-pick angle r1638 (34.57 KB, patch)
2013-02-01 11:30 PST, Benoit Jacob [:bjacob] (mostly away)
jgilbert: review+
lukasblakk+bugs: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta-
Details | Diff | Splinter Review
turn on the clamping (1.26 KB, patch)
2013-02-01 11:30 PST, Benoit Jacob [:bjacob] (mostly away)
jgilbert: review+
lukasblakk+bugs: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta-
Details | Diff | Splinter Review

Description miaubiz 2013-01-06 06:37:25 PST
Created attachment 698424 [details]
asan repro

I load this:

<html>
  <head>
    <script id="vshader" type="x-shader/x-vertex">
      void main()
      {
        gl_Position = vec4(0,0,0,0);
      }
    </script>

    <script id="fshader" type="x-shader/x-fragment">
      precision mediump float;
      uniform vec4 uni[1];
      void main()
      {
        vec4 c = vec4(0,0,0,0);
        for (int ii = -11; ii < -10; ++ii) {
          c += uni[ii];
        }
        gl_FragColor = vec4(c.r, c.g, c.b, 0);
      }
    </script>
    <script>
      function shader(gl, program, shaderType, shaderId) {
        var shaderSource = document.getElementById(shaderId).text
        var shader = gl.createShader(shaderType);
        gl.shaderSource(shader, shaderSource);
        gl.compileShader(shader);
        gl.attachShader(program, shader)
      }

      function aProgram(gl) {
        var program = gl.createProgram();
        shader(gl, program, gl.VERTEX_SHADER, "vshader")
        shader(gl, program, gl.FRAGMENT_SHADER, "fshader")
        gl.linkProgram(program);
        gl.useProgram(program);
        return program;
      }
      var canvas = document.createElement('canvas')
      var gl = canvas.getContext('experimental-webgl')
      aProgram(gl)
      gl.drawArrays(gl.TRIANGLES, 0, 1)
    </script>
  </head>
  <body>
  </body>
</html>


and it says this:

ATTENTION: default value of option force_s3tc_enable overridden by environment.
=================================================================
==9645== ERROR: AddressSanitizer attempting free on address which was not malloc()-ed: 0x7fffbcd97780
    #0 0x435a70 in free ??:0
    #1 0x7fffc0d2359c in ?? ??:0
0x7fffbcd97780 is located 127 bytes to the right of 0-byte region [0x7fffbcd97701,0x7fffbcd97701)
freed by thread T0 here:
    #0 0x5a1105ffffffff
    #1 0x5a11063cd9567f
    #2 0x7ffeffffffff
    #3 0xbcd9827f
    #4 0x7fffbcd9767f in  
    #5 0x7ffeffffffff
    #6 0xfffffffe
    #7 0xfffffffd
    #8 0xfffffffc
    #9 0xfffffffb
previously allocated by thread T0 here:
    #0 0x435ba1 in __interceptor_calloc ??:0
    #1 0x7fffc0d23766 in ?? ??:0
Stats: 114M malloced (136M for red zones) by 294525 calls
Stats: 4M realloced by 16786 calls
Stats: 53M freed by 157098 calls
Stats: 0M really freed by 0 calls
Stats: 288M (73768 full pages) mmaped in 66 calls
  mmaps   by size class: 8:262128; 9:24573; 10:12285; 11:8188; 12:3072; 13:1536; 14:768; 15:384; 16:960; 17:128; 18:32; 19:16; 20:8; 21:2; 24:2;
  mallocs by size class: 8:252481; 9:22353; 10:8092; 11:6586; 12:1960; 13:1014; 14:679; 15:272; 16:939; 17:111; 18:20; 19:9; 20:5; 21:2; 24:2;
  frees   by size class: 8:133911; 9:11537; 10:4458; 11:4249; 12:926; 13:770; 14:493; 15:169; 16:487; 17:83; 18:9; 19:3; 20:3;
  rfrees  by size class:
Stats: malloc large: 149 small slow: 1554
==9645== ABORTING


on linux, intel driver
Comment 1 miaubiz 2013-01-06 06:37:50 PST
Created attachment 698425 [details]
asan log linux
Comment 2 Daniel Veditz [:dveditz] 2013-01-09 10:26:07 PST
A stack w/symbols would be helpful...
Comment 3 miaubiz 2013-01-10 10:52:41 PST
chrome is also affected http://crbug.com/169054
Comment 4 Matt Wobensmith [:mwobensmith][:matt:] 2013-01-10 15:47:12 PST
I've tried multiple attempts using ASan builds from 2013-01-06 through 2012-01-10 on Linux and Mac - no crash for me.

If you can still repro, can you set up the crash log so that it includes symbols? 


./firefox-bin 2>&1 | <your_depot>/llvm/projects/compiler-rt/lib/asan/scripts/asan_symbolize.py | c++filt
Comment 5 Matt Wobensmith [:mwobensmith][:matt:] 2013-01-10 15:47:53 PST
^ 2012 --> 2013

Thank you.
Comment 6 miaubiz 2013-01-10 23:54:06 PST
sorry :( the symbolizer, it does nothing. I think it's because it's the graphics driver thing.

original stack: 

==25962== ERROR: AddressSanitizer attempting free on address which was not malloc()-ed: 0x7fffac0f7880
    #0 0x435a70 (/home/user/firefox/firefox+0x435a70)
    #1 0x7fffaf72259c (/usr/lib/x86_64-linux-gnu/dri/libglsl.so+0x3a59c)

which driver are you using on linux?
Comment 7 Matt Wobensmith [:mwobensmith][:matt:] 2013-01-11 15:49:44 PST
What happens when you run the symbolizer? It can take time... in fact, at times it's quite slow, but it does work.

Graphic drivers shouldn't have anything do with it. FWIW, I use a VMware image and various flavors of Linux.

The main thing is that in order to assess this bug, we need to reproduce it and/or we need a better stack trace. Any additional help you can provide would be very much appreciated.
Comment 8 miaubiz 2013-01-14 11:55:43 PST
Matt: sorry for taking a long time to get back to you. it only crashes for me with intel driver on firefox. on other drivers it says it can't get a webgl context. are you able to get a webgl context with your drivers?

here is unsymbolized output:

=================================================================
==32090== ERROR: AddressSanitizer attempting free on address which was not malloc()-ed: 0x7fffba4f3e80
    #0 0x4359e0 (/home/user/firefox/firefox+0x4359e0)
    #1 0x7fffb01b159c (/usr/lib/x86_64-linux-gnu/dri/libglsl.so+0x3a59c)
0x7fffba4f3e80 is located 127 bytes to the right of 0-byte region [0x7fffba4f3e01,0x7fffba4f3e01)
freed by thread T0 here:
    #0 0x5a1105ffffffff
    #1 0x5a11063a4f2d7f
    #2 0x7ffeffffffff
    #3 0xb5eef47f
    #4 0x7fffb5eeea7f (+0xaf2a7f)
    #5 0x7ffeffffffff
    #6 0xfffffffe
    #7 0xfffffffd
    #8 0xfffffffc
    #9 0xfffffffb
previously allocated by thread T0 here:
    #0 0x7ffe
    #1 0x301b9765
Stats: 246M malloced (251M for red zones) by 330120 calls
Stats: 34M realloced by 19901 calls
Stats: 180M freed by 187855 calls
Stats: 47M really freed by 89085 calls
Stats: 456M (116818 full pages) mmaped in 108 calls
  mmaps   by size class: 8:212979; 9:32764; 10:12285; 11:12282; 12:3072; 13:1536; 14:1280; 15:384; 16:704; 17:1280; 18:48; 19:40; 20:24; 21:2; 24:2; 
  mallocs by size class: 8:264109; 9:35211; 10:9332; 11:12709; 12:2571; 13:1732; 14:1594; 15:365; 16:1056; 17:1331; 18:41; 19:42; 20:23; 21:2; 24:2; 
  frees   by size class: 8:142772; 9:23355; 10:5225; 11:10236; 12:1412; 13:1239; 14:1408; 15:250; 16:578; 17:1296; 18:28; 19:36; 20:20; 
  rfrees  by size class: 8:72409; 9:6632; 10:1854; 11:6209; 12:513; 13:444; 14:340; 15:113; 16:369; 17:191; 18:6; 19:4; 20:1; 
Stats: malloc large: 1441 small slow: 2014
==32090== ABORTING

here is symbolized output:

=================================================================
==32090== ERROR: AddressSanitizer attempting free on address which was not malloc()-ed: 0x7fffba4f3e80
    #0 0x4359e0 in free ??:0
    #1 0x7fffb01b159c in ?? ??:0
0x7fffba4f3e80 is located 127 bytes to the right of 0-byte region [0x7fffba4f3e01,0x7fffba4f3e01)
freed by thread T0 here:
    #0 0x5a1105ffffffff
    #1 0x5a11063a4f2d7f
    #2 0x7ffeffffffff
    #3 0xb5eef47f
addr2line: '': No such file
    #4 0x7fffb5eeea7f in  
    #5 0x7ffeffffffff
    #6 0xfffffffe
    #7 0xfffffffd
    #8 0xfffffffc
    #9 0xfffffffb
previously allocated by thread T0 here:
    #0 0x7ffe
    #1 0x301b9765
Stats: 246M malloced (251M for red zones) by 330120 calls
Stats: 34M realloced by 19901 calls
Stats: 180M freed by 187855 calls
Stats: 47M really freed by 89085 calls
Stats: 456M (116818 full pages) mmaped in 108 calls
  mmaps   by size class: 8:212979; 9:32764; 10:12285; 11:12282; 12:3072; 13:1536; 14:1280; 15:384; 16:704; 17:1280; 18:48; 19:40; 20:24; 21:2; 24:2;
  mallocs by size class: 8:264109; 9:35211; 10:9332; 11:12709; 12:2571; 13:1732; 14:1594; 15:365; 16:1056; 17:1331; 18:41; 19:42; 20:23; 21:2; 24:2;
  frees   by size class: 8:142772; 9:23355; 10:5225; 11:10236; 12:1412; 13:1239; 14:1408; 15:250; 16:578; 17:1296; 18:28; 19:36; 20:20;
  rfrees  by size class: 8:72409; 9:6632; 10:1854; 11:6209; 12:513; 13:444; 14:340; 15:113; 16:369; 17:191; 18:6; 19:4; 20:1;
Stats: malloc large: 1441 small slow: 2014
==32090== ABORTING


fwiw, here is the chromium fix:

https://codereview.chromium.org/11884007/
Comment 9 Milan Sreckovic [:milan] 2013-01-14 12:59:54 PST
The chromium fix is to add SH_CLAMP_INDIRECT_ARRAY_BOUNDS to compile options.  What does that do?
Comment 10 Benoit Jacob [:bjacob] (mostly away) 2013-01-14 13:13:47 PST
It's a new shader compiler option that causes the compiler to inject code into the shader to ensure that no out of bounds accesses can be made into arrays. It's something we've always known we needed in the WebGL WG so I'm really glad that it's happening. We should do the same asap regardless of whether it avoids the present bug.
Comment 11 Benoit Jacob [:bjacob] (mostly away) 2013-01-14 13:22:14 PST
https://code.google.com/p/chromium/issues/detail?id=169054#c16 thinks that this won't actually fix the present bugs, and Chrome devs now seem to be favoring fixing this in Mesa (which they ship in Chrome OS). In that case we'll have to look for a fix on our own, or blacklist.
Comment 12 Benoit Jacob [:bjacob] (mostly away) 2013-01-14 13:27:01 PST
Alternatively, since this is a bad free(), and we ship our own memory allocator which is being used there, have people already talked about just hardening our memory allocator (jemalloc) against bad free()'s? Longer-term project, of course.
Comment 13 Milan Sreckovic [:milan] 2013-01-14 14:14:08 PST
Benoit, can you help make an assessment if this is critical/high/moderate security issue?
Comment 14 Benoit Jacob [:bjacob] (mostly away) 2013-01-14 14:15:28 PST
Unless there is something specific that I don't know about our memory allocator (jemalloc) that makes it sturdy in the face of bad free()'s,  all bad free()'s should be considered sec-critical, since they can corrupt the internal structures of the allocator.
Comment 15 Benoit Jacob [:bjacob] (mostly away) 2013-01-15 10:56:08 PST
Mesa bug: https://bugs.freedesktop.org/show_bug.cgi?id=59429
Comment 16 David Bolter [:davidb] 2013-01-17 13:58:00 PST
Milan can you own this for now - it looks like you are helping figure out traction.
Comment 17 Benoit Jacob [:bjacob] (mostly away) 2013-01-23 11:22:08 PST
Based on https://code.google.com/p/chromium/issues/detail?id=169054#c37 it actually seems like the ANGLE array index clamping (comment 9) would actually fix it.

So let's do that; but that probably depends on the ANGLE update landing first (this is a recent ANGLE change, it might or might not apply reasonably on the old ANGLE we're still using).
Comment 18 Milan Sreckovic [:milan] 2013-01-23 11:35:01 PST
(In reply to Benoit Jacob [:bjacob] from comment #17)
> Based on https://code.google.com/p/chromium/issues/detail?id=169054#c37 it
> actually seems like the ANGLE array index clamping (comment 9) would
> actually fix it.
> 
> So let's do that; but that probably depends on the ANGLE update landing
> first (this is a recent ANGLE change, it might or might not apply reasonably
> on the old ANGLE we're still using).

I always thought I had a talent for suggesting an answer even when I don't understand the question :-)  Just to be clear, we will wait for the new ANGLE to land, then try a change - but there is still a change that needs to be made.  In other words, this depends on 801158, but is not a duplicate of it.  Just checking.
Comment 19 Benoit Jacob [:bjacob] (mostly away) 2013-01-23 12:55:49 PST
Correct.
Comment 20 Daniel Veditz [:dveditz] 2013-01-24 13:19:37 PST
This sounds like something that's been in there rather than in recent code. Am I safe in assuming we'll need this on the ESR-17 branch and possibly b2g18?
Comment 21 Daniel Veditz [:dveditz] 2013-01-31 13:29:21 PST
So the angle update has happened according to bug 801158. Is this bug fixed?
Comment 22 Daniel Veditz [:dveditz] 2013-01-31 16:25:53 PST
Do we use the Mesa driver on b2g? Would it be different on ARM even if we did?

I'm concerned that the ANGLE update in bug 801158 is huge -- could we safely take that on the ESR branch? is there a smaller band-aide we could cherry-pick?
Comment 23 Benoit Jacob [:bjacob] (mostly away) 2013-02-01 09:37:49 PST
(In reply to Daniel Veditz [:dveditz] from comment #22)
> Do we use the Mesa driver on b2g? Would it be different on ARM even if we
> did?

We don't. On B2G we use the regular Adreno Android driver.

> 
> I'm concerned that the ANGLE update in bug 801158 is huge -- could we safely
> take that on the ESR branch? is there a smaller band-aide we could
> cherry-pick?

There might be a possibility of a small cherry-pick. I haven't investigated that. To be clear, I've not been working on the present bug since I didn't realize I was the assignee (since 3 days ago). Will look into that.
Comment 24 Benoit Jacob [:bjacob] (mostly away) 2013-02-01 10:28:27 PST
(In reply to Daniel Veditz [:dveditz] from comment #21)
> So the angle update has happened according to bug 801158. Is this bug fixed?

The ANGLE update only brought the option to fix this bug by enabling a new shader compilation option. We have to do that now.
Comment 25 Benoit Jacob [:bjacob] (mostly away) 2013-02-01 10:40:39 PST
The ANGLE revision that brings initial support is r1638:
https://code.google.com/p/angleproject/source/detail?r=1638

It landed on *December 21*.

To have this working on Windows we also need r1719, r1733, r1734, but that's irrelevant to the present security bug discussion which is on Linux.

Unfortunately the ANGLE revision that landed last week is an older revision, r1561, from *December 6*.

Now that sucks: we need another ANGLE update, even on central, or we need to cherry-pick even on central.
Comment 26 Benoit Jacob [:bjacob] (mostly away) 2013-02-01 10:42:21 PST
So let's cherry-pick only r1638 and turn on the array clamping only on linux for now. Next, let's file follow-up bugs to update ANGLE, and enable array clamping everywhere.
Comment 27 Benoit Jacob [:bjacob] (mostly away) 2013-02-01 11:30:04 PST
Created attachment 709160 [details] [diff] [review]
cherry-pick angle r1638
Comment 28 Benoit Jacob [:bjacob] (mostly away) 2013-02-01 11:30:43 PST
Created attachment 709161 [details] [diff] [review]
turn on the clamping
Comment 29 Benoit Jacob [:bjacob] (mostly away) 2013-02-01 11:42:22 PST
(In reply to Benoit Jacob [:bjacob] from comment #26)
> Next, let's file follow-up bugs to update ANGLE

Bug 837214

> and enable array clamping everywhere.

Bug 837213
Comment 30 Benoit Jacob [:bjacob] (mostly away) 2013-02-04 16:39:53 PST
Review/triaging ping: Chromium people on https://code.google.com/p/chromium/issues/detail?id=169054 seem particularly interested in making sure that they have this fixed in time for "Pwnium 3" --- I assume that's a security/exploit contest? Anyway, just a heads up --- other people are taking this one very seriously.
Comment 31 Benoit Jacob [:bjacob] (mostly away) 2013-02-04 17:29:35 PST
awsum thx
https://tbpl.mozilla.org/?tree=Try&rev=ae554b1f2dae
Comment 32 Benoit Jacob [:bjacob] (mostly away) 2013-02-05 08:23:36 PST
had forgotten hg add:
https://tbpl.mozilla.org/?tree=Try&rev=872000faade4
Comment 33 Benoit Jacob [:bjacob] (mostly away) 2013-02-05 14:04:57 PST
Causing the shader compiler on Mac OSX 10.6 to crash :-(
https://tbpl.mozilla.org/php/getParsedLog.php?id=19457145&tree=Try#error0

For the present bug let's make the fix linux-only. We'll figure Mac later (bug 837213).
Comment 34 Benoit Jacob [:bjacob] (mostly away) 2013-02-05 14:08:33 PST
https://tbpl.mozilla.org/?tree=Try&rev=06d6174495bb
Comment 35 Benoit Jacob [:bjacob] (mostly away) 2013-02-05 17:50:05 PST
The errors on the last push look like random **** that didn't happen on the previous try push and couldn't be caused by the diff between them; ignoring and pushing.
Comment 37 Benoit Jacob [:bjacob] (mostly away) 2013-02-05 18:00:31 PST
Comment on attachment 709160 [details] [diff] [review]
cherry-pick angle r1638

[Approval Request Comment]
Bug caused by (feature/regressing bug #): ever since we ran WebGL on Linux Intel driver
User impact if declined: sec-crit for Linux + Intel GPU users
Testing completed (on m-c, etc.): m-i
Risk to taking this patch (and alternatives if risky): Heh, that is actually a fairly big ANGLE patch. So it could regress anything in ANGLE. Fortunately we have substantial tests for that (in the WebGL mochitests in M1) but what we can't know is how this will interact with drivers. The above-mentioned crash on Mac OSX 10.6 gives a good example of what could go wrong --- as this features rewrites WebGL shaders, it can trigger existing bugs in the drivers' shader compilers that we didn't hit before. On the other hand, in the second patch here we are not enabling this on Windows and not enabling this on Mac. So the riskiness here is mostly concentrated on desktop Linux and Android, and we know that this patch fixes a very big bug for many Linux users.
String or UUID changes made by this patch: none
Comment 38 Benoit Jacob [:bjacob] (mostly away) 2013-02-05 18:00:59 PST
Comment on attachment 709161 [details] [diff] [review]
turn on the clamping

[Approval Request Comment]
See previous comment.
Comment 39 Benoit Jacob [:bjacob] (mostly away) 2013-02-05 18:04:21 PST
For ESR17, I believe that we should just blacklist Mesa as we did in ESR10. Filed bug 838413.
Comment 41 Lukas Blakk [:lsblakk] use ?needinfo 2013-02-06 12:29:46 PST
I don't see us taking such risks on beta so will only approve for aurora uplift at this time but am nominating for 19? tracking so we can talk about it in triage.
Comment 42 Alex Keybl [:akeybl] 2013-02-07 16:56:22 PST
This fix is too large for the FF19 timeframe. Beta 6 (going to build on Monday) is our final beta. If we were super concerned about this, we'd take bug 838413 on FF19, but we don't think that's necessary.
Comment 43 miaubiz 2013-02-11 03:50:08 PST
@dveditz: sorry, just realized there is a needinfo from me on this. I can't confirm this either way atm. sorry :|
Comment 44 Benoit Jacob [:bjacob] (mostly away) 2013-02-11 09:18:50 PST
Erm, the ANGLE cherry-pick does not apply cleanly at all on aurora --- I'm getting conflicts in 10+ files.

Given that, I propose abandoning the aurora nom. Acceptable?
Comment 45 Benoit Jacob [:bjacob] (mostly away) 2013-02-11 09:20:03 PST
Ah wait, it might not be so bad --- trying a backport now.
Comment 46 Benoit Jacob [:bjacob] (mostly away) 2013-02-11 11:00:58 PST
https://hg.mozilla.org/releases/mozilla-aurora/rev/9acedf5c6da5
https://hg.mozilla.org/releases/mozilla-aurora/rev/d2f68c7e8153

Seems to be working so far...
Comment 47 Lukas Blakk [:lsblakk] use ?needinfo 2013-02-13 11:04:50 PST
Batch edit: Bugs still affected on b2g18 after 2/13 merge to v1.0.1 branch are affected on v1.0.1 branch.
Comment 48 Benoit Jacob [:bjacob] (mostly away) 2013-03-04 06:40:46 PST
Fixed on ESR17 by blacklisting Mesa there, see bug 838413.

B2G is unaffected as this was a Mesa (GL driver) bug i.e. desktop Linux.

Note You need to log in before you can comment on or make changes to this bug.