Closed Bug 805814 Opened 12 years ago Closed 12 years ago

Mesa heap buffer overflow, exposed by WebGL, mirroring Chrome issue 145525

Categories

(Core :: Graphics: CanvasWebGL, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla20
Tracking Status
firefox17 + wontfix
firefox18 + fixed
firefox19 + fixed
firefox20 + fixed
firefox-esr10 --- unaffected
firefox-esr17 18+ fixed

People

(Reporter: bjacob, Assigned: bjacob)

References

Details

(5 keywords, Whiteboard: [adv-main18+][adv-esr17+])

Attachments

(4 files)

Attached file Testcase
This is http://code.google.com/p/chromium/issues/detail?id=145525 originally reported by miaubiz.


VULNERABILITY DETAILS
heap buffer overflow in gpu process with webgl 

VERSION
Chrome Version: dev
Operating System: 64bit linux

REPRODUCTION CASE
<html>
  <head>
    <script id="vshader" type="x-shader/x-vertex">
      attribute vec4 vPosition;
      void main()
      {
        gl_Position = vPosition;
      }
    </script>

    <script id="fshader" type="x-shader/x-fragment">
      precision mediump float;
      uniform float color[3];
      void main()
      {
        gl_FragColor = vec4(color[0], 0, 0, 0);
      }
    </script>
    <script>
      function shader(gl, program, shaderType, shaderId) {
        var shaderSource = document.getElementById(shaderId).text
        var shader = gl.createShader(shaderType);
        gl.shaderSource(shader, shaderSource);
        gl.compileShader(shader);
        gl.attachShader(program, shader)
      }

      function loadProgram(gl) {
        var program = gl.createProgram();
        shader(gl, program, gl.VERTEX_SHADER, "vshader")
        shader(gl, program, gl.FRAGMENT_SHADER, "fshader")
        gl.linkProgram(program);
        return program;
      }

      var gl = document.createElement('canvas').getContext('experimental-webgl')
      var program = loadProgram(gl)

      var elemLoc = gl.getUniformLocation(program, "color[2]");
      var value = gl.getUniform(program, elemLoc);
    </script>
  </head>
  <body>
  </body>
</html>


FOR CRASHES, PLEASE INCLUDE THE FOLLOWING ADDITIONAL INFORMATION
Type of crash: gpu + asan
Crash State: 

==21671== ERROR: AddressSanitizer heap-buffer-overflow on address 0x7fffe06962b8 at pc 0x55555f2a4226 bp 0x7fffffff7b20 sp 0x7fffffff78e8
WRITE of size 1 at 0x7fffe06962b8 thread T0
    #0 0x55555f2a4225 in __interceptor_memcpy ??:0
    #1 0x7fffe967640b in ?? ??:0
0x7fffe06962b8 is located 4 bytes to the right of 52-byte region [0x7fffe0696280,0x7fffe06962b4)
allocated by thread T0 here:
    #0 0x55555f2a78e1 in __interceptor_calloc ??:0
    #1 0x7fffe74936f7 in ?? ??:0
Group: core-security
Whiteboard: sec-critical
Discussion seems to be still going on on the Chromium bug, but from what I gather the favored approach there is to make additional restrictions on shaders and enforce them in the ANGLE shader compiler.
Comment 11 on the Chromium bug says:

client glx vendor string: Mesa Project and SGI
client glx version string: 1.4
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: Gallium 0.4 on llvmpipe (LLVM 0x300)
OpenGL version string: 2.1 Mesa 8.0.2
OpenGL shading language version string: 1.20
And in comment 13 on the Chromium bug, we see that the bug is not specific to one particular driver:

OpenGL vendor string: Tungsten Graphics, Inc
OpenGL renderer string: Mesa DRI Intel(R) Ivybridge Desktop 
OpenGL version string: 2.1 Mesa 8.0.2
OpenGL shading language version string: 1.30
Whiteboard: sec-critical
Benoit can you own this bug? What can we do?
Assignee: nobody → bjacob
Assuming that if this is a driver bug we have to work around then it affects all branches. Please correct if this was unmasked by some recent WebGL code change.
Benoit - what needs to be done here?  We are going to build on final beta today - so we need to know our options asap.
Flags: needinfo?(bjacob)
At this point in the cycle it probably makes sense to just ship with the vulnerability this time around. Also, the Chromium bug shows that they decided to ship with it in Chrome 22.

I need to have a conversation with Chromium devs to understand what is the status on their side. All I can see in their bug is a patch to Mesa that they will ship in Chrome OS, I can't see a fix in desktop Chromium.
Flags: needinfo?(bjacob)
Thank you, let's keep at this for 18 in that case.
there will be no patch for desktop chromium.

"The way we're currently handling many of the Mesa issues is to patch them in Chrome OS (because it is our responsibility to do so) and letting Linux distributions patch their own Mesa libs

In fact, I just took a Mesa update to my Ubuntu 12.04 a couple of days ago. It's a fix for your older issue (the buffer overflow) so hopefully Ubuntu will continue to catch up and patch this OOB read too! "
Thanks miaubiz.

Do you know what the thinking of the Chromium security team is there --- is it that the vulnerability is not considered exploitable, or is it something else?

Also can you confirm that it is known that the bug cannot be worked around by the browser engine?
@bjacob:

I couldn't figure it out. and there's 78 comments on there so copy pasting a large chunk of them is infeasible.
it's clearly exploitable though.  ben...@gmail.com is cc:d since comment 66 so you should have gotten a copy of atleast the more recent comments there.

the number of uniforms the shader will use is definitely not available to the browser engine. if the browser passes the invalid shader off to mesa, then it is too late to react once mesa has compiled the shader. 

the alternative is to make a conservative estimate of the number of uniforms a shader will use, and prevent anything close to the maximum from reaching mesa. the conservative number of allowed uniforms will then end up disallowing valid high end WebGL content from running. this is much more complex than blacklisting the drivers outright until mesa is patched, and the result is presumably not much better.
Ken explained details in today's WebGL conference call.

What's happening is that when some parts of a uniform array are not actually used by the shaders, the driver feels free to optimize them away and not allocate storage for them; but it will still allow addressing them, effectively allowing to address illegal memory in the program's address space.

The solution chosen by the Chrome team, with which I agree, is to implement in the ANGLE shader compiler the packing algorithm mentioned in a non-normative way in the GLSL spec. In this way, shaders will be rejected if they declare very large arrays that should be rejected but aren't rejected by the driver because it takes advantage of the knowledge that large parts of them are unused. From what I gather from the conversation on the Chrome bug, this is considered enough to prevent the present testcase.

The support for doing this check in ANGLE landed in r1317. We must update out ANGLE copy and use that.
Depends on: 816670
ChromeOS has assigned CVE-2012-5129 to the Mesa bug. We'll need a different CVE to cover the ANGLE workaround (though if Google assigns one we can re-use that), but our advisory should mention this is related to CVE-2012-5129.

We're running out of time to get an ANGLE update into Firefox 18, what's the ETA on that?
I didn't realize that we still wanted the ANGLE update in firefox 18. Jeff Gilbert is working on the ANGLE update (bug 801158), but for the beta channel, a wholesale ANGLE update is too risky. So the only thing we can do about this for 18 is to cherry pick the particular ANGLE revision (1317) that fixes this. Let me see if it applies cleanly.
This patch is `svn diff -c 1317` + adding a cpp file to two of our makefiles.
Attachment #688237 - Flags: review?(jgilbert)
This should cause the testcase to be rejected before being passed to the Mesa shader compiler, avoiding the problem.
Attachment #688239 - Flags: review?(jgilbert)
This is not working... needs some debugging.
The new packing restrictions code is actually working beautifully. The problem was that I apparently attached the wrong testcase here. The present bug is about bad things that happen when a shader tries to use too many variables, but the testcase here doesn't do anything like that. Also, I ran it in valgrind against current m-c and 2 different versions and it doesn't cause any V error.

Now looking for the right testcase in Chromium bug...
Attached file repro case
I still can't reproduce this in valgrind, as when I run in valgrind I hit an assertion failure before I get to the point where it would reproduce:

Assertion failure: !thing->compartment()->scheduledForDestruction, at /hack/mozilla-central/js/src/gc/Marking.cpp:108

Will try the ASan builds...
But, good news! Now that we have the right testcase, the program linking does fail with this WebGL warning:

[11:36:11.894] Error: WebGL: linkProgram: a fragment shader used in this program failed to compile, with this log:
ERROR: too many uniforms

 @ file:///home/bjacob/Downloads/zl.html:35

Which is exactly the message that the new ANGLE packing validation generates.

So regardlesss of whether I'm able to reproduce (I'm busy atm and may not find time to), it is guaranteed that with the patch we are safe.
Comment on attachment 688237 [details] [diff] [review]
import angle r1317 (only that revision)

Review of attachment 688237 [details] [diff] [review]:
-----------------------------------------------------------------

Gross, but I guess we need it. We're not going to be able to backport the update to ANGLE's HEAD.

I didn't really review the variable packer, but I can if we need to. R+ otherwise.

::: gfx/angle/src/compiler/VariablePacker.cpp
@@ +104,5 @@
> +        case SH_SAMPLER_2D_RECT_ARB:
> +            return 1;
> +        default:
> +            ASSERT(false);
> +            return 100000;

Seems like an arbitrary choice. I believe this attempts to make it impossible to back the variable, which should trigger the error condition. (Which is what we want)

@@ +191,5 @@
> +
> +bool VariablePacker::CheckVariablesWithinPackingLimits(int maxVectors, const TVariableInfoList& in_variables)
> +{
> +    ASSERT(maxVectors > 0);
> +    maxRows_ = maxVectors;

This is such a weird syntax.

@@ +197,5 @@
> +    bottomNonFullRow_ = maxRows_ - 1;
> +    TVariableInfoList variables(in_variables);
> +
> +    // As per GLSL 1.017 Appendix A, Section 7 variables are packed in specific
> +    // order by type, then by size of array, largest first.

Doing things by the spec. Love it.

::: gfx/angle/tests/compiler_tests/VariablePacker_test.cpp
@@ +1,1 @@
> +//

This file seems like cruft. Eventually, would it be useful to adapt these ANGLE tests into our CPP test stuff?
Attachment #688237 - Flags: review?(jgilbert) → review+
Attachment #688239 - Flags: review?(jgilbert) → review+
Can we get a sec-approval nomination here in preparation for landing on m-c? Would be great to get this landed on branches Monday, in preparation for beta 4.
Comment on attachment 688237 [details] [diff] [review]
import angle r1317 (only that revision)

[Security approval request comment]
How easily can the security issue be deduced from the patch?
Would be very difficult for this patch, though it's possible there are breadcrumbs in comments somewhere upstream in ANGLE.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
No.

Which older supported branches are affected by this flaw?
All branches since Firefox 4.

If not all supported branches, which bug introduced the flaw?
WebGL.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Presumably we'd pull this into Aurora 19 and Beta 18, possibly ESR 17?

How likely is this patch to cause regressions; how much testing does it need?
Unlikely to cause regressions. We have good test coverage.
Attachment #688237 - Flags: sec-approval?
Comment on attachment 688239 [details] [diff] [review]
turn on packing restrictions

[Security approval request comment]
How easily can the security issue be deduced from the patch?
Not easily, but there's a reason we're adding what we're adding. This plugs a hole that's not well known to others, but fairly well known to WebGL people.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
No, just that the patch adds a single line.

Which older supported branches are affected by this flaw?
All.

If not all supported branches, which bug introduced the flaw?

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
It depends on how difficult it is to backport the ANGLE portion. It should be possible, though. Targets are likely Aurora 19, Beta 18, ESR 17.

How likely is this patch to cause regressions; how much testing does it need?
Unlikely to cause regressions given our test coverage.
Attachment #688239 - Flags: sec-approval?
Status: NEW → ASSIGNED
Comment on attachment 688239 [details] [diff] [review]
turn on packing restrictions

sec-approval+.

Please prepare patches for affected branches and nominate them so we can try to get this into 18 and 19 too.
Attachment #688239 - Flags: sec-approval? → sec-approval+
Attachment #688237 - Flags: sec-approval? → sec-approval+
Please land today on mozilla-central and prepare branch patches for landing tomorrow. We'd like this to make it into Firefox 18 Beta 4.
Tracking for ESR 17 that will ship with Firefox 18 - nominate asap.
Comment on attachment 688239 [details] [diff] [review]
turn on packing restrictions

Note this approval request covers the *two* patches here.

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: This is sec-critical
Fix Landed on Version: 20
Risk to taking this patch (and alternatives if risky): Low risk. The custom mozilla change here (the present patch) is tiny and zero-risk. The ANGLE patch (the other patch attached to this bug, also covered by this approval request) looks big, but it was reviewed upstream by ANGLE devs and is already used in Chromium. It's something that we want anyways because it makes sense. It's unlikely that a bug in there would be worse than the sec-critical vulnerability that it's fixing.
String or UUID changes made by this patch: None.
Attachment #688239 - Flags: approval-mozilla-esr17?
Attachment #688239 - Flags: approval-mozilla-beta?
Attachment #688239 - Flags: approval-mozilla-aurora?
Setting the tracking-firefox-esr10: --- → 18+ as esr10 is marked affected as well
Attachment #688239 - Flags: approval-mozilla-esr17?
Attachment #688239 - Flags: approval-mozilla-esr17+
Attachment #688239 - Flags: approval-mozilla-beta?
Attachment #688239 - Flags: approval-mozilla-beta+
Attachment #688239 - Flags: approval-mozilla-aurora?
Attachment #688239 - Flags: approval-mozilla-aurora+
There is no chance that we can backport the fix to ESR10. What I can counter-offer, is blacklisting Mesa on ESR10.
Wait, we have already recently blacklisted WebGL on Mesa on ESR10. I just verified: at GfxInfoX11.cpp:297 we have:

    if (mIsMesa) {
        if (aFeature == nsIGfxInfo::FEATURE_WEBGL_OPENGL) {
            *aStatus = nsIGfxInfo::FEATURE_BLOCKED_DRIVER_VERSION;
            aSuggestedDriverVersion.AssignLiteral("Not Mesa");
        }

So ESR10 is unaffected.
https://hg.mozilla.org/mozilla-central/rev/23eb16f8bf3b
https://hg.mozilla.org/mozilla-central/rev/7e67e86f81d6
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Whiteboard: [adv-main18+][adv-esr17+]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: