Closed Bug 814407 Opened 12 years ago Closed 12 years ago

Use after free in Mesa 8.0 after calling texImage2D on texture already attached to FBO in texture-attachment-formats.html test

Categories

(Core :: Graphics, defect)

x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla20
Tracking Status
firefox17 --- affected
firefox18 + verified
firefox19 + verified
firefox20 + verified
firefox-esr10 --- unaffected
firefox-esr17 18+ fixed

People

(Reporter: wgianopoulos, Assigned: bjacob)

References

()

Details

(Keywords: sec-critical, Whiteboard: [adv-main18+][adv-esr17+][qa-])

Attachments

(2 files)

The WebGL conformance test now fails under fedora 18 with the latest xorg drivers and trunk Firefox on both NVDIA and AMD/ATI graphics.

I have this issue on 4 different systems crash reports and graphics info in follow-on comments.
bp-0bd24467-e147-4238-9e3a-366762121122

about:support graphics section is:

Adapter DescriptionX.Org -- Gallium 0.4 on AMD RS780
Device IDGallium 0.4 on AMD RS780
Driver Version2.1 Mesa 8.0.4GPU
Accelerated Windows1/1
OpenGLVendor IDX.OrgWebGL
RendererX.Org -- Gallium 0.4 on AMD RS780
AzureCanvasBackend cairo
AzureContentBackend none
AzureFallbackCanvasBackend none
bp-cdf263d1-9fa8-47cd-a17e-b25802121122

about:support graphics section is:

Adapter Descriptionnouveau -- Gallium 0.4 on NVC1
Device IDGallium 0.4 on NVC1
Driver Version2.1 Mesa 8.0.4
GPU Accelerated Windows1/1
OpenGLVendor ID nouveau
WebGL Renderer nouveau -- Gallium 0.4 on NVC1
AzureCanvasBackend cairo
AzureContentBackend none
AzureFallbackCanvasBackend none
bp-f3e08f2a-42ad-4622-8cfb-5c0572121122

about:support graphics section is:

Adapter Description X.Org R300 Project -- Gallium 0.4 on ATI RV410
Device ID Gallium 0.4 on ATI RV410
Driver Version 2.1 Mesa 8.0.4
GPU Accelerated Windows 0/3
Basic Vendor ID X.Org R300 Project
WebGL Renderer X.Org R300 Project -- Gallium 0.4 on ATI RV410
AzureCanvasBackend cairo
AzureContentBackend none
AzureFallbackCanvasBackend none
bp-c2b57357-0706-4162-ba9c-0df4c2121122

about:support graphics section is:

Adapter Description X.Org -- Gallium 0.4 on AMD SUMO
Device ID Gallium 0.4 on AMD SUMO
Driver Version 2.1 Mesa 8.0.4
GPU Accelerated Windows 0/1 Basic Blocked for your graphics card because of unresolved driver issues.
Vendor ID X.Org
WebGL Renderer X.Org -- Gallium 0.4 on AMD SUMO
AzureCanvasBackend cairo
AzureContentBackend none
AzureFallbackCanvasBackend none
With the same hardware, the test completes without crashing under Windows 7.
Oh and BTW I said this before. but having a way to copy paste in a bug readable form jsut the graphics section of about:support in the clipboard so that is can be pasted into a bug sould be jsue do much more helpful.
This page is the test runner, that runs hundreds of conformance tests. As a starting point it would be very useful to know which one crashes.
(In reply to Bill Gianopoulos [:WG9s] from comment #1)
> bp-0bd24467-e147-4238-9e3a-366762121122

segfault at code-segment-looking address deep in radeon driver. The stack looks smashed.

(In reply to Bill Gianopoulos [:WG9s] from comment #2)
> bp-cdf263d1-9fa8-47cd-a17e-b25802121122

null deref in readpixels. Looks like a legitimate bug in our code: in WebGLContext::Readpixels() we apparently should check that data!=null.

(In reply to Bill Gianopoulos [:WG9s] from comment #3)
> bp-f3e08f2a-42ad-4622-8cfb-5c0572121122

same as comment #1

(In reply to Bill Gianopoulos [:WG9s] from comment #4)
> bp-c2b57357-0706-4162-ba9c-0df4c2121122

same as comment #1
Comment on attachment 684430 [details] [diff] [review]
check for null buffer data in readpixels (should never happen, but apparently does)

Review of attachment 684430 [details] [diff] [review]:
-----------------------------------------------------------------

Bizarre.
Attachment #684430 - Flags: review?(jgilbert) → review+
http://hg.mozilla.org/integration/mozilla-inbound/rev/f9a5cb869c1e

Let's actually close this bug; please open a separate bug for the radeon failures; it would greatly help if you could identify the precise crashing test page as said in comment 7.
Assignee: nobody → bjacob
Target Milestone: --- → mozilla20
(In reply to Benoit Jacob [:bjacob] from comment #11)
> http://hg.mozilla.org/integration/mozilla-inbound/rev/f9a5cb869c1e
> 
> Let's actually close this bug; please open a separate bug for the radeon
> failures; it would greatly help if you could identify the precise crashing
> test page as said in comment 7.
Well you know we are supposed to be running this test on all builds. If we actually ran the latest version instead of some old version there would be no question here.  I have no idea why I should have to identify this it clearly fails the WebGL conformance test with a the browser crashes issue.  Why is it yup to me to identify the actual reason?
I was going to include this patch in my personal builds that I make universally available at https://www.wg9s.com/mozilla/firefox/  but because of this odd attitude about this, I an NOT inclined to do so.
(In reply to Bill Gianopoulos [:WG9s] from comment #12)
> Why is it yup to me to identify the actual reason?

I didn't mean to look like I was assigning work to you. I was simply pointing out what would be the most useful thing to do to get this investigated. Otherwise, hopefully someone else has time to look into this.
We can certainly keep this bug open to save you the effort of filing another bug for the AMD issue, if you wish.
Whiteboard: [leave open]
With the patch here applied, the conformance test still crashes with NVIDIA graphics.
I'd be interested in a new crash link then.
If I had a way to produce one on a non-official build ...
I know. Let's wait until it hits Nightly.
(In reply to Benoit Jacob [:bjacob] from comment #17)
> I'd be interested in a new crash link then.

OK I updated the URL link to be the test that actually causes the crash.  It is the exact same test that triggers the crash in the AMD/ATI driver, so the root cause of both might be the same.

New link with today;s nightly which includes your previous fix is:

bp-af7e1011-25c6-47db-b29d-c0e112121123
Ah, so the good thing with bugs that reproduce with two different Mesa Gallium drivers, is they tend to also reproduce with llvmpipe, which means I could reproduce.

But I could only reproduce with Mesa 8.0.3, not with Mesa Git, indicating that the bug is already fixed by Mesa devs. Might not be worth reporting.

C++ stack:

(gdb) bt
#0  0x00007fed287f183d in nanosleep () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007fed287f16dc in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:138
#2  0x00007fed2303f714 in ah_crap_handler (signum=11) at /hack/mozilla-inbound/toolkit/xre/nsSigHandlers.cpp:87
#3  0x00007fed230495e0 in nsProfileLock::FatalSignalHandler (signo=11, info=0x7fffb5431ff0, context=0x7fffb5431ec0)
    at /hack/mozilla-inbound/obj-firefox-debug/toolkit/profile/nsProfileLock.cpp:190
#4  <signal handler called>
#5  0x00007fed2988208d in _mesa_get_format_info (format=1515870810) at src/mesa/main/formats.c:1530
#6  0x00007fed2988217e in _mesa_get_format_bits (format=1515870810, pname=3410) at src/mesa/main/formats.c:1571
#7  0x00007fed29885bfe in renderbuffer_exists (ctx=0x7fecf8086000, fb=0x7fecf80e5800, format=6144, reading=1 '\001')
    at src/mesa/main/framebuffer.c:797
#8  0x00007fed29885d51 in _mesa_source_buffer_exists (ctx=0x7fecf8086000, format=6144) at src/mesa/main/framebuffer.c:848
#9  0x00007fed298c8273 in _mesa_error_check_format_type (ctx=0x7fecf8086000, format=6408, type=5121, drawing=0 '\000')
    at src/mesa/main/readpix.c:706
#10 0x00007fed298c878d in _mesa_ReadnPixelsARB (x=0, y=0, width=2, height=2, format=6408, type=5121, bufSize=2147483647, 
    pixels=0x7fed0f4bd450) at src/mesa/main/readpix.c:817
#11 0x00007fed298c8a68 in _mesa_ReadPixels (x=0, y=0, width=2, height=2, format=6408, type=5121, pixels=0x7fed0f4bd450)
    at src/mesa/main/readpix.c:883
#12 0x00007fed2393653f in mozilla::gl::GLContext::raw_fReadPixels (this=0x7fecf80e3000, x=0, y=0, width=2, height=2, format=6408, 
    type=5121, pixels=0x7fed0f4bd450) at ../../../dist/include/GLContext.h:2640
#13 0x00007fed23935275 in mozilla::gl::GLContext::fReadPixels (this=0x7fecf80e3000, x=0, y=0, width=2, height=2, format=6408, 
    type=5121, pixels=0x7fed0f4bd450) at ../../../dist/include/GLContext.h:1355
#14 0x00007fed23945ec3 in mozilla::WebGLContext::ReadPixels (this=0x7fecfcdcb400, x=0, y=0, width=2, height=2, format=6408, type=5121, 
    pixels=0x7fffb54328d0, rv=...) at /hack/mozilla-inbound/content/canvas/src/WebGLContextGL.cpp:3253
#15 0x00007fed24a88474 in mozilla::dom::WebGLRenderingContextBinding::readPixels (cx=0x7fed04dcaf00, obj=..., self=0x7fecfcdcb400, 
    argc=7, vp=0x7fed10dff3c8) at /hack/mozilla-inbound/obj-firefox-debug/dom/bindings/WebGLRenderingContextBinding.cpp:6749
#16 0x00007fed24a906a4 in mozilla::dom::WebGLRenderingContextBinding::genericMethod (cx=0x7fed04dcaf00, argc=7, vp=0x7fed10dff3c8)
    at /hack/mozilla-inbound/obj-firefox-debug/dom/bindings/WebGLRenderingContextBinding.cpp:9625
#17 0x00007fed25786356 in js::CallJSNative (cx=0x7fed04dcaf00, 
    native=0x7fed24a9056b <mozilla::dom::WebGLRenderingContextBinding::genericMethod(JSContext*, unsigned int, JS::Value*)>, args=...)


JS stack:
(gdb) call DumpJSStack()
0 anonymous() ["https://www.khronos.org/registry/webgl/sdk/tests/conformance/resources/webgl-test-utils.js":694]
    this = [object Window]
1 anonymous() ["https://www.khronos.org/registry/webgl/sdk/tests/conformance/resources/webgl-test-utils.js":735]
    this = [object Window]
2 anonymous() ["https://www.khronos.org/registry/webgl/sdk/tests/conformance/resources/webgl-test-utils.js":748]
    this = [object Object]
3 testFormat() ["https://www.khronos.org/registry/webgl/sdk/tests/conformance/textures/texture-attachment-formats.html":113]
    this = [object Window]
4 <TOP LEVEL> ["https://www.khronos.org/registry/webgl/sdk/tests/conformance/textures/texture-attachment-formats.html":183]
    this = [object Window]
This is a use-after-free inside Mesa. May be security sensitive.

(gdb) frame
#7  0x00007fed29885bfe in renderbuffer_exists (ctx=0x7fecf8086000, fb=0x7fecf80e5800, format=6144, reading=1 '\001')
    at src/mesa/main/framebuffer.c:797
797              ASSERT(_mesa_get_format_bits(readBuf->Format, GL_RED_BITS) > 0 ||
(gdb) p/x *readBuf
$8 = {Mutex = {__data = {__lock = 0x5a5a5a5a, __count = 0x5a5a5a5a, __owner = 0x5a5a5a5a, __nusers = 0x5a5a5a5a, __kind = 0x5a5a5a5a, 
      __spins = 0x5a5a5a5a, __list = {__prev = 0x5a5a5a5a5a5a5a5a, __next = 0x5a5a5a5a5a5a5a5a}}, __size = {0x5a <repeats 40 times>}, 
    __align = 0x5a5a5a5a5a5a5a5a}, ClassID = 0x5a5a5a5a, Name = 0x5a5a5a5a, RefCount = 0x5a5a5a5a, Width = 0x5a5a5a5a, 
  Height = 0x5a5a5a5a, Purgeable = 0x5a, AttachedAnytime = 0x5a, NumSamples = 0x5a, InternalFormat = 0x5a5a5a5a, 
  _BaseFormat = 0x5a5a5a5a, Format = 0x5a5a5a5a, Delete = 0x5a5a5a5a5a5a5a5a, AllocStorage = 0x5a5a5a5a5a5a5a5a}
Group: core-security
Summary: WebGL conformance test crashes on Linux → Use after free in Mesa 8.0 (fixed in Mesa Git) triggered by glReadPixels in texture-attachment-formats.html test
Adding some dump()'s to the test reveals with framebuffer configuration causes the crash:

format = ALPHA
type = UNSIGNED_BYTE
channels = -1,-1,-1,3
bits = 0,0,0,8
mustBeFramebufferComplete = false

So this is a crash with framebuffers having only alpha, not actual color (RGB), in their color buffer.
Removing it, we crash subsequently in

format = RGB
type = UNSIGNED_BYTE
channels = 0,1,2,-1
bits = 8,8,8,0
mustBeFramebufferComplete = false

But we don't crash in

format = RGBA
type = UNSIGNED_BYTE
channels = 0,1,2,3
bits = 8,8,8,8
mustBeFramebufferComplete = true

Could it be that Mesa 8.0 required the color attachment to be RGBA?
Nope, in fact this has nothing to do with the choice of format: it always crashes on the _second_ format test. Which fits well with the fact that it's a use-after-free.
Found a work-around.

The bug happens when a texture is already bound to a framebuffer object, and we re-upload an image to it with texImage2D.

The work-around is to re-attach it to the FBO (framebufferTexture2D) immediately after the texImage2D call.

CC'ing a few Mesa and Chromium people.
Summary: Use after free in Mesa 8.0 (fixed in Mesa Git) triggered by glReadPixels in texture-attachment-formats.html test → Use after free in Mesa 8.0 after calling texImage2D on texture already attached to FBO in texture-attachment-formats.html test
I have also verified that using AMD/ATI graphics, the entire suite runs to completion if I disable this one test.  Using NVDIA graphics I also have to disable textures/texture-size-limit.html (see bug 814716).
This works around the problem for me here on Mesa 8.0.3 llvmpipe.
 * without patch: crash in mesa as described above
 * with patch: no crash. The test runs to completion with a few test failures.

Try builds:
https://tbpl.mozilla.org/?tree=Try&rev=fede02ff2e83
Attachment #684785 - Flags: review?(jgilbert)
My limited understanding is that Use-after-free bugs are sec-critical.
Keywords: sec-critical
(In reply to Benoit Jacob [:bjacob] from comment #29)
> Created attachment 684785 [details] [diff] [review]
> reattach textures when a texture image is replaced

I have verified that with this patch included, this test runs to completion on both nouveau and the Xorg AMD/ATI driver.
Please use ASan builds to run the conformance test suite! 

If this is a use-after-free it is discoverable with an ASan build. GDB will NOT tell you - not without some debugging - that this is a use-after-free bug and such bugs stay then undiscovered.

Instructions: https://developer.mozilla.org/en-US/docs/Thread_Sanitizer 
If you don't want to compile by yourself, download a build at: https://people.mozilla.com/~choller/firefox/asan/
(In reply to Christoph Diehl [:cdiehl] from comment #32)
> Please use ASan builds to run the conformance test suite! 
> 
> If this is a use-after-free it is discoverable with an ASan build. GDB will
> NOT tell you - not without some debugging - that this is a use-after-free
> bug and such bugs stay then undiscovered.

In some cases it will strongly suggest so: in comment 23 we see that we are crashing as we are dereferencing pointers stored in some structure filled with 0x5a bytes. These 0x5a bytes are how jemalloc marks free'd memory in debug builds.
This issue seems to be prestnt in firefos=x 17 relase version so setting affected flags appropriately.
Hmm auto correct strikes again.  Trying to say it is present in Firefox 17 release.  So, I set affected flags as I thought appropriately.
Oh I should mention that my daily builds at http://www.wg9s.com/mozilla/firefox/ include these patches.
Comment on attachment 684785 [details] [diff] [review]
reattach textures when a texture image is replaced

Review of attachment 684785 [details] [diff] [review]:
-----------------------------------------------------------------

Nit.

::: content/canvas/src/WebGLContextGL.cpp
@@ +5308,5 @@
> +        framebuffer = framebuffer->getNext())
> +    {
> +        if (framebuffer->ColorAttachment().Texture() == tex) {
> +            framebuffer->FramebufferTexture2D(
> +              LOCAL_GL_FRAMEBUFFER, LOCAL_GL_COLOR_ATTACHMENT0,

2-space indents in 4-space land. Should probably stick to at least 4-space even for this line-overflow stuff.
Attachment #684785 - Flags: review?(jgilbert) → review+
Comment on attachment 684785 [details] [diff] [review]
reattach textures when a texture image is replaced

[Security approval request comment]
How easily can the security issue be deduced from the patch?
Quite easily.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
No, but the patch itself is pretty explicit about that.

Which older supported branches are affected by this flaw?
Every version since Firefox 6 enabled WebGL on Mesa.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Should be trivial to backport.

How likely is this patch to cause regressions; how much testing does it need?
Not likely to cause a regression. WebGL conformance tests run as part of mochitest-1 should be extensive enough.
Attachment #684785 - Flags: sec-approval?
Comment on attachment 684785 [details] [diff] [review]
reattach textures when a texture image is replaced

Sec-approval+ on or after December 15. Please don't check it in before then and we should have nominated patches for all affected branches.
Attachment #684785 - Flags: sec-approval? → sec-approval+
Comment on attachment 684785 [details] [diff] [review]
reattach textures when a texture image is replaced

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: this is sec:crit
User impact if declined: sec:crit affecting certain linux users
Fix Landed on Version: will land on m-c around Dec 15
Risk to taking this patch (and alternatives if risky): see above sec-approval comment. Low risk.
String or UUID changes made by this patch: none
Attachment #684785 - Flags: approval-mozilla-esr17?
Attachment #684785 - Flags: approval-mozilla-beta?
Attachment #684785 - Flags: approval-mozilla-aurora?
Whiteboard: [leave open] → [leave open][embargoed till 12/15]
(In reply to Al Billings [:abillings] from comment #40)
> Comment on attachment 684785 [details] [diff] [review]
> reattach textures when a texture image is replaced
> 
> Sec-approval+ on or after December 15. Please don't check it in before then


OK. I am therefore beginning tomorrow, no longer including these patches in my builds that I make available for download.  I cannot continue to provide builds that include patches that I don't provide and still be consistent with this being open source.  If this had been for a shorter term, then that would not have been as much of an issue.
Whiteboard: [leave open][embargoed till 12/15] → [leave open]
Correction to earlier comment - we're going to be approving/landing these changes on 12/10 instead of 12/15 to make sure they make it into our fourth beta.
Attachment #684785 - Flags: approval-mozilla-esr17?
Attachment #684785 - Flags: approval-mozilla-esr17+
Attachment #684785 - Flags: approval-mozilla-beta?
Attachment #684785 - Flags: approval-mozilla-beta+
Attachment #684785 - Flags: approval-mozilla-aurora?
Attachment #684785 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/8d394324e01c
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
ESR10 is unaffected because Mesa is blacklisted there.
Whiteboard: [adv-main18+][adv-esr17+]
Keywords: verifyme
Bill, as I wrote in bug 814716, if you have affected configs/drivers, please run the latest trunk/Aurora/beta builds against them to make sure we've addressed the crash(es). Thank you.
Keywords: verifyme
Whiteboard: [adv-main18+][adv-esr17+] → [adv-main18+][adv-esr17+][qa-]
I have verified that this issue is resolved in current trunk, aurora and beta builds.  HTe actual builds I used to perform the test are:

Mozilla/5.0 (X11; Linux i686 on x86_64; rv:21.0) Gecko/20130112 Firefox/21.0 ID:20130112030947
Mozilla/5.0 (X11; Linux i686 on x86_64; rv:20.0) Gecko/20130112 Firefox/20.0 ID:20130112042018
Mozilla/5.0 (X11; Linux i686 on x86_64; rv:19.0) Gecko/20100101 Firefox/19.0 ID:20130109111322
Great work, Bill. Setting to verified for these three branches. Thank you.
Bill: thanks for verifying; it is intentional that the "correct way" patch wasn't backported: it was not necessary to fix the bug, and we try to only take necessary patches on beta.
(In reply to Benoit Jacob [:bjacob] from comment #53)
> Bill: thanks for verifying; it is intentional that the "correct way" patch
> wasn't backported: it was not necessary to fix the bug, and we try to only
> take necessary patches on beta.

Was this comment meant for Bug 814716?
Oh, yes. Too much bugmail.
Isn't this fixed everywhere now?  Does it still need to be hidden?
Group: core-security
Depends on: 880734
Blocks: 978422
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: