Closed Bug 963962 (CVE-2014-1528) Opened 10 years ago Closed 10 years ago

crash in sse2_composite_src_x888_8888

Categories

(Core :: Graphics: CanvasWebGL, defect)

28 Branch
x86
Windows 8.1
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla31
Tracking Status
firefox27 --- unaffected
firefox28 --- wontfix
firefox29 + fixed
firefox30 + fixed
firefox31 + fixed
firefox-esr24 --- unaffected
b2g18 --- unaffected
b2g-v1.1hd --- unaffected
b2g-v1.2 --- unaffected
b2g-v1.3 --- unaffected
b2g-v1.3T --- unaffected
b2g-v1.4 --- unaffected
b2g-v2.0 --- unaffected

People

(Reporter: jujjyl, Assigned: jgilbert)

References

Details

(4 keywords, Whiteboard: [adv-main29+])

Crash Data

Attachments

(3 files, 1 obsolete file)

1. Visit https://dl.dropboxusercontent.com/u/40949268/emcc/bugs/Tests_d_crashes/Tests_d.html

Observed: Web page crashes to desktop. See traces

https://crash-stats.mozilla.com/report/index/2f8f3cc4-f715-4961-835c-190282140125
https://crash-stats.mozilla.com/report/index/67e3f68d-8e07-4aa3-8f2d-171622140125

Reproduced on newest Macbook Pro line laptop running Windows 8.1 in bootcamp. Did not reproduce on a two year old Macbook Air line laptop running Windows 7 in bootcamp.
Jeff, how bad is this?
Hey, I think I might have been running this with the prefer-native-gl pref enabled, I can't remember for certain. That's worth checking out if the bug doesn't reproduce otherwise.
Flags: needinfo?(jmuizelaar)
Jukka, can you confirm this with or without that pref enabled so we can know which it was?
Without being able to reproduce this it's hard to say much.
Flags: needinfo?(jmuizelaar)
Keywords: sec-high
The crashes look exploitable as a best guess, without being able to repro.
Attached file Tests_d.html (obsolete) —
Hi Jukka - could you do us a favor and zip up the test file and all of its dependent files (JS, etc.) and attach them to this bug? The dropbox web site is very slow and sometimes inaccessible, and it would be useful for us to have the files available for us to stage ourselves. Thank you.
Flags: needinfo?(jujjyl)
Hey thanks for poking with the needinfo - I had missed the earlier question, sorry Al!

I tested again on latest Nightly as of today and confirmed that webgl.prefer-native-gl is set to false. The crash still occurs - tested on both Nightly 64-bit and 32-bit on Windows. An offline .zip version is available for download at https://dl.dropboxusercontent.com/u/40949268/emcc/bugs/Tests_d_crashes/Tests_d_crashes.zip
Flags: needinfo?(jujjyl)
On the same system:
  - an old Aurora 27.0a2 (2013-12-01) did not crash.
  - Firefox Stable 27.0.1 does not crash.
but
  - Aurora 29.0a2 (2013-03-04) does crash, and
  - Firefox Beta 28.0 does crash.
Version: 29 Branch → 28 Branch
Marking the affected versions with flags and requesting tracking given that dveditz says it's potentially exploitable in comment #5 and we know it's a 28 regression.
Attached file Tests_d_crashes.zip
Attachment #8382390 - Attachment is obsolete: true
At this stage, we're about to go to final 28 beta and there's not even speculative work here.  As it's a high, not critical, wontfixing for 28.
Went through following issue using build:
- http://ftp.mozilla.org/pub/mozilla.org/firefox/releases/28.0b8/win32/en-US/

Using the proof of concept from comment #0, ran the following tests on actual Windows hardware:

- Lenovo X1 Carbon with Windows 8.1 -> no crashes (Passed)
- Surface Pro 2 with Windows 8.1 -> no crashes (Passed)
At this point, it appears that we probably need to configure a MacBook with dual boot Windows 8.1 in order to repro. This will take some time.
Group: gfx-core-security
Matt, I'll do this sometime this weekend and will add the results.
I installed Windows 8.1 on OSX 10.9.2 using VMware Fusion. Windows 8.1 was fully updated and patched.

Used the following build for the verification process:
- http://ftp.mozilla.org/pub/mozilla.org/firefox/releases/28.0b9/win32/en-US/

- Loaded the website from comment #0 under Windows 8.1 in the OSX VMware Fusion VM without any issues (tried 10 times)
- Loaded the website from comment #0 under a dedicated Windows 8.1 Surface Pro 2 without any issues (tried 10 times)

I'm not sure if this is a problem, but under OSX, the "Blue" square doesn't load at the end of the test. The browser doesn't crash and completes downloading/loading but the blue square isn't appearing the same way it does under a dedicated Win 8.1 machine.
Thanks Kamil. What we need is Windows 8.1 running via BootCamp. One of us will have to bite the bullet and install it. :)
Note that this signature suddenly jumped up really much on Nightly over the weekend, see more reports at https://crash-stats.mozilla.com/report/list?signature=sse2_composite_src_x888_8888

The spike started 2014-03-08 so maybe some checkin on the 7th made us hit this more frequently - or is the spike a different bug?
This is startup crash that has jumped to #4 on Nightly.
Keywords: topcrash-win
Tracy is going to check the per-build stats and build a regression range link for comment 19/20.
Flags: needinfo?(twalker)
Retested this today with latest Nightly, some observations:
  - The tests_d.html crash rate is 100%, ran perhaps ~times.
  - Created a new profile with default settings to exclude any pref, but still crashes.
  - Checked that all add-ons are disabled when testing, does not relate to any of those (had geckoprofiler.xpi and IndexedDB Browser add-ons installed but both disabled while testing)

Also, the tests_d.html contains multiple unit tests, and I tried to track down the crash into a single test, but it looks like if I just run any single test in that suite, the crash does not occur. It is somehow caused by a sequence of multiple unit tests ran in order. I'll play with this more to find a minimal sequence of tests from tests_d.html that produces the crash.
It seems to have happened in this range: 

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b0e7f63c2138&tochange=d01bf8596d3b

Jonathan it may be one of your changes.
Flags: needinfo?(twalker)
Flags: needinfo?(jwatt)
It does seem likely that bug 979853 caused the increase in crashes, but bug 981020 should have fixed the issue there. I'm not seeing any crashes reported on crash-stats.mozilla.com with a build ID later than 2014030803, so hopefully that is the case.
Flags: needinfo?(jwatt)
Comment #24 is probably right, I haven't look at by-buildid at this time, but I trust jwatt did get it correctly. And I see numbers go down in by-crash-date numbers as well.
That's expected though, right? As in it consistently crashed for you before bug 979853 (presumably) caused the spike in crashes.
I've bisected this down to the following commit:

Jukka@GOZILLA /e/mozilla-central
$ hg bisect -b
The first bad revision is:
mchangeset:   152820:175163f053f7
user:        Jeff Gilbert <jgilbert@mozilla.com>
date:        Thu Oct 31 09:52:24 2013 -0400
summary:     Bug 924241 - Don't force-present after post-resize clear. r=bjacob

Does that help?
Stepping through the code, I see that cairo is painting out-of-bounds to the destination buffer in the compositing function. What happens is that the tests_d.html page resizes the canvas during its operation, and the new code added in the above commit ends up being called twice, first to resize from 640x480 to 621x480, and then a second time to resize from 621x480 to 621x449. After that the code goes to call gfxContext::Paint(gfxFloat alpha), which calls into cairo_paint_with_alpha(..), but in the cairo function, the paint rectangle size is still the old 640x480, and in the compositing loop, the a write to dst pointer gives the access violation. _Presumably_ the paint dst pointer is now allocated to smaller size (621x449 < 640x480) and the painting walks out of bounds past the end of the surface.

Perhaps there is some kind of cairo-related surface update/resize that's missing here? Jeff, what do you make of this?
Flags: needinfo?(jgilbert)
I can reproduce this, at least. Let me try to dig into this.
Ok, so this is what it looks like. We map into the CanvasLayer's D3D Texture, which has sizeA. But the frame we got from the producer is of sizeB. We're copying pixels as if sizeB is always equal to sizeA.

Indeed, I trip this assert that I added:

        DataSourceSurface* frameData = shareSurf->GetData();
        MOZ_ASSERT(frameData->GetSize().width == mBounds.width);
        MOZ_ASSERT(frameData->GetSize().height == mBounds.height);
        // Scope for DrawTarget, so it's destroyed before Unmap.

In `mapDt->DrawSurface`, we pass in `drawRect` (the rect we get from `frameData`) as both `aDest` and `aSrc`. We might want to fix this for correctness, but this shouldn't be the main bug.

Our bug should be above this, though, in the Factory::CreateDrawTargetForData line, where we wrap our mapped D3DTexture in a DrawTarget of the size of `frameData`. Oops.
Flags: needinfo?(jgilbert)
We should consider taking something like this (sans printfs) so we catch this earlier. That said, it doesn't help me that might right now, since MSVC doesn't seem to pick up on the fault it generates. I'm guessing this is because we're in the middle of ASM.js stuff, and haven't reset our exception handlers.
Is my theory about ASM.js eating my segfault correct?
Flags: needinfo?(luke)
With just patch 2, the testcase completes without crashing on my '13 retinabook pro running Win8.1, which can reproduce the testcase crashing on trunk.
Attachment #8390921 - Flags: review?(bas) → review+
Attachment #8390914 - Attachment description: patch 1: Try to setfault earlier and more predictably → patch ?: Try to setfault earlier and more predictably
Attachment #8390921 - Attachment description: patch 2: Fix use of CreateDrawTargetForData in CanvasLayerD3D9/10 → patch 1: Fix use of CreateDrawTargetForData in CanvasLayerD3D9/10
(In reply to Jeff Gilbert [:jgilbert] from comment #32)
I'm not sure exactly what you're asking.  The asm.js fault handler only "eats" (that is, safely resumes) faults that (1) come from within asm.js-generated code and (2) occur at known heap offsets, so likely it is not likely to be interacting here.
Flags: needinfo?(luke)
(In reply to Jeff Gilbert [:jgilbert] from comment #33)
> Is my theory about ASM.js eating my segfault correct?

While I was testing, I had the same issue that I could not at first get Visual Studio to break at the segfault. This was because the segfault is caught as a SEH exception in nsCrashOnException.cpp:35. To be able to break at the point where the segfault write occurred, one can 
   1. Attach Visual Studio to the process, and add a breakpoint to nsCrashOnException.cpp:35, inside the MOZ_SEH_EXCEPT block. Although this is a bit after-the-fact already.
   2. Possibly comment out the MOZ_SEH_TRY and MOZ_SEH_EXCEPT blocks in that file:line. I did not try this, but assuming there are no other exception handlers above in the call stack, that should work.
   3. Go to Debug->Exceptions..., and tick all the checkboxes in the "Thrown" column. (I think the actual exception is a Win32 Exception, but does not hurt to add them all)

I think that unconditional SEH try catch is a bit useless for debug builds and it trips up people trying to debug. I was scratching my head on this for a while, assuming at first that there was a C exit() being called somewhere as a response to the error, since Firefox quit so fast after the crash. Perhaps that try catch should be conditionally compiled out in "ac_add_options --enable-debug" builds? Would we ever send out crash reports from such builds?
Whoa, we should definitely not be using SEH try/catch.  It's totally broken on Win64 when JITs are used so, with bug 844196, we'll probably switch breakpad to use "vectored exception handlers" (AddVectoredExceptionHandler) which are run before SEH even starts.  (Right now breakpad uses SetUnhandledExceptionFilter which runs *after* all SEH has been tried.)
Jeff, sorry if I missed some information but is there a reason why the patch has not been checked in + uplifted? Thanks
Flags: needinfo?(jgilbert)
No one has asked for sec-approval to check in, which would need to be done since this is a sec-high. I'm not sure why that hasn't happened yet.
Comment on attachment 8390921 [details] [diff] [review]
patch 1: Fix use of CreateDrawTargetForData in CanvasLayerD3D9/10

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
This should only be a vector, but I think it's pretty easy to use.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
No.

Which older supported branches are affected by this flaw?
29+

If not all supported branches, which bug introduced the flaw?
bug 979853

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Easy, but not yet prepared.

How likely is this patch to cause regressions; how much testing does it need?
Not risky.
Attachment #8390921 - Flags: sec-approval?
Flags: needinfo?(jgilbert)
Assignee: nobody → jgilbert
Comment on attachment 8390921 [details] [diff] [review]
patch 1: Fix use of CreateDrawTargetForData in CanvasLayerD3D9/10

Giving sec-approval+ for trunk. We should get branch patches.

You say this affects 29 and greater but the status flags say this is "won't fix" for 28, which means 28 is affected. Is 28 affected?
Attachment #8390921 - Flags: sec-approval? → sec-approval+
(In reply to Al Billings [:abillings] from comment #42)
> Comment on attachment 8390921 [details] [diff] [review]
> patch 1: Fix use of CreateDrawTargetForData in CanvasLayerD3D9/10
> 
> Giving sec-approval+ for trunk. We should get branch patches.
> 
> You say this affects 29 and greater but the status flags say this is "won't
> fix" for 28, which means 28 is affected. Is 28 affected?

Yeah, my bad. It's affected from 28+.
Attachment #8390921 - Flags: checkin?(ryanvm)
Comment on attachment 8390921 [details] [diff] [review]
patch 1: Fix use of CreateDrawTargetForData in CanvasLayerD3D9/10

https://hg.mozilla.org/integration/mozilla-inbound/rev/0ac3e1179211
Attachment #8390921 - Flags: checkin?(ryanvm) → checkin+
Jeff, could you fill the uplift requests for aurora & beta? Thanks
Flags: needinfo?(jgilbert)
https://hg.mozilla.org/mozilla-central/rev/0ac3e1179211
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
Comment on attachment 8390921 [details] [diff] [review]
patch 1: Fix use of CreateDrawTargetForData in CanvasLayerD3D9/10

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 979853
User impact if declined: sec-high
Testing completed (on m-c, etc.): on m-c 
Risk to taking this patch (and alternatives if risky): very low
String or IDL/UUID changes made by this patch: none
Attachment #8390921 - Flags: approval-mozilla-beta?
Attachment #8390921 - Flags: approval-mozilla-aurora?
Flags: needinfo?(jgilbert)
Attachment #8390921 - Flags: approval-mozilla-beta?
Attachment #8390921 - Flags: approval-mozilla-beta+
Attachment #8390921 - Flags: approval-mozilla-aurora?
Attachment #8390921 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/30c27cdcf400

This needs some branch-specific love on the beta patch.
Flags: needinfo?(jgilbert)
Group: gfx-core-security
Whiteboard: [adv-main29+]
Alias: CVE-2014-1528
Blocks: 979853
Keywords: regression
Flags: in-testsuite?
Group: core-security
Flags: in-testsuite?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: