Closed Bug 1366446 Opened 2 years ago Closed 2 years ago

AddressSanitizer: heap-use-after-free /home/worker/workspace/build/src/gfx/skia/skia/src/opts/SkBitmapProcState_opts_SSSE3.cpp:291:9 in ProcessTwoPixelPairs<false>

Categories

(Core :: Graphics, defect, P1, critical)

53 Branch
defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox-esr45 --- unaffected
firefox-esr52 --- unaffected
firefox53 --- wontfix
firefox54 + fixed
firefox55 + fixed

People

(Reporter: bc, Assigned: lsalzman)

References

(Blocks 1 open bug, )

Details

(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [post-critsmash-triage][adv-main54+])

Crash Data

Attachments

(4 files, 5 obsolete files)

Attached file asan report
1. http://www.lovedroids.com/
2. AddressSanitizer: heap-use-after-free /home/worker/workspace/build/src/gfx/skia/skia/src/opts/SkBitmapProcState_opts_SSSE3.cpp:291:9 in ProcessTwoPixelPairs<false>

Fedora and Ubuntu. On Windows I see low and high exploitable EXCEPTION_ACCESS_VIOLATION_READ at `anonymous namespace'::S32_generic_D32_filter_DX_SSSE3<int> BitmapProcShaderContext::shadeSpan SkARGB32_Shader_Blitter::blitRect antifilldot8 antifillrect

https://crash-stats.mozilla.com/signature/?_sort=-date&signature=%60anonymous%20namespace%27%27%3A%3AProcessTwoPixelPairs%3CT%3E&date=%3E%3D2017-05-13T05%3A45%3A00.000Z&date=%3C2017-05-20T05%3A45%3A00.000Z which also rates this as low to high exploitable on Windows back to at least 53.
milan: do you know who could take a look at this ?
Flags: needinfo?(milan)
Group: core-security → gfx-core-security
Assignee: nobody → lsalzman
Flags: needinfo?(milan)
The link you gave is a 404? Do we have any other testcase?
Flags: needinfo?(bob)
The site still exists and loads. I got a Assertion failure: mDestroyed, at /mozilla/builds/nightly/mozilla/gfx/layers/composite/ImageLayerComposite.cpp:44  with a build I did late late night fwiw. That was with a proxy located in the scl3 colo. I just loaded it myself via my ISP and it also loads.
Flags: needinfo?(bob)
O_o careful opening that seizure-inducing site, but it appears to open from the west coast (California and BC).
Initial investigation results: This is not actually a bug in the Skia code itself. The entire source surface that is being supplied to the canvas is bogus. The source seems to be an HTMLVideoElement, which by the time we get it from, has already been deleted. It is just unfortunate downwind fallout that the downstream Skia code is the first one trying to access it.
The particulars of this are that we're trying to draw an HTML video element to the canvas. It pulls a layers image temporarily from the HTML video element for this purpose. The problem here is that SurfaceFromElementResult has a RefPtr to this image, and yet its scope is confined only to the branch in which the source surface is queried.

So by the time we get down to the DrawSurface below, layers may have determined that nothing is referencing the image and decide to free it out from under us, creating a nice race condition.

The solution is thus to keep the reference around until we actually need to draw it.

It looks like this bug is thus confined to E10s only. The reference to mLayersImage in SurfaceFromElementResult was added in bug 1221822, so the bug was presumably introduced then. However, that change was in 45, so this affects all released versions. :(
Attachment #8871426 - Flags: review?(jmuizelaar)
Status: NEW → ASSIGNED
Has Regression Range: --- → yes
Has STR: --- → yes
Priority: -- → P1
It appears we have several other potential instances of this bug where the reference to the layers image is discarded despite keeping a reference to the source surface.

https://dxr.mozilla.org/mozilla-central/source/dom/canvas/CanvasRenderingContext2D.cpp?q=CanvasRenderingContext2D.cpp%3A2613&redirect_type=direct#2613

https://dxr.mozilla.org/mozilla-central/source/dom/canvas/ImageBitmap.cpp?q=ImageBitmap.cpp%3A390&redirect_type=direct#390

https://dxr.mozilla.org/mozilla-central/source/dom/canvas/ImageBitmap.cpp?q=ImageBitmap.cpp%3A390&redirect_type=direct#732

There are two ways I can think of to solve these...

1) rely on user data on the source surface to keep the layers image alive
2) force an actual copy of the data instead of pointing into the shared layers data which can go away at any time

Doing it via user data might be another way to solve the original instance reported in this bug if we force all SurfaceFromElement results to set this.
Attached file test_crash.html (obsolete) —
I have a test case that does repro the crash but it does require two slightly NSFW videos (from the site linked above). I tried to repro with other videos but could not.

This test case may need to run for up to five minutes sometimes to trigger the crash (sometimes only a few seconds)
Attached video 1.webm (obsolete) —
Attached file test_case.zip
Attachment #8871513 - Attachment is obsolete: true
Attachment #8871514 - Attachment is obsolete: true
I strongly suspect bug 1364192 is just an instance of this bug, just triggering down in the SSE2 bits due to either differences in CPU hardware or slightly differing canvas setup. Though, I don't know offhand how to reproduce the SSE2 signature, unlike this one. At least the stack trace from those reports matches the pathway through canvas drawImage. So depending, we may just want to dup that bug onto this one.
This is an alternate way of fixing both the original instance of the bug and all other instances of this bug. We ensure the layers Image actually keeps a reference so long as the SourceSurface lives.
Attachment #8871809 - Flags: review?(jmuizelaar)
Comment on attachment 8871809 [details] [diff] [review]
ensure layers Image always lives as long as the SourceSurface it hands out

So this approach can't work as-is because of a reference cycle. The chain of references is:

SharedRGBImage -> TextureClient -> BufferTextureData -> DrawTargetSkia -> SourceSurfaceSkia

So having the SourceSurfaceSkia point back at the TextureClient or the SharedRGBImage means a cycle is created. It would seem superficially that we at least need to point back to the TextureClient so that lifetime management of the BufferTextureData. The TextureClient may deallocate the BufferTextureData if it goes away.

So the strategy of forcing the SourceSurfaceSkia to be a copy in some cases, via some small amount of extra plumbing to SharedRGBImage::GetAsSourceSurface, would be safer without impacting performance for the common case of drawing video to canvas.
Attachment #8871809 - Flags: review?(jmuizelaar) → review-
Upon further investigation, SharedRGBImage::GetAsSourceSurface() won't actually create a reference cycle so long as we reference the TextureClient instead of the SharedRGBImage in the SourceSurface's user data.

DrawTargetSkia only contains a *weak* reference to its associated SourceSurfaceSkia snapshot, while the SourceSurfaceSkia contains a normal reference to the DrawTargetSkia. So the chains of references look more like:

1) SharedRGBImage -> TextureClient -> BufferTextureData -> DrawTargetSkia
2) SharedRGBImage -> SourceSurfaceSkia -> TextureClient

... with us adding that link from SourceSurfaceSkia to TextureClient in 2) via user-data. This fixes the lifetime issue and should thus handle all instances of this bug extent in the canvas 2d code, without need for any spot fixing.
Attachment #8871809 - Attachment is obsolete: true
Attachment #8872077 - Flags: review?(jmuizelaar)
[Tracking Requested - why for this release]:
long-term sec-high, now with a patch and good analysis
Fixes namespace issue with UserDataKey.
Attachment #8872077 - Attachment is obsolete: true
Attachment #8872077 - Flags: review?(jmuizelaar)
Attachment #8872191 - Flags: review?(jmuizelaar)
tracking for 54/55 as sec-high.  wontfix for esr45 since there are no more planned releases from that branch.
Attachment #8871426 - Flags: review?(jmuizelaar) → review+
Attachment #8872191 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8872191 [details] [diff] [review]
ensure layers TextureClient always lives as long as the SourceSurface using it

[Security approval request comment]
> How easily could an exploit be constructed based on the patch?
You would have to rely on the fact that that drawing video frames into a 2D canvas can sometimes write into memory that has been subsequently unmapped, which will in most cases simply crash, but in the worst case, if the memory is remapped by something else in the brief span of time between unmapping and drawing, maybe being able to overwrite that memory with the contents of your video frame. Given that this is a very hard to trigger race to the point it goes mostly unnoticed in release builds, I'm not sure it is very easy, but I am not an experienced blackhat...

> Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
The comments don't label it as a use-after-free, but they do, for posterity and to ensure nobody goes and deletes the code, at least point out that the extra references to the surface, that the canvas is using, are necessary to keep the surface alive while it is in use. Not quite a bulls-eye, but if someone were keen and really really hunting for things that could be turned into a use-after-free, maybe they could figure it out from that clue.
 
> Which older supported branches are affected by this flaw?
This is a preexisting bug that affects everything from 45 on up. However, E10s must be enabled for it to trigger, so a fix is only really necessary for those releases that actually default E10s to on.

> Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Patch seems to apply to 53+. 52 would require some easy fixup due to bit-rot.
How likely is this patch to cause regressions; how much testing does it need?
Attachment #8872191 - Flags: sec-approval?
Comment on attachment 8872191 [details] [diff] [review]
ensure layers TextureClient always lives as long as the SourceSurface using it

Approval Request Comment
[Feature/Bug causing the regression]: Pre-existing condition.
[User impact if declined]: Crashes and possible security issues when drawing video to canvas 2d.
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no 
[List of other uplifts needed for the feature/fix]:
[Is the change risky?]: Not really
[Why is the change risky/not risky?]: Just prevents a surface from getting freed while it is being used. As far as testing runs are concerned, this has not caused any observable leaks, which could be the only potential risk, but as noted, testing did not find any.
[String changes made/needed]: none
Attachment #8872191 - Flags: approval-mozilla-beta?
Attachment #8871426 - Attachment is obsolete: true
NI? release management to see if we have time to take this on Beta and ESR52 as well.
Flags: needinfo?(rkothari)
Flags: needinfo?(lhenry)
We will gtb b13 on Thursday so yes we can take it.
Flags: needinfo?(rkothari)
Comment on attachment 8872191 [details] [diff] [review]
ensure layers TextureClient always lives as long as the SourceSurface using it

Please nominate a patch for ESR52 approval, to be checked in after mozilla-central.
sec-approval+
Attachment #8872191 - Flags: sec-approval?
Attachment #8872191 - Flags: sec-approval+
Attachment #8872191 - Flags: approval-mozilla-beta?
Attachment #8872191 - Flags: approval-mozilla-beta+
Good news. On trying to rebase the patch for 52, it appears that the naughty code in SharedRGBImage doesn't actually exist. It appears that was only added in bug 1332952, which landed in 53. While the potential for the bug existed prior to 53 if SharedRGBImage was used in this new way, no code actually existed to trigger it yet in this way until 53.So we should actually be safe on 52 and below, and thus only need for beta.
https://hg.mozilla.org/mozilla-central/rev/61bea8a3b2db
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Whiteboard: [post-critsmash-triage]
Group: gfx-core-security → core-security-release
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main54+]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.