Closed Bug 1366446 Opened 4 years ago Closed 4 years ago
Sanitizer: heap-use-after-free /home/worker/workspace/build/src/gfx/skia/skia/src/opts/Sk Bitmap Proc State _opts _SSSE3 .cpp:291:9 in Process Two Pixel Pairs<false>
32.57 KB, text/plain
685.83 KB, application/x-zip-compressed
34.03 KB, text/plain
2.02 KB, patch
|Details | Diff | Splinter Review|
1. http://www.lovedroids.com/ 2. AddressSanitizer: heap-use-after-free /home/worker/workspace/build/src/gfx/skia/skia/src/opts/SkBitmapProcState_opts_SSSE3.cpp:291:9 in ProcessTwoPixelPairs<false> Fedora and Ubuntu. On Windows I see low and high exploitable EXCEPTION_ACCESS_VIOLATION_READ at `anonymous namespace'::S32_generic_D32_filter_DX_SSSE3<int> BitmapProcShaderContext::shadeSpan SkARGB32_Shader_Blitter::blitRect antifilldot8 antifillrect https://crash-stats.mozilla.com/signature/?_sort=-date&signature=%60anonymous%20namespace%27%27%3A%3AProcessTwoPixelPairs%3CT%3E&date=%3E%3D2017-05-13T05%3A45%3A00.000Z&date=%3C2017-05-20T05%3A45%3A00.000Z which also rates this as low to high exploitable on Windows back to at least 53.
milan: do you know who could take a look at this ?
4 years ago
Assignee: nobody → lsalzman
The link you gave is a 404? Do we have any other testcase?
The site still exists and loads. I got a Assertion failure: mDestroyed, at /mozilla/builds/nightly/mozilla/gfx/layers/composite/ImageLayerComposite.cpp:44 with a build I did late late night fwiw. That was with a proxy located in the scl3 colo. I just loaded it myself via my ISP and it also loads.
O_o careful opening that seizure-inducing site, but it appears to open from the west coast (California and BC).
Initial investigation results: This is not actually a bug in the Skia code itself. The entire source surface that is being supplied to the canvas is bogus. The source seems to be an HTMLVideoElement, which by the time we get it from, has already been deleted. It is just unfortunate downwind fallout that the downstream Skia code is the first one trying to access it.
The particulars of this are that we're trying to draw an HTML video element to the canvas. It pulls a layers image temporarily from the HTML video element for this purpose. The problem here is that SurfaceFromElementResult has a RefPtr to this image, and yet its scope is confined only to the branch in which the source surface is queried. So by the time we get down to the DrawSurface below, layers may have determined that nothing is referencing the image and decide to free it out from under us, creating a nice race condition. The solution is thus to keep the reference around until we actually need to draw it. It looks like this bug is thus confined to E10s only. The reference to mLayersImage in SurfaceFromElementResult was added in bug 1221822, so the bug was presumably introduced then. However, that change was in 45, so this affects all released versions. :(
Attachment #8871426 - Flags: review?(jmuizelaar)
It appears we have several other potential instances of this bug where the reference to the layers image is discarded despite keeping a reference to the source surface. https://dxr.mozilla.org/mozilla-central/source/dom/canvas/CanvasRenderingContext2D.cpp?q=CanvasRenderingContext2D.cpp%3A2613&redirect_type=direct#2613 https://dxr.mozilla.org/mozilla-central/source/dom/canvas/ImageBitmap.cpp?q=ImageBitmap.cpp%3A390&redirect_type=direct#390 https://dxr.mozilla.org/mozilla-central/source/dom/canvas/ImageBitmap.cpp?q=ImageBitmap.cpp%3A390&redirect_type=direct#732 There are two ways I can think of to solve these... 1) rely on user data on the source surface to keep the layers image alive 2) force an actual copy of the data instead of pointing into the shared layers data which can go away at any time Doing it via user data might be another way to solve the original instance reported in this bug if we force all SurfaceFromElement results to set this.
4 years ago
Attachment #8871426 - Flags: feedback+
I have a test case that does repro the crash but it does require two slightly NSFW videos (from the site linked above). I tried to repro with other videos but could not. This test case may need to run for up to five minutes sometimes to trigger the crash (sometimes only a few seconds)
I strongly suspect bug 1364192 is just an instance of this bug, just triggering down in the SSE2 bits due to either differences in CPU hardware or slightly differing canvas setup. Though, I don't know offhand how to reproduce the SSE2 signature, unlike this one. At least the stack trace from those reports matches the pathway through canvas drawImage. So depending, we may just want to dup that bug onto this one.
This is an alternate way of fixing both the original instance of the bug and all other instances of this bug. We ensure the layers Image actually keeps a reference so long as the SourceSurface lives.
Attachment #8871809 - Flags: review?(jmuizelaar)
Comment on attachment 8871809 [details] [diff] [review] ensure layers Image always lives as long as the SourceSurface it hands out So this approach can't work as-is because of a reference cycle. The chain of references is: SharedRGBImage -> TextureClient -> BufferTextureData -> DrawTargetSkia -> SourceSurfaceSkia So having the SourceSurfaceSkia point back at the TextureClient or the SharedRGBImage means a cycle is created. It would seem superficially that we at least need to point back to the TextureClient so that lifetime management of the BufferTextureData. The TextureClient may deallocate the BufferTextureData if it goes away. So the strategy of forcing the SourceSurfaceSkia to be a copy in some cases, via some small amount of extra plumbing to SharedRGBImage::GetAsSourceSurface, would be safer without impacting performance for the common case of drawing video to canvas.
Attachment #8871809 - Flags: review?(jmuizelaar) → review-
I flipped some prefs to get multiple processes with Spider and now get this asan report which closely matches the existing windows crashes. https://crash-stats.mozilla.com/signature/?_sort=-date&signature=%60anonymous%20namespace%27%27%3A%3AS32_generic_D32_filter_DX_SSSE3%3CT%3E&date=%3E%3D2017-05-20T17%3A32%3A00.000Z&date=%3C2017-05-27T17%3A32%3A00.000Z
Upon further investigation, SharedRGBImage::GetAsSourceSurface() won't actually create a reference cycle so long as we reference the TextureClient instead of the SharedRGBImage in the SourceSurface's user data. DrawTargetSkia only contains a *weak* reference to its associated SourceSurfaceSkia snapshot, while the SourceSurfaceSkia contains a normal reference to the DrawTargetSkia. So the chains of references look more like: 1) SharedRGBImage -> TextureClient -> BufferTextureData -> DrawTargetSkia 2) SharedRGBImage -> SourceSurfaceSkia -> TextureClient ... with us adding that link from SourceSurfaceSkia to TextureClient in 2) via user-data. This fixes the lifetime issue and should thus handle all instances of this bug extent in the canvas 2d code, without need for any spot fixing.
[Tracking Requested - why for this release]: long-term sec-high, now with a patch and good analysis
Fixes namespace issue with UserDataKey.
tracking for 54/55 as sec-high. wontfix for esr45 since there are no more planned releases from that branch.
Attachment #8871426 - Flags: review?(jmuizelaar) → review+
Attachment #8872191 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8872191 [details] [diff] [review] ensure layers TextureClient always lives as long as the SourceSurface using it [Security approval request comment] > How easily could an exploit be constructed based on the patch? You would have to rely on the fact that that drawing video frames into a 2D canvas can sometimes write into memory that has been subsequently unmapped, which will in most cases simply crash, but in the worst case, if the memory is remapped by something else in the brief span of time between unmapping and drawing, maybe being able to overwrite that memory with the contents of your video frame. Given that this is a very hard to trigger race to the point it goes mostly unnoticed in release builds, I'm not sure it is very easy, but I am not an experienced blackhat... > Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? The comments don't label it as a use-after-free, but they do, for posterity and to ensure nobody goes and deletes the code, at least point out that the extra references to the surface, that the canvas is using, are necessary to keep the surface alive while it is in use. Not quite a bulls-eye, but if someone were keen and really really hunting for things that could be turned into a use-after-free, maybe they could figure it out from that clue. > Which older supported branches are affected by this flaw? This is a preexisting bug that affects everything from 45 on up. However, E10s must be enabled for it to trigger, so a fix is only really necessary for those releases that actually default E10s to on. > Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Patch seems to apply to 53+. 52 would require some easy fixup due to bit-rot. How likely is this patch to cause regressions; how much testing does it need?
Comment on attachment 8872191 [details] [diff] [review] ensure layers TextureClient always lives as long as the SourceSurface using it Approval Request Comment [Feature/Bug causing the regression]: Pre-existing condition. [User impact if declined]: Crashes and possible security issues when drawing video to canvas 2d. [Is this code covered by automated tests?]: yes [Has the fix been verified in Nightly?]: yes [Needs manual test from QE? If yes, steps to reproduce]: no [List of other uplifts needed for the feature/fix]: [Is the change risky?]: Not really [Why is the change risky/not risky?]: Just prevents a surface from getting freed while it is being used. As far as testing runs are concerned, this has not caused any observable leaks, which could be the only potential risk, but as noted, testing did not find any. [String changes made/needed]: none
Attachment #8871426 - Attachment is obsolete: true
NI? release management to see if we have time to take this on Beta and ESR52 as well.
We will gtb b13 on Thursday so yes we can take it.
Comment on attachment 8872191 [details] [diff] [review] ensure layers TextureClient always lives as long as the SourceSurface using it Please nominate a patch for ESR52 approval, to be checked in after mozilla-central. sec-approval+
Good news. On trying to rebase the patch for 52, it appears that the naughty code in SharedRGBImage doesn't actually exist. It appears that was only added in bug 1332952, which landed in 53. While the potential for the bug existed prior to 53 if SharedRGBImage was used in this new way, no code actually existed to trigger it yet in this way until 53.So we should actually be safe on 52 and below, and thus only need for beta.
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main54+]
You need to log in before you can comment on or make changes to this bug.