Closed
Bug 1119019
Opened 9 years ago
Closed 9 years ago
crash in mozilla::layers::SharedSurfaceTextureHost::Lock()
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
RESOLVED
FIXED
mozilla38
Tracking | Status | |
---|---|---|
firefox36 | + | fixed |
firefox37 | + | fixed |
firefox38 | + | fixed |
firefox-esr31 | --- | unaffected |
b2g-v1.4 | --- | unaffected |
b2g-v2.0 | --- | unaffected |
b2g-v2.0M | --- | unaffected |
b2g-v2.1 | --- | unaffected |
b2g-v2.1S | --- | unaffected |
b2g-v2.2 | --- | unaffected |
b2g-master | --- | unaffected |
People
(Reporter: jesup, Assigned: nical)
References
()
Details
(Keywords: crash, csectype-uaf, sec-critical, Whiteboard: [adv-main36+] gfx-noted)
Crash Data
Attachments
(4 files)
15.95 KB,
text/plain
|
Details | |
4.48 KB,
patch
|
sotaro
:
review+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
3.93 KB,
patch
|
abillings
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
3.93 KB,
patch
|
abillings
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-adf37f72-5201-47ea-ada3-087122141231. ============================================================= Highly repeatable at http://diskdetective.org Hit the icon that looks like an open book, scroll around in the help (and other techniques as well). The addresses from a crashreporter search aren't good.... Milan; please find the right people
This is use after free, right? Kats, can you reproduce this on any of the devices you have?
Reporter | ||
Comment 2•9 years ago
|
||
I hit it tons (hard to use the help popups on that page or scroll around) on my Nexus 10 (Android 5.0.1)
Updated•9 years ago
|
Whiteboard: gfx-noted
Comment 3•9 years ago
|
||
I hit https://crash-stats.mozilla.com/report/index/7659a782-627b-40fb-b635-25c862150109 using the STR in comment 0 on aurora. Will try on a local build to debug.
Comment 4•9 years ago
|
||
I ran a local debug build with gdb attached, and using the STR ended up triggering the MOZ_ASSERT at http://mxr.mozilla.org/mozilla-central/source/gfx/gl/SharedSurface.h?rev=ccd27160dd97#106 which caused Fennec to crash. The assert was triggered on the compositor thread, so grabbed the backtraces from both the compositor thread and the Gecko thread, they are attached. I'm not sure if this is enough to debug what's going on but I can get more info if needed since I seem to be able to reliably repro.
Comment 5•9 years ago
|
||
Guessing nical is the person to look at this...
Flags: needinfo?(nical.bugzilla)
Assignee | ||
Comment 6•9 years ago
|
||
The SharedSurface is being used after it is freed. Looks like it might be the same issue as bug 1098227 and duplicates btw. Luckily Kats's stack trace has more helpful info than the stacks in the related bugs.
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → nical.bugzilla
Flags: needinfo?(nical.bugzilla)
Assignee | ||
Comment 7•9 years ago
|
||
What seems to be happening is that the SharedSurface object is deleted on the main thread ~ShSurfHandle which happens in CanvasClient::Clear, without any form of synchronization with the TextureClient/Host memory management protocol. Normally TextureClient is supposed to fully own its underlying data (the SharedSurface in our case) but it's not the case for the SharedSurfaceTextureClient/Host. The right way to fix this would be to have the SharedSurfaceTextureClient own the SharedSurface, but in that case, if the SharedSurface needs to be deleted on the main thread, the only way to ensure that the deallocation happens race-free is to make it synchronous (TextureClient synchronously waits for the host to be deallocated) which is sad and slow), or to resurect the logic we used to have where the data would be deleted along with the TextureChild. If the SharedSurface can be deleted on the compositor thread, things are simpler and we can follow the same deallocation protocol as with most TextureClient types (but I have a feeling it can't). Otherwise we can hack around it and have a special ForceSynchronousDeallocation() which would only be called when the CanvasClient is cleared if the TextureClient is a SharedSurfaceTextureClient. Paying the cost of Synchronous deallocation is acceptable if it happens only on "rare" occasions like when tearing down the layer (vs every frame).
Updated•9 years ago
|
Keywords: csectype-uaf,
sec-critical
Comment 8•9 years ago
|
||
Assuming this affects 36+
Reporter | ||
Comment 9•9 years ago
|
||
Sec-crit with understanding of how to resolve it - is there a plan/eta for resolving it?
Comment 10•9 years ago
|
||
I think we should patch this by having the ShSurfTexClient hold a ShSurfHandle, and make it destruct synchronously. This would solve the immediate problem, and we can have room to breathe before working on a permanent solution. We want to get rid of the ShSurfClient/Host pair, but the TexClients we use won't own their own data until we supplant ShSurf with TexClient.
Assignee | ||
Comment 11•9 years ago
|
||
This should prevent the SharedSurface to be deallocated before the TextureHost since the latter is synchronously deallocated before the ShSurfHandle. This patch attempts to only do synchronous deallocation when clearing CanvasClient since the race is unlikely to happen anywhere else (unless the client side can overproduce more than one frame in advance but I think that Matt Woodrow made it impossible, I can't remember in which bug). If this is not enough we can make the destruction of all TextureClients used by CanvasClientSurfaceStream synchronous, but performance will suffer.
Attachment #8553088 -
Flags: review?(sotaro.ikeda.g)
Comment on attachment 8553088 [details] [diff] [review] Synchronously deallocate the last TextureClient when destroying CanvasClientSurfaceStream Review of attachment 8553088 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/layers/composite/TextureHost.h @@ +551,5 @@ > PTextureParent* mActor; > TextureFlags mFlags; > int mCompositableCount; > + // whether client is reponsible for deallocating the shared data > + bool mClientDeallocation; Either TextureHost.h shouldn't be part of this patch or TextureHost.cpp should be?
Assignee | ||
Comment 13•9 years ago
|
||
Comment on attachment 8553088 [details] [diff] [review] Synchronously deallocate the last TextureClient when destroying CanvasClientSurfaceStream Review of attachment 8553088 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/layers/composite/TextureHost.h @@ +551,5 @@ > PTextureParent* mActor; > TextureFlags mFlags; > int mCompositableCount; > + // whether client is reponsible for deallocating the shared data > + bool mClientDeallocation; Oops, this is a leftover from a different attempt at fixing the issue. Please ignore this chunk.
Comment 14•9 years ago
|
||
Comment on attachment 8553088 [details] [diff] [review] Synchronously deallocate the last TextureClient when destroying CanvasClientSurfaceStream Review of attachment 8553088 [details] [diff] [review]: ----------------------------------------------------------------- As a short term fix. It seems good. ::: gfx/layers/client/CanvasClient.cpp @@ +405,5 @@ > > +void > +CanvasClientSharedSurface::ClearSurfaces() > +{ > + if (mFrontTex && (mFront || mPrevFront)) { Is there a case that mPrevFront is related to mFrontTex? Even when there is not relation ship between mFrontTex and mPrevFront, synchronous ipc seems to indirectly fix the problem.
Attachment #8553088 -
Flags: review?(sotaro.ikeda.g) → review+
Assignee | ||
Comment 15•9 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #14) > Is there a case that mPrevFront is related to mFrontTex? > Even when there is not relation ship between mFrontTex and mPrevFront, > synchronous ipc seems to indirectly fix the problem. I was being a bit conservative about triggering the synchronous safe-guard, with the reasoning that ::Clear could happen just after ::update in which case the SharedSurface recently stored in mPrevFront might still be in use on the compositor side. The timing is a bit unlikely, it feels safer to me to just rule out this possibility.
Assignee | ||
Comment 16•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/57c76122bb27
Comment 17•9 years ago
|
||
AFAICT, this should have had sec-approval first. Please fill it out retroactively. https://wiki.mozilla.org/Security/Bug_Approval_Process
Flags: needinfo?(nical.bugzilla)
Comment 18•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/57c76122bb27
Status: NEW → RESOLVED
Closed: 9 years ago
status-b2g-v1.4:
--- → ?
status-b2g-v2.0:
--- → ?
status-b2g-v2.1:
--- → ?
status-b2g-v2.2:
--- → affected
status-b2g-master:
--- → fixed
status-firefox-esr31:
--- → ?
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Assignee | ||
Comment 20•9 years ago
|
||
Comment on attachment 8553088 [details] [diff] [review] Synchronously deallocate the last TextureClient when destroying CanvasClientSurfaceStream [Security approval request comment] How easily could an exploit be constructed based on the patch? I suppose it would be quite hard. This bug can cause the compositor thread to try to read form an already-destroyed texture and crash (no writes). Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No Which older supported branches are affected by this flaw? Gecko 34 and later, but something recent must have made the race condition more likely to happen. b2g is not impacted. If not all supported branches, which bug introduced the flaw? My guess would be Bug 1066280 but it's hard to be sure. Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? We should uplift the fix just to be on the safe side when we have confirmed that it does fix the issue. How likely is this patch to cause regressions; how much testing does it need? I think that the patch isn't risky. Let's give it a few days to bake in central to be sure.
Flags: needinfo?(nical.bugzilla)
Attachment #8553088 -
Flags: sec-approval?
Assignee | ||
Updated•9 years ago
|
Assignee | ||
Comment 21•9 years ago
|
||
Assignee | ||
Comment 22•9 years ago
|
||
Comment 23•9 years ago
|
||
Comment on attachment 8553088 [details] [diff] [review] Synchronously deallocate the last TextureClient when destroying CanvasClientSurfaceStream sec-approval+ given (which is good since it went in anyway). I'll mark the Aurora and Beta patches for uplift. We should do those soon.
Attachment #8553088 -
Flags: sec-approval? → sec-approval+
Comment 24•9 years ago
|
||
Comment on attachment 8554581 [details] [diff] [review] Patch upliftable to aurora [Triage Comment]
Attachment #8554581 -
Flags: approval-mozilla-aurora+
Comment 25•9 years ago
|
||
Comment on attachment 8554582 [details] [diff] [review] Patch upliftable to beta Approving patches for branches for sec-critical issue.
Attachment #8554582 -
Flags: approval-mozilla-beta+
Updated•9 years ago
|
status-firefox-esr31:
--- → unaffected
Comment 26•9 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/d077aaf352ea https://hg.mozilla.org/releases/mozilla-beta/rev/6601b8da1750
Updated•9 years ago
|
Updated•9 years ago
|
Whiteboard: gfx-noted → [adv-main36+] gfx-noted
Updated•9 years ago
|
Group: core-security → core-security-release
Updated•8 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•