UAF crash in libvpx 1.6.1

RESOLVED FIXED in Firefox -esr52

Status

()

defect
P1
normal
Rank:
8
RESOLVED FIXED
2 years ago
10 months ago

People

(Reporter: drno, Assigned: drno)

Tracking

({crash, csectype-uaf, sec-high})

Trunk
mozilla60
Points:
---
Dependency tree / graph
Bug Flags:
qe-verify -

Firefox Tracking Flags

(firefox-esr5259+ fixed, firefox57 wontfix, firefox58 wontfix, firefox59+ fixed, firefox60+ fixed)

Details

(Whiteboard: [post-critsmash-triage][adv-main59+][adv-esr52.7+])

Attachments

(3 attachments, 1 obsolete attachment)

Assignee

Description

2 years ago
When trying to start a screen share in Google Hangouts with Firefox shows a pretty clear UAF on ASAN builds.
Assignee

Updated

2 years ago
Rank: 8
Assignee

Comment 1

2 years ago
Posted patch vpx_patch.txt (obsolete) — Splinter Review
This patch appears to fix the problem on ASAN builds.
Assignee

Comment 2

2 years ago
What is the best way to report and land this patch upstream in the libvpx project without disclosing it right away?
Flags: needinfo?(jzern)
Flags: needinfo?(johannkoenig)
Flags: needinfo?(giles)
Attachment #8938783 - Attachment is patch: true

Comment 3

2 years ago
Our security related bug template is https://bugs.chromium.org/p/webm/issues/entry?template=Security

Using that template will create a bug with restricted visibility.

Instructions for submitting a patch are https://www.webmproject.org/code/contribute/submitting-patches/

The patch will not have restricted visibility. I don't believe we have a process for hiding patches.
Flags: needinfo?(johannkoenig)
Assignee

Comment 4

2 years ago
Thanks Johann for the quick response.

I created https://bugs.chromium.org/p/webm/issues/detail?id=1482 for tracking purpose.
Flags: needinfo?(jzern)
Flags: needinfo?(giles)
Nils: can you add me to bug 1482?  You can use either my mozilla.com address or rjesup@webrtc.org if that helps.  THanks
Flags: needinfo?(drno)
Attachment #8938783 - Flags: review?(johannkoenig)
Attachment #8938783 - Flags: review?(giles)
FYI I'm patch author.  It certainly seems to avoid the ASAN failures (under rr and not under rr).

Comment 7

2 years ago
Please add marpan@google.com (or marpan@webrtc.org) or post the patch in the webm bugtracker if you don't want to upload it to gerrit.
Tracking for 58 and 59 since this is sec-high and it is getting to be late in 58 beta.
Comment on attachment 8938783 [details] [diff] [review]
vpx_patch.txt

Review of attachment 8938783 [details] [diff] [review]:
-----------------------------------------------------------------

Ok, the two breaks are replaced by an "early" return still inside the encoder config loop, with the ctx->iface->init call conditionalized to not run when it wouldn't before. So we get cleanup before returning on either code path where a break would have happened.

That's not worse than what we had before, but the code is still confusing, and I think the patch isn't entirely correct. In the case where the dsf check fails, the patch will now call vpx_codec_destroy() when it wouldn't have before, which means a double-free inside ctx->iface for either vp8 or vp9. Unless we're currently leaking it instead further up the call chain; I didn't find any destructors associated with the vectors the ctx are passed from. So bad now or bad later.

Also, the new return doesn't go through the ctx-- after the end of the loop like the old one. I don't understand what that does. The code seems to be treating the case where the mr_get_mem_loc() call fails differently, but I don't understand why that affects where in the ctx stack we are, or why this INVALID_PARAM is different from the other guards higher in the function. Maybe the code is just wrong. In any case it ends up calling SAVE_STATUS with a different ctx pointer. I'm also unclear why it's safe to call this unconditionally since vpx_codec_vp9_cx appears to set it to NULL. Does it get fixed up somewhere? 

Good job writing a minimal change, but based on the above I'm not confident we've addressed the issues here. Please explain or update the patch to address comments. Is there a trace where we can see what code path was taken when the referenced memory was freed?

Thanks for looking at this and sorry for the review delay; was on vacation until today.
Attachment #8938783 - Flags: review?(giles) → review-
Johann, re #7 I don't see an existing account for marpan. Unfortunately bugzilla doesn't have an invite system, so they can't be added until then register an account themselves. We don't post security-sensitive patches to public trackers until we've landed them, so it will have to be here.

NB libaom has the same code, although it doesn't affect Firefox since we don't currently provide an encoding interface for AV1.
Flags: needinfo?(johannkoenig)

Comment 11

2 years ago
Ralph, the chromium bug tracker is restricted as well and Randell posted a copy of the patch there. I've added you to that bug.
Flags: needinfo?(johannkoenig)

Comment 12

2 years ago
Bug 1426449 is not visible to me. How is it related to this issue?
Flags: needinfo?(rjesup)
johann - that bug can cause a config to vp8 that fails in the encode initialization (ts_target_bitrates empty in an encoder with temporal layers)
Flags: needinfo?(rjesup) → needinfo?(johannkoenig)
(In reply to Ralph Giles (:rillian) | needinfo me from comment #9)
> Comment on attachment 8938783 [details] [diff] [review]
> vpx_patch.txt
> 
> Review of attachment 8938783 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Ok, the two breaks are replaced by an "early" return still inside the
> encoder config loop, with the ctx->iface->init call conditionalized to not
> run when it wouldn't before. So we get cleanup before returning on either
> code path where a break would have happened.

So, context is everything, and splinter here hid the important bit.  Right after ctx-- at the bottom, at exit from the function, it does SAVE_STATUS(ctx, res).  The old code (which tripped ASAN) would in some cases write to ctx[-1] in the SAVE_STATUS.  If it hit the first break on the first iteration, ctx is still the initial value.  It would then do ctx-- (now pointing to invalid memory -- ctx[-1], and call SAVE_STATUS which would trip ASAN.

If it failed init(), it would destroy the codec for all ctx's that had been inited -- good.  it decrements ctx to where it's now the initial value.  it then breaks, does ctx-- (now pointing to ctx[-1], and does SAVE_STATUS -- boom again.

ctx-- only makes sense in the case of success, and then only because it does ctx++ (etc) right before that at the end of the loop (this is UGLY code; it desperately should be rewritten but I was making a minimal-fix-it change).  (Johann - you can take my patch, but I advise rewriting the function to have a saner loop.)

Also, if the first break happened on an iteration after the first, it would have leaked codecs that had been inited (didn't call vpx_codec_destroy).  Note that we were only hitting a failure in init().  In the first break case, ctx->iface isn't set yet, so vpx_codec_destroy() will just return VPX_CODEC_ERROR.
Attachment #8938783 - Flags: review- → review?(giles)

Updated

2 years ago
Flags: needinfo?(johannkoenig)
Comment on attachment 8938783 [details] [diff] [review]
vpx_patch.txt

Review of attachment 8938783 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/libvpx/libvpx/vpx/src/vpx_encoder.c
@@ +112,2 @@
>  
>          if (res) {

Ok, so the final ctx-- just undoes the ctx++ at the end of the loop block. Thanks. I missed that before, and thanks for the extra context.

Looking at the code some more, in the case that the dsf check fails, res will be VPX_CODEC_INVALID_PARAM so we'll enter this block without ctx->iface->init() having been called for the current ctx. I think that *might* avoid the double-free, since vpx_codec_destroy checks for NULL ctx->priv before destroying the interface.

We can't just change this conditional to `res && res != VPX_CODEC_INVALID_PARAM` because ctx->iface->init can return that error. I'm now comfortable with this landing.
Attachment #8938783 - Flags: review?(giles) → review+
Assignee

Comment 16

Last year
Just for awareness: there is still some discussion in the Chromium bug going on if this needs co-ordination with other projects, or if Firefox is the only affected product and we can roll-out on our own schedule.
Flags: needinfo?(drno)
Group: core-security → media-core-security
Assignee

Comment 17

Last year
A patch for this problem has landed in the public libvpx code https://chromium.googlesource.com/webm/libvpx/+/5b6ae020b6a972b67b59b3dbf7cf9cbd3140a80d

We assume that means we can go ahead and land this in Firefox as well.
Assignee

Comment 18

Last year
Comment on attachment 8938783 [details] [diff] [review]
vpx_patch.txt

[Security approval request comment]
How easily could an exploit be constructed based on the patch? 
Since this only triggers with simulcast turned on and not enough bandwidth being available I think it is non trivial.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? 
No

Which older supported branches are affected by this flaw?
All branches are affected. But we landed other patches already to prevent the bandwidth problem from getting us into this bug here.

If not all supported branches, which bug introduced the flaw?
N/A

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
The patch should apply cleanly to all branches as they all use the same version of libvpx.

How likely is this patch to cause regressions; how much testing does it need?
Very unlikely as Firefox should no longer be able to get into the situation which triggered this problem in the first place. We manually verified on local ASAN builds that screen sharing in Google Hangouts no longer was triggering this issue with the patch applied. And since it is hard to manually get into the required low bandwidth condition, plus we don't know the exact bandwidth constraints needed, I think this would eat a lot of time from a manual test verification for very little return.
Attachment #8938783 - Flags: sec-approval?
See Also: → 1433158
Comment on attachment 8938783 [details] [diff] [review]
vpx_patch.txt

sec-approval+
Attachment #8938783 - Flags: sec-approval? → sec-approval+
Assignee

Comment 21

Last year
This patch combines all the fixes from the libvpx repository upstream into a single commit for easy uplifting.
Attachment #8938783 - Attachment is obsolete: true
Attachment #8946078 - Flags: review?(giles)
Assignee

Comment 22

Last year
Comment on attachment 8946078 [details] [diff] [review]
bug1426988.patch

[Security approval request comment]
How easily could an exploit be constructed based on the patch? 
Since this only triggers with simulcast turned on and not enough bandwidth being available I think it is non trivial.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? 
No

Which older supported branches are affected by this flaw?
All branches are affected. We landed other patches already in bug 1426449 to prevent the bandwidth problem from getting us into this bug here in Fx 58. But ESR doesn't use the same surrounding/invoking code, thus ESR doesn't have the same level of protection.

If not all supported branches, which bug introduced the flaw?
N/A

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
The patch should apply cleanly to all branches as they all use the same version of libvpx.

How likely is this patch to cause regressions; how much testing does it need?
Very unlikely as Firefox should no longer be able to get into the situation which triggered this problem in the first place. We manually verified on local ASAN builds that screen sharing in Google Hangouts no longer was triggering this issue with the patch applied. And since it is hard to manually get into the required low bandwidth condition, plus we don't know the exact bandwidth constraints needed, I think this would eat a lot of time from a manual test verification for very little return.
Attachment #8946078 - Flags: sec-approval?
Assignee

Comment 23

Last year
Comment on attachment 8946078 [details] [diff] [review]
bug1426988.patch

Approval Request Comment
[Feature/Bug causing the regression]: Looks like libvpx 1.3.0 import in bug 918550 landed this code.
[User impact if declined]: Crashes in certain WebRTC calls.
[Is this code covered by automated tests?]: No.
[Has the fix been verified in Nightly?]: We have only manually tried it with ASAN builds. Patch hasn't landed in Nightly, because I would prefer to land the version update to libvpx 1.7.0 in Nightly (bug 1433158) instead.
[Needs manual test from QE? If yes, steps to reproduce]: No, since no this is not a functional change and we only check for a clean exit in an unlikely error condition (which should no longer happen in 58 or 59 since bug 1426449 landed already), plus there isn't anything to verify except Firefox no longer crashing.
[List of other uplifts needed for the feature/fix]: N/A
[Is the change risky?]: Not really.
[Why is the change risky/not risky?]: It only ensures that in an error case the code cleanly exits (so far the exit was not clean) - so no functional change.
[String changes made/needed]: N/A
Attachment #8946078 - Flags: approval-mozilla-beta?
Assignee

Comment 24

Last year
Since the patch for bug 1426449 made it into 58 I don't think we need to uplift this into release. 

But since bug 1426449 didn't apply to ESR52 I think we should consider this to get into ESR52.

And for 60 I would propose to import libvpx 1.7.0 in bug 1433158 instead of this patch.

Al: what do you think about that proposal?
Flags: needinfo?(abillings)
What about 59? I'd like the next ESR52 and mainline releases to be somewhat in synch. If we take 1.7.0 for ESR52.7, I'd like it to be in 59 as well.
Flags: needinfo?(abillings)
Flags: needinfo?(drno)
Assignee

Comment 26

Last year
Sorry my proposal in comment #24 obviously wasn't as clear as it should have been. Another more structured attempt:

58 - has the patches from bug 1426449 and we will leave it at that

59 - has the patches from bug 1426449 already, but additionally we will also land the latest patch from this bug 1426988 (which combines the security patches from libvpx 1.7.0)

60 - has the patches from bug 1426449 already, but instead of landing the patches from this bug we will land the libvpx update to 1.7.0 over in bug 1433158 - this should avoid merge conflicts when landing bug 1433158

ESR52 - the patches from bug 1426449 don't aplly, uplifting a whole libvpx update into an ESR release so late seem risky, so we only land the patch in this bug 1426988
Flags: needinfo?(drno)
Attachment #8946078 - Flags: review?(giles) → review+
OK, you would need to request uplift to ESR as well.
Flags: needinfo?(drno)
Comment on attachment 8946078 [details] [diff] [review]
bug1426988.patch

Giving sec-approval+ and beta approval as well.
Attachment #8946078 - Flags: sec-approval?
Attachment #8946078 - Flags: sec-approval+
Attachment #8946078 - Flags: approval-mozilla-beta?
Attachment #8946078 - Flags: approval-mozilla-beta+
Assignee

Comment 29

Last year
Comment on attachment 8946078 [details] [diff] [review]
bug1426988.patch

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: it is sec-high.

User impact if declined: potential Firefox crashes.

Fix Landed on Version: Not landed yet.

Risk to taking this patch (and alternatives if risky): It is pretty low risk as it only cleans up memory management in case of failing to initialize the video encoder. The scenario under which that happens is pretty rare.

String or UUID changes made by this patch: N/A

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Flags: needinfo?(drno)
Attachment #8946078 - Flags: approval-mozilla-esr52?
https://hg.mozilla.org/releases/mozilla-beta/rev/737f7d2ba619

I'd kind of like to land this on trunk too just in case we run into any issues with the libvpx upgrade. Do you foresee any major issues with doing that? I would think that update would just clobber this fix anyway.
Flags: needinfo?(drno)
Assignee

Comment 32

Last year
Sorry about that. I should have known better to compile after modifying the patch.

I just added the missing includes. Carrying forward all approvals.

Going to check if this compiles on ESR52.
Flags: needinfo?(drno)
Attachment #8946803 - Flags: review+
Assignee

Comment 33

Last year
Patch doesn't apply to ESR52. Working on a patch for that.
Assignee

Comment 34

Last year
The same patch for ESR52
Attachment #8946947 - Flags: review+
Assignee

Comment 35

Last year
I think you are right we can land this on central and later over-write it with libvpx 1.7.0.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/f3bc73c747e8
Status: NEW → RESOLVED
Closed: Last year
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Attachment #8946078 - Flags: approval-mozilla-esr52?
Comment on attachment 8946947 [details] [diff] [review]
bug1426988_for_esr52.patch

Fixes a sec-high, let's get this on ESR52 too.
Attachment #8946947 - Flags: approval-mozilla-esr52+
Group: media-core-security → core-security-release
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main59+][adv-esr52.7+]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.