Closed Bug 1426449 Opened 6 years ago Closed 6 years ago

Crash in webrtc::SimulcastRateAllocator::GetAllocation

Categories

(Core :: WebRTC, defect, P1)

58 Branch
Unspecified
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox-esr52 --- unaffected
firefox57 --- wontfix
firefox58 + fixed
firefox59 + fixed

People

(Reporter: miketaylr, Assigned: jesup)

References

()

Details

(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [adv-main58+][post-critsmash-triage])

Crash Data

Attachments

(1 file, 1 obsolete file)

This bug was filed from the Socorro interface and is
report bp-3c4447e1-962c-4aa2-9e48-356a90171220.
=============================================================

Top 10 frames of crashing thread:

0 XUL webrtc::SimulcastRateAllocator::GetAllocation media/webrtc/trunk/webrtc/modules/video_coding/utility/simulcast_rate_allocator.cc:131
1 XUL webrtc::vcm::VideoSender::UpdateEncoderParameters media/webrtc/trunk/webrtc/modules/video_coding/video_sender.cc:203
2 XUL webrtc::vcm::VideoSender::UpdateChannelParemeters media/webrtc/trunk/webrtc/modules/video_coding/video_sender.cc:222
3 XUL webrtc::ViEEncoder::ReconfigureEncoder media/webrtc/trunk/webrtc/video/vie_encoder.cc:427
4 XUL webrtc::ViEEncoder::EncodeVideoFrame media/webrtc/trunk/webrtc/video/vie_encoder.cc:553
5 XUL webrtc::ViEEncoder::EncodeTask::Run media/webrtc/trunk/webrtc/video/vie_encoder.cc:105
6 XUL rtc::TaskQueue::TaskContext::RunTask media/webrtc/trunk/webrtc/base/task_queue_gcd.cc:53
7  @0x7fff5c91ee87 
8  @0x7fff5c933355 
9  @0x7fff5c926238 

=============================================================

This just happened to me 3 times in a row before I gave up on screen-sharing in hangouts.

STR:

1. open hangouts.google.com, and start a video chat with someone
2. initiate a screen-sharing session
3. wait a few seconds

I had one hard browser crash, and 2 tab crashes doing this. I'm also connected to an external monitor if it makes a difference.

(not sure if this counts as a regression, since webrtc hangouts was just enabled for Firefox like today...)
(6 out of 8 comments from crash reports mention screen sharing)
Bunch of the crashes in the last month have UAF addresses
Group: media-core-security
Rank: 5
Priority: -- → P1
I'll check if a ASAN run on Linux will help me better what is going on here.
Assignee: nobody → drno
I was able to repro one assertion about bandwidth and out of bounds access violation when initializing VP8 encoders. My current guess is this caused by Hangouts creating 3 encoder for the outgoing video from your camera and then adding 3 more encoders for he screen share video.
Note that in a debug build I hit an assertion before any crash:
simulcast_rate_allocator.cc:143
    RTC_DCHECK_LE(tl_allocation_sum_kbps, expected_allocated_bitrate_kbps);
(2000 vs 200)
I don't think that's the cause of this UAF; will rebuild ASAN without DEBUG
To me it looks like these crashes are caused by not exiting ViEEncoder::ReconfigureEncoder() after function calls returned errors like here: https://searchfox.org/mozilla-central/rev/22c55eb7b7e6494a8615a7af3b613ff899d2cdba/media/webrtc/trunk/webrtc/video/vie_encoder.cc#411
and here https://searchfox.org/mozilla-central/rev/22c55eb7b7e6494a8615a7af3b613ff899d2cdba/media/webrtc/trunk/webrtc/video/vie_encoder.cc#423

The later one I hit a couple of times. But I have not figured out how or why these function calls sometimes fail.
Bug 1426988 was provoked by this error - vp8 config fails, and that causes a UAF in the vp8 code.  I believe it does *not* need to be fixed to fix the hangouts problem.

The problem that caused the failure of config was using 3 temporal layers (like for live video) instead of 2 (for screencast) -- the webrtc.org code defaults to 2 for screencasts, and 3 for live video.
Assignee: drno → rjesup
See Also: → 1426988
works without patch to libvpx.  Disabling DCHECK needed to avoid assertion in debug; innocuous it appears
Attachment #8940046 - Flags: review?(drno)
[Tracking Requested - why for this release]:

This blocks a major feature of Google Hangouts (and Meet once they enable that for more than Mozilla).

Simple fix
Comment on attachment 8940046 [details] [diff] [review]
set the correct number of temporal layers for screencasts

Approval Request Comment
[Feature/Bug causing the regression]: Support for temporal layers Bug 1390202

[User impact if declined]: Hangouts Screenshares will crash or do Bad Things (UAF)

[Is this code covered by automated tests?]: not this particular case - would need to test simulcast screenshares

[Has the fix been verified in Nightly?]: by me, yes (local build)

[Needs manual test from QE? If yes, steps to reproduce]: Yes: Start hangout.  Use ... menu to start screenshare.

[List of other uplifts needed for the feature/fix]: none

[Is the change risky?]: No

[Why is the change risky/not risky?]: Trivial change to avoid encoder config failure (assumptions about the combination of screencast and temporal layers)

[String changes made/needed]: none

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
Very hard, virtually impossible

Do comments in the patch, the check-in comment, or tests included in the patch  paint a bulls-eye on the security problem? No - the actual problem is in libvpx


Which older supported branches are affected by this flaw? 57 and above

If not all supported branches, which bug introduced the flaw? 57

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?  Trivial - should apply with no changes

How likely is this patch to cause regressions; how much testing does it need?  No likelihood of regressions.
Attachment #8940046 - Flags: sec-approval?
Attachment #8940046 - Flags: approval-mozilla-beta?
Comment on attachment 8940046 [details] [diff] [review]
set the correct number of temporal layers for screencasts

Review of attachment 8940046 [details] [diff] [review]:
-----------------------------------------------------------------

I think 1 and 2 need to be swapped.

::: media/webrtc/signaling/src/media-conduit/VideoConduit.cpp
@@ +641,5 @@
>        // Oddly, though this is a 'bps' array, nothing really looks at the
>        // values, just the size of the array to know the number of temporal
>        // layers.
> +      video_stream.temporal_layer_thresholds_bps.resize(
> +        (mConduit->mCodecMode != webrtc::VideoCodecMode::kScreensharing) ? 1 : 2);

This looks to me likes the order of 1 and 2 is swapped here.

According to https://searchfox.org/mozilla-central/rev/652fbd6270de0d3ec424d2b88f8375ff546f949f/media/webrtc/trunk/webrtc/media/engine/simulcast.cc#221 it should be 1 for screencast and 2 for everything else.

The other question is if we need to fill it with actual bitrate values like in the above referenced example?

::: media/webrtc/trunk/webrtc/modules/video_coding/utility/simulcast_rate_allocator.cc
@@ +139,5 @@
>        allocated_bitrates_bps.SetBitrate(simulcast_id, tl_index,
>                                          layer_rate_kbps * 1000);
>        tl_allocation_sum_kbps += layer_rate_kbps;
>      }
> +#ifndef WEBRTC_MOZILLA_BUILD

Have you checked if this is still needed after using the correct size for the array? Because if yes I'm concerned that we are doing something wrong with the bandwidth calculations on our end.
Attachment #8940046 - Flags: review?(drno) → review-
Track 58+ as blocker for Google Hangouts.
(In reply to Randell Jesup [:jesup] from comment #10)
> [Is this code covered by automated tests?]: not this particular case - would
> need to test simulcast screenshares

Please file a followup on creating one when this is landed and uplifted. Sounds from comment 7 like a simple gtest could verify that VideoConduit creates the expected number of layers for a screenshare configuration.
Tracking for 59 as well. We could still get a fix into 58 though it's pretty last minute, since it seems like a bad issue for Hangouts.
I have updated patches (there are more hidden issues once the above is fixed).  Once I have them clean I'll put them back up.
there are two more fixes needed once this is corrected...
Attachment #8941195 - Flags: review?(drno)
Attachment #8940046 - Attachment is obsolete: true
Attachment #8940046 - Flags: sec-approval?
Attachment #8940046 - Flags: approval-mozilla-beta?
Blocks: 1429216
Gerry, fyi we may want to land this for beta 16 if it is reviewed and has sec-approval by Thursday. What do you think?
Flags: needinfo?(gchang)
Comment on attachment 8941195 [details] [diff] [review]
set the correct number of temporal layers for screencasts

Review of attachment 8941195 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM with the one modification below to make easier to maintain.

::: media/webrtc/signaling/src/media-conduit/VideoConduit.cpp
@@ +676,5 @@
> +      // number of temporal layers.
> +      // For VideoEncoderConfig::ContentType::kScreen, though, in
> +      // video_codec_initializer.cc it uses [0] to set the target bitrate
> +      // for the screenshare.
> +      video_stream.temporal_layer_thresholds_bps.clear();

I think we should move the video_stream.temporal_layer_thresholds_bps.clear() call before this if condition. That makes it easier to read and maintain.
Attachment #8941195 - Flags: review?(drno) → review+
I created https://bugzilla.mozilla.org/show_bug.cgi?id=1429224 for a test (no linking it directly here to not paint a target onto this sec bug until disclosed).
Comment on attachment 8941195 [details] [diff] [review]
set the correct number of temporal layers for screencasts

Approval Request Comment
[List of other uplifts needed for the feature/fix]: libvpx bug, bug 1429216, and bug 1429219

[Is the change risky?]: Not very risky

[Why is the change risky/not risky?]: Corrects number of temporal layers.  The other bugs are needed - two are to handle "what if InitEncode fails", the other is to avoid it failing (along with this one).  We could take just this and bug 1429219, but then if there were other ways to make InitEncode fail it would UAF.  Alternatively, we could take this and 1429219 and make the RTC_DCHECK(success) in vie_encoder.cc into RTC_CHECK(success) -- i.e. make it a safe release assert if InitEncode fails.

Each change is pretty simple.

[String changes made/needed]: none

[Security approval request comment]
How easily could an exploit be constructed based on the patch?  Pretty hard

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Not really, but if someone plays with "what would this patch change" they might find the UAF.  Not really different than any bug that lands under a closed-to-public bug.

Which older supported branches are affected by this flaw?  Probably all except ESR52.  ESR doesn't support temporal layers.

If not all supported branches, which bug introduced the flaw?  Bug 1390202 (FF 57)

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?  Trivial

How likely is this patch to cause regressions; how much testing does it need?  I'd want it smoke-tested using Google Hangouts (not Google Meet - you'll need to use a non-mozilla.com google login), but chance of regressions is very low.
Attachment #8941195 - Flags: sec-approval?
Attachment #8941195 - Flags: approval-mozilla-beta?
I'm tempted land this (and bug 1429219 which is not a sec bug) under bug 1429219 (or another bug) and not the other two issues which are both "what if InitEncode fails?" with a switch to RTC_CHECK(success) since then the other two aren't needed.  This also avoids needing to coordinate the landing with upstream.
Note: without the vpx patch, if someone can make the encode-init fail, vpx will write out-of-bounds, but before there's much risk from that we'll release-assert while unwinding from the init.
I'd like to give this sec-approval+ and then beta+ to make the last beta but I need Gerry's approval to do so.
Hi Al,
Yes. I would like to include this in beta 16.
Flags: needinfo?(gchang)
Comment on attachment 8941195 [details] [diff] [review]
set the correct number of temporal layers for screencasts

sec-approval+ and beta+ with Gerry's approval for the beta.
Attachment #8941195 - Flags: sec-approval?
Attachment #8941195 - Flags: sec-approval+
Attachment #8941195 - Flags: approval-mozilla-beta?
Attachment #8941195 - Flags: approval-mozilla-beta+
I can update the patch to force the RTC_CHECK(success), and re-verify all is sane before we land it.
Stress-tested it in a debug build, and with rr (note that in rr screenshare doesn't actually work, due to shmem + rr issues, but the single-core execution was good at triggering the encoder-init issues).  All is happy (with RTC_CHECK(success)).
https://hg.mozilla.org/mozilla-central/rev/a2d59268abd2
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Whiteboard: [adv-main58+]
Flags: qe-verify-
Whiteboard: [adv-main58+] → [adv-main58+][post-critsmash-triage]
Group: media-core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.