Closed
Bug 1426449
Opened 7 years ago
Closed 7 years ago
Crash in webrtc::SimulcastRateAllocator::GetAllocation
Categories
(Core :: WebRTC, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla59
People
(Reporter: miketaylr, Assigned: jesup)
References
()
Details
(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [adv-main58+][post-critsmash-triage])
Crash Data
Attachments
(1 file, 1 obsolete file)
4.15 KB,
patch
|
drno
:
review+
abillings
:
approval-mozilla-beta+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-3c4447e1-962c-4aa2-9e48-356a90171220.
=============================================================
Top 10 frames of crashing thread:
0 XUL webrtc::SimulcastRateAllocator::GetAllocation media/webrtc/trunk/webrtc/modules/video_coding/utility/simulcast_rate_allocator.cc:131
1 XUL webrtc::vcm::VideoSender::UpdateEncoderParameters media/webrtc/trunk/webrtc/modules/video_coding/video_sender.cc:203
2 XUL webrtc::vcm::VideoSender::UpdateChannelParemeters media/webrtc/trunk/webrtc/modules/video_coding/video_sender.cc:222
3 XUL webrtc::ViEEncoder::ReconfigureEncoder media/webrtc/trunk/webrtc/video/vie_encoder.cc:427
4 XUL webrtc::ViEEncoder::EncodeVideoFrame media/webrtc/trunk/webrtc/video/vie_encoder.cc:553
5 XUL webrtc::ViEEncoder::EncodeTask::Run media/webrtc/trunk/webrtc/video/vie_encoder.cc:105
6 XUL rtc::TaskQueue::TaskContext::RunTask media/webrtc/trunk/webrtc/base/task_queue_gcd.cc:53
7 @0x7fff5c91ee87
8 @0x7fff5c933355
9 @0x7fff5c926238
=============================================================
This just happened to me 3 times in a row before I gave up on screen-sharing in hangouts.
STR:
1. open hangouts.google.com, and start a video chat with someone
2. initiate a screen-sharing session
3. wait a few seconds
I had one hard browser crash, and 2 tab crashes doing this. I'm also connected to an external monitor if it makes a difference.
(not sure if this counts as a regression, since webrtc hangouts was just enabled for Firefox like today...)
Reporter | ||
Comment 1•7 years ago
|
||
(6 out of 8 comments from crash reports mention screen sharing)
Assignee | ||
Comment 2•7 years ago
|
||
Bunch of the crashes in the last month have UAF addresses
Group: media-core-security
Keywords: csectype-uaf,
sec-high
Updated•7 years ago
|
Rank: 5
Priority: -- → P1
Comment 3•7 years ago
|
||
I'll check if a ASAN run on Linux will help me better what is going on here.
Assignee: nobody → drno
Comment 4•7 years ago
|
||
I was able to repro one assertion about bandwidth and out of bounds access violation when initializing VP8 encoders. My current guess is this caused by Hangouts creating 3 encoder for the outgoing video from your camera and then adding 3 more encoders for he screen share video.
Assignee | ||
Comment 5•7 years ago
|
||
Note that in a debug build I hit an assertion before any crash:
simulcast_rate_allocator.cc:143
RTC_DCHECK_LE(tl_allocation_sum_kbps, expected_allocated_bitrate_kbps);
(2000 vs 200)
I don't think that's the cause of this UAF; will rebuild ASAN without DEBUG
Comment 6•7 years ago
|
||
To me it looks like these crashes are caused by not exiting ViEEncoder::ReconfigureEncoder() after function calls returned errors like here: https://searchfox.org/mozilla-central/rev/22c55eb7b7e6494a8615a7af3b613ff899d2cdba/media/webrtc/trunk/webrtc/video/vie_encoder.cc#411
and here https://searchfox.org/mozilla-central/rev/22c55eb7b7e6494a8615a7af3b613ff899d2cdba/media/webrtc/trunk/webrtc/video/vie_encoder.cc#423
The later one I hit a couple of times. But I have not figured out how or why these function calls sometimes fail.
Assignee | ||
Comment 7•7 years ago
|
||
Bug 1426988 was provoked by this error - vp8 config fails, and that causes a UAF in the vp8 code. I believe it does *not* need to be fixed to fix the hangouts problem.
The problem that caused the failure of config was using 3 temporal layers (like for live video) instead of 2 (for screencast) -- the webrtc.org code defaults to 2 for screencasts, and 3 for live video.
Assignee: drno → rjesup
See Also: → 1426988
Assignee | ||
Comment 8•7 years ago
|
||
works without patch to libvpx. Disabling DCHECK needed to avoid assertion in debug; innocuous it appears
Attachment #8940046 -
Flags: review?(drno)
Assignee | ||
Comment 9•7 years ago
|
||
[Tracking Requested - why for this release]:
This blocks a major feature of Google Hangouts (and Meet once they enable that for more than Mozilla).
Simple fix
status-firefox57:
--- → wontfix
status-firefox58:
--- → affected
status-firefox59:
--- → affected
status-firefox-esr52:
--- → ?
tracking-firefox58:
--- → ?
Assignee | ||
Comment 10•7 years ago
|
||
Comment on attachment 8940046 [details] [diff] [review]
set the correct number of temporal layers for screencasts
Approval Request Comment
[Feature/Bug causing the regression]: Support for temporal layers Bug 1390202
[User impact if declined]: Hangouts Screenshares will crash or do Bad Things (UAF)
[Is this code covered by automated tests?]: not this particular case - would need to test simulcast screenshares
[Has the fix been verified in Nightly?]: by me, yes (local build)
[Needs manual test from QE? If yes, steps to reproduce]: Yes: Start hangout. Use ... menu to start screenshare.
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: No
[Why is the change risky/not risky?]: Trivial change to avoid encoder config failure (assumptions about the combination of screencast and temporal layers)
[String changes made/needed]: none
[Security approval request comment]
How easily could an exploit be constructed based on the patch?
Very hard, virtually impossible
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No - the actual problem is in libvpx
Which older supported branches are affected by this flaw? 57 and above
If not all supported branches, which bug introduced the flaw? 57
Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Trivial - should apply with no changes
How likely is this patch to cause regressions; how much testing does it need? No likelihood of regressions.
Attachment #8940046 -
Flags: sec-approval?
Attachment #8940046 -
Flags: approval-mozilla-beta?
Comment 11•7 years ago
|
||
Comment on attachment 8940046 [details] [diff] [review]
set the correct number of temporal layers for screencasts
Review of attachment 8940046 [details] [diff] [review]:
-----------------------------------------------------------------
I think 1 and 2 need to be swapped.
::: media/webrtc/signaling/src/media-conduit/VideoConduit.cpp
@@ +641,5 @@
> // Oddly, though this is a 'bps' array, nothing really looks at the
> // values, just the size of the array to know the number of temporal
> // layers.
> + video_stream.temporal_layer_thresholds_bps.resize(
> + (mConduit->mCodecMode != webrtc::VideoCodecMode::kScreensharing) ? 1 : 2);
This looks to me likes the order of 1 and 2 is swapped here.
According to https://searchfox.org/mozilla-central/rev/652fbd6270de0d3ec424d2b88f8375ff546f949f/media/webrtc/trunk/webrtc/media/engine/simulcast.cc#221 it should be 1 for screencast and 2 for everything else.
The other question is if we need to fill it with actual bitrate values like in the above referenced example?
::: media/webrtc/trunk/webrtc/modules/video_coding/utility/simulcast_rate_allocator.cc
@@ +139,5 @@
> allocated_bitrates_bps.SetBitrate(simulcast_id, tl_index,
> layer_rate_kbps * 1000);
> tl_allocation_sum_kbps += layer_rate_kbps;
> }
> +#ifndef WEBRTC_MOZILLA_BUILD
Have you checked if this is still needed after using the correct size for the array? Because if yes I'm concerned that we are doing something wrong with the bandwidth calculations on our end.
Attachment #8940046 -
Flags: review?(drno) → review-
Comment 13•7 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #10)
> [Is this code covered by automated tests?]: not this particular case - would
> need to test simulcast screenshares
Please file a followup on creating one when this is landed and uplifted. Sounds from comment 7 like a simple gtest could verify that VideoConduit creates the expected number of layers for a screenshare configuration.
Updated•7 years ago
|
tracking-firefox59:
--- → ?
Comment 14•7 years ago
|
||
Tracking for 59 as well. We could still get a fix into 58 though it's pretty last minute, since it seems like a bad issue for Hangouts.
Assignee | ||
Comment 15•7 years ago
|
||
I have updated patches (there are more hidden issues once the above is fixed). Once I have them clean I'll put them back up.
Assignee | ||
Comment 16•7 years ago
|
||
there are two more fixes needed once this is corrected...
Attachment #8941195 -
Flags: review?(drno)
Assignee | ||
Updated•7 years ago
|
Attachment #8940046 -
Attachment is obsolete: true
Attachment #8940046 -
Flags: sec-approval?
Attachment #8940046 -
Flags: approval-mozilla-beta?
Comment 17•7 years ago
|
||
Gerry, fyi we may want to land this for beta 16 if it is reviewed and has sec-approval by Thursday. What do you think?
Flags: needinfo?(gchang)
Comment 18•7 years ago
|
||
Comment on attachment 8941195 [details] [diff] [review]
set the correct number of temporal layers for screencasts
Review of attachment 8941195 [details] [diff] [review]:
-----------------------------------------------------------------
LGTM with the one modification below to make easier to maintain.
::: media/webrtc/signaling/src/media-conduit/VideoConduit.cpp
@@ +676,5 @@
> + // number of temporal layers.
> + // For VideoEncoderConfig::ContentType::kScreen, though, in
> + // video_codec_initializer.cc it uses [0] to set the target bitrate
> + // for the screenshare.
> + video_stream.temporal_layer_thresholds_bps.clear();
I think we should move the video_stream.temporal_layer_thresholds_bps.clear() call before this if condition. That makes it easier to read and maintain.
Attachment #8941195 -
Flags: review?(drno) → review+
Comment 19•7 years ago
|
||
I created https://bugzilla.mozilla.org/show_bug.cgi?id=1429224 for a test (no linking it directly here to not paint a target onto this sec bug until disclosed).
Assignee | ||
Comment 20•7 years ago
|
||
Comment on attachment 8941195 [details] [diff] [review]
set the correct number of temporal layers for screencasts
Approval Request Comment
[List of other uplifts needed for the feature/fix]: libvpx bug, bug 1429216, and bug 1429219
[Is the change risky?]: Not very risky
[Why is the change risky/not risky?]: Corrects number of temporal layers. The other bugs are needed - two are to handle "what if InitEncode fails", the other is to avoid it failing (along with this one). We could take just this and bug 1429219, but then if there were other ways to make InitEncode fail it would UAF. Alternatively, we could take this and 1429219 and make the RTC_DCHECK(success) in vie_encoder.cc into RTC_CHECK(success) -- i.e. make it a safe release assert if InitEncode fails.
Each change is pretty simple.
[String changes made/needed]: none
[Security approval request comment]
How easily could an exploit be constructed based on the patch? Pretty hard
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Not really, but if someone plays with "what would this patch change" they might find the UAF. Not really different than any bug that lands under a closed-to-public bug.
Which older supported branches are affected by this flaw? Probably all except ESR52. ESR doesn't support temporal layers.
If not all supported branches, which bug introduced the flaw? Bug 1390202 (FF 57)
Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Trivial
How likely is this patch to cause regressions; how much testing does it need? I'd want it smoke-tested using Google Hangouts (not Google Meet - you'll need to use a non-mozilla.com google login), but chance of regressions is very low.
Attachment #8941195 -
Flags: sec-approval?
Attachment #8941195 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 21•7 years ago
|
||
I'm tempted land this (and bug 1429219 which is not a sec bug) under bug 1429219 (or another bug) and not the other two issues which are both "what if InitEncode fails?" with a switch to RTC_CHECK(success) since then the other two aren't needed. This also avoids needing to coordinate the landing with upstream.
Assignee | ||
Comment 22•7 years ago
|
||
Note: without the vpx patch, if someone can make the encode-init fail, vpx will write out-of-bounds, but before there's much risk from that we'll release-assert while unwinding from the init.
Comment 23•7 years ago
|
||
I'd like to give this sec-approval+ and then beta+ to make the last beta but I need Gerry's approval to do so.
Comment 24•7 years ago
|
||
Hi Al,
Yes. I would like to include this in beta 16.
Flags: needinfo?(gchang)
Comment 25•7 years ago
|
||
Comment on attachment 8941195 [details] [diff] [review]
set the correct number of temporal layers for screencasts
sec-approval+ and beta+ with Gerry's approval for the beta.
Attachment #8941195 -
Flags: sec-approval?
Attachment #8941195 -
Flags: sec-approval+
Attachment #8941195 -
Flags: approval-mozilla-beta?
Attachment #8941195 -
Flags: approval-mozilla-beta+
Assignee | ||
Comment 26•7 years ago
|
||
I can update the patch to force the RTC_CHECK(success), and re-verify all is sane before we land it.
Assignee | ||
Comment 27•7 years ago
|
||
Stress-tested it in a debug build, and with rr (note that in rr screenshare doesn't actually work, due to shmem + rr issues, but the single-core execution was good at triggering the encoder-init issues). All is happy (with RTC_CHECK(success)).
Assignee | ||
Comment 28•7 years ago
|
||
![]() |
||
Comment 29•7 years ago
|
||
uplift |
![]() |
||
Comment 30•7 years ago
|
||
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Updated•7 years ago
|
Whiteboard: [adv-main58+]
Updated•7 years ago
|
Flags: qe-verify-
Whiteboard: [adv-main58+] → [adv-main58+][post-critsmash-triage]
Updated•7 years ago
|
Group: media-core-security → core-security-release
Updated•7 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•