Closed Bug 1211389 Opened 9 years ago Closed 9 years ago

Crash in nr_ice_candidate_mark_done when setting up WebRTC calls

Categories

(Core :: WebRTC: Networking, defect, P1)

Unspecified
All
defect

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox43 --- unaffected
firefox44 --- fixed
firefox-esr38 --- unaffected

People

(Reporter: standard8, Assigned: bwc)

References

Details

(Keywords: crash, sec-high)

Crash Data

Attachments

(2 files, 1 obsolete file)

This bug was filed from the Socorro interface and is 
report bp-39a18db1-14c9-45c9-8430-609602151005.
=============================================================

STR:

1) Using latest nightly for both ends, set up for a conversation via Firefox Hello or Talky.io
2) Join the rooms, granting permissions

Expect Results

- Call joins successfully

Actual Results

- Firefox either crashes or the content process crashes if in e10s.

Given the stack location, this seems to be a regression from bug 1208096.
Stack from https://crash-stats.mozilla.com/report/index/39a18db1-14c9-45c9-8430-609602151005

0 	XUL 	nr_ice_candidate_mark_done 	media/mtransport/third_party/nICEr/src/ice/ice_candidate.c
1 	XUL 	nr_ice_candidate_destroy 	media/mtransport/third_party/nICEr/src/ice/ice_candidate.c
2 	XUL 	nr_ice_component_destroy 	media/mtransport/third_party/nICEr/src/ice/ice_component.c
3 	XUL 	nr_ice_media_stream_destroy 	media/mtransport/third_party/nICEr/src/ice/ice_media_stream.c
4 	XUL 	nr_ice_remove_media_stream 	media/mtransport/third_party/nICEr/src/ice/ice_ctx.c
5 	XUL 	mozilla::NrIceMediaStream::Close() 	media/mtransport/nricemediastream.cpp
6 	XUL 	mozilla::NrIceCtx::SetStream(unsigned long, mozilla::NrIceMediaStream*) 	media/mtransport/nricectx.cpp
7 	XUL 	mozilla::PeerConnectionMedia::ActivateOrRemoveTransport_s(unsigned long, unsigned long, std::string const&, std::string const&, std::vector<std::string, std::allocator<std::string> > const&) 	media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.cpp
8 	XUL 	mozilla::runnable_args_memfn<mozilla::RefPtr<mozilla::PeerConnectionMedia>, void (mozilla::PeerConnectionMedia::*)(unsigned long, unsigned long, const std::basic_string<char>&, const std::basic_string<char>&, const std::vector<std::basic_string<char>, std::allocator<std::basic_string<char> > >&), unsigned long, unsigned long, std::basic_string<char>, std::basic_string<char>, std::vector<std::basic_string<char>, std::allocator<std::basic_string<char> > > >::Run 	media/mtransport/runnable_utils.h
9 	XUL 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp
Crash Signature: [@ nr_ice_candidate_mark_done] → [@ nr_ice_candidate_mark_done] [@ nr_ice_media_stream_destroy ]
I am having no luck reproducing this bug. Do the two ends need to be on different networks? Looking at the stack, I'm not seeing how this can happen unless undefined behavior has somehow been introduced, or the debug symbols are wrong, since we null-check everything being used on that line.
Group: core-security
This is being caused by the teardown logic for candidates destroying the "piggyback" srflx candidate before the relay candidate it is piggybacking on. This is an extremely narrow window UAF. Working on a fix.
Keywords: sec-high
FWIW, I'm working on Mac with both ends of the call on the same machine. Its affecting me 100% of the time that I've tried it so far.
Assignee: nobody → docfaraday
Status: NEW → ASSIGNED
Rank: 5
Priority: -- → P1
Comment on attachment 8669664 [details] [diff] [review]
Remove unnecessary call to nr_ice_candidate_mark_done

Review of attachment 8669664 [details] [diff] [review]:
-----------------------------------------------------------------

I've double checked every callsite of nr_ice_candidate_destroy, and the only one that looks like it might be called when the candidate is in INITIALIZING is the one causing this crash. So, I've just removed the call to nr_ice_candidate_mark_done from nr_ice_candidate_destroy.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=cfb196fd6069
Attachment #8669664 - Flags: review?(drno)
(In reply to Mark Banner (:standard8) from comment #5)
> FWIW, I'm working on Mac with both ends of the call on the same machine. Its
> affecting me 100% of the time that I've tried it so far.

I'm guessing you're far enough from the TURN servers that the relay candidates aren't initialized by the time offer/answer concludes. I'm able to repro about 1 time out of 10 if I'm running on ASAN.
Group: media-core-security
I just tested the patch locally a few times and it seems to fix the issue :-)
Good to hear, although try is not happy. I do know why though, should not be hard to fix.
Attachment #8669664 - Attachment is obsolete: true
Attachment #8669664 - Flags: review?(drno)
Comment on attachment 8669873 [details] [diff] [review]
Make absolutely sure the relay->srflx pointer doesn't dangle

Review of attachment 8669873 [details] [diff] [review]:
-----------------------------------------------------------------

https://treeherder.mozilla.org/#/jobs?repo=try&revision=306ce0f3b9de
Attachment #8669873 - Flags: review?(drno)
Group: core-security
Comment on attachment 8669873 [details] [diff] [review]
Make absolutely sure the relay->srflx pointer doesn't dangle

Review of attachment 8669873 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM
Attachment #8669873 - Flags: review?(drno) → review+
Comment on attachment 8669873 [details] [diff] [review]
Make absolutely sure the relay->srflx pointer doesn't dangle

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

   It would be pretty hard. The window of opportunity between the free and the UAF is extremely narrow, but since we're using a function pointer on the freed object we have the potential for executing arbitrary code.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

   Not exactly, but it does hint strongly that there exists a UAF bug with relay candidates and their piggybacked srflx candidates.

Which older supported branches are affected by this flaw?

   None.

If not all supported branches, which bug introduced the flaw?

   Bug 1208096

How likely is this patch to cause regressions; how much testing does it need?

   Any time we mess with nICEr can be dicey. Fortunately, this only needs to land on m-c.
Attachment #8669873 - Flags: sec-approval?
Can you verify that the current patch (or binary from the try run) works for you?
Flags: needinfo?(standard8)
Adding Valentin who can also reproduce this
The second crash signature [@ nr_ice_media_stream_destroy ] is gone with the Try build, it doesn't crash anymore when initiating a session between two participants.
(In reply to Byron Campen [:bwc] from comment #16)

> Which older supported branches are affected by this flaw?
> 
>    None.

If this is trunk only, you don't need approval, though I'm happy to give it. 

https://wiki.mozilla.org/Security/Bug_Approval_Process
Attachment #8669873 - Flags: sec-approval? → sec-approval+
The new patch seems to work fine as well - I tested before applying it, got crashes, I then applied it and did a number of runs and it worked fine. I also then unapplied it and got the crashes again.

So this seems fine as far as I can tell.
Flags: needinfo?(standard8)
Keywords: checkin-needed
(In reply to Al Billings [:abillings] from comment #20)
> (In reply to Byron Campen [:bwc] from comment #16)
> 
> > Which older supported branches are affected by this flaw?
> > 
> >    None.
> 
> If this is trunk only, you don't need approval, though I'm happy to give it. 
> 
> https://wiki.mozilla.org/Security/Bug_Approval_Process

   The changeset that causes this bug needs to be uplifted, so this one will need to tag along.
FWIW: I'm hitting a crash with this bug's signature when trying to use BrowserStack [which uses WebRTC for sending a remote view of another browser over the wire].

The crash happens after I've selected a browser for BrowserStack to show me, while it's preparing for a session. Sample crash report: bp-358b3e22-8a18-4e31-83b5-cc5e92151008

I'm hoping the patch that just landed fixes this; I'll double-check after this fix makes it into a Nightly.
Flags: needinfo?(dholbert)
https://hg.mozilla.org/mozilla-central/rev/b185293e7db6
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
Group: media-core-security → core-security-release
(In reply to Daniel Holbert [:dholbert] from comment #24)
> FWIW: I'm hitting a crash with this bug's signature when trying to use BrowserStack
> [...] I'm hoping the patch that just landed fixes this; I'll double-check

Yup, no longer crashing when logging into BrowserStack, using current Nightly 44.0a1 (2015-10-19). Hooray!
Flags: needinfo?(dholbert)
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: