Closed Bug 942950 Opened 6 years ago Closed 6 years ago

Use after free/double free bug when pruning a newly initialized TURN candidate.

Categories

(Core :: WebRTC: Networking, defect, major)

defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla29
Tracking Status
firefox26 --- unaffected
firefox27 --- fixed
firefox28 --- fixed
firefox29 + fixed
firefox-esr24 --- unaffected
b2g18 --- unaffected
b2g-v1.1hd --- unaffected
b2g-v1.2 --- unaffected
b2g-v1.3 --- fixed

People

(Reporter: bwc, Assigned: bwc)

Details

(Keywords: csectype-uaf, sec-critical, Whiteboard: [qa-])

Attachments

(2 files, 1 obsolete file)

When a TURN candidate is finished initializing (ie; the TURN allocation has completed), it is then run through the pruning logic via this callback:

http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_candidate.c#756

If this TURN candidate is found to be redundant, it is then pruned from the candidate list, and destroyed, meaning that the following accesses are unsafe:

http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_candidate.c#756

In addition, if we encounter this failure:

http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_candidate.c#763

The code at the abort: label will call done_cb a second time; the pruning logic will probably destroy the candidate a second time.

http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_candidate.c#790
FWIW, I cannot find any crash reports in any of the following functions (these are the ones that look like they might crash due to this bug). This is not too surprising, since it is a very tight race if we don't fail when calling nr_turn_client_get_mapped_address.

nr_ice_candidate_destroy
nr_ice_component_maybe_prune_candidate
nr_ice_initialize_finished_cb
nr_turn_client_get_mapped_address
nr_ice_turn_allocated_cb
Whiteboard: sec-uaf
Whiteboard: sec-uaf
Attachment #8339413 - Flags: review?(ekr)
Comment on attachment 8339413 [details] [diff] [review]
Fix bug where a newly allocated (in the sense that a TURN allocation has been formed) relayed candidate could be used after being pruned, when redundant.

Review of attachment 8339413 [details] [diff] [review]:
-----------------------------------------------------------------

I agree that this fix is correct.

Please confirm that this cannot happen absent trickle ICE. If so,
that has two implications:

1. This only occurs in Aurora
2. It does not require fast uplift to the resip nICEr repo.

Please do not land until you have verified these issues.

Then you will need to get sec approval for the landing
Attachment #8339413 - Flags: review?(ekr) → review+
Also, I think it would be good to null out cand here and add
a check in the error return area to future proof against other
stuff like this a bit by substituting a null deref for other
issues.
Also, I think it would be good to null out cand here and add
a check in the error return area to future proof against other
stuff like this a bit by substituting a null deref for other
issues.
(In reply to Eric Rescorla (:ekr) from comment #3)
> Comment on attachment 8339413 [details] [diff] [review]
> Fix bug where a newly allocated (in the sense that a TURN allocation has
> been formed) relayed candidate could be used after being pruned, when
> redundant.
> 
> Review of attachment 8339413 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I agree that this fix is correct.
> 
> Please confirm that this cannot happen absent trickle ICE. If so,
> that has two implications:
> 
> 1. This only occurs in Aurora
> 2. It does not require fast uplift to the resip nICEr repo.

I'm pretty sure trickle ICE doesn't factor in on this one; if we have two identical TURN servers configured, and the allocations come back with the same relay address, the fact that we haven't started pairing things up yet doesn't save us. I will look at the code pre-trickle though to be sure.
(In reply to Eric Rescorla (:ekr) from comment #5)
> Also, I think it would be good to null out cand here and add
> a check in the error return area to future proof against other
> stuff like this a bit by substituting a null deref for other
> issues.

Agreed.
Ok, so looking at the nICEr code at resiprocate, it appears that cand->done_cb does not carry the same risk of causing candidates to go away that it does in our codebase. cnad->done_cb can cause an app callback to fire, which could _potentially_ translate into a synchronous call to nr_ice_media_stream_get_attributes, which then calls the pruning logic, but this will only happen once the last candidate has been initialized. Since cand2 has not been initialized yet, I think we're ok in the older code.
Byron,

I believe that per c8 the first affected version is 27 (where we added
trickle). Please verify and then file for sec-approval and uplift to
the appropriate branches.
Flags: needinfo?(docfaraday)
Yep, I'm on it. Just need to get an asan build of the old code going, and then try to hack together a test-case that causes the bug, and then verify the bug/fix with an asan build on the latest code. Lots of waiting on builds.
Flags: needinfo?(docfaraday)
Null out cand, and add a little more checking for future-proofing. Alter commit/code comments to avoid making the sec bug obvious.
Attachment #8339413 - Attachment is obsolete: true
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

Review of attachment 8346801 [details] [diff] [review]:
-----------------------------------------------------------------

Is this enough checking?
Attachment #8346801 - Flags: review?(ekr)
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

   Someone would need to dig down into what the done_cb function pointer actually points at, and then do a little reading to determine that this patch fixes a memory error. Having done this, it would be easy to trigger the bug, but the window of opportunity is so narrow that it would probably be difficult to exploit in any controlled fashion (but I'm not an exploit expert).

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

   Not really. It looks fairly innocuous, although if someone attempted to inspect the ticket and found that it was inaccessible, learning that the patch fixes some sort of security bug, I suspect it would not take too long to figure it out.

Which older supported branches are affected by this flaw?

   27 and later for sure, 26 and earlier are almost certainly not affected.

If not all supported branches, which bug introduced the flaw?

   842549

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

   The patch should apply cleanly to the old code, and if it does not, backports will be trivial.

How likely is this patch to cause regressions; how much testing does it need?

   It is pretty unlikely. The standard suite of test-cases should be fine.
Attachment #8346801 - Flags: sec-approval?
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

sec-approval+ for trunk.

We should get Aurora and Beta patches so we don't ship an effected version too.
Attachment #8346801 - Flags: sec-approval? → sec-approval+
I'll try applying the patch on 27 and 28.
Patch applies fine on current mozilla-aurora and beta.
Attachment #8346801 - Flags: approval-mozilla-beta+
Attachment #8346801 - Flags: approval-mozilla-aurora+
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

Review of attachment 8346801 [details] [diff] [review]:
-----------------------------------------------------------------

Yes, this appears to add the checks that EKR requested.
Attachment #8346801 - Flags: review?(ekr) → review+
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

Review of attachment 8346801 [details] [diff] [review]:
-----------------------------------------------------------------

Let's go ahead and check into inbound, and if all goes well, we can land on aurora and beta.
Attachment #8346801 - Flags: checkin?(adam)
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

https://hg.mozilla.org/integration/mozilla-inbound/rev/29516a7c4545
Attachment #8346801 - Flags: checkin?(adam) → checkin+
Setting needinfo on myself so I can remember to come back here and commit to aurora and beta later tomorrow if the m-i landing goes smoothly.
Flags: needinfo?(adam)
landed on central https://hg.mozilla.org/mozilla-central/rev/29516a7c4545
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Byron, does this need verification by QA or is this sufficiently covered by tests?
Flags: needinfo?(docfaraday)
This is caused by very similar circumstances as bug 938857, and would require a similar test setup as described here: https://bugzilla.mozilla.org/show_bug.cgi?id=938857#c49
Flags: needinfo?(docfaraday)
(In reply to Byron Campen [:bwc] from comment #26)
> This is caused by very similar circumstances as bug 938857, and would
> require a similar test setup as described here:
> https://bugzilla.mozilla.org/show_bug.cgi?id=938857#c49

Thanks Byron, marking this [qa-] as with bug 938857.
Whiteboard: [qa-]
Group: core-security
You need to log in before you can comment on or make changes to this bug.