Use after free/double free bug when pruning a newly initialized TURN candidate.

RESOLVED FIXED in Firefox 27, Firefox OS v1.3

Status

()

Core
WebRTC: Networking
--
major
RESOLVED FIXED
5 years ago
3 years ago

People

(Reporter: bwc, Assigned: bwc)

Tracking

({csectype-uaf, sec-critical})

Trunk
mozilla29
csectype-uaf, sec-critical
Points:
---
Bug Flags:
in-testsuite ?

Firefox Tracking Flags

(firefox26 unaffected, firefox27 fixed, firefox28 fixed, firefox29+ fixed, firefox-esr24 unaffected, b2g18 unaffected, b2g-v1.1hd unaffected, b2g-v1.2 unaffected, b2g-v1.3 fixed)

Details

(Whiteboard: [qa-])

Attachments

(2 attachments, 1 obsolete attachment)

(Assignee)

Description

5 years ago
When a TURN candidate is finished initializing (ie; the TURN allocation has completed), it is then run through the pruning logic via this callback:

http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_candidate.c#756

If this TURN candidate is found to be redundant, it is then pruned from the candidate list, and destroyed, meaning that the following accesses are unsafe:

http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_candidate.c#756

In addition, if we encounter this failure:

http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_candidate.c#763

The code at the abort: label will call done_cb a second time; the pruning logic will probably destroy the candidate a second time.

http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_candidate.c#790
(Assignee)

Comment 1

5 years ago
FWIW, I cannot find any crash reports in any of the following functions (these are the ones that look like they might crash due to this bug). This is not too surprising, since it is a very tight race if we don't fail when calling nr_turn_client_get_mapped_address.

nr_ice_candidate_destroy
nr_ice_component_maybe_prune_candidate
nr_ice_initialize_finished_cb
nr_turn_client_get_mapped_address
nr_ice_turn_allocated_cb

Updated

5 years ago
Whiteboard: sec-uaf
Keywords: csectype-uaf, sec-critical
Whiteboard: sec-uaf
(Assignee)

Comment 2

5 years ago
Created attachment 8339413 [details] [diff] [review]
Fix bug where a newly allocated (in the sense that a TURN allocation has been formed) relayed candidate could be used after being pruned, when redundant.
(Assignee)

Updated

5 years ago
Attachment #8339413 - Flags: review?(ekr)

Comment 3

5 years ago
Comment on attachment 8339413 [details] [diff] [review]
Fix bug where a newly allocated (in the sense that a TURN allocation has been formed) relayed candidate could be used after being pruned, when redundant.

Review of attachment 8339413 [details] [diff] [review]:
-----------------------------------------------------------------

I agree that this fix is correct.

Please confirm that this cannot happen absent trickle ICE. If so,
that has two implications:

1. This only occurs in Aurora
2. It does not require fast uplift to the resip nICEr repo.

Please do not land until you have verified these issues.

Then you will need to get sec approval for the landing
Attachment #8339413 - Flags: review?(ekr) → review+

Comment 4

5 years ago
Also, I think it would be good to null out cand here and add
a check in the error return area to future proof against other
stuff like this a bit by substituting a null deref for other
issues.

Comment 5

5 years ago
Also, I think it would be good to null out cand here and add
a check in the error return area to future proof against other
stuff like this a bit by substituting a null deref for other
issues.
(Assignee)

Comment 6

5 years ago
(In reply to Eric Rescorla (:ekr) from comment #3)
> Comment on attachment 8339413 [details] [diff] [review]
> Fix bug where a newly allocated (in the sense that a TURN allocation has
> been formed) relayed candidate could be used after being pruned, when
> redundant.
> 
> Review of attachment 8339413 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I agree that this fix is correct.
> 
> Please confirm that this cannot happen absent trickle ICE. If so,
> that has two implications:
> 
> 1. This only occurs in Aurora
> 2. It does not require fast uplift to the resip nICEr repo.

I'm pretty sure trickle ICE doesn't factor in on this one; if we have two identical TURN servers configured, and the allocations come back with the same relay address, the fact that we haven't started pairing things up yet doesn't save us. I will look at the code pre-trickle though to be sure.
(Assignee)

Comment 7

5 years ago
(In reply to Eric Rescorla (:ekr) from comment #5)
> Also, I think it would be good to null out cand here and add
> a check in the error return area to future proof against other
> stuff like this a bit by substituting a null deref for other
> issues.

Agreed.
(Assignee)

Comment 8

5 years ago
Ok, so looking at the nICEr code at resiprocate, it appears that cand->done_cb does not carry the same risk of causing candidates to go away that it does in our codebase. cnad->done_cb can cause an app callback to fire, which could _potentially_ translate into a synchronous call to nr_ice_media_stream_get_attributes, which then calls the pruning logic, but this will only happen once the last candidate has been initialized. Since cand2 has not been initialized yet, I think we're ok in the older code.

Comment 9

5 years ago
Byron,

I believe that per c8 the first affected version is 27 (where we added
trickle). Please verify and then file for sec-approval and uplift to
the appropriate branches.
Flags: needinfo?(docfaraday)
(Assignee)

Comment 10

5 years ago
Yep, I'm on it. Just need to get an asan build of the old code going, and then try to hack together a test-case that causes the bug, and then verify the bug/fix with an asan build on the latest code. Lots of waiting on builds.
Flags: needinfo?(docfaraday)
status-b2g18: --- → unaffected
status-b2g-v1.1hd: --- → unaffected
status-b2g-v1.2: --- → unaffected
status-firefox27: --- → affected
status-firefox28: --- → affected
status-firefox29: --- → affected
status-firefox-esr24: --- → unaffected
tracking-firefox29: --- → +
(Assignee)

Comment 12

5 years ago
Created attachment 8346786 [details] [diff] [review]
(DO NOT CHECK IN) Direct drive test that triggers the bug. Got as close to the real world flow as I could without involving actual network IO.

Direct-drive test case
(Assignee)

Comment 13

5 years ago
Created attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

Null out cand, and add a little more checking for future-proofing. Alter commit/code comments to avoid making the sec bug obvious.
(Assignee)

Updated

5 years ago
Attachment #8339413 - Attachment is obsolete: true
(Assignee)

Comment 14

5 years ago
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

Review of attachment 8346801 [details] [diff] [review]:
-----------------------------------------------------------------

Is this enough checking?
Attachment #8346801 - Flags: review?(ekr)
(Assignee)

Comment 15

5 years ago
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

   Someone would need to dig down into what the done_cb function pointer actually points at, and then do a little reading to determine that this patch fixes a memory error. Having done this, it would be easy to trigger the bug, but the window of opportunity is so narrow that it would probably be difficult to exploit in any controlled fashion (but I'm not an exploit expert).

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

   Not really. It looks fairly innocuous, although if someone attempted to inspect the ticket and found that it was inaccessible, learning that the patch fixes some sort of security bug, I suspect it would not take too long to figure it out.

Which older supported branches are affected by this flaw?

   27 and later for sure, 26 and earlier are almost certainly not affected.

If not all supported branches, which bug introduced the flaw?

   842549

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

   The patch should apply cleanly to the old code, and if it does not, backports will be trivial.

How likely is this patch to cause regressions; how much testing does it need?

   It is pretty unlikely. The standard suite of test-cases should be fine.
Attachment #8346801 - Flags: sec-approval?
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

sec-approval+ for trunk.

We should get Aurora and Beta patches so we don't ship an effected version too.
Attachment #8346801 - Flags: sec-approval? → sec-approval+
status-firefox26: --- → unaffected
(Assignee)

Comment 17

5 years ago
I'll try applying the patch on 27 and 28.
(Assignee)

Comment 18

5 years ago
Patch applies fine on current mozilla-aurora and beta.
Attachment #8346801 - Flags: approval-mozilla-beta+
Attachment #8346801 - Flags: approval-mozilla-aurora+

Comment 19

5 years ago
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

Review of attachment 8346801 [details] [diff] [review]:
-----------------------------------------------------------------

Yes, this appears to add the checks that EKR requested.
Attachment #8346801 - Flags: review?(ekr) → review+
(Assignee)

Comment 20

5 years ago
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

Review of attachment 8346801 [details] [diff] [review]:
-----------------------------------------------------------------

Let's go ahead and check into inbound, and if all goes well, we can land on aurora and beta.
Attachment #8346801 - Flags: checkin?(adam)

Comment 21

5 years ago
Comment on attachment 8346801 [details] [diff] [review]
Avoid calling done_cb in the wrong order, or multiple times.

https://hg.mozilla.org/integration/mozilla-inbound/rev/29516a7c4545
Attachment #8346801 - Flags: checkin?(adam) → checkin+

Comment 22

5 years ago
Setting needinfo on myself so I can remember to come back here and commit to aurora and beta later tomorrow if the m-i landing goes smoothly.
Flags: needinfo?(adam)
landed on central https://hg.mozilla.org/mozilla-central/rev/29516a7c4545
Status: NEW → RESOLVED
Last Resolved: 5 years ago
status-firefox29: affected → fixed
Flags: in-testsuite?
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
https://hg.mozilla.org/releases/mozilla-aurora/rev/8f7687732fd1
https://hg.mozilla.org/releases/mozilla-beta/rev/b45852a5f6df
status-b2g-v1.3: --- → fixed
status-firefox27: affected → fixed
status-firefox28: affected → fixed
Flags: needinfo?(adam)
Byron, does this need verification by QA or is this sufficiently covered by tests?
Flags: needinfo?(docfaraday)
(Assignee)

Comment 26

5 years ago
This is caused by very similar circumstances as bug 938857, and would require a similar test setup as described here: https://bugzilla.mozilla.org/show_bug.cgi?id=938857#c49
Flags: needinfo?(docfaraday)
(In reply to Byron Campen [:bwc] from comment #26)
> This is caused by very similar circumstances as bug 938857, and would
> require a similar test setup as described here:
> https://bugzilla.mozilla.org/show_bug.cgi?id=938857#c49

Thanks Byron, marking this [qa-] as with bug 938857.
Whiteboard: [qa-]
Group: core-security
You need to log in before you can comment on or make changes to this bug.