Closed Bug 946733 Opened 6 years ago Closed 6 years ago
Crash in nr
_turn _stun _ctx _cb
(DO NOT CHECK IN!) Patch that sets the minimum TURN allocation duration extremely high, to show that we UAF if the duration from the TURN server is too low.
1.67 KB, patch
|Details | Diff | Splinter Review|
2.44 KB, patch
|Details | Diff | Splinter Review|
3.46 KB, patch
|Details | Diff | Splinter Review|
So, I was doing the following thing, while using https://opentokrtc.com/, in the Paris office: - A Linux machine, using a wired connection, running Chromium 30.0.1599.114 and today's Firefox Nightly. One tab in Firefox, two tabs in Chrome, all in the same room. - A macbook, running a build from yesterday of Chromium, connected to the "Mozilla" wifi network, also in the same room. All the browsers were connected to a room (four participants). After reloading multiple times the Chrome tabs, I tried to reload the Firefox tab, and after a couple seconds trying to connect, it crashed. The bug report with stacks: https://crash-stats.mozilla.com/report/index/920a4655-1911-4b63-b005-d9f582131205
Byron and I looked at this and Byron points out that the TURN code has some confusion about the STUN ctx lifetime in that it calls a callback which destroys the STUN ctx and then references the STUN ctx. This can cause UAFs or potentially double-frees in a number of places. The right fix is just to make the nr_turn_stun_ctx_ lifetime match that of the nr_turn_client_ctx rather than have it destroyed in nr_turn_client_cancel(). We should have a patch later today. Al, Alex, this code is currently running in FF 25 and this is the first crash we have seen in this area. However, this kind of memory error is of course a potential security risk. How would you like us to proceed?
ekr - what versions are affected by this? All versions with TURN support? Can you set the affected flags? Thanks
yeah, everything with TURN, starting in 25. I set the version flag. is that the right one?
Version: unspecified → 25 Branch
We should get a security rating on this. How bad is this really?
I get the impression that an attacker could easily trigger these bugs if they deployed a malicious TURN server, and got us to use it (trivial to carry out).
I think this patch should plug this error, and others.
(In reply to Al Billings [:abillings] from comment #4) > We should get a security rating on this. How bad is this really? TBH, we're not entirely sure. We don't currently have: - A clear path to make this happen - A way to exploit it given that it's happening. But I agree with Byron that there probably is some way to make this happen remotely (though some of the obvious angles I think will not work.) That would then need to be turned into an exploit and the timing for that is probably pretty short.
Cancel the STUN client ctxs (cancel timers, and set state appropriately)
Al, new thoughts?
Comment on attachment 8343402 [details] [diff] [review] Simplify turn client ctx lifecycle to fix several use-after-free and double-free bugs. Review of attachment 8343402 [details] [diff] [review]: ----------------------------------------------------------------- lgtm with nit below. 1. I would like a second review from Adam, or, if he isn't around tomorrow, Martin. 2. Please add a unit test that has a TURN failure on it (e.g., wrong credentials) and make sure it works properly as does the entire unit test suite.
Sure, I can add that test case; I don't expect to see any failures though as I've tried the wrong credentials test under asan.
That's good to hear, but I think it would be good to have a test to make sure it still works.
Yep. If it is easy I'll add some other simple rainy day stuff.
Dan, Based on the above, how would you prefer we handle this in terms of landing timing and branch selection?
Eric, Make a patch and nominate for sec-approval. We should be able to go in straight away. We'll want to take it on Aurora, Beta, and ESR24 after that.
Or not ESR24, rather, since it is unaffected.
Comment on attachment 8343402 [details] [diff] [review] Simplify turn client ctx lifecycle to fix several use-after-free and double-free bugs. Review of attachment 8343402 [details] [diff] [review]: ----------------------------------------------------------------- Looks like a straight shot to me. r++ from me (blame ekr on the double count). Note the request for unit tests from ekr.
Make it easy to see a real-world example of one possible failure case.
Comment on attachment 8343402 [details] [diff] [review] Simplify turn client ctx lifecycle to fix several use-after-free and double-free bugs. [Security approval request comment] How easily could an exploit be constructed based on the patch? If you want to see one possible example, apply https://bugzilla.mozilla.org/attachment.cgi?id=8344015&action=edit, and run ice_unittest under asan with some credentials to a real TURN server. The allocation lifetime will never be high enough to satisfy the checking code, and will cause failures and UAF. It would be trivial to stand up a misconfigured TURN server that would respond with something like a lifetime of 5 seconds, causing this same failure. As for actually exploiting this failure, all of these structs have bunches of function pointers in them, and it is not out of the question that you might be able to engineer a situation where you call one of them on a previously freed struct (I see some potential for double calls to some of the same function pointers that get us to the failure case in the first place). However, the window of opportunity is pretty narrow here, because no pointers to these structs are left behind once the callback for handling the allocate response returns. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Uh. I guess they do now? Which older supported branches are affected by this flaw? Reasonably sure this affects all versions with TURN support. When did this land again? I see a rewrite that landed in April, and ekr says that the TURN code preceding that didn't work at all. If not all supported branches, which bug introduced the flaw? The rewrite happened in bug 786235, changeset e69693180dff. Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? The backports should be pretty easy. How likely is this patch to cause regressions; how much testing does it need? Since we are keeping the failed STUN/TURN context stuff around until the candidates go away, instead of deleting them when they fail, it is possible that these failed contexts could interfere with the continued operation of TURN in some way. However, we are already talking about rainy-day testing, which we don't do much of at all right now...
It landed in TURN 25.
I have verified that the UAF goes away when applying the lifecycle patch, and no asan failures remain. The expected unit-test failures are present in cases where TURN must work to pass the test.
Bryon, You may want to change the summary of the patch to not explictly say you're fixing a use-after-free or double-free bug. Historical, this calls undo attention to the bug. At least that is the thinking.
(In reply to Byron Campen [:bwc] from comment #19) > Do comments in the patch, the check-in comment, or tests included in the > patch paint a bulls-eye on the security problem? > > Uh. I guess they do now? We ask so you can think about changing things like that. Can you change the comment so it's just "Bug 946733 - Simplify turn client ctx lifecycle"?
Comment on attachment 8344040 [details] [diff] [review] turn_client_ctx_lifecycle.patch -- with fixed commit comment. same patch as above w/ correct checkin comment.
Attachment #8344040 - Attachment description: turn_client_ctx_lifecycle.patch → turn_client_ctx_lifecycle.patch -- with fixed commit comment.
Comment on attachment 8343402 [details] [diff] [review] Simplify turn client ctx lifecycle to fix several use-after-free and double-free bugs. [Triage Comment] Please shorted the check-in comment to Bug 946733. Simplify turn client ctx lifecycle With that a=dveditz for aurora and sec-approval+. Please don't mention sec-approval+ in the check-in comment, but you need the a=dveditz part for Aurora check-in
Comment on attachment 8344040 [details] [diff] [review] turn_client_ctx_lifecycle.patch -- with fixed commit comment. Moving flags from other patch
The plan looks good to me. Given the possibility to remotely provoke the UAF, let's push it wherever makes sense. We should make specific decisions about 26 and 25 (given 26 about to go out, probably no point in 25). Right now both are marked wontfix
Randell, we built 26 already and it ships within the next two work days. This isn't getting fixed on 26. We should get this onto other branches though.
https://hg.mozilla.org/releases/mozilla-aurora/rev/f1a8671a0b7e It looks like this needs a rebased patch for trunk (28).
If you attach a patch for 28 I can land it.
Gimme a sec.
This patch applies cleanly for me...
Ah I see. It applies cleanly to mozilla-central, but not to mozilla-inbound. It looks like a conflict with bug 906968, which is on inbound but not central.
In case it isn't clear, I definitely am not going to rebase over that gigantic patch myself. ;)
Yeah, I'm on it.
Looks like the windows builds insist on C89, which wants this declared at the beginning.
Attachment #8343402 - Attachment is obsolete: true
Attachment #8344040 - Attachment is obsolete: true
Windows bustage fix. https://hg.mozilla.org/releases/mozilla-aurora/rev/c896d8ee21f0
Oops, I failed to notice Byron posted a patch. With this push, it should be matched up to his patch. https://hg.mozilla.org/releases/mozilla-aurora/rev/37f2d83153a5
Patch for 28.
Target Milestone: --- → mozilla28
Is this needed on b2g26 (v1.2)?
No, since we don't support peerconnections in 1.2
You need to log in before you can comment on or make changes to this bug.