Crash in mozilla::DataChannelConnection::SctpDtlsOutput

RESOLVED FIXED in Firefox -esr45

Status

()

defect
P1
critical
Rank:
10
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: jesup, Assigned: jesup)

Tracking

({crash, csectype-uaf, sec-high})

49 Branch
mozilla54
x86
Windows
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr45 fixed, firefox52 fixed, firefox-esr52 fixed, firefox53 fixed, firefox54 fixed)

Details

(crash signature)

Attachments

(1 attachment)

+++ This bug was initially created as a clone of Bug #1294095 +++

We still have a small number of crashes in SctpDtlsOutput() on retransmissions:
https://crash-stats.mozilla.com/report/index/4ba148e2-49a5-4c2c-a7e9-476a82161019

This appears to be a "regular" retransmission, not an association abort.  I think this is the same base cause as bug 1294095, which closed the big hole caused by the ordering of aborts and timer shutdown.  The remaining failures may be cases where a normal retransmit hits the window.

We still could use the wrapper-object patch I wrote for bug 1294095. It does require locking on all the calls, and makes me a little concerned about deadlocks, but it should be safe.

See some of the previous discussion in bug 1294095 comment 38 and related comments before and after.

Note that we're still waiting on an sctp update from upstream; however it's unclear if that would actually help this.
michael - thoughts?
Flags: needinfo?(tuexen)
(In reply to Randell Jesup [:jesup] from comment #1)
> michael - thoughts?

Does this occur only on Windows?
Flags: needinfo?(tuexen)
(In reply to Michael Tüxen from comment #2)
> (In reply to Randell Jesup [:jesup] from comment #1)
> > michael - thoughts?
> 
> Does this occur only on Windows?

yes
Hmm. That could be a hint that it is a Windows specific locking problem. We are not testing much on Windows.
I can try to see if we can use a Firefox version on Windows to do some testing. Not sure. We normally don't have Windows...
Do you have an idea how the peer variable becomes NULL? If the retransmission is not killing the association, it must be the upper layer, right?
Flags: needinfo?(rjesup)
Randell: FYI: I still have to address:
https://github.com/sctplab/usrsctp/pull/101
not sure if it is related, but it is related to timers...
DataChannelConnection::Destroy()

https://dxr.mozilla.org/mozilla-central/source/netwerk/sctp/datachannel/DataChannel.cpp?q=DataChannelConnection%3A%3ADestroy%28%29&redirect_type=single#254

and DestroyOnSTS right after it.  

Note that DestroyOnSTS runs on another thread (see bug 87167 for why), but it holds a reference to the connection, such that we can't destroy the connection until DestroyOnSTS has finished.  The ordering of usrsctp_deregister_address and usrsctp_close are not guaranteed, however, in the current impl.  If it will help, I could move deregister_address to come before close always; I also probably can have it always happen *after* usrsctp_close.   However for the previous bug here this wasn't the issue apparently, it was the ordering of the callbacks we got.
Flags: needinfo?(rjesup)
Assignee: nobody → rjesup
Rank: 10
Randell, any update on this? Can we move it forward in any way?
Flags: needinfo?(rjesup)
speculative fix or attempt to mitigate - since Bug 1294095 landed, the crash rate is way down and of the crashes since then, all but one were 0xfffffffff
Attachment #8828115 - Flags: review?(tuexen)
Attachment #8828115 - Flags: review?(tuexen) → review+
There is a timer issue when calling usrsctp_deregister_address(), see the report in https://github.com/sctplab/usrsctp/pull/101, however, the fix there is wrong. Will fix it soon. Could this be related to the issue here?
I don't think that exact issue is going on here... but it might be related.
Flags: needinfo?(rjesup)
Hi Randell, I see a r+ patch here, does it need to be sec approved or are you waiting for the fix on the bug Michael talks about?
Flags: needinfo?(rjesup)
Thanks; I should probably land the speculative fix - it can't hurt.
Flags: needinfo?(rjesup)
https://hg.mozilla.org/mozilla-central/rev/2affff77f467
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
I'm confused by why this landed without sec-approval, but it looks like we need to move forward with branch requests at this point at least.
Ugh.  it's been sitting so long I'd forgotten it didn't have s-a long ago, and when I did it I was sick.  It's virtually certainly not exploitable, but it is a UAF.  filing uplift requests.  (also note that it's quite unclear that this actually fixes this problem; this basically just makes things more consistent.)
Comment on attachment 8828115 [details] [diff] [review]
make shutdown ordering for sctp deregister_address consistent

Approval Request Comment
[Feature/Bug causing the regression]: long-standing very-low-frequency problem

[User impact if declined]: security risk until this rides the trains to release.  Hard to see how this gets exploited, as we can't find a way to trigger the problem.

[Is this code covered by automated tests?]: Yes, though they don't trigger this issue

[Has the fix been verified in Nightly?]: We have no STR, and very low-freq

[Needs manual test from QE? If yes, steps to reproduce]:  no

[List of other uplifts needed for the feature/fix]: none

[Is the change risky?]: no

[Why is the change risky/not risky?]: collapses two timing cases down to one consistent ordering

[String changes made/needed]: none
Attachment #8828115 - Flags: approval-mozilla-beta?
Attachment #8828115 - Flags: approval-mozilla-aurora?
Comment on attachment 8828115 [details] [diff] [review]
make shutdown ordering for sctp deregister_address consistent

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: 
Fix Landed on Version: 54
Risk to taking this patch (and alternatives if risky): Low risk - mostly just guarantees ordering of shutdown

String or UUID changes made by this patch:  none

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #8828115 - Flags: approval-mozilla-esr45?
Comment on attachment 8828115 [details] [diff] [review]
make shutdown ordering for sctp deregister_address consistent

Post-landing s-a request (sorry!)

[Security approval request comment]
How easily could an exploit be constructed based on the patch? very hard.  We've been unable to reproduce it

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Not really, but the comment does mention this it to obtain consistent ordering, which might be a clue.  We've been unable to find any way this could cause a UAF, however, nor reproduce it leveraging this knowledge.

Which older supported branches are affected by this flaw? All

If not all supported branches, which bug introduced the flaw? N/A

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Trivial.

How likely is this patch to cause regressions; how much testing does it need? Very unlikely, no testing.
Attachment #8828115 - Flags: sec-approval?
Should have leave-opened it
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---
Comment on attachment 8828115 [details] [diff] [review]
make shutdown ordering for sctp deregister_address consistent

let's get this on aurora/beta now it's on trunk
Attachment #8828115 - Flags: approval-mozilla-beta?
Attachment #8828115 - Flags: approval-mozilla-beta+
Attachment #8828115 - Flags: approval-mozilla-aurora?
Attachment #8828115 - Flags: approval-mozilla-aurora+
Comment on attachment 8828115 [details] [diff] [review]
make shutdown ordering for sctp deregister_address consistent

*sigh* Sec-approval+
Attachment #8828115 - Flags: sec-approval? → sec-approval+
(In reply to Al Billings [:abillings] from comment #23)
> Comment on attachment 8828115 [details] [diff] [review]
> make shutdown ordering for sctp deregister_address consistent
> 
> *sigh* Sec-approval+

The dangers of checking in while sick.. Apologies
I left them set to affected on purpose since this bug is leave-open and the patch that landed wasn't guaranteed to fix it.
Comment on attachment 8828115 [details] [diff] [review]
make shutdown ordering for sctp deregister_address consistent

Fix a sec-high. ESR45+.
Attachment #8828115 - Flags: approval-mozilla-esr45? → approval-mozilla-esr45+
Jesup, since Ryan landed this and the patch wasn't guaranteed to fix this, do we have any insight now if this really fix the problem. Just wondering before i do the uplift to esr45
Flags: needinfo?(rjesup)
The crashes are so rare it will take a while to be sure.  There are no new issues reported due to the landing.  There have been no crashes since the landing, but in the 2 weeks before it the only crashes were in 49bx builds and one in 51.0.1
Flags: needinfo?(rjesup)
Ok, I'm going to call this fixed, and we can reopen if needed.
Status: REOPENED → RESOLVED
Closed: 3 years ago2 years ago
Keywords: leave-open
Resolution: --- → FIXED
Group: media-core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.