UAF in sigslot code (MutexAutoLock::MutexAutoLock)

RESOLVED INCOMPLETE

Status

()

defect
P1
critical
Rank:
10
RESOLVED INCOMPLETE
3 years ago
a year ago

People

(Reporter: jesup, Assigned: bwc)

Tracking

({crash, csectype-uaf, sec-critical})

Trunk
mozilla47
x86
Windows NT
Points:
---

Firefox Tracking Flags

(firefox42 wontfix, firefox44 wontfix, firefox45+ fixed, firefox46+ fixed, firefox47+ fixed, firefox48 fixed, firefox49+ fixed, firefox-esr3845+ fixed)

Details

(Whiteboard: [adv-main45+][adv-esr38.7+], crash signature)

Attachments

(1 attachment, 1 obsolete attachment)

This bug was filed from the Socorro interface and is 
report bp-4cae62bb-dbad-4a1c-b4bb-64ada2151215.
=============================================================

UAF (0xffffffffe5e5e5e9)
Group: media-core-security
backlog: --- → webrtc/webaudio+
Rank: 10
(Assignee)

Comment 1

3 years ago
This looks like a dispatch landing on a PeerConnectionMedia that has been freed.
(Assignee)

Comment 2

3 years ago
The signature matches SignalUpdateDefaultCandidate.
(Assignee)

Comment 3

3 years ago
I'm not having any luck finding a way for SignalUpdateDefaultCandidate to fire after the PeerConnectionMedia has been destroyed.
FYI, this was in 42 in case things have changed
I've asked Byron to take a look at the Fx42 code to see if there's a way this could have happened in Fx42 that can't happen now.  (We've landed a lot of code changes since 42.)
Assignee: nobody → docfaraday
(Assignee)

Comment 6

3 years ago
I've looked at 42, doesn't seem to be different enough to allow this to happen either.
Group: core-security
Once we know what versions this affects , can you request tracking? Thanks.
I see a bunch of crashes on 43, and one on 44, in the last 28 days, but not more recent than that. Maybe this can be closed as incomplete?
I'd like to wait and reassess when Fx44 has more usage.  BTW, I don't this crash is actually being hit that frequently.  You have to look very closely at the stack to know if it's this bug.
"More Reports" for this bug catches a tun of bugs in different areas; you have to check the stack to know which one is which.  I think I've seen 1 hit for this particular bug total, in 42.  I'll check the recent stacks again for more hits from this, but it's time-consuming.

We desperately need search terms for "on the crashing thread's stack" in crashstats
In the last month:
43.0.1: https://crash-stats.mozilla.com/report/index/a8c5433b-3790-4bf9-a6c5-963492160126
43.0b2: https://crash-stats.mozilla.com/report/index/4cfb1a41-51e2-4b60-ba80-efe0e2160120
It's so low frequency we're not likely to see hits outside of release (though we got one beta)
(Assignee)

Comment 12

3 years ago
(Assignee)

Comment 13

3 years ago
Comment on attachment 8714364 [details] [diff] [review]
Assert if PCM is destroyed improperly

Review of attachment 8714364 [details] [diff] [review]:
-----------------------------------------------------------------

This might catch this bug when it happens, but it is hard to be sure.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5aee4562c256
Attachment #8714364 - Flags: review?(rjesup)
Attachment #8714364 - Flags: review?(rjesup) → review+
Once it's stable on Nightly, likely we want to uplift this to Aurora and Beta.  If this does catch it, it converts it into a safe crash.
(Assignee)

Comment 15

3 years ago
Comment on attachment 8714364 [details] [diff] [review]
Assert if PCM is destroyed improperly

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

   Not very easily at all; we aren't even sure this will catch this bug.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

   No.

Which older supported branches are affected by this flaw?

   It looks like 42-44 are definitely affected, maybe more.   

If not all supported branches, which bug introduced the flaw?

   Unknown.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

   No, but this should be low-risk.

How likely is this patch to cause regressions; how much testing does it need?

   Very unlikely, but it might cause things to crash more reliably when things go horribly wrong.
Attachment #8714364 - Flags: sec-approval?
(Assignee)

Updated

3 years ago
Keywords: leave-open
Comment on attachment 8714364 [details] [diff] [review]
Assert if PCM is destroyed improperly

sec-approval+ because...why not?
Attachment #8714364 - Flags: sec-approval? → sec-approval+
(Assignee)

Updated

3 years ago
Keywords: checkin-needed
Tracking since this is sec-critical and will likely need uplift. 

Randall can you check to see if ESR38 is affected?
Flags: needinfo?(rjesup)
There's no easy way to tell barring a hit; this is a defense-in-depth patch.  Likely any bug goes back to 38, but there's no easy way to tell without a different interface to crash-stats to look at a much wider timeframe - and then looking at the stacks of hundreds of crashes to see if any of them are from sigslot.

I'd suggest taking this safely-assert patch for 38
Flags: needinfo?(rjesup)
Byron, could you fill the uplift request to aurora, beta & esr38? Thanks
Flags: needinfo?(docfaraday)
(Assignee)

Comment 22

3 years ago
Comment on attachment 8714364 [details] [diff] [review]
Assert if PCM is destroyed improperly

[Approval Request Comment]
User impact if declined: 

   This is mostly a diagnostic patch, but if it does catch this problem, it converts a potential unsafe UAF crash into a safe crash.

Fix Landed on Version:

   47

Risk to taking this patch (and alternatives if risky): 

   Very low risk, although it might cause more frequent (safe) crashes than are happening now.

String or UUID changes made by this patch: 

   None.

[Feature/regressing bug #]:

   Unknown.

[Describe test coverage new/current, TreeHerder]:

   We have no test case that is able to reproduce this problem, and are unsure how this problem could happen.

[Risks and why]: 

   This patch adds a release assert for a case that should never happen, but it may happen more often than we think. There's no way to know for sure until we see some data.
Flags: needinfo?(docfaraday)
Attachment #8714364 - Flags: approval-mozilla-esr38?
Attachment #8714364 - Flags: approval-mozilla-beta?
Attachment #8714364 - Flags: approval-mozilla-aurora?
Comment on attachment 8714364 [details] [diff] [review]
Assert if PCM is destroyed improperly

Improve the situation, taking it.
Should be in 45 beta 5.
Attachment #8714364 - Flags: approval-mozilla-esr38?
Attachment #8714364 - Flags: approval-mozilla-esr38+
Attachment #8714364 - Flags: approval-mozilla-beta?
Attachment #8714364 - Flags: approval-mozilla-beta+
Attachment #8714364 - Flags: approval-mozilla-aurora?
Attachment #8714364 - Flags: approval-mozilla-aurora+
Whiteboard: [adv-main45+][adv-esr38.7+]
(Assignee)

Comment 25

3 years ago
Not seeing this crash on 45 still. I do see another crash on 44 that might be related:

https://crash-stats.mozilla.com/report/index/e31270dc-61d1-491f-b0b7-9e4f12160229

This one definitely looks like it is caused by using the PeerConnectionMedia after it has started tearing down, possibly for the same reason. It also looks like this might be caused if the following code doesn't actually do what it is supposed to:

http://mxr.mozilla.org/mozilla-release/source/media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.cpp#794

I don't see how that could cause this bug though.
(Assignee)

Comment 26

3 years ago
I'm also not seeing the release assertion we added in crash-stats.
How long do we want to keep this open?
Flags: needinfo?(rjesup)
Only crash I see in the last month was a single 44b1 crash.   (19 crashes, but all are other sources than this except one).  44b1 wouldn't have byron's mitigation (MOZ_RELEASE_ASSERT); a search for PeerConnectionMedia in crash-stats doesn't find anything pointing to the release assert.

I think we need more time to know, given low frequency of hits.  We could close this and needinfo ourselves to come back and check it.  What we landed is mitigation, though, not really a core fix.
Flags: needinfo?(rjesup)
I can go either way. If you'd prefer to leave it open, that's fine.
(Assignee)

Comment 31

3 years ago
I have a hypothesis as to what is going on. Another patch coming soon.
(Assignee)

Comment 32

3 years ago
MozReview-Commit-ID: 5Mq8CBmSDsu
(Assignee)

Updated

3 years ago
Attachment #8714364 - Attachment is obsolete: true
(Assignee)

Comment 33

3 years ago
Comment on attachment 8744906 [details] [diff] [review]
Add an assertion

Review of attachment 8744906 [details] [diff] [review]:
-----------------------------------------------------------------

Adding an assertion to check another possible cause.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e74fe785f9f0
Attachment #8744906 - Flags: review?(drno)
Comment on attachment 8744906 [details] [diff] [review]
Add an assertion

Review of attachment 8744906 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM
Attachment #8744906 - Flags: review?(drno) → review+
(Assignee)

Comment 35

3 years ago
Comment on attachment 8744906 [details] [diff] [review]
Add an assertion

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

   Not easily at all.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

   Not really.

Which older supported branches are affected by this flaw?

   Unknown, this is just an attempt to diagnose.

If not all supported branches, which bug introduced the flaw?

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

   No, but they should be easy if we need them.

How likely is this patch to cause regressions; how much testing does it need?

   Extremely unlikely. At most, it will convert an unsafe crash/undefined behavior into a safe crash.
Attachment #8744906 - Flags: sec-approval?
I'm confused on why you're asking for sec-approval on a bug we shipped a fix for a month and a half ago. 

Is this a new few for the core issue and supersedes the previous fix?
Flags: needinfo?(docfaraday)
(Assignee)

Comment 37

3 years ago
Both of these patches have been investigatory, not fixes.
Flags: needinfo?(docfaraday)
(In reply to Byron Campen [:bwc] (PTO Apr 15) from comment #37)
> Both of these patches have been investigatory, not fixes.

I was going by comment 28:

> What we landed is mitigation, though, not really a core fix.
Attachment #8744906 - Flags: sec-approval? → sec-approval+
Tracking for 49. We may want to mark this as still affected for older versions if uplifting this latest patch is a good idea. Or, it's usually more clear to open a new bug.
https://hg.mozilla.org/mozilla-central/rev/f241151c6169
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Oops, didn't catch that leave-open.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I don't see any crashes containing PeerConnectionMedia on Nightly, for what it is worth.
What's left to do here. It is 1 1/2 months later and this bug comes up in all security triage meetings as an open issue.
Flags: needinfo?(docfaraday)
No crashes in the last month again.  I'm going to close as incomplete pending new data
Status: REOPENED → RESOLVED
Last Resolved: 3 years ago3 years ago
Flags: needinfo?(docfaraday)
Resolution: --- → INCOMPLETE
Group: media-core-security
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.