Closed
Bug 1315248
Opened 8 years ago
Closed 7 years ago
Crash in mozilla::dom::UDPSocketParent::ConnectInternal
Categories
(Core :: DOM: Core & HTML, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla54
People
(Reporter: calixte, Assigned: drno)
References
Details
(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [post-critsmash-triage][adv-main52+])
Crash Data
Attachments
(3 files)
59 bytes,
text/x-review-board-request
|
schien
:
review+
jcristau
:
approval-mozilla-aurora+
jcristau
:
approval-mozilla-beta+
|
Details |
2.42 KB,
patch
|
jesup
:
review+
gchang
:
approval-mozilla-aurora+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
2.39 KB,
patch
|
drno
:
review+
jcristau
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-782e6b03-4a79-4938-8f94-9417d2161031. ============================================================= This crash (7 crashes) appeared on nightly the 2016-10-31 just after the patch [1] to fix bug 1198381 which modified the file xpcom/threads/nsThread.cpp which is in the backtrace. [1] https://hg.mozilla.org/mozilla-central/rev?node=17b19eb241db07356004d211c4d23a7646fd8b09
Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(afarre)
Comment 1•8 years ago
|
||
It really doesn't feel like it's related to bug 1198381. The changes to nsThread all happen before a runnable is executed, while this call stack is in the middle of the execution of a runnable. I'll look a bit closer on Monday though and ask around for ideas.
Flags: needinfo?(afarre)
Updated•8 years ago
|
Priority: -- → P3
Comment 2•7 years ago
|
||
this signature is increasing in frequency after firefox 51 was released - it's also quite common on os x.
status-firefox51:
--- → affected
status-firefox53:
--- → affected
status-firefox54:
--- → affected
OS: Windows 10 → All
Hardware: Unspecified → All
See Also: → 1194259
Assignee | ||
Comment 3•7 years ago
|
||
To add some background: WebRTC attempts a fake connect of a UDP socket to find out the IP address of the default interface on a machine. I failed to find any code change in WebRTC which could explain this. Now up to version 52 that feature only gets triggered if user has set a userpref to not disclose all of his addresses. From version 52 onward that same feature is also being used as a fallback method if other means of gathering the local IP addresses have failed. So the most logic explanation for the spike is that more users appear to set this pref (maybe via an add-on)?! From looking at the code it looks like |mSocket| is null. Which either means the socket got closed, before we call connect() on it, or the socket wasn't even successfully created before we call connect() on it. Neither of which really makes sense to me. I spotted bug 1336586 while looking at our code. Potentially that could cause this, but I'm not 100% convinced.
Comment 4•7 years ago
|
||
I apparently hit this crash just now on Firefox Nightly on Windows 10: https://crash-stats.mozilla.com/report/index/08ac33be-79af-46ae-9a26-c93502170209 I had reopened Firefox with a full set of tabs from my previous session, and I was clicking through and closing a bunch of tabs in series. Some of the tabs were closed as they were starting to load, so it's possible there's a race with something.
Comment 5•7 years ago
|
||
Does anything guarantee that DoConnect() is processed before RecvClose (in case child explicitly closes the connect very early) or ActorDestroy (in case child process crashes or something)? DoConnect() is called asynchronously. The relevant code might be coming from bug 1194259
Flags: needinfo?(rjesup)
Comment 6•7 years ago
|
||
Oh, relevant! I have dom.ipc.processCount set to 16.
Comment 7•7 years ago
|
||
experienced this issue on Linux/amd64: f64d4244-a80f-4fa1-af4b-aab1c2170209
Comment 8•7 years ago
|
||
drno: can you answer Olli's question? I suspect the answer is no
Flags: needinfo?(rjesup) → needinfo?(drno)
Comment hidden (mozreview-request) |
Comment 10•7 years ago
|
||
mozreview-review |
Comment on attachment 8836279 [details] Bug 1315248: ensure socket is available for connect and send. https://reviewboard.mozilla.org/r/111754/#review113056 ::: dom/network/UDPSocketParent.cpp:342 (Diff revision 1) > UDPSocketParent::RecvOutgoingData(const UDPData& aData, > const UDPSocketAddr& aAddr) > { > - MOZ_ASSERT(mSocket); > + if (!mSocket) { > + MOZ_ASSERT(false, "socket is closed"); > + return IPC_FAIL(this, "Attempted to send data on closed socket"); curious, is this really a reason to kill the child process?
Assignee | ||
Comment 11•7 years ago
|
||
WebRTC's networking API's pretty much guarantee that close() can not be called before we have received the result for a connect() call, as nICEr (our ICE stack) assumes to operate on a sync socket API. Further more I assume that IPC calls in the parent get executed in the order which they were received. That's why I failed to see how we actually end up hitting this crash. But ActorDestroy() is probably the answer how this can happen.
Flags: needinfo?(drno)
Assignee | ||
Comment 12•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8836279 [details] Bug 1315248: ensure socket is available for connect and send. https://reviewboard.mozilla.org/r/111754/#review113056 > curious, is this really a reason to kill the child process? No. You are right. I forgot what IPC_FAIL() means.
Comment hidden (mozreview-request) |
Comment 14•7 years ago
|
||
mozreview-review |
Comment on attachment 8836279 [details] Bug 1315248: ensure socket is available for connect and send. https://reviewboard.mozilla.org/r/111754/#review113180 `RecvConnect`, `RecvJoinMulticast` and `RecvLeaveMulticast` should also do a nullptr check on `mSocket` as well. ::: dom/network/UDPSocketParent.cpp:341 (Diff revision 2) > mozilla::ipc::IPCResult > UDPSocketParent::RecvOutgoingData(const UDPData& aData, > const UDPSocketAddr& aAddr) > { > - MOZ_ASSERT(mSocket); > + if (!mSocket) { > + MOZ_ASSERT(false, "socket is closed"); We should use NS_WARN_IF instead of MOZ_ASSERT, since there might be close or IPC actor error happened simultaneously. Aserting on chrome process seems overkill to me.
Attachment #8836279 -
Flags: review?(schien) → review+
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(drno)
Comment 19•7 years ago
|
||
Pushed by drno@ohlmeier.org: https://hg.mozilla.org/integration/autoland/rev/6a3ba8bfb34c ensure socket is available for connect and send. r=schien
Comment 20•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/6a3ba8bfb34c
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Comment 21•7 years ago
|
||
Please request Aurora/Beta approval on this when you get a chance.
Assignee | ||
Comment 22•7 years ago
|
||
Comment on attachment 8836279 [details] Bug 1315248: ensure socket is available for connect and send. Approval Request Comment [Feature/Bug causing the regression]: Bug 1194259 plus probably some timing related changes in later releases. [User impact if declined]: Crashes on pages using webrtc if the page gets closed (?) [Is this code covered by automated tests?]: No, as we don't know exactly how to reproduce it and it is most likely highly timing sensitive. [Has the fix been verified in Nightly?]: No, as we don't know how to reproduce it. [Needs manual test from QE? If yes, steps to reproduce]: Not clear how to reproduce. [List of other uplifts needed for the feature/fix]: N/A [Is the change risky?]: No [Why is the change risky/not risky?]: The patch adds a safeguard against a null pointer reference. [String changes made/needed]: N/A
Flags: needinfo?(drno)
Attachment #8836279 -
Flags: approval-mozilla-beta?
Attachment #8836279 -
Flags: approval-mozilla-aurora?
Comment 23•7 years ago
|
||
Comment on attachment 8836279 [details] Bug 1315248: ensure socket is available for connect and send. avoid null deref in udp code, beta52+, aurora53+
Attachment #8836279 -
Flags: approval-mozilla-beta?
Attachment #8836279 -
Flags: approval-mozilla-beta+
Attachment #8836279 -
Flags: approval-mozilla-aurora?
Attachment #8836279 -
Flags: approval-mozilla-aurora+
Comment 24•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/aafe44390b1b
Comment 25•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/0d1e2aa8c0c2
Comment 26•7 years ago
|
||
had to push https://hg.mozilla.org/releases/mozilla-beta/rev/d0d5f1187d755b0d1fc65baaecd0ca2e5d16d842 as follow up to fix a bustage (thanks to ryanvm to point me to the right direction)
Assignee | ||
Comment 27•7 years ago
|
||
(In reply to Carsten Book [:Tomcat] from comment #26) > had to push > https://hg.mozilla.org/releases/mozilla-beta/rev/ > d0d5f1187d755b0d1fc65baaecd0ca2e5d16d842 as follow up to fix a bustage > (thanks to ryanvm to point me to the right direction) Ups. Sorry for that. And thanks for fixing it!
Comment 28•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-esr52/rev/0d1e2aa8c0c2 https://hg.mozilla.org/releases/mozilla-esr52/rev/d0d5f1187d75
status-firefox-esr52:
--- → fixed
Comment 29•7 years ago
|
||
hi, there are still crashes on 52.0b7: https://crash-stats.mozilla.com/report/index/ea0d347b-0be1-4cf9-bc43-57eba2170217 do we need to reopen?
Flags: needinfo?(drno)
Assignee | ||
Comment 30•7 years ago
|
||
After verifying that my patch actually has made it into beta 7 I have to admit that I'm puzzled on what is going on here. Actually the list of crash reports shows a few crash addresses like this 0xffffffffe5e5e5e5, which if I'm not mistaken indicates user after free. While looking around in the code I noticed that UDPSocketParent::RecvConnect() dispatches to the STS thread. But none of the other functions in UDPSocketParent do the same dispatch. So my best guess right now is that we are seeing some threading issue here. In fact nsUDPSocket::InitWithAddress() and nsUDPSocket::Connect() show the same pattern: InitWithAddress() appears to be running on any thread, but Connect() checks for STS. schien: do you have any thoughts on this?
Status: RESOLVED → REOPENED
Flags: needinfo?(drno) → needinfo?(schien)
Resolution: FIXED → ---
Comment 31•7 years ago
|
||
marking this as security-sensitive as precaution as well...
Group: core-security
Updated•7 years ago
|
Group: core-security → dom-core-security
Comment 32•7 years ago
|
||
Thanks philipp - was was about to do the same :-) This smacks of something involving threading or messages; perhaps passing a 'this' pointer to a WrapRunnable without ensuring a lifetime long enough that the runnable is certain to have been destroyed. It also could be thread coherence - accessing 'this' members (and changing them) from the STS thread and the PBackground thread. Something like RecvConnect() or sendConnectResponse() -- if 'this' has gone away before the runnable runs, boom. You can do WrapRunnable(RefPtr<USPSocketParent>(this), ...
Comment 33•7 years ago
|
||
Agree with @jesup, this looks fishy when I skim through UDPSocketParent.cpp. We should try hold additional refcount for that WrapRunnable.
Flags: needinfo?(schien)
Updated•7 years ago
|
Flags: needinfo?(drno)
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(drno)
Comment 34•7 years ago
|
||
(In reply to Nils Ohlmeier [:drno] from comment #30) > Actually the list of crash reports shows a few crash addresses like this > 0xffffffffe5e5e5e5, which if I'm not mistaken indicates user after free. FYI crash addresses in Breakpad are always treated as 64-bit even on dumps from 32-bit CPUs, so this is just 0xe5e5e5e5 sign-extended to 64-bit, which is almost certainly some sort of poison pattern.
Comment 35•7 years ago
|
||
This is a sec-high in 52, which is also an ESR release. Any chance of a fix while we're still in beta?
Flags: needinfo?(drno)
Keywords: csectype-uaf,
sec-high
Assignee | ||
Comment 36•7 years ago
|
||
Flags: needinfo?(drno)
Attachment #8840079 -
Flags: review?(rjesup)
Updated•7 years ago
|
Attachment #8840079 -
Flags: review?(rjesup) → review+
Assignee | ||
Comment 37•7 years ago
|
||
Try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5844222b8a8feb6277a252317933b19216ba5846
Assignee | ||
Comment 38•7 years ago
|
||
Comment on attachment 8840079 [details] [diff] [review] bug1315248_part2.patch [Security approval request comment] How easily could an exploit be constructed based on the patch? It's probably pretty hard as it appears to be some race condition between getting a PeerConnection to gather IP addresses and shutting it down at the right moment. What makes it easier is that you can try that in a loop without user interaction until your code hits it. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No. Which older supported branches are affected by this flaw? All releases up to Firefox 46. If not all supported branches, which bug introduced the flaw? Bug 1194259 added support for UDP connect. Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? The patch applies to aurora. I have a patch for beta. How likely is this patch to cause regressions; how much testing does it need? I think the likelihood of regressions is pretty low, since it only holds a handle to an object to ensure the object still exists when the runable gets executed. As we don't know how to reproduce the problem we can really test the problem itself.
Attachment #8840079 -
Flags: sec-approval?
Comment 39•7 years ago
|
||
Comment on attachment 8840079 [details] [diff] [review] bug1315248_part2.patch sec-approval+ for trunk. We'll want this on branches ASAP as well. Can you please nominate branch patches?
Attachment #8840079 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Comment 40•7 years ago
|
||
Comment on attachment 8840079 [details] [diff] [review] bug1315248_part2.patch Approval Request Comment [Feature/Bug causing the regression]: Bug 1194259 in Fx 46 [User impact if declined]: Crashes the browser if WebRTC is used and tab gets closed. [Is this code covered by automated tests?]: No, we don't know how to reproduce it. [Has the fix been verified in Nightly?]: No. [Needs manual test from QE? If yes, steps to reproduce]: No STR known. [List of other uplifts needed for the feature/fix]: N/A [Is the change risky?]: Low risk [Why is the change risky/not risky?]: I think it is not risky as it only adds a refpointer to keep the object alive until the runnables are executed. [String changes made/needed]: N/A
Attachment #8840079 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 41•7 years ago
|
||
This is just a separate patch for beta (because attachment 8840079 [details] [diff] [review] does not apply cleanly).
Attachment #8840187 -
Flags: review+
Assignee | ||
Comment 42•7 years ago
|
||
Comment on attachment 8840187 [details] [diff] [review] bug1315248_part2_beta.patch Approval Request Comment [Feature/Bug causing the regression]: Bug 1194259 in Fx 46 [User impact if declined]: Crashes the browser if WebRTC is used and tab gets closed. [Is this code covered by automated tests?]: No, we don't know how to reproduce it. [Has the fix been verified in Nightly?]: No. [Needs manual test from QE? If yes, steps to reproduce]: No STR known. [List of other uplifts needed for the feature/fix]: N/A [Is the change risky?]: Low risk [Why is the change risky/not risky?]: I think it is not risky as it only adds a refpointer to keep the object alive until the runnables are executed. [String changes made/needed]: N/A
Attachment #8840187 -
Flags: approval-mozilla-beta?
Comment 43•7 years ago
|
||
Comment on attachment 8840079 [details] [diff] [review] bug1315248_part2.patch Fix a crash. Aurora53+.
Attachment #8840079 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Updated•7 years ago
|
Keywords: checkin-needed
Comment 44•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/4fff02e2a6a2d1d8f3ea81c7ab3f30374b9e03c9
Keywords: checkin-needed
Comment 45•7 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/08c817237480
Comment 46•7 years ago
|
||
Comment on attachment 8840187 [details] [diff] [review] bug1315248_part2_beta.patch uaf fix, should be in 52.0b9
Attachment #8840187 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 47•7 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/500322a38bc8
Comment 48•7 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-esr52/rev/500322a38bc8
Comment 49•7 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/4fff02e2a6a2
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 50•7 years ago
|
||
Check back on crash stats once this has been in the public for some time.
Flags: needinfo?(drno)
Updated•7 years ago
|
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]
Updated•7 years ago
|
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main52+]
Assignee | ||
Comment 51•7 years ago
|
||
Since the landing of the first patched lowered the frequency of the crashes in beta to a really low volume I guess we need to wait for 52 to become release to judge if it is really fixed now with both patches.
Updated•7 years ago
|
status-firefox-esr45:
--- → unaffected
tracking-firefox52:
--- → ?
tracking-firefox53:
--- → ?
tracking-firefox54:
--- → ?
tracking-firefox-esr52:
--- → ?
Updated•7 years ago
|
Group: dom-core-security → core-security-release
Assignee | ||
Comment 53•7 years ago
|
||
Since 52 becoming release I only see crashes for 51.0.1 or for older beta's of 52. So it looks good. I'll check back in two weeks to see if this really disappeared from crash stats for good this time.
Assignee | ||
Comment 54•7 years ago
|
||
No crashes in 52 since release. Now I'm convinced we fixed it properly.
Flags: needinfo?(drno)
Updated•7 years ago
|
Group: core-security-release
Updated•5 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•