Closed Bug 1685354 Opened 4 years ago Closed 3 years ago

Crash in [@ mozilla::NrUdpSocketIpc::close_i]

Categories

(Core :: WebRTC, defect)

Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
93 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox-esr91 93+ fixed
firefox92 --- wontfix
firefox93 + fixed

People

(Reporter: kbrosnan, Assigned: bwc)

Details

(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [post-critsmash-triage][sec-survey][adv-main93+r][adv-esr91.2+r])

Crash Data

Attachments

(1 file)

Crash report: https://crash-stats.mozilla.org/report/index/90227726-b809-472c-ab86-215580210106

Reason: SIGSEGV /SEGV_MAPERR

Top 10 frames of crashing thread:

0 libxul.so mozilla::NrUdpSocketIpc::close_i dom/media/webrtc/transport/nr_socket_prsock.cpp:1558
1 libxul.so mozilla::detail::runnable_args_base< dom/media/webrtc/transport/runnable_utils.h:41
2 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1197
3 libxul.so NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:513
4 libxul.so mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:332
5 libxul.so MessageLoop::Run ipc/chromium/src/base/message_loop.cc:309
6 libxul.so nsThread::ThreadFunc xpcom/threads/nsThread.cpp:442
7 libnss3.so _pt_root nsprpub/pr/src/pthreads/ptthread.c:201
8 libc.so libc.so@0x4285d 
9 libc.so libc.so@0x1af71 

This signature shows evidence of memory corruption. The address for all crashes are 0xe5e5e61d

Group: core-security → media-core-security

Could this also be socket process shutdown related?

Flags: needinfo?(docfaraday)

I doubt it; this is happening on release and beta, which do not run socket process by default. We also see no crashes on nightly, so running socket process might be preventing this problem.

I think that what is happening here is a race between this code:

https://searchfox.org/mozilla-central/rev/8432d4fe31245ae9121231d7c0335aaa342675d2/dom/media/webrtc/transport/nr_socket_prsock.cpp#1098

And this code:

https://searchfox.org/mozilla-central/rev/8432d4fe31245ae9121231d7c0335aaa342675d2/dom/media/webrtc/transport/nr_socket_prsock.cpp#1559

We cannot manipulate the RefPtr in this way since it is not itself threadsafe. We may need to rewrite this code to make use of NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY similar to MediaTransportHandler, so we can implement the "destruction tour" pattern so that resources are released on the appropriate threads before the NrUdpSocketIpc is itself destroyed.

Flags: needinfo?(docfaraday)
Assignee: nobody → docfaraday

Hey Byron, can you give us an update on this? Are we waiting on the socket process rollout?

Flags: needinfo?(docfaraday)

I am seeing signs that this is causing intermittent "YOU ARE LEAKING THE WORLD" errors, which is both plausible and time-consuming to diagnose. I have not had enough time to figure out what is going wrong.

Flags: needinfo?(docfaraday)

Mochitest try looks about like it normally does.

Comment on attachment 9238498 [details]
Bug 1685354: Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY instead of runnables inside d'tor. r?mjf

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: It would probably be kinda difficult to work out.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
  • Which older supported branches are affected by this flaw?: all
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: No
  • If not, how different, hard to create, and risky will they be?: I don't think they will be particularly hard, but it is difficult to say for sure.
  • How likely is this patch to cause regressions; how much testing does it need?: This lifecycle stuff is very subtle, so there's always a chance of problems.
Attachment #9238498 - Flags: sec-approval?

Not a very common crash, and doesn't seem to happen in latest builds. Only two crashes in Fx90, none in Fx91. In the last 6 months the earliest version showing this crash is Firefox 80: We don't need to worry about back-porting this to ESR-78. It's largely an Android crash (80%) so maybe we just aren't seeing it on 78.x. Unless we see some crashing I don't think we need to back-port to ESR-91 either.

This isn't a "stop-ship" type bug so we've missed 92 also, which is just as well since we're not seeing any crashes and we'll want some testing to make sure nothing got broken.

Comment on attachment 9238498 [details]
Bug 1685354: Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY instead of runnables inside d'tor. r?mjf

sec-approval = dveditz

Attachment #9238498 - Flags: sec-approval? → sec-approval+

wpt looks fine, seeing a new failure on mochitest though. Digging into whether that is in the base revision.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=71567df660de093c5c4f4dc1f6904cc3ab7f4e86

Hmm. No failures on base revision. This looks like an mDNS-related test though, which could be effected by transient external network stuff. Let's see what happens if I retrigger both at the same time.

Looks transient.

Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY instead of runnables inside d'tor. r=mjf
https://hg.mozilla.org/integration/autoland/rev/870bd38662bed0edddfbb804411a6927fc8f8348
https://hg.mozilla.org/mozilla-central/rev/870bd38662be

Group: media-core-security → core-security-release
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 93 Branch
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]

As part of a security bug pattern analysis, we are requesting your help with a high level analysis of this bug. It is our hope to develop static analysis (or potentially runtime/dynamic analysis) in the future to identify classes of bugs.

Please visit this google form to reply.

Flags: needinfo?(docfaraday)
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][sec-survey]

Please nominate this for ESR91 approval when you get a chance.

Comment on attachment 9238498 [details]
Bug 1685354: Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY instead of runnables inside d'tor. r?mjf

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration:
  • User impact if declined: Potential UAF. Maybe exploitable, maybe not.
  • Fix Landed on Version: 93
  • Risk to taking this patch: Medium
  • Why is the change risky/not risky? (and alternatives if risky): Anything with non-trivial lifecycle changes carries at least some risk.
  • String or UUID changes made by this patch: None
Flags: needinfo?(docfaraday)
Attachment #9238498 - Flags: approval-mozilla-esr91?

Comment on attachment 9238498 [details]
Bug 1685354: Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY instead of runnables inside d'tor. r?mjf

Approved for 91.2esr.

Attachment #9238498 - Flags: approval-mozilla-esr91? → approval-mozilla-esr91+
No longer blocks: webrtc-triage
Whiteboard: [post-critsmash-triage][sec-survey] → [post-critsmash-triage][sec-survey][adv-main93+r]
Whiteboard: [post-critsmash-triage][sec-survey][adv-main93+r] → [post-critsmash-triage][sec-survey][adv-main93+r][adv-esr91.2+r]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: