Assertion failure: IsAcquired() && mOwningThread == PR_GetCurrentThread(), at xpcom/threads/BlockingResourceBase.cpp:369
Categories
(Core :: WebRTC: Networking, defect, P1)
Tracking
()
People
(Reporter: bc, Assigned: jesup)
References
()
Details
(Keywords: assertion, csectype-race, sec-high, Whiteboard: [post-critsmash-triage][adv-main66+][adv-esr60.6+])
Attachments
(2 files)
2.42 KB,
patch
|
drno
:
review+
tuexen
:
review+
lizzard
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-release-
RyanVM
:
approval-mozilla-esr60+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
1.98 KB,
patch
|
drno
:
review+
|
Details | Diff | Splinter Review |
Linux x86_64 Nightly. wait a few seconds.
Linux x86_64 Beta. wait a minute or two.
May require the live show to be playing in the page. It just stopped being reproducible when the "Watch Live now" started throwing HLS.js error networkError. Appears to be using https://videojs.com/. I wasn't able to reproduce with any examples at videojs.com. I'm going to file any way in the event the error is temporary and we will be able to reproduce later.
Opt doesn't appear to crash.
- Assertion failure: IsAcquired() && mOwningThread == PR_GetCurrentThread(), at xpcom/threads/BlockingResourceBase.cpp:369
#01: mozilla::DataChannelConnection::ReceiveCallback(socket*, void*, unsigned long, sctp_rcvinfo, int) (netwerk/sctp/datachannel/DataChannel.cpp:2317)
#02: mozilla::receive_cb(socket*, sctp_sockstore, void*, unsigned long, sctp_rcvinfo, int, void*) (netwerk/sctp/datachannel/DataChannel.cpp:241)
#03: sctp_invoke_recv_callback (netwerk/sctp/src/netinet/sctputil.c:4859)
#04: sctp_add_to_readq (netwerk/sctp/src/netinet/sctputil.c:4968)
#05: sctp_notify_send_failed (netwerk/sctp/src/netinet/sctputil.c:3150)
#06: sctp_ulp_notify (:?)
#07: sctp_release_pr_sctp_chunk (netwerk/sctp/src/netinet/sctputil.c:5119)
#08: sctp_t3rxt_timer (:?)
#09: sctp_timeout_handler (netwerk/sctp/src/netinet/sctputil.c:1763)
#10: sctp_handle_tick (netwerk/sctp/src/netinet/sctp_callout.c:167)
#11: start_thread (pthread_create.c:?)
#12: __GI___clone (:?)
#13: ??? (???:???)
jesup: I made this security senstive because of bug 1493347. Could you take a look and see if this is really security sensitive?
Previously this url had thrown:
Assertion failure: false (MOZ_ASSERT_UNREACHABLE: Should always be able to find the appropriate sample description! Malformed mp4?), at /builds/worker/workspace/build/src/dom/media/mp4/Index.cpp:260
Assertion failure: false (MOZ_ASSERT_UNREACHABLE: Should always be able to find the appropriate sample description! Malformed mp4?), at z:/build/build/src/dom/media/mp4/Index.cpp:260
#01: mozilla::MP4TrackDemuxer::GetNextSample() [dom/media/mp4/MP4Demuxer.cpp:419]
#02: mozilla::MP4TrackDemuxer::GetSamples(int) [dom/media/mp4/MP4Demuxer.cpp:480]
#03: mozilla::TrackBuffersManager::DoDemuxVideo() [dom/media/mediasource/TrackBuffersManager.cpp:1396]
#04: mozilla::TrackBuffersManager::CodedFrameProcessing() [dom/media/mediasource/TrackBuffersManager.cpp:1366]
#05: mozilla::TrackBuffersManager::SegmentParserLoop() [dom/media/mediasource/TrackBuffersManager.cpp:809]
#06: nsresult mozilla::detail::RunnableMethodImpl<mozilla::TrackBuffersManager ,void (mozilla::TrackBuffersManager::)(),1,mozilla::RunnableKind::Standard>::Run() [xpcom/threads/nsThreadUtils.h:1161]
#07: mozilla::TaskQueue::Runner::Run() [xpcom/threads/TaskQueue.cpp:200]
#08: nsThreadPool::Run() [xpcom/threads/nsThreadPool.cpp:243]
#09: nsThread::ProcessNextEvent(bool,bool *) [xpcom/threads/nsThread.cpp:1154]
#10: NS_ProcessNextEvent(nsIThread *,bool) [xpcom/threads/nsThreadUtils.cpp:468]
#11: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate *) [ipc/glue/MessagePump.cpp:304]
#12: MessageLoop::RunHandler() [ipc/chromium/src/base/message_loop.cc:309]
#13: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:291]
#14: nsThread::ThreadFunc(void *) [xpcom/threads/nsThread.cpp:452]
#15: PR_NativeRunThread [nsprpub/pr/src/threads/combined/pruthr.c:406]
#16: static unsigned int pr_root(void *) [nsprpub/pr/src/md/windows/w95thred.c:138]
#17: ucrtbase.dll + 0x1c4ce
#18: KERNEL32.DLL + 0x13dc4
#19: static void patched_BaseThreadInitThunk(int, void *, void *) [mozglue/build/WindowsDllBlocklist.cpp:724]
#20: ntdll.dll + 0x73691
On Windows and Linux.
Reporter | ||
Updated•6 years ago
|
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 1•6 years ago
|
||
I presume from comment 0 that you need to run a debug build to see the crash; I tried an opt build (with audio off!!) and didn't have problems (plus bob indicated it may not repro anymore; it may have been caused by an ad).
Reporter | ||
Comment 2•6 years ago
|
||
It never crashed opt only the fatal assertion on debug. I was able to reproduce it today. It requires the "Watch Live now" video to be playing and visible before the assertion will fire.
Assignee | ||
Comment 3•6 years ago
|
||
I caught this in debug build by letting it run long enough.
124 Thread 0x7fff925fd700 (LWP 25991) "SCTP timer" 0x00007fffe0b3a558 in mozilla::OffTheBooksMutex::AssertCurrentThreadOwns (
this=0x7fff81b2b468) at /home/jesup/src/mozilla/inbound/xpcom/threads/BlockingResourceBase.cpp:369
It's asserting we don't own the mutex when invoked on the timer thread (connection timeout). Normally all callbacks with data occur on the thread we provided input on, where we lock the mutex before passing it in.
We can just lock on the output (since all input is on one thread), or do a TryLock() on callbacks (to catch the case of being called from the timer thread). Locking in the callbacks seems to make more sense, unless there are other requirements about usrsctp_* functions being called from one thread (or without overlapping with other calls). I'll try that first; if it doesn't work the TryLock() should
Our tests don't catch this since it depends on a connection timing out.
I suspect this is also the cause of the other long-standing SCTP sec bug
Assignee | ||
Comment 4•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Comment 5•6 years ago
|
||
I don't believe I have a decent enough overview regarding the threads used in that code to review. But anything you want me to test?
Comment 6•6 years ago
|
||
Assignee | ||
Comment 7•6 years ago
|
||
Assignee | ||
Comment 8•6 years ago
|
||
Note: this will be a prime candidate for any 65 point release. If we had 1 more week, I'd push to get it into 65.0 release
Updated•6 years ago
|
Comment 9•6 years ago
|
||
I "tested" it for two days with a constant data channel connection using Threema Web... if that helps. Nothing to report so far. :)
Updated•6 years ago
|
Updated•6 years ago
|
Assignee | ||
Comment 10•6 years ago
|
||
sec-high - missing lock means that the association failure could get processed on a different thread without locks, and so you can get nasty collisions on the shared structures, including UAFs.
This may be related to bug 1400563, which also involves the SCTP timer thread
Comment 11•6 years ago
|
||
Assignee | ||
Comment 12•6 years ago
|
||
Comment on attachment 9038020 [details] [diff] [review]
Move some Mutexes
Security Approval Request
How easily could an exploit be constructed based on the patch?
Not especially easy, but I wouldn't guarantee it's very hard. Knowing there's a missing mutex on that gives a big hint, and likely they could find the timer callback possibility eventually, and then they have to figure out what that message causes to happen under lock that could mess up the environment. However, this may be the cause of the long-standing clear UAF that involves the SCTP timer, so maybe.
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
Yes
Which older supported branches are affected by this flaw?
all
If not all supported branches, which bug introduced the flaw?
None
Do you have backports for the affected branches?
No
If not, how different, hard to create, and risky will they be?
Trivial to create
How likely is this patch to cause regressions; how much testing does it need?
Regressions are unlikely, and would be deadlocks. Alternative patch would be to lock the connection in the callback (from the timer) if not already locked (i.e. on the timer thread). Much messier and puts a bigger bullseye on it
Beta/Release Uplift Approval Request
Feature/Bug causing the regression
None
User impact if declined
security risk
Is this code covered by automated tests?
No
Has the fix been verified in Nightly?
No
Needs manual test from QE?
No
If yes, steps to reproduce
Browse to the URL and let it sit, in a Debug build. I manually verified the fix locally after reproing the problem.
List of other uplifts needed
None
Risk to taking this patch
Low
Why is the change risky/not risky? (and alternatives if risky)
Just moves a lock
String changes made/needed
none
ESR Uplift Approval Request
If this is not a sec:{high,crit} bug, please state case for ESR consideration
Security bug
User impact if declined
Security risk
Fix Landed on Version
TBD
Risk to taking this patch
Low
Why is the change risky/not risky? (and alternatives if risky)
Just moves a lock
String or UUID changes made by this patch
none
Comment 13•6 years ago
|
||
I'm not convinced we need to take this for a 65 dot release, but I'll track it accordingly for now. Note that we'd also need to dot release ESR 60.5 as well if we decided to go that route.
Comment 15•6 years ago
|
||
Updated•6 years ago
|
Comment 16•6 years ago
|
||
We're going to let this ride with Fx66/60.6esr.
Updated•6 years ago
|
Assignee | ||
Comment 17•6 years ago
|
||
Assignee | ||
Comment 18•6 years ago
|
||
ping on the approvals for beta and esr
Comment 19•6 years ago
|
||
We'll get the ESR one when we're triaging the requests for 60.6 soon.
Comment 20•6 years ago
|
||
Thanks - I'm waiting for this to land on trunk before uplift. Should make it just fine into the beta 8 build on Thursday.
![]() |
||
Comment 21•6 years ago
|
||
![]() |
||
Comment 22•6 years ago
|
||
Backed out for frequent uaf:
https://hg.mozilla.org/mozilla-central/rev/47048ef82b50e3130220d86b09f5addf0b58906e
Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunnable&group_state=expanded&selectedJob=227981486&revision=2161b075ce64dd9455574f1342c735bf1c631770
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=227907346&repo=mozilla-inbound
SUMMARY: AddressSanitizer: heap-use-after-free z:\build\build\src\netwerk\sctp\src\netinet\sctp_input.c:3900 in sctp_handle_stream_reset_response
Assignee | ||
Comment 23•6 years ago
|
||
Comment 24•6 years ago
|
||
Assignee | ||
Comment 25•6 years ago
|
||
![]() |
||
Comment 26•6 years ago
|
||
Comment 27•6 years ago
|
||
![]() |
||
Comment 28•6 years ago
|
||
uplift |
Updated•6 years ago
|
Updated•6 years ago
|
Comment 29•6 years ago
|
||
Hello,
I have reproduced this issue with the asan debug build 65.0b12(buildID: 20190117232427).
Confirming this issue as verified fixed on 66.0b10(buildID:20190221023827) and 67.0a1(buildID:20190221102221). Verified on Windows 10x64 and Ubuntu 16.04x64.
Comment 30•6 years ago
|
||
![]() |
||
Comment 31•6 years ago
|
||
uplift |
Updated•6 years ago
|
Comment 32•6 years ago
|
||
Hello,
Confirming this issue as verified fixed on ESR 60.6.0-build1. Verified on Windows 10x64 and Ubuntu 16.04.
Updated•6 years ago
|
Updated•5 years ago
|
Description
•