Closed
Bug 1466577
Opened 6 years ago
Closed 6 years ago
Crash in arena_dalloc | Allocator<T>::free | MOZ_Z_inflateEnd
Categories
(Core :: Networking: WebSockets, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla63
People
(Reporter: MatsPalmgren_bugz, Assigned: michal)
References
Details
(4 keywords, Whiteboard: [necko-triaged][post-critsmash-triage][adv-main62+][adv-esr60.2+])
Crash Data
Attachments
(1 file, 1 obsolete file)
11.48 KB,
patch
|
u408661
:
review+
dveditz
:
approval-mozilla-beta+
dveditz
:
approval-mozilla-esr60+
dveditz
:
sec-approval+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-d2714192-2567-45ce-af04-afb540180602.
=============================================================
MOZ_CRASH Reason: MOZ_RELEASE_ASSERT((mapelm->bits & ((size_t)0x01U)) != 0) (Double-free?)
All crashes are on Android.
Top 10 frames of crashing thread:
0 libmozglue.so arena_dalloc memory/build/mozjemalloc.cpp:3502
1 libmozglue.so Allocator<MozJemallocBase>::free memory/build/malloc_decls.h:40
2 libmozglue.so MOZ_Z_inflateEnd modules/zlib/src/inflate.c:1285
3 libxul.so mozilla::net::PMCECompression::~PMCECompression netwerk/protocol/websocket/WebSocketChannel.cpp:811
4 libxul.so nsAutoPtr<mozilla::net::PMCECompression>::assign xpcom/base/nsAutoPtr.h:45
5 libxul.so mozilla::net::WebSocketChannel::StopSession xpcom/base/nsAutoPtr.h:129
6 libxul.so wcsrtombs
7 libxul.so nsTimerImpl::InitCommon xpcom/threads/nsTimerImpl.cpp:416
8 libmozglue.so mozilla::detail::MutexImpl::unlock mozglue/misc/Mutex_posix.cpp:181
9 libxul.so nsTimerImpl::InitHighResolutionWithCallback xpcom/threads/nsTimerImpl.cpp:488
=============================================================
Updated•6 years ago
|
Group: core-security → network-core-security
Comment 1•6 years ago
|
||
By itself this doesn't appear exploitable (release assert), but it's definitely new in 60. Would it be better to unhide and treat as a topcrash/stability bug?
Flags: needinfo?(mats)
Reporter | ||
Comment 2•6 years ago
|
||
If it's a double-free then I guess that it'd be kinda random
if the condition is true or not? I haven't analyzed the code though.
I'm not sure unhiding it would get more people that knows this code
looking at it. We might as well CC everyone we think might be able
to help.
Flags: needinfo?(mats)
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → michal.novotny
Priority: -- → P2
Whiteboard: [necko-triaged]
Assignee | ||
Comment 3•6 years ago
|
||
We call inflateEnd() only in the destructor and only if mInflater was successfully initialized with inflateInit2(), so I don't see how we could call it twice.
Assignee: michal.novotny → nobody
Priority: P2 → P3
Reporter | ||
Comment 4•6 years ago
|
||
Perhaps adding canaries in the relevant code might help pin down the bug?
https://searchfox.org/mozilla-central/source/mfbt/Poison.h#82
Before and after the regular members in PMCECompression for example:
https://searchfox.org/mozilla-central/source/netwerk/protocol/websocket/WebSocketChannel.cpp#946
then sprinkle Check() calls on those canaries liberally in the relevant code.
Before WebSocketChannel::mPMCECompressor too maybe?
Flags: needinfo?(michal.novotny)
Updated•6 years ago
|
Updated•6 years ago
|
Comment 5•6 years ago
|
||
Looking at the Fennec crashes under WebSocketChannel::StopSession it's mostly this signature (release asserts) but also a handful of nearby crashes with UAF addresses. Because of those I don't think we can count on the release asserts to always protect us from exploits.
https://crash-stats.mozilla.com/search/?proto_signature=~WebSocketChannel%3A%3AStopSession&product=FennecAndroid&version=60.0.2&version=60.0.1&version=60.0&date=%3E%3D2018-05-14T17%3A46%3A18.000Z&date=%3C2018-06-14T17%3A46%3A18.000Z&_sort=-address&_sort=-date&_facets=signature&_facets=proto_signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=address&_columns=moz_crash_reason#facet-signature
Keywords: csectype-uaf,
sec-high
Assignee | ||
Comment 6•6 years ago
|
||
I've added the checks to the code path where it's crashing according to crash-stats reports.
Assignee: nobody → michal.novotny
Flags: needinfo?(michal.novotny)
Attachment #8986461 -
Flags: review?(mcmanus)
Assignee | ||
Comment 7•6 years ago
|
||
There seems to be race condition when stopping session on the socket thread using AbortSession() and calling Close() on the main thread. When the socket thread nulls out mTransaction we can call StopSession() on the main thread from Close(). Comment in StopSession() says that it shouldn't be called after StartWebsocketData() was called. Jason, Patrick, is the purpose of mTransaction check at https://searchfox.org/mozilla-central/rev/8aaaaf0f98cfec766f8be28c390bc0dcfbb0ea03/netwerk/protocol/websocket/WebSocketChannel.cpp#2354 to check that StartWebsocketData() hasn't been called yet?
I think the right thing to do here is to synchronize this code using a lock (having several atomic bools doesn't seem to be enough) and
- call StopSession() if StartWebsocketData() hasn't been called yet
- send fin message if websocket channel is alive
- do nothing and return a failure if StopSession() was already called on the socket thread
Flags: needinfo?(mcmanus)
Flags: needinfo?(jduell.mcbugs)
Comment 8•6 years ago
|
||
I'm not sure of the answer to the question you asked.. your approach seems reasonable - if you need more eyes I would suggest looping nick in.
Flags: needinfo?(mcmanus)
Assignee | ||
Comment 9•6 years ago
|
||
The new lock protects access to mStopped and mDataStarted. This ensures that Close() won't call StopSession() once the websocket data started (StopSession() is then expected to be called only on socket thread) and also that data cannot start at the same time when Close() is being executed on the main thread.
Flag mStopped is now set iff the session is really stopped. It used to be set together with mRequestedClose in AbortSession() when fin packet was queued, which was IMO wrong.
Redundant mTCPClosed=true were removed from OnInputStreamReady() because the flag is set at the beginning of AbortSession().
Attachment #8986461 -
Attachment is obsolete: true
Attachment #8986461 -
Flags: review?(mcmanus)
Flags: needinfo?(jduell.mcbugs)
Attachment #8991829 -
Flags: review?(hurley)
Comment 10•6 years ago
|
||
Comment on attachment 8991829 [details] [diff] [review]
diff-bug1466577-v2.patch
Review of attachment 8991829 [details] [diff] [review]:
-----------------------------------------------------------------
Looks pretty good overall, but I have a couple questions/concerns to address before landing.
::: netwerk/protocol/websocket/WebSocketChannel.cpp
@@ +2412,4 @@
> // normally this should be called on socket thread, but it is ok to call it
> // from OnStartRequest before the socket thread machine has gotten underway
>
> + MOZ_ASSERT(mStopped);
I know this only happens on debug builds, but should we (debug-only) lock the mutex before accessing? This is the only case of unprotected access, and it just has a bad smell.
@@ -3612,5 @@
> - mRequestedClose = true;
> - mScriptCloseReason = reason;
> - mScriptCloseCode = code;
> -
> - if (!mTransport || mConnecting != NOT_CONNECTING) {
Can you explain why this condition was removed? The log lines (which you left in) appear to be correct only if !mTransport, but I'm not seeing anything that ensures that's the case. If we can assume it to be so, could you add a comment explaining why?
Attachment #8991829 -
Flags: review?(hurley) → review+
Assignee | ||
Comment 11•6 years ago
|
||
(In reply to Nicholas Hurley [:nwgh][:hurley] from comment #10)
> Comment on attachment 8991829 [details] [diff] [review]
> diff-bug1466577-v2.patch
>
> Review of attachment 8991829 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Looks pretty good overall, but I have a couple questions/concerns to address
> before landing.
>
> ::: netwerk/protocol/websocket/WebSocketChannel.cpp
> @@ +2412,4 @@
> > // normally this should be called on socket thread, but it is ok to call it
> > // from OnStartRequest before the socket thread machine has gotten underway
> >
> > + MOZ_ASSERT(mStopped);
>
> I know this only happens on debug builds, but should we (debug-only) lock
> the mutex before accessing? This is the only case of unprotected access, and
> it just has a bad smell.
It's not needed because mStopped is atomic. It's just a sanity check that DoStopSession() is called properly after setting mStopped to true and this is done under a lock due to synchronization of StopSession(), AbortSession() and Close() methods. Once set to true, it never reverts back, so no lock is needed here.
> @@ -3612,5 @@
> > - mRequestedClose = true;
> > - mScriptCloseReason = reason;
> > - mScriptCloseCode = code;
> > -
> > - if (!mTransport || mConnecting != NOT_CONNECTING) {
>
> Can you explain why this condition was removed? The log lines (which you
> left in) appear to be correct only if !mTransport, but I'm not seeing
> anything that ensures that's the case. If we can assume it to be so, could
> you add a comment explaining why?
The condition has changed to "if (mDataStarted)" above in the patch. It's equivalent, but IMO easier to understand. As described in comment #7, if the data already started, try to send fin packet, otherwise call DoStopSession().
Assignee | ||
Comment 12•6 years ago
|
||
Comment on attachment 8991829 [details] [diff] [review]
diff-bug1466577-v2.patch
[Security approval request comment]
How easily could an exploit be constructed based on the patch?
Close() method must be called by the script at the same time when server aborts the connection. It's probably doable, but the result is double free of an object that performs compression. I don't think it can be easily used for anything else than crashing the browser.
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
It's obvious that the patch solves a race condition, but as I noted above, I don't think this is really a security problem.
Which older supported branches are affected by this flaw?
Race condition in Websocket closing was probably from the beginning. This particular crash is possible since landing bug 792831.
If not all supported branches, which bug introduced the flaw?
See above.
Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
It shouldn't be hard to backport the patch if it's not applicable.
How likely is this patch to cause regressions; how much testing does it need?
I don't expect regressions but it's hard to say. Try push looked good.
Attachment #8991829 -
Flags: sec-approval?
Updated•6 years ago
|
Blocks: 792831
status-firefox63:
--- → affected
tracking-firefox62:
--- → +
tracking-firefox63:
--- → +
tracking-firefox-esr60:
--- → 62+
Keywords: csectype-uaf → csectype-race
Comment 13•6 years ago
|
||
Comment on attachment 8991829 [details] [diff] [review]
diff-bug1466577-v2.patch
[Triage Comment]
sec-approval=dveditz
a=dveditz for beta and esr60. Please land on branches soon so we have bake time.
Attachment #8991829 -
Flags: sec-approval?
Attachment #8991829 -
Flags: sec-approval+
Attachment #8991829 -
Flags: approval-mozilla-esr60+
Attachment #8991829 -
Flags: approval-mozilla-beta+
Assignee | ||
Updated•6 years ago
|
Keywords: checkin-needed
Comment 14•6 years ago
|
||
Keywords: checkin-needed
Comment 15•6 years ago
|
||
uplift |
Comment 16•6 years ago
|
||
Group: network-core-security → core-security-release
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Updated•6 years ago
|
Flags: qe-verify-
Whiteboard: [necko-triaged] → [necko-triaged][post-critsmash-triage]
Comment 17•6 years ago
|
||
This needs a rebased patch for ESR60 uplift.
Updated•6 years ago
|
Flags: needinfo?(aryx.bugmail)
Comment 18•6 years ago
|
||
uplift |
Flags: needinfo?(aryx.bugmail)
Updated•6 years ago
|
Whiteboard: [necko-triaged][post-critsmash-triage] → [necko-triaged][post-critsmash-triage][adv-main62+][adv-esr60.2+]
Updated•5 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•