Valgrind warning about use after free in SocketAcceptTask

VERIFIED FIXED in B2G C4 (2jan on)

Status

()

VERIFIED FIXED
6 years ago
3 years ago

People

(Reporter: bent.mozilla, Assigned: mrbkap)

Tracking

({csectype-uaf, sec-high})

18 Branch
B2G C4 (2jan on)
ARM
Gonk (Firefox OS)
csectype-uaf, sec-high
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:tef+, firefox20 wontfix, firefox21 wontfix, firefox22 wontfix, firefox-esr17 wontfix, b2g18 fixed, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 fixed)

Details

Attachments

(2 attachments, 4 obsolete attachments)

==228== Invalid write of size 4
==228==    at 0x5BFC8E4: mozilla::ipc::SocketAcceptTask::Run() (in /system/b2g/libxul.so)
==228==    by 0x5C47FA1: MessageLoop::RunTask(Task*) (in /system/b2g/libxul.so)
==228==    by 0x5C48F9F: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (in /system/b2g/libxul.so)
==228==    by 0x5C49017: MessageLoop::DoDelayedWork(base::TimeTicks*) (in /system/b2g/libxul.so)
==228==    by 0x5C5BB7D: base::MessagePumpLibevent::Run(base::MessagePump::Delegate*) (in /system/b2g/libxul.so)
==228==    by 0x5C47F3D: MessageLoop::RunInternal() (in /system/b2g/libxul.so)
==228==    by 0x5C4801D: MessageLoop::Run() (in /system/b2g/libxul.so)
==228==    by 0x5C5104D: base::Thread::ThreadMain() (in /system/b2g/libxul.so)
==228==    by 0x5C5C065: ThreadFunc(void*) (in /system/b2g/libxul.so)
==228==    by 0x483AE17: __thread_entry (in /system/lib/libc.so)
==228==  Address 0xbae2bc8 is 48 bytes inside a block of size 84 free'd
==228==    at 0x48062B0: free (vg_replace_malloc.c:446)
==228==    by 0x60E3DA7: moz_free (in /system/b2g/libxul.so)
==228==    by 0x5BFCAB3: mozilla::ipc::UnixSocketImpl::~UnixSocketImpl() (in /system/b2g/libxul.so)
==228==    by 0x5BFBE79: mozilla::ipc::DeleteInstanceRunnable<mozilla::ipc::UnixSocketImpl>::Run() (in /system/b2g/libxul.so)
==228==    by 0x5C235DF: nsThread::ProcessNextEvent(bool, bool*) (in /system/b2g/libxul.so)
==228==    by 0x5C00A27: NS_ProcessNextEvent_P(nsIThread*, bool) (in /system/b2g/libxul.so)
==228==    by 0x5B05EE3: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (in /system/b2g/libxul.so)
==228==    by 0x5C47F3D: MessageLoop::RunInternal() (in /system/b2g/libxul.so)
==228==    by 0x5C4801D: MessageLoop::Run() (in /system/b2g/libxul.so)
==228==    by 0x5A81337: nsBaseAppShell::Run() (in /system/b2g/libxul.so)
==228==    by 0x59D5A91: nsAppStartup::Run() (in /system/b2g/libxul.so)
==228==    by 0x5381871: XREMain::XRE_mainRun() (in /system/b2g/libxul.so)

Looks like there is a Cancel method on SocketAcceptTask and SocketConnectTask but mImpl is used before checking mCanceled in Run(). Some of these other task classes hold mImpl members and are not cancelable, so I don't know how those will hold up either.

This may be the cause of bug 832385 since the first thing this Run() method does is set a random pointer to null.
(In reply to ben turner [:bent] from comment #0)
> ==228==  Address 0xbae2bc8 is 48 bytes inside a block of size 84 free'd

FWIW this is exactly the offset of the first word of mResultVal in IDBRequest.
Blocking since this results in random behaviour that could go as far as exploitable crash.
blocking-b2g: tef? → tef+
status-b2g18: --- → affected
status-b2g18-v1.0.0: --- → wontfix
status-b2g18-v1.0.1: --- → affected
Assignee: nobody → bent.mozilla
The problem is that in UnixSocket.cpp, we keep multiple bare pointers to UnixSocketImpl instances in tasks that run asynchronously on the IO Thread. The implementation pointer can be deleted before the task is run, and we have a good way to check this at the moment.
Created attachment 718798 [details] [diff] [review]
patch v1

NOTE: This patch is against the b2g18 branch.

I haven't had a chance to fully test this patch, but wanted to get it in the bug. It attempts to fully clean up the destruction path for UnixSocketImpls. The basic idea is this:

We introduce two state bits: mShuttingDownMainThread and mShuttingDownIOThread. When CloseSocket is called (on the main thread), we set mShuttingDownMain to true, send an event to the IO thread and start enforcing that no more events are sent from the main thread to that Impl (for completeness, we also null out mImpl). Once the event runs on the IO thread, we clean up our state to stop receiving "data ready" events and set a bit that means that no more Accept calls are re-queued. From this point on, the Impl will be untouched on the IO thread. We then proxy the delete back to the main thread.

It's worth noting that this approach is similar to what happened on trunk, though that uses a lock (though it only locks around the get and set of a boolean, so might as well not) and this approach is a little more paranoid.
Assignee: kyle → mrbkap
Status: NEW → ASSIGNED
Attachment #718798 - Flags: review?(kyle)
Attachment #718798 - Flags: review?(bent.mozilla)
Oh, I realized after writing my comment that mShuttingDownMainThread is now redundant with mConsumer being non-null, so I removed it in favor of that.
Comment on attachment 718798 [details] [diff] [review]
patch v1

Review of attachment 718798 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great!

::: ipc/unixsocket/UnixSocket.cpp
@@ +419,5 @@
>    SocketAcceptTask(UnixSocketImpl* aImpl) : mCanceled(false), mImpl(aImpl) { }
> +
> +  virtual void Cancel()
> +  {
> +    mCanceled = true;

This should only happen on the IO thread, right? Let's assert (I thought we did already...).
Attachment #718798 - Flags: review?(bent.mozilla) → review+
Is this a sec-critical rated issue? Could this be triggered by web content?
(In reply to Al Billings [:abillings] from comment #7)
> Is this a sec-critical rated issue? Could this be triggered by web content?

Well, on B2G, all one has to do is disable bluetooth from the settings app. I don't know how we rate that kind of thing...
Keywords: sec-high
Keywords: csec-uaf
Created attachment 719243 [details] [diff] [review]
patch v2

I'm sorry, I couldn't refrain from fixing Read... This gets most of the way there as far as my limited testing can tell. Still missing is a patch to make sure all callers of SendSocketData come from the main thread.
Attachment #718798 - Attachment is obsolete: true
Attachment #718798 - Flags: review?(kyle)
Attachment #719243 - Flags: review?(kyle)
Attachment #719243 - Flags: review?(bent.mozilla)
Are you moving the changed to BluetoothOppManager to another bug, or was that supposed to be in this one?
Created attachment 719286 [details] [diff] [review]
patch v2.1

The last patch was overeager in removing the calls to StopWatchingFileDescriptor in the "socket failed" case. I assumed that the socket failing would prevent any more events coming in on it, but that is not the case. This adds them back. Testcase: Send a file. Try to open an application. Note no application opens.
Attachment #719243 - Attachment is obsolete: true
Attachment #719243 - Flags: review?(bent.mozilla)
Attachment #719286 - Flags: review?(kyle)
Attachment #719286 - Flags: review?(bent.mozilla)
Created attachment 719287 [details] [diff] [review]
Fix for the OPP manager

The OPP manager was playing fast and loose with threads: calling into the UnixSocket code from a third thread! Properly synchronizing a third thread would have forced us into the land of locks and even more difficult problems than the ones that we're facing in this bug. Fortunately, the third thread isn't even necessary: we can use the IO thread, meaning that all the synchronization is done for us. One note: it'd be really good for someone to double check that canceling a file transfer can't result in us sending data after we call CloseSocket. I didn't have time to check it fully, but there are MOZ_ASSERTs thrown in there.
Attachment #719287 - Flags: review?(kyle)
Attachment #719287 - Flags: review?(bent.mozilla)
Comment on attachment 719287 [details] [diff] [review]
Fix for the OPP manager

Bringing in echou on the OPP manager review since he's the expert on that particular subject. :)
Attachment #719287 - Flags: review?(echou)
This patch represents a really scary change right on a deadline. I've been testing sending and receiving files, but I don't have a bluetooth headset to test and I'm sure that I haven't stress tested things like error conditions enough. Can other people help out here?
Keywords: qawanted
(fwiw: if this lands 2/29 I honestly doubt anybody outside of moz will notice, except for maybe me and I'm over it)
To be clear, the result of this bug is that turning bluetooth off and on a few times in a row causes the parent process to crash. I'm not going to try to assign a severity to that.
+Matt

I'll go hunt for a QA owner.

Updated

6 years ago
Duplicate of this bug: 845990
Comment on attachment 719286 [details] [diff] [review]
patch v2.1

Review of attachment 719286 [details] [diff] [review]:
-----------------------------------------------------------------

r=me.

I'm running this through valgrind as we speak.

::: ipc/unixsocket/UnixSocket.cpp
@@ +406,5 @@
> +  SocketAcceptTask(UnixSocketImpl* aImpl) : mImpl(aImpl) { }
> +
> +  virtual void Cancel()
> +  {
> +    mImpl = nullptr;

MOZ_ASSERT(!NS_IsMainThread()) here since this is only ever supposed to be called on IO thread.

@@ +675,2 @@
>          }
> +        else if (errno == EAGAIN || errno == EWOULDBLOCK) {

Nit: else-after-continue :)

@@ +687,5 @@
> +      // cause us to end up back here.
> +      mReadWatcher.StopWatchingFileDescriptor();
> +      mWriteWatcher.StopWatchingFileDescriptor();
> +
> +      nsRefPtr<SocketCloseTask> t = new SocketCloseTask(this);

SocketCloseTask assumes that it has a mConsumer... Do you need to check that? You added a |if (!mConsumer)| check elsewhere here...
Attachment #719286 - Flags: review?(bent.mozilla) → review+
Comment on attachment 719287 [details] [diff] [review]
Fix for the OPP manager

Review of attachment 719287 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/unixsocket/UnixSocket.cpp
@@ +611,5 @@
> +{
> +  MOZ_ASSERT(!NS_IsMainThread());
> +  MOZ_ASSERT(mImpl && !mImpl->IsShutdownOnIOThread());
> +
> +  mImpl->QueueWriteData(aData);

I don't know how we can guarantee that this is safe. Isn't mImpl set (and unset) on the main thread?
(Greg to cc: since he's tracking bug 845990 from over here)
Comment on attachment 719287 [details] [diff] [review]
Fix for the OPP manager

bent and I have a plan for how to fix this.
Attachment #719287 - Attachment is obsolete: true
Attachment #719287 - Flags: review?(echou)
Attachment #719287 - Flags: review?(bent.mozilla)
(In reply to ben turner [:bent] from comment #19)\
> SocketCloseTask assumes that it has a mConsumer... Do you need to check
> that? You added a |if (!mConsumer)| check elsewhere here...

It has a check in the !IsShutdownOnMainThread call.
Created attachment 719691 [details] [diff] [review]
patch v2.2
Attachment #719286 - Attachment is obsolete: true
Attachment #719691 - Flags: review?(bent.mozilla)
Created attachment 719694 [details] [diff] [review]
OPP manager patch v2

This no longer races to get mImpl from the OppManager. It also no longer races to set mLastCommand off the main thread.

The relationship between the classes here makes hacking the code here really weird.
Attachment #719694 - Flags: review?(echou)
Attachment #719694 - Flags: review?(bent.mozilla)
Attachment #719691 - Flags: review?(bent.mozilla) → review+
Comment on attachment 719694 [details] [diff] [review]
OPP manager patch v2

Review of attachment 719694 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/bluetooth/BluetoothOppManager.cpp
@@ +124,5 @@
>  public:
> +  ReadFileTask(UnixSocketImpl* aImpl,
> +               nsIInputStream* aInputStream,
> +               uint32_t aRemoteMaxPacketSize)
> +    : mImpl(aImpl),

Assert mImpl

@@ +997,5 @@
> +  // responses come back from the socket.
> +  nsRefPtr<SetLastCommandRunnable> runnable(
> +      new SetLastCommandRunnable(ObexRequestCode::Put));
> +  nsresult rv = NS_DispatchToMainThread(runnable);
> +  NS_ENSURE_SUCCESS_VOID(rv);

This will leak req, let's put that in an autoptr

::: dom/bluetooth/BluetoothOppManager.h
@@ +73,5 @@
>    bool ExtractBlobHeaders();
>  
> +  void SetLastCommand(int aCommand)
> +  {
> +    mLastCommand = aCommand;

This needs a main thread assertion, maybe just move to the cpp.
Attachment #719694 - Flags: review?(bent.mozilla) → review+
This needs to be checked into Mozilla-B2g18_v1_0_1 only (I think!)
Keywords: checkin-needed
https://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/656066ad44e6
https://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/3b499b6b21e6

Leaving open for the m-c fix to land.
status-b2g18-v1.0.1: affected → fixed
status-firefox20: --- → wontfix
status-firefox21: --- → wontfix
status-firefox22: --- → affected
Keywords: checkin-needed
status-firefox-esr17: --- → wontfix
Oh, let's clone and have a new one for trunk after we figure out what's going on in bug 830290.
This was a branch-only fix. Let's track bug 846615 for anything that needs to happen on trunk (starting with me actually verifying that there is a real problem on trunk).
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
status-firefox22: affected → wontfix

Comment 32

6 years ago
with one other tester, we ran a quick testpass against 03-01 nightly:

Gecko  http://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/69e249bf251c
Gaia   f46906e594b613571bfcc8f146d60bffd42e5a5b
BuildID 20130301070202
Version 18.0

partial results of the testrun are here: https://moztrap.mozilla.org/runtests/run/898/env/305/

summary:

accepting phonecall via bt headset - OK
rejecting phonecall via bt headset  - OK
pairing bt device - ok
disconnecting bt device - OK
searching for devices when BT switch on  - OK

however we saw issues where bt file tranfer.
1.) in one case FFOS on the device crashed while trying to send a file from unagi (happened once);
2.) not able to establish connection between macbook air and unagi device to send a file from unagi to laptop (laptop says "failed to establish connection')

3.) in some cases though, after a repair, we can send files to laptop to unagi
 Note that we'll be investigating this more when we have more time but this is the case for now.
I'll wait for the followup bugs to be filed for the items in comment 32, then we can remove qawanted on the bug.
Target Milestone: --- → B2G C4 (2jan on)
John, have you written up the follow up bugs for this yet?  If so, can you please list them?
Flags: needinfo?(jhammink)

Comment 35

6 years ago
Hi,

Here are a few bugs currently related to this issue:

https://bugzilla.mozilla.org/show_bug.cgi?id=847963 -  sometimes cannot transfer files to a mac

https://bugzilla.mozilla.org/show_bug.cgi?id=823803 -  [Open_][bluetooth] Bluetooth file which The test machine to accept not be able to view, when accept completed-

To be clear, I want to try and test the above scenario on the latest nightly, to see where things are.
Flags: needinfo?(jhammink)

Comment 36

6 years ago
Tested BT File transfer, and and BT Headset (answering call) on the following builds.

accepting phonecall via bt headset - OK
terminating phonecall via bt headset - OK
pairing bt device - ok
disconnecting bt device - OK
searching for devices when BT switch on  - OK


Tested on both 1.0.1. and 1.1 channels.
Gecko  http://hg.mozilla.org/releases/mozilla-b2g18/rev/99680a32f297
Gaia   de3e5b9205e6cb1a6bd0858a98d159272ad96d11
BuildID 20130312070202
Version 18.0


Also tested on
Gecko  http://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/cdf058e47197
Gaia   1a6e1cd7715a5192b32db4b7127c5ae5b8162a7a
BuildID 20130312070203
Version 18.0
Resolution: FIXED → WORKSFORME

Updated

6 years ago
Keywords: qawanted
Thanks John.
Resolution: WORKSFORME → FIXED

Updated

6 years ago
Status: RESOLVED → VERIFIED
Comment on attachment 719694 [details] [diff] [review]
OPP manager patch v2

Review of attachment 719694 [details] [diff] [review]:
-----------------------------------------------------------------

Remove review request since it has been resolved.
Attachment #719694 - Flags: review?(echou)

Updated

3 years ago
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.