Closed Bug 843868 Opened 11 years ago Closed 11 years ago

Non-pointer usage of sockaddr struct in UnixSocketImpl causing memory to be stomped

Categories

(Core :: DOM: Device Interfaces, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla22
blocking-b2g tef+
Tracking Status
firefox20 --- wontfix
firefox21 --- wontfix
firefox22 --- fixed
b2g18 --- fixed
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- fixed

People

(Reporter: bent.mozilla, Assigned: qdot)

References

Details

Attachments

(2 files, 8 obsolete files)

11.04 KB, patch
tzimmermann
: review+
Details | Diff | Splinter Review
7.26 KB, patch
Details | Diff | Splinter Review
==229== Invalid write of size 1
==229==    at 0x480B014: memcpy (mc_replace_strmem.c:883)
==229==    by 0x5CB171B: (anonymous namespace)::RilConnector::CreateAddr(bool, int&, sockaddr*, char const*) (Ril.cpp:132)
==229==    by 0x5CB2F99: mozilla::ipc::UnixSocketImpl::Connect() (UnixSocket.cpp:524)
==229==    by 0x5CB308B: mozilla::ipc::SocketConnectTask::Run() (UnixSocket.cpp:452)
==229==    by 0x5CFF039: MessageLoop::RunTask(Task*) (message_loop.cc:333)
==229==    by 0x5D00037: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (message_loop.cc:341)
==229==    by 0x5D00CF9: MessageLoop::DoWork() (message_loop.cc:441)
==229==    by 0x5D12CDF: base::MessagePumpLibevent::Run(base::MessagePump::Delegate*) (message_pump_libevent.cc:311)
==229==    by 0x5CFEFE5: MessageLoop::RunInternal() (message_loop.cc:215)
==229==    by 0x5CFF0B5: MessageLoop::Run() (message_loop.cc:208)
==229==    by 0x5D080D5: base::Thread::ThreadMain() (thread.cc:156)
==229==    by 0x5D131D9: ThreadFunc(void*) (platform_thread_posix.cc:39)
==229==  Address 0x9957b9a is 2 bytes after a block of size 88 alloc'd
==229==    at 0x4806A4C: malloc (vg_replace_malloc.c:270)
==229==    by 0x61A5F23: moz_xmalloc (mozalloc.cpp:54)
==229==    by 0x5CB29B7: mozilla::ipc::UnixSocketConsumer::ConnectSocket(mozilla::ipc::UnixSocketConnector*, char const*, int) (mozalloc.h:201)
==229==    by 0x5CB18DF: mozilla::ipc::RilConsumer::RilConsumer(unsigned long, mozilla::dom::workers::WorkerCrossThreadDispatcher*) (Ril.cpp:189)
==229==    by 0x5867D05: mozilla::dom::gonk::SystemWorkerManager::RegisterRilWorker(unsigned int, JS::Value const&, JSContext*) (SystemWorkerManager.cpp:515)
==229==    by 0x5CE7AA3: NS_InvokeByIndex_P (xptcinvoke_arm.cpp:160)
==229==    by 0x594FABF: XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) (XPCWrappedNative.cpp:3085)
==229==    by 0x59549F3: XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) (XPCWrappedNativeJSOps.cpp:1417)
==229==    by 0x5F857C9: js::InvokeKernel(JSContext*, JS::CallArgs, js::MaybeConstruct) (jscntxtinlines.h:327)
==229==    by 0x5F7F08F: js::Interpret(JSContext*, js::StackFrame*, js::InterpMode) (jsinterp.cpp:2344)
==229==    by 0x5F84F3F: js::RunScript(JSContext*, js::StackFrame*) (jsinterp.cpp:324)
==229==    by 0x5F864D7: js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value*, JS::Value*) (jsinterp.cpp:381)

It looks like the size written in RilConnector::CreateAddr for sockaddr_un is bigger than the size of sockaddr.

All this code needs to be cleaned up so that we don't accidentally write more than we should.
Attached patch Basic idea (obsolete) — Splinter Review
This is the basic idea... I think it will crash the phone if it lands like this though since we're violating the size limit already.
Attachment #716948 - Flags: review?(kyle)
Attachment #716948 - Flags: review?(kyle) → review+
I'll apply the patch and run here to test, as I'm curious what portion is overstepping bounds anyways.
The warning fingers RilConnector::CreateAddr on Gonk, so the sockaddr_un with its string member is too big.
And yes, this does cause a crash loop. \o/

Working on it now.
Oops.

So UnixSocketImpl has a sockaddr member. That's bad. sockaddr is just a common base pointer type, but should usually have a sockaddr_un or sockaddr_in or something like that under it as its actual implementation. So that's why we're hopping that boundary.

Turning this into a member pointer that is passed and filled by the connector classes should fix it.
BTW, this only happens on m-c due to the fact that RIL moved to using UnixSocket as part of MultiRIL on m-c. That said, this unixsocket code is still used in bluetooth on b2g18, and could /feasibly/ be causing crashes there. Not quite sure if it should track/block yet.
Assignee: nobody → kyle
Summary: Valgrind memory error in RilConnector::CreateAddr → Non-pointer usage of sockaddr struct in UnixSocketImpl causing memory to be stomped
So, removing the sockaddr member of UnixSocketImpl is easy. It was only used to resolve out addresses that we can easily just cache. The hard part is figuring out what to do about the way the UnixSocketConnector classes are architected now. The idea was to abstract off the struct types. The problem being if we pass a sockaddr* into CreateSocketAddr to be filled, we'll fill with an pointer that's most likely larger, and we don't have a way to destruct properly afterward without leaking. I'm looking at moving ownership of the sockaddr* to the Connector classes, but this is going to require property shuffling and some new classes.

All this because C doesn't have polymorphism. Yup.
Ok. sockaddr_sco is only 8 bytes, sockaddr_rc is 10 bytes. So, while not a good idea, the currently bluetooth implementation on b2g18 shouldn't have memory corruptions based on this specific bug.
Ok. First shot at cleaning this up. Moved sockaddr_* storage into connector classes, so they can store the types they'll need as members, and deal with cleanup. The freaky part here is returning pointers to these members so we can fill in the sockaddr* parameters for connect/bind/etc... Not sure how kosher that is. It's not documented here yet, if that's all I need to do, I can do it in the next round.

Not putting this on full review yet because I'm pretty sure this'll break bluetooth. I just wedged some values into accept() for server sockets, which means bluetooth won't store off the client's address, and will have issues referring to it later. Will fix this ASAP, just wanted to get a patch in now to start on general strategy review.
Attachment #717467 - Flags: feedback?(bent.mozilla)
Attachment #717467 - Flags: feedback?(tzimmermann)
Comment on attachment 717467 [details] [diff] [review]
Patch 1 (v1) - WIP: Moving sockaddr type ownership to connectors

Not gonna work, accept suffers the same problem as everything else in this version. Fixing.
Attachment #717467 - Flags: feedback?(tzimmermann)
Attachment #717467 - Flags: feedback?(bent.mozilla)
Not sure I'm super thrilled with exposing the bare pointers the way I am, but it seems to do the trick.
Attachment #723033 - Flags: review?(bent.mozilla)
Attachment #723033 - Flags: review?(tzimmermann)
Attachment #716948 - Attachment is obsolete: true
Attachment #717467 - Attachment is obsolete: true
Comment on attachment 723033 [details] [diff] [review]
Patch 1 (v2) - Move sockaddr type ownership to connectors

Review of attachment 723033 [details] [diff] [review]:
-----------------------------------------------------------------

Hey!

Before I review the patch, let me suggest a different solution to the problem. The parameter aAddr that is passed to CreateAddr is too small for holding the Unix path of the RIL socket. Why not just introduce 'union sockaddr_any' which holds all sockaddr types we support.

> union sockaddr_any {
>   sockaddr_storage storage; // address-family only
>   sockaddr_in in;
>   sockaddr_in6 in6;
>   sockaddr_local local;
>   sockaddr_sco sco;
>   sockaddr_rc rc;
>   // ... others
> };

This would become the type for aAddr and callers of CreateAddr can retrieve the data from it (or use it directly everywhere). This might need a bit more memory on the stack or the classes that use it, but the interfaces would be much cleaner and we wouldn't need those dubious mAddr fields in the connector classes.
I like that solution far better than what I've got currently. I just get a bit obsessive with pushing the component specific portions down to their own classes. :) 

I'll implement this today and resubmit. Thanks!
Attachment #723033 - Flags: review?(tzimmermann)
Attachment #723033 - Flags: review?(bent.mozilla)
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #15)
> Created attachment 724646 [details] [diff] [review]
> Patch 1 (v4) - Change sockaddr* to be a union of all possible sockaddr types

Hey Kyle, how about initializing mAddr to 0 in ctor? Just came by. :)
Comment on attachment 724646 [details] [diff] [review]
Patch 1 (v4) - Change sockaddr* to be a union of all possible sockaddr types

Review of attachment 724646 [details] [diff] [review]:
-----------------------------------------------------------------

There is one comment about a missing test for AF_INET, the rest are merely improvements of the overall code. They should probably be fixed later with a second patch. Regarding the memset of mAddr, I suggest to only set the family to AF_UNSPEC (which is 0). This way we don't have the overhead of a full memset, but can distinguish valid addresses from invalid ones.

::: dom/bluetooth/BluetoothUnixSocketConnector.cpp
@@ +161,5 @@
>      struct sockaddr_rc addr_rc;
>      aAddrSize = sizeof(addr_rc);
> +    aAddr.rc.rc_family = AF_BLUETOOTH;
> +    aAddr.rc.rc_channel = mChannel;
> +    memcpy(&aAddr.rc.rc_bdaddr, &bd_address_obj, sizeof(bdaddr_t));

I prefer to use variable names as arguments to sizeof(). This way the type can be changed without breaking code.

@@ +167,5 @@
>    case BluetoothSocketType::SCO:
>      struct sockaddr_sco addr_sco;
>      aAddrSize = sizeof(addr_sco);
> +    aAddr.sco.sco_family = AF_BLUETOOTH;
> +    memcpy(&aAddr.sco.sco_bdaddr, &bd_address_obj, sizeof(bdaddr_t));

Same as above.

::: ipc/ril/Ril.cpp
@@ +122,5 @@
>      MOZ_ASSERT(!aIsServer);
>  
>  #if defined(MOZ_WIDGET_GONK)
> +    strcpy((char*)&aAddr.un.sun_path, aAddress);
> +    aAddr.un.sun_family = AF_LOCAL;

I suggest to check the length of aAddress against sun_path. Or it this is already done somewhere, an assert could be used.

@@ +135,5 @@
>  
> +    aAddr.in.sin_family = hp->h_addrtype;
> +    aAddr.in.sin_port = htons(RIL_TEST_PORT + mClientId);
> +    memcpy(&aAddr.in.sin_addr, hp->h_addr, hp->h_length);
> +    aAddrSize = sizeof(sockaddr_in);

This block only works with AF_INET, so there should be a check at the beginning. Or even better, you can skip the gethostbyname call entirely and use INADDR_LOOPBACK as your address. This should give the same results. You can r me with this fixed.

@@ +140,1 @@
>  #endif

For ifdef blocks like this I suggest to only set the address family from within the ifdef branches, and use C or C++ to decide which code to execute. Here is an example

>>>

#ifdef MOZ_WIDGET_GONK
af = AF_LOCAL
#else
hp = gethostname()
af = hp->h_addrtype
#endif

switch (af)
case AF_LOCAL:
  ...
case AF_INET:
  ...
case AF_INET6:
  ...
// etc
}

>>>>

This will make sure that the compiler sees most of the code and can detect errors. I think the compilers are smart enough to not generate code for the unused cases.

::: ipc/unixsocket/UnixSocket.cpp
@@ +664,1 @@
>  

Calls to connect and accept are interuptible. They should be enclosed by TEMP_FAILURE_RETRY.

::: ipc/unixsocket/UnixSocket.h
@@ +105,2 @@
>                            const char* aAddress) = 0;
>  

I think this function should probably return a boolean to indicate success.
Attachment #724646 - Flags: review?(tzimmermann) → review+
Comment on attachment 725078 [details] [diff] [review]
Patch 1 (v5) - Change sockaddr* to be a union of all possible sockaddr types

Whatever portion of my brain is responsible for putting breaks/returns in switch statements seems to have gone missing. :|
Attachment #725078 - Attachment is obsolete: true
I realize this was already r+'d, but with my dodgy patch record lately and the fact I cleaned up most of the non-nits mentioned, I'd like this to get one more once over. I ran it on desktop and the phone, and also through desktop valgrind using the AF_LOCAL branch to test socket address formation. Leak no longer seen (after confirming that it was seen on desktop without patch).
Attachment #725201 - Flags: review?(tzimmermann)
Comment on attachment 725201 [details] [diff] [review]
Patch 1 (v6) - Change sockaddr* to be a union of all possible sockaddr types

Review of attachment 725201 [details] [diff] [review]:
-----------------------------------------------------------------

Nice. Thanks for fixing most of the other problems as well.
Attachment #725201 - Flags: review?(tzimmermann) → review+
This blocks another tef+ bug so tef+.
blocking-b2g: --- → tef+
Going to need some changes to uplift. I /think/ we can get away with just removing the Ril.cpp patch, but might require more. Will check it out.
Removed Ril changes, fixed a couple of bad merge points.
Comment on attachment 725611 [details] [diff] [review]
B2G18 - Patch 1 (v6) - Change sockaddr* to be a union of all possible sockaddr types

Accidentally included fabrice's single task patch :|
Attachment #725611 - Attachment is obsolete: true
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/60261728879e - so far, Linux opt and every opt flavor of Android have refused to compile it.
Depends on: 851770
any further update on this bug? as this is blocking another tef+ bug
https://hg.mozilla.org/mozilla-central/rev/d9aa9a732b06
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Backed out for bustage (yes, I pushed the b2g18 patch).
https://hg.mozilla.org/releases/mozilla-b2g18/rev/866413d0f783
https://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/7202939c0a24

https://tbpl.mozilla.org/php/getParsedLog.php?id=20883178&tree=Mozilla-B2g18_v1_0_1

In file included from ../../../ipc/unixsocket/UnixSocket.cpp:7:0:
../../../ipc/unixsocket/UnixSocket.h:30:3: error: 'sockaddr_in' does not name a type
../../../ipc/unixsocket/UnixSocket.h:31:3: error: 'sockaddr_in6' does not name a type
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: