Last Comment Bug 699222 - Add File Socket functionality to IO thread for B2G RIL comms
: Add File Socket functionality to IO thread for B2G RIL comms
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: DOM: Device Interfaces (show other bugs)
: Trunk
: All Gonk (Firefox OS)
: -- normal (vote)
: mozilla11
Assigned To: Kyle Machulis [:qdot]
:
: Andrew Overholt [:overholt]
Mentors:
Depends on:
Blocks: b2g-telephony
  Show dependency treegraph
 
Reported: 2011-11-02 13:57 PDT by Kyle Machulis [:qdot]
Modified: 2011-12-16 13:52 PST (History)
9 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP for IPC Thread communication with cell phone radio (15.07 KB, patch)
2011-11-28 15:46 PST, Kyle Machulis [:qdot]
kyle: review+
Details | Diff | Splinter Review
Part 1 (v1): Add MOZ_B2G_RIL configure flag (2.30 KB, patch)
2011-12-02 14:10 PST, Philipp von Weitershausen [:philikon]
khuey: review+
Details | Diff | Splinter Review
Part 2 (v1): RIL IPC implementation (15.86 KB, patch)
2011-12-02 14:11 PST, Philipp von Weitershausen [:philikon]
no flags Details | Diff | Splinter Review
Part 2 (v2): RIL IPC implementation (15.71 KB, patch)
2011-12-05 03:26 PST, Philipp von Weitershausen [:philikon]
no flags Details | Diff | Splinter Review
Part 2 (v3): RIL IPC implementation (18.60 KB, patch)
2011-12-07 01:10 PST, Philipp von Weitershausen [:philikon]
cjones.bugs: review+
Details | Diff | Splinter Review

Description Kyle Machulis [:qdot] 2011-11-02 13:57:58 PDT
Make the RIL thread code talk to the POSIX socket provided by the b2g-dialer-daemon.
Comment 1 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-11-02 14:01:30 PDT
In bug 698621 we're having our existing IO thread talk to the ril socket, and then ferry complete packets over to the ril worker.  Did that plan change?  If so, why?
Comment 2 Kyle Machulis [:qdot] 2011-11-02 14:02:24 PDT
Also depends on the outside task of getting the radio socket account change daemon finished and integrated. See

https://github.com/kmachulis-mozilla/b2g-dialer-daemon/issues/21
Comment 3 Kyle Machulis [:qdot] 2011-11-02 14:03:54 PDT
This is a subportion of that bug. I was just creating this for the file socket implementation part, I thought bug 698621 was about getting the tri-thread comms working?
Comment 4 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-11-02 14:05:19 PDT
That's fine --- what confused me was that we discussed doing the socket IO on the IO thread, but this bug title says "worker thread" which might mean either the IO thread or the ril Web Worker thread.
Comment 5 Kyle Machulis [:qdot] 2011-11-02 14:07:02 PDT
My fault. Fixed.
Comment 6 Kyle Machulis [:qdot] 2011-11-28 15:46:48 PST
Created attachment 577401 [details] [diff] [review]
WIP for IPC Thread communication with cell phone radio

Attaching the current work in progress for the RIL. Hasn't been tested on phone yet due to issues listed at https://bugzilla.mozilla.org/show_bug.cgi?id=698621#c7 but the general idea is there. Still need to clean out printfs. Would be nice to leave in network code for desktop testing, can #if 0 it before final landing.
Comment 7 Philipp von Weitershausen [:philikon] 2011-12-02 14:10:33 PST
Created attachment 578714 [details] [diff] [review]
Part 1 (v1): Add MOZ_B2G_RIL configure flag
Comment 8 Philipp von Weitershausen [:philikon] 2011-12-02 14:11:26 PST
Created attachment 578716 [details] [diff] [review]
Part 2 (v1): RIL IPC implementation
Comment 9 Philipp von Weitershausen [:philikon] 2011-12-05 00:01:05 PST
Landed part 1 on inbound to unblock all the other RIL related patches:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8f00dd36208d
Comment 10 Philipp von Weitershausen [:philikon] 2011-12-05 03:26:48 PST
Created attachment 579022 [details] [diff] [review]
Part 2 (v2): RIL IPC implementation

Kyle changed RIL Write to use tasks instead of fd polling, fixes write thread CPU exhaustion issue.
Comment 11 Matt Brubeck (:mbrubeck) 2011-12-05 10:37:15 PST
Part 1: https://hg.mozilla.org/mozilla-central/rev/8f00dd36208d
Comment 12 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-12-06 00:55:11 PST
Comment on attachment 579022 [details] [diff] [review]
Part 2 (v2): RIL IPC implementation

>diff --git a/ipc/ril/Makefile.in b/ipc/ril/Makefile.in

>+# Contributor(s):
>+#   Chris Jones <jones.chris.g@gmail.com>
>+#	Kyle Machulis <kmachulis@mozilla.com>
>+#

Nit: tab character crept in.

>diff --git a/ipc/ril/Ril.cpp b/ipc/ril/Ril.cpp

>+#if defined(MOZ_WIDGET_GONK)

Is ril-enabled-but-no-gonk a configuration we want to continue to
support?  Why?

>+class RILWriteTask : public Task {

Nit: "RilWriteTask".

>+bool
>+RilClient::OpenSocket()
>+{
>+    /*
>+     * XXX IMPLEMENT ME
>+     *

I think this is implemented ...

>+    if(mSocket.mFd < 0)
>+    {

Nit: "if (mSocket.mFd < 0) {"

>+
>+
>+

Extranenous whitespace.

>+    if(connect(mSocket.mFd, (struct sockaddr *) &addr, alen) < 0) {

Nit: "if (connect(..."

>+
>+
>+

Extraneous whitespace.

>+void
>+RilClient::OnFileCanReadWithoutBlocking(int fd)

>+            if(ret <= 0)
>+            {

Nit: "if (ret <= 0) {"

>+                LOG("Cannot read from network, error %d\n", ret);
>+                return;
>+            }
>+            mIncoming->mSize = ret;
>+            LOG("RIL Read from network %d\n", (int)mIncoming->mSize);
>+            sConsumer->MessageReceived(mIncoming.forget());
>+            if(ret < 1024)
>+            {

Nit: "if (ret < 1024) {"

This implementation is wrong because it assumes each message can be
read atomically.  It also stops reading when it sees a message <1024
bytes.  Why?  Couldn't there be several messages in succession of
length < 1024?

>+void
>+RilClient::OnFileCanWriteWithoutBlocking(int fd)
>+{
>+    MOZ_ASSERT(fd == mSocket.mFd);
>+
>+    /*
>+     * IMPLEMENT ME
>+     */

I think this is implemented too (partly).

>+            else {
>+                // XXX?
>+                mOutgoingQ.pop();
>+                perror("RIL can't write");
>+                return;

We end up pop()'ing on all paths out of this function, so maybe we
should do that up top.  (I thought this was a use-after-free bug
initially.)  But see note below.

This implementation is also wrong because it assumes messages can be
written atomically.  See
http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/chrome/common/ipc_channel_posix.cc#663
for an example of how this should work.  If we fail to write a
complete message, we need to remember it, register our write-ready
watcher, and then continue.

>+>diff --git a/ipc/ril/Ril.h b/ipc/ril/Ril.h

>+struct RilMessage
>+{
>+    static const size_t DATA_SIZE = 1024;
>+    char mData[DATA_SIZE];

What limits RIL messages to <= 1024 bytes?
Comment 13 Kyle Machulis [:qdot] 2011-12-06 18:39:31 PST
> >+#if defined(MOZ_WIDGET_GONK)
> 
> Is ril-enabled-but-no-gonk a configuration we want to continue to
> support?  Why?

This allows us to adb forward the file socket to the desktop for
development and testing using the RIL threads, like we would on
the phone. We've also got a solution that just uses
nsISocketTransport that I suppose we could use, but I liked being
able to have the IPC <-> worker interface in place even while
doing desktop dev.
  
> >+                LOG("Cannot read from network, error %d\n", ret);
> >+                return;
> >+            }
> >+            mIncoming->mSize = ret;
> >+            LOG("RIL Read from network %d\n", (int)mIncoming->mSize);
> >+            sConsumer->MessageReceived(mIncoming.forget());
> >+            if(ret < 1024)
> >+            {
> 
> Nit: "if (ret < 1024) {"
> 
> This implementation is wrong because it assumes each message can be
> read atomically.  It also stops reading when it sees a message <1024
> bytes.  Why?  Couldn't there be several messages in succession of
> length < 1024?

We're just trying to exhaust the buffer at this point. We have no
idea what a message even is here, we're just bringing in bytes
and shoving them up to the state machine. The JS parcel handler
deals with the storage of partially read messages. We could most
likely reallocate a buffer for the full size every time we read,
but why? This just saves us some management.

> >+            else {
> >+                // XXX?
> >+                mOutgoingQ.pop();
> >+                perror("RIL can't write");
> >+                return;
> 
> We end up pop()'ing on all paths out of this function, so maybe we
> should do that up top.  (I thought this was a use-after-free bug
> initially.)  But see note below.
> 
> This implementation is also wrong because it assumes messages can be
> written atomically.  See
> http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/chrome/common/
> ipc_channel_posix.cc#663
> for an example of how this should work.  If we fail to write a
> complete message, we need to remember it, register our write-ready
> watcher, and then continue.

Ok, yeah, since we're sending a PostTask in the first place with
no idea whether it'll block or not, this could probably be more
robust.

> >+>diff --git a/ipc/ril/Ril.h b/ipc/ril/Ril.h
> 
> >+struct RilMessage
> >+{
> >+    static const size_t DATA_SIZE = 1024;
> >+    char mData[DATA_SIZE];
> 
> What limits RIL messages to <= 1024 bytes?

Nothing. That was just a static size to read in/out easily so I
didn't have to reallocate at any point. There's also not going to
be RIL messages in the multiple megabyte (or even near a mb)
range, so we shouldn't cause lots of polling loops on this. Was
basically just an average to get the data thru and up. to the
state machine.
Comment 14 Kyle Machulis [:qdot] 2011-12-06 18:48:02 PST
Ok, looking at the review and the code again, we really need to rename RilMessage to RilRawData or something, so it's obvious we're dealing with bytes without context on this level. We started calling it RilMessage back before we realized we were doing everything up in the worker in JS. I'll add that to the patch.
Comment 15 Philipp von Weitershausen [:philikon] 2011-12-07 01:10:00 PST
Created attachment 579623 [details] [diff] [review]
Part 2 (v3): RIL IPC implementation

Patch with Kyle's latest changes from the GitHub fork.
Comment 16 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-12-07 01:59:01 PST
Comment on attachment 579623 [details] [diff] [review]
Part 2 (v3): RIL IPC implementation

>diff --git a/ipc/ril/Makefile.in b/ipc/ril/Makefile.in

>+# Contributor(s):
>+#   Chris Jones <jones.chris.g@gmail.com>
>+#	Kyle Machulis <kmachulis@mozilla.com>
>+#

Still have a stray tab here.

>diff --git a/ipc/ril/Ril.cpp b/ipc/ril/Ril.cpp

>+#if defined(MOZ_WIDGET_GONK)
>+#include <sys/socket.h>
>+#include <sys/un.h>
>+#include <sys/select.h>
>+#include <sys/types.h>
>+#endif
>+

Does this need to be ifdef WIDGET_GONK?  I don't think it does.
Please remove if not.

>+  RilRawData* mCurrentRilRawData;

Make this nsAutoPtr<RilRawData>.

>+void
>+RilClient::OnFileCanWriteWithoutBlocking(int fd)
>+{
>+    // Try to write the bytes of mCurrentRilRawData.  If all were written, continue.
>+    //
>+    // Otherwise, save the byte position of the next byte to write
>+    // within mCurrentRilRawData, and request
>+    //

And request what?!??  What?!?!?  A pony?  The suspense is killing me.

>+        const uint8_t *toWrite;
>+
>+        toWrite = (const uint8_t *)mCurrentRilRawData->mData;
>+

  const uint8_t* toWrite = mCurrentRilRawData->mData;

You shouldn't need to cast this.

>+        delete mCurrentRilRawData;
>+        mCurrentRilRawData = NULL;

With mCurrentRilRawData as an nsAutoPtr, you just need to assign it
null to free its pointed-to data.  So get rid of the |delete
mCurrentRilRawData| statement.

>diff --git a/ipc/ril/Ril.h b/ipc/ril/Ril.h

>+struct RilRawData
>+{
>+    static const size_t DATA_SIZE = 1024;

Call this MAX_DATA_SIZE.

r=me with those fixes.
Comment 17 Philipp von Weitershausen [:philikon] 2011-12-07 03:00:19 PST
Pushed on Kyle's behalf: https://hg.mozilla.org/integration/mozilla-inbound/rev/c160a02d2d2a
Comment 18 Matt Brubeck (:mbrubeck) 2011-12-07 12:25:04 PST
Part 2: https://hg.mozilla.org/mozilla-central/rev/c160a02d2d2a

Note You need to log in before you can comment on or make changes to this bug.