Closed Bug 814579 Opened 12 years ago Closed 11 years ago

B2G Multi-SIM: multiplexing data streams of multiple RIL daemons

Categories

(Core :: DOM: Device Interfaces, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla21
blocking-b2g -

People

(Reporter: vicamo, Assigned: vicamo)

References

Details

Attachments

(3 files, 10 obsolete files)

48 bytes, patch
vicamo
: review+
Details | Diff | Splinter Review
49 bytes, text/x-github-pull-request
vicamo
: review+
Details | Review
15.19 KB, patch
Details | Diff | Splinter Review
With multiple ril daemons provided in bug 814577, we're going to create a platform independent layer in rilproxy & SystemWorkerManager. In rilproxy, it multiplexes all data from and to each ril daemon while SystemWorkerManager does similar thing but to multiple ril_worker instances.
Blocks: 814581
Assignee: nobody → allstars.chh
Attachment #697427 - Flags: review?(philipp)
Attachment #697427 - Flags: review?(kyle)
RefPtr was created for compatibility issues in bug 661973. Use nsRefPtr because we'll have to compare the client id of each managed RilClient pointer, and equality operators and other utility classes and functions are only available for nsRefPtr.
Attachment #697429 - Attachment is obsolete: true
Attachment #697826 - Flags: review?(jones.chris.g)
These three patches are the stepping stone for MultiSIM support. Full implementation will come in bug 814581.
Attachment #697830 - Flags: review?(kyle)
Attachment #697425 - Flags: review?(allstars.chh)
Assignee: allstars.chh → vyang
All the patches/pulls here keeps compatibility of current b2g arch. Part 1 & 2 have to be merged at the same time, and part 3 can be merged independent to other parts.
Comment on attachment 697826 [details] [diff] [review]
Plan B - Part 3-1: use nsRefPtr instead

>diff --git a/dom/system/gonk/SystemWorkerManager.cpp b/dom/system/gonk/SystemWorkerManager.cpp
>     // Now that we're set up, connect ourselves to the RIL thread.
>-    mozilla::RefPtr<RILReceiver> receiver = new RILReceiver(wctd);
>+    nsRefPtr<RILReceiver> receiver = new RILReceiver(wctd);

Why change this?  (It shouldn't matter, just curious.)

>diff --git a/ipc/ril/Ril.cpp b/ipc/ril/Ril.cpp

>+static nsRefPtr<RilClient> sClient;
>+static nsRefPtr<RilConsumer> sConsumer;

These should be |static StaticRefPtr<...>|.
Attachment #697826 - Flags: review?(jones.chris.g)
Sorry, forgot git-push before creating hg patches.
Attachment #697828 - Attachment is obsolete: true
Attachment #697828 - Flags: review?(kyle)
Attachment #697832 - Flags: review?(kyle)
Attachment #697830 - Attachment description: Plan B - Part 3-3: → Plan B - Part 3-3: connect client socket by id
(In reply to Chris Jones [:cjones] [:warhammer] from comment #11)
> Comment on attachment 697826 [details] [diff] [review]
> Plan B - Part 3-1: use nsRefPtr instead
> 
> >diff --git a/dom/system/gonk/SystemWorkerManager.cpp b/dom/system/gonk/SystemWorkerManager.cpp
> >     // Now that we're set up, connect ourselves to the RIL thread.
> >-    mozilla::RefPtr<RILReceiver> receiver = new RILReceiver(wctd);
> >+    nsRefPtr<RILReceiver> receiver = new RILReceiver(wctd);
> 
> Why change this?  (It shouldn't matter, just curious.)

Just make every ref counted, ril related pointer use nsRefPtr.

> >diff --git a/ipc/ril/Ril.cpp b/ipc/ril/Ril.cpp
> 
> >+static nsRefPtr<RilClient> sClient;
> >+static nsRefPtr<RilConsumer> sConsumer;
> 
> These should be |static StaticRefPtr<...>|.

Thanks, but the two lines are removed in following patches though.
Address review comment #11, use StaticRefPtr instead.
Attachment #697826 - Attachment is obsolete: true
Attachment #697834 - Flags: review?(jones.chris.g)
Rebase after the patch uploaded in comment #14.
Attachment #697832 - Attachment is obsolete: true
Attachment #697832 - Flags: review?(kyle)
Attachment #697836 - Flags: review?(kyle)
Try run for 725983e1650f is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=725983e1650f
Results (out of 19 total builds):
    exception: 4
    failure: 15
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/vyang@mozilla.com-725983e1650f
(In reply to Mozilla RelEng Bot from comment #16)
> Try run for 725983e1650f is complete.
> Detailed breakdown of the results available here:
>     https://tbpl.mozilla.org/?tree=Try&rev=725983e1650f
> Results (out of 19 total builds):
>     exception: 4
>     failure: 15

I cancelled it. Another new run is in https://tbpl.mozilla.org/?tree=Try&rev=802e645c0eea
Attachment #697834 - Flags: review?(jones.chris.g) → review+
What's our timeline for this needing to land? It honestly might be easier to move RIL over to the new UnixSocket class at this point, which is made for handling as many socket connections as we need. If this bug is not tracking v1.1, it might be worth it to spend the time to do that. I can take a look at it, or advise someone else on what needs to be done.
Comment on attachment 697425 [details] [diff] [review]
Plan B - Part 1: Github pull request for rilproxy

reviewed in person and merged.
Attachment #697425 - Flags: review?(allstars.chh) → review+
Comment on attachment 697427 [details] [review]
Plan B - Part 2: Github pull request for gonk-misc

Reviewed & merged.
Attachment #697427 - Flags: review?(philipp)
Attachment #697427 - Flags: review?(kyle)
Attachment #697427 - Flags: review+
Try run for 725983e1650f is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=725983e1650f
Results (out of 20 total builds):
    exception: 4
    warnings: 1
    failure: 15
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/vyang@mozilla.com-725983e1650f
Try run for 802e645c0eea is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=802e645c0eea
Results (out of 306 total builds):
    exception: 2
    success: 282
    warnings: 20
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/vyang@mozilla.com-802e645c0eea
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #18)
> What's our timeline for this needing to land? It honestly
> might be easier to move RIL over to the new UnixSocket class
> at this point, which is made for handling as many socket
> connections as we need. If this bug is not tracking
> v1.1, it might be worth it to spend the time to do that. I can
> take a look at it, or advise someone else on what needs to be
> done.

Two things in concern. One, actually we are now working with partners to refactor all RIL related DOM APIs. I mean, the due date for this, multiple rild support, has already passed. We're just stablizing, polishing the internal interface/arch changes, which have been shared with partners for a while, and moving them in m-c. Hopefully we can merge all internal changes at the end of Jan and start to review/merge public interfaces in Feb.

Two, I always want to remove rilproxy, which now exists for only one reason - permission, and that's one of the reasons that we have this Plan B. By removing rilproxy, Gecko will connect to all ril daemons directly. However, rilds create their sockets by invoking an Android specific call - socket_local_server() and I have no idea whether or not UnixSocket can cover all the differences now.

I've filed bug 826931 for this.
Blocks: 826931
No longer blocks: 826931
(In reply to Vicamo Yang [:vicamo][:vyang] from comment #23)
> Two things in concern. One, actually we are now working with partners to
> refactor all RIL related DOM APIs. I mean, the due date for this, multiple
> rild support, has already passed. We're just stablizing, polishing the
> internal interface/arch changes, which have been shared with partners for a
> while, and moving them in m-c. Hopefully we can merge all internal changes
> at the end of Jan and start to review/merge public interfaces in Feb.

Oh. Ok. I'll go ahead and review this then, and we can punt the move to unixsocket down the line a bit

> Two, I always want to remove rilproxy, which now exists for only one reason
> - permission, and that's one of the reasons that we have this Plan B. By
> removing rilproxy, Gecko will connect to all ril daemons directly. However,
> rilds create their sockets by invoking an Android specific call -
> socket_local_server() and I have no idea whether or not UnixSocket can cover
> all the differences now.

The only thing socket_local_server does is set where the socket is created on the file system, I think? Shouldn't matter.
Comment on attachment 697836 [details] [diff] [review]
Plan B - Part 3-2: allow multiple RilClient instances

Review of attachment 697836 [details] [diff] [review]:
-----------------------------------------------------------------

There's a few issues with XPCOM, but more importantly I just don't understand how we can do things like send only to the first rild we connect to? More comments might help.

::: ipc/ril/Ril.cpp
@@ +80,3 @@
>  };
>  
> +static nsTArray< nsRefPtr<RilClient> > sConnectedClients;

We can't have a static nsTArray, since nsTArray is an XPCOM object. Making it static will trigger allocators at the wrong time. Make this a StaticAutoPtr that is allocated on first usage in StartRIL. Then checkout out the ClearOnShutdown stuff at

http://dxr.allizom.org/mozilla-central/xpcom/base/ClearOnShutdown.h?from=clearonshutdown

To see how to have this autoclear on shutdown without having to setup observers.

@@ +363,5 @@
>  }
>  
>  
>  static void
>  DisconnectFromRil(Monitor* aMonitor)

You don't clear the client out of sConnectedClients here, so if we call disconnect, we won't actually clear the mClient pointer out of the static client array. You'll drop the reference in Ril on disconnect, but it'll still live on here.

@@ +408,5 @@
>  
>      RilRawData *msg = *aMessage;
>      *aMessage = nullptr;
>  
> +    nsRefPtr<RilClient> client = sConnectedClients[0];

So... How does this work?

@@ +419,5 @@
>      return true;
>  }
>  
>  void
>  StopRil()

Shouldn't this take the consumer to stop? IF we're planning on shutting down /all/ clients whenever this is called, you should make a comment about that in the header.

::: ipc/ril/Ril.h
@@ +7,5 @@
>  #ifndef mozilla_ipc_Ril_h
>  #define mozilla_ipc_Ril_h 1
>  
>  #include "mozilla/RefPtr.h"
> +#include "nsAutoPtr.h"

Nit: We may be able to remove the refptr include above this now?
Attachment #697836 - Flags: review?(kyle) → review-
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #25)
> Comment on attachment 697836 [details] [diff] [review]
> Plan B - Part 3-2: allow multiple RilClient instances
-----------------------------------------------------------------
> More comments might help.

Sure. Any change that results in questions during review process represents that it's either not clear enough or completely wrong. Either way, comments or rewrites are required to get it merged.

> ::: ipc/ril/Ril.cpp
> @@ +408,5 @@
> >  
> >      RilRawData *msg = *aMessage;
> >      *aMessage = nullptr;
> >  
> > +    nsRefPtr<RilClient> client = sConnectedClients[0];
> 
> So... How does this work?

It's replaced in part 3-3, which passes an additional aClientId to replace the hard-coded '0' here.
Comment on attachment 697830 [details] [diff] [review]
Plan B - Part 3-3: connect client socket by id

Review of attachment 697830 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not sure I have enough information to properly review this, and it feels more like this is looking for feedback than a full review anyways. How does the RIL Worker know how many clients it will need to start? Right now it's just one since it's hardcoded to zero. I suppose since we wrote it, we can trust it to not collide ids, but since I don't have that information here... 

Honestly, you could probably put parts 3-2 and 3-3 in the same patch. Might make it easier to review.
Attachment #697830 - Flags: review?(kyle) → review-
I think I'd better get bug 826931 fixed first to skip ref counting hell in this bug.
Depends on: 826931
Rewrite based on bug 826931.
Attachment #697830 - Attachment is obsolete: true
Attachment #697834 - Attachment is obsolete: true
Attachment #697836 - Attachment is obsolete: true
Attachment #704683 - Flags: feedback?(kyle)
Comment on attachment 704683 [details] [diff] [review]
Part 3: connect client socket by id

Review of attachment 704683 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/ril/Ril.cpp
@@ +168,3 @@
>      , mShutdown(false)
>  {
> +    if (!aClientId) {

This seems confusing. If aClientId is 0, just don't use it? So are ril sockets now called "rild", "rild1", "rild2", etc?
Attachment #704683 - Flags: feedback?(kyle)
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #30)
> Comment on attachment 704683 [details] [diff] [review]
> Part 3: connect client socket by id
-----------------------------------------------------------------
> This seems confusing. If aClientId is 0, just don't use it? So
> are ril sockets now called "rild", "rild1", "rild2", etc?

Yes, that's for compatibility of both single SIM and multi SIMs. The related parts can also be found in https://github.com/mozilla-b2g/platform_external_qemu/blob/master/vl-android.c#L3741 and https://github.com/mozilla-b2g/platform_hardware_ril/blob/master/rild/rild.c#L200
1) Re-gen patch from hg
2) Connect to different port in desktop debug.
Attachment #704683 - Attachment is obsolete: true
Attachment #705295 - Flags: review?(kyle)
Attachment #705295 - Flags: review?(kyle) → review+
blocking-b2g: --- → leo?
Rebase.
Attachment #705295 - Attachment is obsolete: true
Attachment #709337 - Flags: review+
Backed out, for Mn & X orange on B2G Arm Opt.

The backout cset was:
 https://hg.mozilla.org/integration/mozilla-inbound/rev/9a7369b9bc9d

Mn Failure logs:
 https://tbpl.mozilla.org/php/getParsedLog.php?id=19378468&tree=Mozilla-Inbound
 https://tbpl.mozilla.org/php/getParsedLog.php?id=19379237&tree=Mozilla-Inbound

X failure logs:
 https://tbpl.mozilla.org/php/getParsedLog.php?id=19378470&tree=Mozilla-Inbound
 https://tbpl.mozilla.org/php/getParsedLog.php?id=19379255&tree=Mozilla-Inbound

Not 100% sure (from a glance) that the Mn orange is related (though it probably is) -- but I'm pretty sure the xpcshell orange is are, because it's in gonk/tests/test_ril_worker_icc.js and the error is "ReferenceError: CLIENT_ID is not defined"  (and this bug is about RIL, and its commit message mentions "client...id")
(In reply to Daniel Holbert [:dholbert] from comment #35)
> Not 100% sure (from a glance) that the Mn orange is related

For the record, Mn went green on this push, which was before the backout:
  https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=e649c4c4cb6f

So, it's still unclear whether this bug's patch was responsible for the Mn failure.  Definitely need to sort out the xpcshell issue before this re-lands, though.
Version: unspecified → Trunk
Add CLIENT_ID in newWorker()
Attachment #709337 - Attachment is obsolete: true
(In reply to Vicamo Yang [:vicamo][:vyang] from comment #37)
> Created attachment 709629 [details] [diff] [review]

try https://tbpl.mozilla.org/?tree=Try&rev=d095db219a40
Blocking-, no longer targeting Leo release. Product will retarget for future release.
blocking-b2g: leo? → -
https://hg.mozilla.org/mozilla-central/rev/6a86feb9c86b
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Plan B - Part * patches were uplifted to B2g v1-train and the corresponding gecko patch (Part 3) has not been uplifted to gecko-18.
(In reply to José Antonio Olivera Ortega [:jaoo] from comment #42)
> Plan B - Part * patches were uplifted to B2g v1-train and the corresponding
> gecko patch (Part 3) has not been uplifted to gecko-18.

I don't see any blocking/tracking/approval status for this bug, so I don't know why you would have thought it would have been...
(In reply to Ryan VanderMeulen [:RyanVM] from comment #43)
> (In reply to José Antonio Olivera Ortega [:jaoo] from comment #42)
> > Plan B - Part * patches were uplifted to B2g v1-train and the corresponding
> > gecko patch (Part 3) has not been uplifted to gecko-18.
> 
> I don't see any blocking/tracking/approval status for this bug, so I don't
> know why you would have thought it would have been...

Uplifting the gecko part here is not needed as I've figured out. Forget these tow last comments from my side. I'm sorry for the noise.
Attachment mime type: text/plain → text/x-github-pull-request
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: