B2G MMS: Cannot send MMS after resetting Message settings (even without any changes)

RESOLVED FIXED in Firefox 28

Status

defect
P1
blocker
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: airpingu, Assigned: jessica)

Tracking

({regression})

unspecified
1.3 C3/1.4 S3(31jan)
ARM
Gonk (Firefox OS)
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:1.3+, firefox27 wontfix, firefox28 fixed, firefox29 fixed, b2g-v1.3 fixed, b2g-v1.4 fixed)

Details

(Whiteboard: [FT:RIL])

Attachments

(6 attachments, 17 obsolete attachments)

2.58 KB, patch
jessica
: review+
Details | Diff | Splinter Review
5.28 KB, patch
jessica
: review+
Details | Diff | Splinter Review
3.31 KB, patch
hsinyi
: review+
Details | Diff | Splinter Review
2.61 KB, patch
Details | Diff | Splinter Review
5.30 KB, patch
Details | Diff | Splinter Review
3.33 KB, patch
Details | Diff | Splinter Review
This bug sounds really critical.

STR:

0. Use the latest master codes on 10/25 for both Gecko and Gaia.
1. Open Messaging App to send MMS (repeat this step; always successful).
2. Open Settings App -> Cellular & Data -> Message settings -> Tap on OK (don't do any change).
3. Back to Messaging App to send MMS again -> KO.

I assume this is a regression since I haven't been aware of this issue.
> 3. Back to Messaging App to send MMS again -> KO.

Note that the symptom is the sending circle will keep spinning for a long time.
Using build released on 10/20 seems ok. I will try it with latest build again.
Note that you have to do step #2 within 30 seconds after successfully sending an MMS at step #1, because Gecko will internally keep the MMS connection for 30 seconds for the next attempt. So, if you wait for the disconnection and do the resetting, it won't cause issues.
QA Contact: nkot
This issue started to occur on the Buri Master Build ID: 20131025100746

Gaia   afbf45f26a73b7cd5e0a831bea48087331975286
SourceStamp 2f2a45f04e7c
BuildID 20131025100746
Version 27.0a1

Last working Buri Master Build ID: 20131024154023

Gaia   2feebdb1a2583928f32407d76d798f8654621e2b
SourceStamp 19fd3388c372
BuildID 20131024154023
Version 27.0a1
Hi Ken, I'd appreciate if you could find someone look into this? I'm wondering there is something wrong with the APN settings. We haven't changed MMS logic during this time period.
Flags: needinfo?(kchang)
Jessica, we need your help...:)
Assignee: nobody → jjong
Flags: needinfo?(kchang)
Thanks Jessica very much! Please let me know if you need to reproduce this issue.
(In reply to Gene Lian [:gene] (needinfo? encouraged) from comment #7)
> Thanks Jessica very much! Please let me know if you need to reproduce this
> issue.

Sure! I was not able to reproduce it with my buri, either my phone is too laggy or my hands are not quick enough.
Will ask for your help when I look into this bug. Thank you.
From my observations today, when this issue happens we can find the following in the logs:

10-29 16:47:15.639 I/Gecko   (  144): -@- MmsService: xhr done, but status = 0, statusText = 
10-29 16:47:15.639 I/Gecko   (  144): -@- MmsService: Fail to send. Will retry after: 300000

However, I think the root cause is: when ril.data.apnSettings key is set (no matter values changed or not), RIL will clear all data connections and reconstruct |apnSettings|. The new |apnSettings| does not contain some cached old values and callbacks, so it will fail to notify of the "network-interface-state-changed" events. In this case, MmsService though mms connection was still there so it tried to send the message without setting up mms data connection.

We need to think of a way to handle apn settings change properly when there are active connections.
Interesting! Sounds like not a regression, though. Thanks for finding out the root cause! It's very valuable. We need to work out a better mechanism to handle this case.
A quick and easy way to fix this, since we are enhancing NetworkManager/RILNetworkInterface in the near future, is to send out explicitly a "network-interface-state-changed" event when disconnecting all active data connections due to ril.data.apnSettings change. This way, MmsService will call getDataCallStateByType("mms"), and it will know that mms connection is not in CONNECTED state.
Sounds good to me. However, I'm curious about why the original |deactivateDataCallByType(...)| in |updateApnSettings(...)| doesn't trigger the "network-interface-state-changed"?

In MmsService.js, we also call deactivateDataCallByType(...) to disconnect the MMS connection and can update the |this.connected| via "network-interface-state-changed".
In a normal case, the flow is:

1. (RadioInterface)      deactivateDataCallByType(...) -> 
2. (RILNetworkInterface) disconnect(...) -> 
3. (RadioInterface)      deactivateDataCall(...) ->
4. (ril_worker)          deactivateDataCall(...), here it will send back to RadioInterface a "datacallstatechange" 
                                                  message with state DISCONNECTING . ->
5. (RadioInterface)      handleDataCallState(...) ->
6. (RadioInterface)      _deliverDataCallCallback(...), here it will find all the datacall callback registrants and 
                                                        trigger the callback, RILNetworkInterface(s) should be one of 
                                                        the registrants. ->
7. (RILNetworkInterface) dataCallStateChanged(...), here is the place where it is supposed to notify the 
                                                   "network-interface-state-changed" event.


In the case of updateApnSettings(...), RILNetworkInterface's datacall callback is first unregistered and then RILNetworkInterface is deleted  [1], so in Step 6 above, no callback to RILNetworkInterface(s), which are recreated by then (RILNetworkInterface(s) will register to datacall callbacks on |connect(...)| [2]).


Another thing to note is that after data deactivation is done, state is set to UNKNOWN, and no "network-interface-state-changed" event is notified for this state [3]. Currently, it does not seems to impact anything, but I think is kind of odd.

Please let me know if there is anything unclear or wrong. Thanks.


[1] http://mxr.mozilla.org/mozilla-central/source/dom/system/gonk/RadioInterfaceLayer.js#1555
[2] http://mxr.mozilla.org/mozilla-central/source/dom/system/gonk/RadioInterfaceLayer.js#3483
[3] http://mxr.mozilla.org/mozilla-central/source/dom/system/gonk/RadioInterfaceLayer.js#3429
Oops! Sorry for the alignment...
This one might be a regression.
Keywords: regression
(In reply to Gene Lian [:gene] (needinfo? encouraged) from comment #15)
> This one might be a regression.

comment 4 already proves this is a regression.
Keywords: regression
(In reply to Gene Lian [:gene] (needinfo? encouraged) from comment #15)
> This one might be a regression.

Do you mean regression caused by Bug 931699?
I tried to reproduce this issue today now that Bug 931699 has been backed-out, however, I can't send mms at all with the latest build :(.
The following is found in the logs:

I/Gecko   (  441): [Child 441] WARNING: NS_ENSURE_TRUE(txToRemove) failed: file /home/jessica/hg/src/docshell/shistory/src/nsSHistory.cpp, line 1292
E/GeckoConsole(  441): [JavaScript Error: "ReferenceError: SMIL is not defined" {file: "app://sms.gaiamobile.org/js/message_manager.js" line: 478}]
I/Gecko   (  441): [Child 441] WARNING: There is no observer for "invalidformsubmit". One should be implemented!: file /home/jessica/hg/src/content/html/content/src/HTMLFormElement.cpp, line 1854
E/GeckoConsole(  141): [JavaScript Error: "Return value is not an object."]
E/GeckoConsole(  141): [JavaScript Error: "NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [inIDOMUtils.setContentState]" {file: "chrome://global/content/BrowserElementPanning.js" line: 465}]
(In reply to Jessica Jong [:jjong] [:jessica] from comment #17)

> E/GeckoConsole(  441): [JavaScript Error: "ReferenceError: SMIL is not
> defined" {file: "app://sms.gaiamobile.org/js/message_manager.js" line: 478}]

This one is a Gaia issue (regression).

Btw, I was wondering this bug might not be a regression because the APN resetting issue should been there for a long time since we didn't change much logic on that. Anyway, we have to fix it.
Posted patch proposed patch. (obsolete) — Splinter Review
We set state to DISCONNECTING and notify observers on behalf of RILNetworkInterface when data is deactivated due to ril.data.apnSettings change.

Try: https://tbpl.mozilla.org/?tree=Try&rev=9d5316151f82
Attachment #827322 - Flags: review?(gene.lian)
Comms team, can you help to triage this one?
Comment on attachment 827322 [details] [diff] [review]
proposed patch.

Review of attachment 827322 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/system/gonk/RadioInterfaceLayer.js
@@ +1593,5 @@
> +
> +          // If this iface is going down, notify observers on befalf of it.
> +          // These RILNetworkInterface(s) will not be able to send out the
> +          // notifications since they are going to be recreated.
> +          if (!apnSetting.iface.connectedTypes.length) {

Why not using .inConnectedTypes(type) to notify observers for a certain type only? Just double checking.

Also, I wonder the comment you left doesn't stand. Is it sill possible fire the notification in RILNetworkInterface.disconnect(...)? Also, why the notification firing in RILNetworkInterface.dataCallStateChanged(...) doesn't work?

Btw, I think we need to fix some MMS logic as well since the MMS connection can now early disconnected. For example, we need to call disconnectTimer.cancel() whenever the this.connected is disconnected, and don't need to set up disconnectTimer if this.connected is already disconnected. I can help you with this part or we can fire another bug fixing that.
Attachment #827322 - Flags: review?(gene.lian)
Thanks for the review. Please see my response below.

(In reply to Gene Lian [:gene] (needinfo? encouraged) from comment #21)
> Comment on attachment 827322 [details] [diff] [review]
> proposed patch.
> 
> Review of attachment 827322 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/system/gonk/RadioInterfaceLayer.js
> @@ +1593,5 @@
> > +
> > +          // If this iface is going down, notify observers on befalf of it.
> > +          // These RILNetworkInterface(s) will not be able to send out the
> > +          // notifications since they are going to be recreated.
> > +          if (!apnSetting.iface.connectedTypes.length) {
> 
> Why not using .inConnectedTypes(type) to notify observers for a certain type
> only? Just double checking.

A RILNetworkInterface can be shared by different types. If you disconnect a certain type and the RILNetworkInterface is still used by another type, a "network-interface-state-changed" will be sent in [1], so we need to send the notification only when RILNetworkInterface is not used by any type and is really going down, which is when apnSetting.iface.connectedTypes.length reaches zero.

> 
> Also, I wonder the comment you left doesn't stand. Is it sill possible fire
> the notification in RILNetworkInterface.disconnect(...)? 

Yes. I have just discussed with Ken about this. The original way was to set the GECKO_NETWORK_STATE_XXXX in ril_worker, and notification was fired in RILNetworkInterface.dataCallStateChanged(...), this is because in most of the cases ril_worker gets the correct state from modem. But for CONNECTING/DISCONNECTING state, we can set the state in RILNetworkInterface.connect(...)/disconnect(...) and fire the notifications first.
I will upload another patch for this.


> Also, why the
> notification firing in RILNetworkInterface.dataCallStateChanged(...) doesn't
> work?

RILNetworkInterfaces were recreated and does not register to data call callbacks until RILNetworkInterface.connect(...) is called. Even if it does register, it will not be able to handle the datacallstatechange due to info mismatch. Please refer to the flow in Comment 13.

> 
> Btw, I think we need to fix some MMS logic as well since the MMS connection
> can now early disconnected. For example, we need to call
> disconnectTimer.cancel() whenever the this.connected is disconnected, and
> don't need to set up disconnectTimer if this.connected is already
> disconnected. I can help you with this part or we can fire another bug
> fixing that.

Sure. I'd prefer to fire another bug for this.

Please let me know if there is anything unclear, we can discuss f2f. :)

[1] http://mxr.mozilla.org/mozilla-central/source/dom/system/gonk/RadioInterfaceLayer.js#3262
Blocks: 935418
(In reply to Jessica Jong [:jjong] [:jessica] from comment #22)
> > Also, why the
> > notification firing in RILNetworkInterface.dataCallStateChanged(...) doesn't
> > work?
> 
> RILNetworkInterfaces were recreated and does not register to data call
> callbacks until RILNetworkInterface.connect(...) is called. Even if it does
> register, it will not be able to handle the datacallstatechange due to info
> mismatch. Please refer to the flow in Comment 13.

Oh! I missed that bit. You mentioned that before. Sorry! :)
Joe, see comment 20, I think Kevin wants you to triage this.
Flags: needinfo?(jcheng)
Posted patch proposed patch, v2. (obsolete) — Splinter Review
The following changes are included in this patch:

1. when we receive a "ril.data.apnSettings" change, we compare the new apns with the existing apns and reset the apnSettings only if apn/user/password has changed. The compare result is not always correct due to the shared APN nature. But in the worst case, data connections are deactivated and activated again.

2. we moved the "network-interface-state-changed" notifications firing into NetworkManager. NetworkManager fires the notification after "network-interface-registered"/"network-interface-unregistered" and when all the routes (default and extra) and DNSes are all set.

3. additionally, we have added a updateNetworkInterface(...) function which just updates the routes (default and extra) and DNSes and does not re-register the network interface to avoid sending extra "network-interface-state-changed" notifications.


There seem to be lots of duplicated jobs in NetworkManager.registerNetworkInterface(...), "network-interface-state-changed" and "network-interface-registered" observer. Maybe we can merge these two events when doing NetworkManager enhancement?
Attachment #827322 - Attachment is obsolete: true
Attachment #828574 - Flags: feedback?(vyang)
Attachment #828574 - Flags: feedback?(echen)
(In reply to Jessica Jong [:jjong] [:jessica] from comment #25)
> Created attachment 828574 [details] [diff] [review]
> proposed patch, v2.

Per offline discussion, we're going to:

1) obsolete compareApnSettings function because it doesn't really belong to this bug.  We all know the imperfection of APN settings handler and it really deserves a complete rewrite.

2) Since we're transferring the responsibility of sending TOPIC_INTERFACE_STATE_CHANGED events to NetworkManager, I suppose there won't be TOPIC_INTERFACE_STATE_CHANGED event registration in NetworkManager any more because registering an event that is emitted by oneself makes really no sense.  We can also remove TOPIC_INTERFACE_REGISTERED/TOPIC_INTERFACE_UNREGISTERED event registration with the same reason.

3) The necessity of TOPIC_INTERFACE_REGISTERED/TOPIC_INTERFACE_UNREGISTERED can be another topic.  We can remove them if there is just no one interests in, but merging them into TOPIC_INTERFACE_STATE_CHANGED sounds like a mix of different meanings -- the existence of a certain interface and the variance of the attributes of it.  Really don't feel like it. :(
Attachment #828574 - Flags: feedback?(vyang)
Posted patch (WIP) Part 1: idl changes (obsolete) — Splinter Review
Attachment #828574 - Attachment is obsolete: true
Attachment #828574 - Flags: feedback?(echen)
Attachment #830106 - Attachment description: (WIP) Part 2: wifi changes → (WIP) Part 3: wifi changes
Posted patch (WIP) Part 4: implementation (obsolete) — Splinter Review
As stated in Comment 26, we are removing TOPIC_INTERFACE_STATE_CHANGED, TOPIC_INTERFACE_REGISTERED and TOPIC_INTERFACE_UNREGISTERED event registration from NetworkManager.
NetworkManager will provide a new method "updateNetworkInterface(...)" for updating routes and dnses, and will fire a TOPIC_INTERFACE_STATE_CHANGED notification when done.
However, the original issue still persists, since in updateApnSettings(...), only unregisterNetworkInterface(...) is called and not updateNetworkInterface(...).
Flags: needinfo?(jcheng)
Whiteboard: [FT:comms]
1.3+ for regression
blocking-b2g: 1.3? → 1.3+
Whiteboard: [FT:comms]
Actually, this is not a regression, the code flow has been the same for a long time, so resetting the keyword.
Keywords: regression
After rebasing the WIP patches I did find something really weird.
We have moved the notification firing into NetworkManager, so we can remove the TOPIC_INTERFACE_STATE_CHANGED observer events. However, once I removed the TOPIC_INTERFACE_STATE_CHANGED observer event ([1]~[2]), mms can not be sent on the first time, but it can always be sent out on the second time, either retrying automatically or manually. I have checked the routes and settings and they all seem to be right. Still investigating the cause of this...


[1] http://mxr.mozilla.org/mozilla-central/source/dom/system/gonk/NetworkManager.js#188
[2] http://mxr.mozilla.org/mozilla-central/source/dom/system/gonk/NetworkManager.js#257
(In reply to Jessica Jong [:jjong] [:jessica] from comment #32)
> Actually, this is not a regression, the code flow has been the same for a
> long time, so resetting the keyword.

Umm...why? There's a regression range associated with this bug in comment 4. So what is that regression range pointing at then?
Flags: needinfo?(jjong)
(In reply to Jason Smith [:jsmith] from comment #34)
> (In reply to Jessica Jong [:jjong] [:jessica] from comment #32)
> > Actually, this is not a regression, the code flow has been the same for a
> > long time, so resetting the keyword.
> 
> Umm...why? There's a regression range associated with this bug in comment 4.
> So what is that regression range pointing at then?

Mmm... I think maybe it's just the reproduction steps, note that step 2 must be done within 30 seconds.
I was able to reproduce it using 20131024154023.
Flags: needinfo?(jjong)
It is still called regression, not because of no code change.  It breaks the test case(maybe slightly different) and the user story committed before.
Alright, let's verify this then.

Can someone check if this reproduces on 1.2 or 1.1?
Keywords: qawanted
(In reply to Jessica Jong [:jjong] [:jessica] from comment #35)
> Mmm... I think maybe it's just the reproduction steps, note that step 2 must
> be done within 30 seconds.
> I was able to reproduce it using 20131024154023.

+1. I believe this is not a regression. This issue should have been existing from the very first beginning because we've never been thinking of fixing this corner-edge issue (users seldom change the MMS APN settings when the MMS is still downloading; needless to say only advanced users would want to change the MMS APN settings).
(In reply to Gene Lian [:gene] (needinfo? encouraged) from comment #38)
> (In reply to Jessica Jong [:jjong] [:jessica] from comment #35)
> > Mmm... I think maybe it's just the reproduction steps, note that step 2 must
> > be done within 30 seconds.
> > I was able to reproduce it using 20131024154023.
> 
> +1. I believe this is not a regression. This issue should have been existing
> from the very first beginning because we've never been thinking of fixing
> this corner-edge issue (users seldom change the MMS APN settings when the
> MMS is still downloading; needless to say only advanced users would want to
> change the MMS APN settings).

Then what exactly is the regression range pointing at?

Sarah - Can you clarify what the regression range's STR is actually looking at? The above comments seem to imply that this reproduces on the last working build cited in that regression range.

I'm pulling the blocking+ flag for now since this is leaning towards not being a regression, although I really need to understand what that regression range is pointing at.
blocking-b2g: 1.3+ → ---
Flags: needinfo?(sparsons)
Keywords: qawanted
I just tried again using the Buri Master Build ID 20131024154023 and was not able to reproduce the issue. I used the STR in comment 0. 

Gaia   2feebdb1a2583928f32407d76d798f8654621e2b
SourceStamp 19fd3388c372
BuildID 20131024154023
Version 27.0a1

However, I was able to repro on the Buri 1.3 Build ID: 20131025100746

Gaia   afbf45f26a73b7cd5e0a831bea48087331975286
SourceStamp 2f2a45f04e7c
BuildID 20131025100746
Version 27.0a1
Flags: needinfo?(sparsons)
Sarah - I need to know the exact STR you are using to assess that regression window. I don't need clarification of the regression window itself. The above discussion seems to imply that the STR being used for this regression window isn't right.
Flags: needinfo?(sparsons)
STR:

1. Launch Messaging App.
2. Create, and send an MMS.
3. Open the Settings App.
4. Select Cellular & Data, then Message Settings.
5. Tap OK without making any changes to the Message Settings.
6. Go back to the Message App and send another MMS.
Flags: needinfo?(sparsons)
Sounds like the above analysis actually proves that this is a regression. What might be happening here is that the patch being fixed here isn't fixing the bug in question here.
blocking-b2g: --- → 1.3+
Keywords: regression
Hmm, I think I might misunderstand something. Thanks QA very much for all the effort on this!
Thanks to :mcmanus help we have found the cause of the issue mentioned in comment 33.
Depends on: 946302
Posted patch Sol 1 - Part 1: idl changes (obsolete) — Splinter Review
Add updateNetworkInterface() api.
Attachment #830104 - Attachment is obsolete: true
Posted patch Sol 1 - Part 2: ril changes (obsolete) — Splinter Review
Use updateNetworkInterface() in ril.
Attachment #830105 - Attachment is obsolete: true
Posted patch Sol 1 - Part 3: wifi changes (obsolete) — Splinter Review
Use updateNetworkInterface() in wifi.
Attachment #830106 - Attachment is obsolete: true
Implement updateNetworkInterface() in NetworkManager.js and remove self-fired event observers.
Attachment #830112 - Attachment is obsolete: true
Attachment #8348639 - Attachment description: Part 4: implementation → Part 4.1: implementation
fire network-interface-state-changed events in updateRILNetworkInterface() if needed.
Posted patch Alternative solution (obsolete) — Splinter Review
Defer setting up APN settings until all data calls are cleared, this way the 'network-interface-state-changed' events' would be fired properly.
Attachment #8348631 - Attachment description: Part 1: idl changes → Sol 1 - Part 1: idl changes
Attachment #8348632 - Attachment description: Part 2: ril changes → Sol 1 - Part 2: ril changes
Attachment #8348634 - Attachment description: Part 3: wifi changes → Sol 1 - Part 3: wifi changes
Attachment #8348639 - Attachment description: Part 4.1: implementation → Sol 1 - Part 4.1: implementation
Attachment #8348642 - Attachment description: Part 4.2: additional fix → Sol 1 - Part 4.2: additional fix
Hi Vicamo,

There are two proposed solutions.

The first one is the one that we have discussed before, moving all the 'network-interface-state-changed' events firing to NetworkManager.js, it's just that we still need the Sol 2 - Part 4.1, calling updateNetworkInterface() when registering/unregistering network interface, to be able to fix the original issue.

The seconds one is simpler, that is, to defer the apn settings setup when there is any active data call, and wait until all data calls are cleared to setup the new apn settings.

I would prefer the second one since it will let data call deactivation finish properly. We can still move the 'network-interface-state-changed' events firing to NetworkManager.js, maybe in another bug?
Please let me know what is your opinion on this or any other suggestions are welcome! Thank you.
Flags: needinfo?(vyang)
PM triaged this bug and believes that it should be a blocker.
Whiteboard: [FT:RIL]
Target Milestone: --- → 1.3 C3/1.4 S3(31jan)
Interface changed.
Jessica, can you please discuss Vicamo directly?
Flags: needinfo?(jjong)
Joe, do we really care about this bug for 1.3? This issue is not blocking for us and neither for any customers on 1.3. Since we are getting so close to commercializing 1.3 any such interface changes pose risk to 1.3 quality.
Flags: needinfo?(jcheng)
Hey Joe, I think I misread this bug. This seems like is for a single SIM and not DSDS, in which case this might need to be uplifted to 1.3.
Sorry, will have a review asap today :X
(In reply to Anshul from comment #57)
> Joe, do we really care about this bug for 1.3? This issue is not blocking
> for us and neither for any customers on 1.3. Since we are getting so close
> to commercializing 1.3 any such interface changes pose risk to 1.3 quality.

Absolutely disagree with this analysis. This is a bad regression that places the user in a situation where a MMS can be fail to be sent. It's also a confirmed regression, which is treated as a blocker by default unless it's minor.
Flags: needinfo?(jcheng)
Vicamo proposed Solution#3 -- we should de/activate data connection only if APN settings are actually updated. We shouldn't go deactivate connection only because we received 'ril.data.apnSettings.'

This solution benefits bug 962456 but we need to rewrite the whole updateApnSettings() and have one more cache.
See Also: → 962456
Thanks, Hsinyi and Vicamo.

I'd say that solution#3 does help bug 962456, but it does not solve the problem met in this bug entirely. It will be fine if user does not change the apn setting; but if user does change the apn setting, we still need to deactivate the corresponding data call, and this is where the problem we are trying to solve here comes.

Another thing is, solution#3 is really expensive for the current design, we have to compare each of the apns, then look for it in 'this.apnSettings.byType' to remove the reference, and look for it again in 'this.apnSettings.byApn' to remove the reference. That's why I put solution#3 in (or after) our 'data call enhancement', see bug 939046 item 4.

Please let me know if I am missing anything. Thanks.
Flags: needinfo?(jjong)
(In reply to Vicamo Yang [:vicamo][:vyang] from comment #59)
> Sorry, will have a review asap today :X

10 minutes.  Well, I define it's still "today" if I don't leave the office.
Flags: needinfo?(vyang)
(In reply to Jessica Jong [:jjong] [:jessica] from comment #62)
> Thanks, Hsinyi and Vicamo.
> 
> I'd say that solution#3 does help bug 962456, but it does not solve the
> problem met in this bug entirely. It will be fine if user does not change
> the apn setting; but if user does change the apn setting, we still need to
> deactivate the corresponding data call, and this is where the problem we are
> trying to solve here comes.
> 
> Another thing is, solution#3 is really expensive for the current design, we
> have to compare each of the apns, then look for it in
> 'this.apnSettings.byType' to remove the reference, and look for it again in
> 'this.apnSettings.byApn' to remove the reference. That's why I put
> solution#3 in (or after) our 'data call enhancement', see bug 939046 item 4.

Hi there,

I have an idea about solution#3, how about we keep the logic of updateApnSettings() unchanged and only execute it when the new apns is NOT "exactly" the same as old one. For making the logic smarter, we can keep following up it in bug 939046.
I think above proposal still can help bug 962456, and also fix this bug for the case that user does not change the apn settings.

> 
> Please let me know if I am missing anything. Thanks.
Vicamo proposed another solution#4: to use a command queue for commands that need to wait for another commands to be finished. I think it is a good solution and will benefit things like "deactivate data calls before turning radio off". However, it's not a small change, I doubt that we will able to land it for tomorrow, since there other 1.3+ bugs going on... :(
(In reply to Jessica Jong [:jjong] [:jessica] from comment #65)
> Vicamo proposed another solution#4: to use a command queue for commands that
> need to wait for another commands to be finished. I think it is a good
> solution and will benefit things like "deactivate data calls before turning
> radio off". However, it's not a small change, I doubt that we will able to
> land it for tomorrow, since there other 1.3+ bugs going on... :(

Yes, so let's take solution #2 here because it doesn't seem to affect QCRIL in anyway.
Attachment #8348631 - Attachment is obsolete: true
Attachment #8348632 - Attachment is obsolete: true
Attachment #8348634 - Attachment is obsolete: true
Attachment #8348639 - Attachment is obsolete: true
Attachment #8348642 - Attachment is obsolete: true
Attachment #8348673 - Attachment is obsolete: true
Attachment #8348674 - Attachment is obsolete: true
Attachment #8366505 - Flags: review?(vyang)
Attachment #8366509 - Flags: review?(vyang)
Comment on attachment 8366505 [details] [diff] [review]
Part 1: split updateApnSettings() into two parts.

Review of attachment 8366505 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/system/gonk/RadioInterfaceLayer.js
@@ +2061,5 @@
>     *      corresponding APN setting.
>     *   4. Create RilNetworkInterface for each APN setting created at step 2.
>     */
> +  setupApnSettings: function(apnSettings) {
> +    if (!apnSettings) {

nit: let's keep the name 'simApnSettings' here so that we can reduce changed lines a bit.
Attachment #8366505 - Flags: review?(vyang) → review+
Comment on attachment 8366509 [details] [diff] [review]
Part 2: defer setup of apn settings after data calls are cleared.

Review of attachment 8366509 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you!
Attachment #8366509 - Flags: review?(vyang) → review+
Address review comment 70:
- keep the parameter name 'simApnSettings'.
Attachment #8366505 - Attachment is obsolete: true
Attachment #8367105 - Flags: review+
rebased.
Attachment #8366509 - Attachment is obsolete: true
Attachment #8367106 - Flags: review+
Sorry for finding this issue so late. To check that all data calls have been disconnected, we can not use 'anyDataConnected()', cause this function check for 'connected' data calls, but data calls may be already be in 'disconnecting' state. And since 'getDataCallByType()' is not really accurate (it won't return 'disconnecting' state'), we check for network interfaces directly in 'allDataDisconnected()'
Attachment #8367111 - Flags: review?(htsai)
I have tested the following scenarios with part 3:
- On Buri, enable airplane mode with default and mms connected.
- On Buri, change apn settings with default and mms connected (this bug).
- On Fugu, switch default sim for data call with default and mms connected.
Not many times, but they seem to work fine.
Comment on attachment 8367111 [details] [diff] [review]
Part 3: use allDataDisconnected() instead of anyDataConnected().

Review of attachment 8367111 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks.
Attachment #8367111 - Flags: review?(htsai) → review+
(In reply to Hsin-Yi Tsai (OOO Jan. 29 ~ Feb. 5th) [:hsinyi] from comment #76)
> Comment on attachment 8367111 [details] [diff] [review]
> Part 3: use allDataDisconnected() instead of anyDataConnected().
> 
> Review of attachment 8367111 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Thanks.

I'd say I am not a big fan of having allDataDisconnected() and anyDataConnected() both at the same time. However, we are facing a problem with the current ril network structure, i.e. RILNetworkInterface() is shared by types. The design leads to trouble in interface state changes.

And considering the timing pressure, it looks okay with this patch. And I do hope we could re-design the whole structure as soon as possible.

Anyway, big thank for Jessica's hard work :)
(In reply to Jessica Jong (OOO Jan. 30 ~ Feb. 4) [:jjong] [:jessica] from comment #78)
> Thanks, Vicamo and Hsinyi! :)
> 
> Try: https://tbpl.mozilla.org/?tree=Try&rev=ce6a530e61c3

All green! \o/
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.