Closed Bug 1126222 Opened 9 years ago Closed 9 years ago

B2G NetworkManager: fire 'network-connection-state-changed' after internal work is done

Categories

(Firefox OS Graveyard :: RIL, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(tracking-b2g:+)

RESOLVED FIXED
2.2 S12 (15may)
tracking-b2g +

People

(Reporter: jessica, Assigned: jessica)

References

Details

(Whiteboard: [caf priority: p2][CR 819585][grooming])

Attachments

(3 files, 5 obsolete files)

We should fire 'network-connection-state-changed' after routes/dnses/proxy are set/cleared. Maybe use promise in this case.
Blocks: 904514
blocking-b2g: --- → backlog
Whiteboard: [grooming]
Depends on: 1130962
blocking-b2g: backlog → ---
Location components rely a lot on this topic change network-connection-state-changed, to trigger pending time, xtra data downloads or trigger pending crowdsourcing uploads. Getting this notification must mean that the routes/dns/proxies are all setup and ready for use immediately.
Edgar, This bug seems a refactor. Does it create any critical issue that user will face?
Flags: needinfo?(echen)
Is it too late to include this into the CAF bug list? Thanks.
Flags: needinfo?(mlee)
(In reply to Ken Chang[:ken] from comment #2)
> Edgar, This bug seems a refactor. Does it create any critical issue that
> user will face?

In current design, there is a very short period that user possibly unable to access network after receiving 'network-connection-state-changed' due to the internal setup (routing, proxy ... etc) was not ready yet. This bug is trying to improve the design, though we haven't received any bug that is caused by it.
Flags: needinfo?(echen)
Whiteboard: [grooming] → [CR 819585][grooming]
Whiteboard: [CR 819585][grooming] → [caf priority: p2][CR 819585][grooming]
Anshul,

Please work with Bhavna Sharma to clarify whether this needs to be a CAF FC blocker.

Thanks,
Mike
Flags: needinfo?(mlee) → needinfo?(anshulj)
This is causing an issue with MMS.

>2078:04-08 15:09:52.772   261   261 I Gecko   : -@- MmsService: Failed to ensureRouting: Error: Failed to resolve 'proxy.mobile.att.net', with status: 2152398878
>2079:04-08 15:09:52.782   261   261 I Gecko   : -@- MmsService: The sending status of sendTransaction.run(): -8
>2080:04-08 15:09:52.782   261   261 I Gecko   : -@- MmsService: The returned status of sending transaction: aErrorCode: 4 aEnvelopeId: null
>2089:04-08 15:09:52.802   261  2405 D NetworkUtils: Preparing to send 'resolver setnetdns 100 mozilla.rmnet_data0.doman 172.26.38.1 172.26.38.2' command...
>2090:04-08 15:09:52.802   261  2405 D NetworkUtils: Sending '0 resolver setnetdns 100 mozilla.rmnet_data0.doman 172.26.38.1 172.26.38.2' command to netd.
>2091:04-08 15:09:52.802   261  2405 D NetworkUtils: Receiving "0 resolver setnetdns 100 mozilla.rmnet_data0.doman 172.26.38.1 172.26.38.2" command response from netd.
>2092:04-08 15:09:52.802   261  2405 D NetworkUtils:           ==> Code: 200  Reason: Resolver command succeeded


This happens somewhat consistently when trying to send an MMS with no data call is currently up.
When the case mentioned in comment 6 happens, won't MMS service try to establish data call?
Flags: needinfo?(btseng)
(In reply to Ken Chang[:ken] from comment #7)
> When the case mentioned in comment 6 happens, won't MMS service try to
> establish data call?

No, The data call is established before this error was thrown and the connection is reference-counted, so it will be kept alive unless disconnected by network/modem.
Flags: needinfo?(btseng)
Flags: needinfo?(anshulj) → needinfo?(sbhavna)
Move needinfo to Phil, as he can provide more information on MMS service.
Flags: needinfo?(sbhavna) → needinfo?(pgravel)
NI myself to track this.
Flags: needinfo?(whuang)
This seems to be covered by the change in NetworkManager of the patch in bug 992772.
https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=992772&attachment=8592666
setDNS() was wrapped as Promise and 
'network-connection-state-changed' will be fired after setDNS() is done.
(In reply to Bevis Tseng[:bevistseng][:btseng] from comment #11)
> This seems to be covered by the change in NetworkManager of the patch in bug
> 992772.
> https://bugzilla.mozilla.org/page.cgi?id=splinter.
> html&bug=992772&attachment=8592666
> setDNS() was wrapped as Promise and 
> 'network-connection-state-changed' will be fired after setDNS() is done.

Actually, it is not completely covered: 'network-connection-state-changed' will be fired after setDNS() is done, but it is not guaranteed it will be fired after dns routes are set. So, if there is no existing default route, there is a chance that dns routes are not ready when 'network-connection-state-changed' is fired.

However, this has always been this way and we haven't had issues before. I'd like to see if the mms issue still exists after bug 1152568.
Hi Phil,
Comment 6 is not obviously pointing to the root cause being this bug. It smells more like bug 1152568 instead. We need more logs from you if you still think the 'network-connection-state-changed' firing timing brings you some trouble in mms sending.
Michael, can we remove this bug from 2.2 CAF list and focus on bug 1152568?
Flags: needinfo?(mvines)
Apologies for the delay, got sick and have been scrambling to catch up on various issues.

I still believe this is a different issue than 1152568.

I'm attaching logs of my test scenario with full debug enabled. I added a couple of logs to help track network-connection-state-changed notifications and you'll probably also notice the two lines with "PHIL" in the logs which denotes very small tweaks[1][2] I've made in MmsService which I've made in order to fully isolate this issue from 1152568.


Steps followed in the attached log (mmfailusingdefaultdata.log)
1. reboot
2. Have data down
3. Try to send a MMS
4. MMS will fail to send
5. Immediately re-send the same MMS by pressing the (!) mark and pressing OK
6. MMS will send successfully


The 2nd attempt at sending the MMS is done within the time window that the MMS connection remains active, so the exact same data call is used and no new NetworkInterface is registered or modified within the NetworkManager. Since the 2nd attempt succeeds while there is no apparent changes to the connection, i still feels like we are running into the aforementioned timing problem with dns.


[1] - SetupDataCallByType('mms') results in both NETWORK_TYPE_MOBILE and NETWORK_TYPE_MOBILE_MMS network interfaces to be brought up (shared APN, same data call). This is done to avoid running into proxy resolve issues discussed in 1152568 when only NETWORK_TYPE_MOBILE_MMS is registered.
[2] - Made MmsService wait for network-connection-state-changed on NETWORK_TYPE_MOBILE instead of NETWORK_TYPE_MOBILE_MMS because I noticed network-connection-state-changed for MMS comes in before MOBILE since there is less routing to do. This was done to attempt to give MORE time for all netd commands to be sent and resolved, but in the end the behavior is the same and the MMS still fails to resolve the proxy. Logs are pretty much identical either way.
Flags: needinfo?(pgravel)
Flags: needinfo?(mvines)
Phil, thanks for the experiments and logs. I think this problem has become more apparent in Lollipop, because all the routes are set asynchronously through netd, whereas before Lollipop, most of them were set using native functions (ifc_*).

However, I have tested myself using Nexus-5-L, and mms can be sent successfully on the first time without default data. I think the difference is because, without bug 1152568, you need to tweak MmsService to wait for NETWORK_TYPE_MOBILE event, which has more routing to do after sending network-connection-state-changed, so it is more likely for NETWORK_TYPE_MOBILE not to be ready when MmsService receives the event. Have you tested this with the patch in bug 1152568 (actually, you need only bug 992772)? so you don't need to tweak MmsService.

I know this is still a problem, so, we'll try to fix this. In the meantime, could you test this with the patch in bug 992772, if it works, maybe this should not be a blocker? Thanks!
Assignee: nobody → jjong
NI for comment 16, thanks.
Flags: needinfo?(pgravel)
Hi Jessica,

In my last set of logs, I already made the change to wait for NETWORK_TYPE_MOBILE (as mentioned in the [2] note) and it didn't seem to help.

Also, I don't see a patch available for bug 1152568?

I'll make some time tomorrow to try out the patch from bug 992772 and let you know.

Thanks.
Hi Phil, please find my comments below.

(In reply to pgravel from comment #18)
> Hi Jessica,
> 
> In my last set of logs, I already made the change to wait for
> NETWORK_TYPE_MOBILE (as mentioned in the [2] note) and it didn't seem to
> help.

What I meant in comment 16, is that, I _don't_ want MmsService to wait for NETWORK_TYPE_MOBILE, cause it is more vulnerable to race conditions (it has more tasks to do). So, with the patch in bug 992772, you don't need the tweaks.

> 
> Also, I don't see a patch available for bug 1152568?
> 
> I'll make some time tomorrow to try out the patch from bug 992772 and let
> you know.

Yes, please use the patch in bug 992772. Thanks.

> 
> Thanks.
Hi Jessica,

With the patch from bug 992772 I am able to send MMS much more consistently. I did run into a few failed attempts, but haven't been able to reproduce with any consistency.

So in terms of MMS, I think we're OK.
Flags: needinfo?(pgravel)
(In reply to pgravel from comment #20)
> Hi Jessica,
> 
> With the patch from bug 992772 I am able to send MMS much more consistently.
> I did run into a few failed attempts, but haven't been able to reproduce
> with any consistency.
> 
> So in terms of MMS, I think we're OK.

Thanks Phil, that's good news.
Just letting you know, we are also working on this bug.
(In reply to pgravel from comment #20)
> Hi Jessica,
> 
> With the patch from bug 992772 I am able to send MMS much more consistently.
> I did run into a few failed attempts, but haven't been able to reproduce
> with any consistency.
> 
> So in terms of MMS, I think we're OK.

Thanks Phil for the testing.

According to CAF test result, it looks this could be removed from a CAF-v2.2-meta, right?
Flags: needinfo?(anshulj)
No longer blocks: CAF-v2.2-metabug
Clear NI as Phil already adjusted the bug dependency.
Flags: needinfo?(whuang)
Flags: needinfo?(anshulj)
I found that sometimes network info are cleared before NetworkManager finishes its actions, so I let NetworkManager keep a copy of network when updateNetworkInterface() is called.
Attachment #8599077 - Attachment is obsolete: true
Comment on attachment 8599076 [details] [diff] [review]
Part 1: provide callback in some functions in NetworkService.

Edgar, may I have your review on this? Thanks.
Attachment #8599076 - Flags: review?(echen)
Attachment #8599783 - Flags: review?(echen)
Hey, why is this blocking CAF-2.2 again?
ni? to Greg for comment 29.
Flags: needinfo?(ggrisco)
Attachment #8599076 - Flags: review?(echen) → review+
Comment on attachment 8599783 [details] [diff] [review]
Part 2: use promise to ensure execution order in NetworkManager, v2.

Review of attachment 8599783 [details] [diff] [review]:
-----------------------------------------------------------------

Mostly looks good, but I would like to see a revised version again, thank you.

::: dom/system/gonk/NetworkManager.js
@@ +100,5 @@
> +  this.httpProxyHost = aNetwork.httpProxyHost;
> +  this.httpProxyPort = aNetwork.httpProxyPort;
> +}
> +NetworkInterface.prototype = {
> +  QueryInterface: XPCOMUtils.generateQI([Ci.nsINetworkInterface]),

|NetworkInterface| is only used for caching status, don't need this, unless it will be exposed to other module.

@@ +343,3 @@
>              // Add host route for data calls
> +            if (!this.isNetworkTypeMobile(networkInterface.type)) {
> +              return Promise.resolve();

Just `return;` is enough. The new promise is fulfilled if the callback returns a value not a promise.

@@ +360,5 @@
> +          })
> +          .then(() => {
> +            if (networkInterface.type !=
> +                Ci.nsINetworkInterface.NETWORK_TYPE_MOBILE_DUN) {
> +              return Promise.resolve();

Ditto, return;

@@ +388,5 @@
>              Services.obs.notifyObservers(network, TOPIC_CONNECTION_STATE_CHANGED,
>                                           this.convertConnectionType(network));
> +          })
> +          .catch(aError => {
> +            throw new Error(aError);

Catch a rejected promise and create a new rejected promise by throwing an exception?
I guess you could just print some message here instead?

@@ +395,5 @@
>        case Ci.nsINetworkInterface.NETWORK_STATE_DISCONNECTED:
> +        Promise.resolve()
> +          .then(() => {
> +            if (!this.isNetworkTypeMobile(networkInterface.type)) {
> +              return Promise.resolve();

Ditto, return;

@@ +403,5 @@
> +          })
> +          .then(() => {
> +            if (networkInterface.type !=
> +                Ci.nsINetworkInterface.NETWORK_TYPE_MOBILE_DUN) {
> +              return Promise.resolve();

Ditto, return;

@@ +446,5 @@
> +            Services.obs.notifyObservers(network, TOPIC_CONNECTION_STATE_CHANGED,
> +                                         this.convertConnectionType(network));
> +          })
> +          .catch(aError => {
> +            throw new Error(aError);

Ditto, print message instead?

@@ +676,5 @@
>      return null;
>    },
>  
> +  _setSecondaryRoute: function(doAdd, interfaceName, route) {
> +    return new Promise((aResolve, aReject) => {

Have a consistent style, put `a` prefix for other arguments.

@@ +678,5 @@
>  
> +  _setSecondaryRoute: function(doAdd, interfaceName, route) {
> +    return new Promise((aResolve, aReject) => {
> +      if (doAdd) {
> +        gNetworkService.addSecondaryRoute(interfaceName, route, (success) => {

s/success/aSuccess/

@@ +815,5 @@
> +    return Promise.resolve()
> +      .then(() => {
> +        // Don't set default route on secondary APN
> +        if (!this.active ||
> +            (this.active && this.isNetworkTypeSecondaryMobile(this.active.type))) {

if(!(this.active && this.isNetworkTypeSecondaryMobile(this.active.type))) {

::: dom/system/gonk/RadioInterfaceLayer.js
@@ +2837,2 @@
>          }
> +      }.bind(this, RIL.GECKO_NETWORK_STATE_DISCONNECTED), Ci.nsIEventTarget.DISPATCH_NORMAL);

Nit: Use arrow function instead.
Attachment #8599783 - Flags: review?(echen)
Thanks Edgar for the review.

(In reply to Edgar Chen [:edgar][:echen] from comment #31)
> Comment on attachment 8599783 [details] [diff] [review]
> Part 2: use promise to ensure execution order in NetworkManager, v2.
> 
> Review of attachment 8599783 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Mostly looks good, but I would like to see a revised version again, thank
> you.
> 
> ::: dom/system/gonk/NetworkManager.js
> @@ +100,5 @@
> > +  this.httpProxyHost = aNetwork.httpProxyHost;
> > +  this.httpProxyPort = aNetwork.httpProxyPort;
> > +}
> > +NetworkInterface.prototype = {
> > +  QueryInterface: XPCOMUtils.generateQI([Ci.nsINetworkInterface]),
> 
> |NetworkInterface| is only used for caching status, don't need this, unless
> it will be exposed to other module.

Oh, I was thinking whether to use 'networkInterface' or 'network' when calling notifyObservers in updateNetworkInterface(), then I decided to use the original 'network' so I think we don't need this. Thanks.

> 
> @@ +343,3 @@
> >              // Add host route for data calls
> > +            if (!this.isNetworkTypeMobile(networkInterface.type)) {
> > +              return Promise.resolve();
> 
> Just `return;` is enough. The new promise is fulfilled if the callback
> returns a value not a promise.

Got it. Thanks.

> 
> @@ +360,5 @@
> > +          })
> > +          .then(() => {
> > +            if (networkInterface.type !=
> > +                Ci.nsINetworkInterface.NETWORK_TYPE_MOBILE_DUN) {
> > +              return Promise.resolve();
> 
> Ditto, return;

Will do.

> 
> @@ +388,5 @@
> >              Services.obs.notifyObservers(network, TOPIC_CONNECTION_STATE_CHANGED,
> >                                           this.convertConnectionType(network));
> > +          })
> > +          .catch(aError => {
> > +            throw new Error(aError);
> 
> Catch a rejected promise and create a new rejected promise by throwing an
> exception?
> I guess you could just print some message here instead?

This is because updateNetworkInterface() does not return a promise, so I think caller does not have the chance to handle the rejected promise, that's why I throw an error instead. Does that make sense?
However, before this patch, we don't throw anything on error, so maybe we should keep the same behaviour.

> 
> @@ +395,5 @@
> >        case Ci.nsINetworkInterface.NETWORK_STATE_DISCONNECTED:
> > +        Promise.resolve()
> > +          .then(() => {
> > +            if (!this.isNetworkTypeMobile(networkInterface.type)) {
> > +              return Promise.resolve();
> 
> Ditto, return;

Will do.

> 
> @@ +403,5 @@
> > +          })
> > +          .then(() => {
> > +            if (networkInterface.type !=
> > +                Ci.nsINetworkInterface.NETWORK_TYPE_MOBILE_DUN) {
> > +              return Promise.resolve();
> 
> Ditto, return;

Will do.

> 
> @@ +446,5 @@
> > +            Services.obs.notifyObservers(network, TOPIC_CONNECTION_STATE_CHANGED,
> > +                                         this.convertConnectionType(network));
> > +          })
> > +          .catch(aError => {
> > +            throw new Error(aError);
> 
> Ditto, print message instead?
> 
> @@ +676,5 @@
> >      return null;
> >    },
> >  
> > +  _setSecondaryRoute: function(doAdd, interfaceName, route) {
> > +    return new Promise((aResolve, aReject) => {
> 
> Have a consistent style, put `a` prefix for other arguments.

Will do.

> 
> @@ +678,5 @@
> >  
> > +  _setSecondaryRoute: function(doAdd, interfaceName, route) {
> > +    return new Promise((aResolve, aReject) => {
> > +      if (doAdd) {
> > +        gNetworkService.addSecondaryRoute(interfaceName, route, (success) => {
> 
> s/success/aSuccess/

Will do.

> 
> @@ +815,5 @@
> > +    return Promise.resolve()
> > +      .then(() => {
> > +        // Don't set default route on secondary APN
> > +        if (!this.active ||
> > +            (this.active && this.isNetworkTypeSecondaryMobile(this.active.type))) {
> 
> if(!(this.active && this.isNetworkTypeSecondaryMobile(this.active.type))) {

Hmm, I think this will cover the case where this.active is defined but it is not type secondary mobile, but we don't it to be covered because we need to set default route in this case.

> 
> ::: dom/system/gonk/RadioInterfaceLayer.js
> @@ +2837,2 @@
> >          }
> > +      }.bind(this, RIL.GECKO_NETWORK_STATE_DISCONNECTED), Ci.nsIEventTarget.DISPATCH_NORMAL);
> 
> Nit: Use arrow function instead.

Will do.
(In reply to Jessica Jong [:jjong] [:jessica] from comment #32)
> (In reply to Edgar Chen [:edgar][:echen] from comment #31)
> > ::: dom/system/gonk/NetworkManager.js
> > @@ +388,5 @@
> > >              Services.obs.notifyObservers(network, TOPIC_CONNECTION_STATE_CHANGED,
> > >                                           this.convertConnectionType(network));
> > > +          })
> > > +          .catch(aError => {
> > > +            throw new Error(aError);
> > 
> > Catch a rejected promise and create a new rejected promise by throwing an
> > exception?
> > I guess you could just print some message here instead?
> 
> This is because updateNetworkInterface() does not return a promise, so I
> think caller does not have the chance to handle the rejected promise, that's
> why I throw an error instead. Does that make sense?

Throwing an exception in the promise callback doesn't really "throw an exception", but create a new rejected promise with the exception as the rejection reason [1].

[1] https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/Promise.jsm/Promise#then%28%29 

> However, before this patch, we don't throw anything on error, so maybe we
> should keep the same behaviour.
(In reply to Edgar Chen [:edgar][:echen] from comment #33)
> (In reply to Jessica Jong [:jjong] [:jessica] from comment #32)
> > (In reply to Edgar Chen [:edgar][:echen] from comment #31)
> > > ::: dom/system/gonk/NetworkManager.js
> > > @@ +388,5 @@
> > > >              Services.obs.notifyObservers(network, TOPIC_CONNECTION_STATE_CHANGED,
> > > >                                           this.convertConnectionType(network));
> > > > +          })
> > > > +          .catch(aError => {
> > > > +            throw new Error(aError);
> > > 
> > > Catch a rejected promise and create a new rejected promise by throwing an
> > > exception?
> > > I guess you could just print some message here instead?
> > 
> > This is because updateNetworkInterface() does not return a promise, so I
> > think caller does not have the chance to handle the rejected promise, that's
> > why I throw an error instead. Does that make sense?
> 
> Throwing an exception in the promise callback doesn't really "throw an
> exception", but create a new rejected promise with the exception as the
> rejection reason [1].
> 
> [1]
> https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/
> Promise.jsm/Promise#then%28%29 

I get what you mean now, thanks for the information, I'll just print some error message here.

> 
> > However, before this patch, we don't throw anything on error, so maybe we
> > should keep the same behaviour.
Removing this from CAF-2.2 blockers again.  It shouldn't come back unintentionally this time.
No longer blocks: CAF-v2.2-metabug
Flags: needinfo?(ggrisco)
Hi Edgar, I have addressed the review comments in comment 31. Could you take a look again? Thanks.

Note that I kept the QueryInterface part in NetworkInterface cause it is exposed to NetworkService.
Attachment #8599783 - Attachment is obsolete: true
Attachment #8601316 - Flags: review?(echen)
Comment on attachment 8601316 [details] [diff] [review]
Part 2: use promise to ensure execution order in NetworkManager, v3.

Review of attachment 8601316 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thank you.

::: dom/system/gonk/NetworkManager.js
@@ +100,5 @@
> +  this.httpProxyHost = aNetwork.httpProxyHost;
> +  this.httpProxyPort = aNetwork.httpProxyPort;
> +}
> +NetworkInterface.prototype = {
> +  QueryInterface: XPCOMUtils.generateQI([Ci.nsINetworkInterface]),

(In reply to Jessica Jong [:jjong] [:jessica] from comment #37)
> Note that I kept the QueryInterface part in NetworkInterface cause it is
> exposed to NetworkService.

I see. I missed NetworkService.
We are also considering don't take the nsINetworkInterface directly as argument in NetworkService, but spread the arguments up.

@@ +109,5 @@
> +
> +    return this.ips.length;
> +  },
> +
> +  getGateways: function (aCount) {

Nit: Remove space between 'function' and '('.

@@ +116,5 @@
> +    }
> +    return this.gateways.slice();
> +  },
> +
> +  getDnses: function (aCount) {

Nit: Ditto.
Attachment #8601316 - Flags: review?(echen) → review+
try looks good but b2g-inbound is closed at the moment.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0cc33c7fa13f
Patches were merged into m-c last week.

https://hg.mozilla.org/mozilla-central/rev/d210ab7b17ef
https://hg.mozilla.org/mozilla-central/rev/772d976eb3e8
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2.2 S12 (15may)
Depends on: 1162865
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: