Closed Bug 1070858 Opened 5 years ago Closed 5 years ago

[Telemetry] App usage metrics doesn't transmit if boot with Wi-Fi on only.

Categories

(Firefox OS Graveyard :: Gaia::System, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

(blocking-b2g:2.1+, b2g-v2.1 fixed, b2g-v2.2 fixed)

VERIFIED FIXED
2.1 S6 (10oct)
blocking-b2g 2.1+
Tracking Status
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: shinglyu, Assigned: marshall)

References

Details

(Whiteboard: ft:media)

Attachments

(3 files)

Summary: 

  App usage data doesn't transmit if the devices is booted up with Wi-Fi enabled only.

Steps:
  - (Optional) Set the retry interval to 30 sec in gaia/apps/system/js/app_usage_metrics.js
  - Disable 3G; Enable Wi-Fi and connect to an AP.
  - Reboot the phone.
  - Phone booted with Wi-Fi connected automatically

Expected Result:

  App usage data should start to be transmitted to the telemetry server at predefined interval.

Actual Result:

  App usage data was not transmitted. Disable and re-enable the Wi-Fi interface can trigger the transmission.

Additional Information:

  This won't happen for 3G connection. App usage data will be transimitted immediately when the phone boots and the 3G connection is on.

Environment:
  Gaia      b630b8bcaf9653885539d4449bc65c3b592bd752
  Gecko     https://hg.mozilla.org/releases/mozilla-aurora/rev/7cb113c3ce0c
  BuildID   20140917160201
  Version   34.0a2
  ro.build.version.incremental=eng.cltbld.20140915.191939
  ro.build.date=Mon Sep 15 19:19:49 EDT 2014
Flags: needinfo?(marshall)
[Blocking Requested - why for this release]:
blocking-b2g: --- → 2.1?
Blocks: 1071443
I've reproduced this locally on my Flame, looking into it now..
Flags: needinfo?(marshall)
Attached file track navigator.onLine
Track navigator.onLine to avoid race conditions between AUM service startup and registering for the online/offline events.
Attachment #8496112 - Flags: review?(dflanagan)
Comment on attachment 8496112 [details] [review]
track navigator.onLine

The reason I didn't use navigator.onLine in the first place was because of the documentation here: https://developer.mozilla.org/en-US/docs/Web/API/NavigatorOnLine.onLine

It seems ambiguous to me whether this property actually tracks the online state of the device. If I open up the console in Aurora on my Mac, I see navigator.onLine is true. Then I disable wifi, and navigator.onLine is still true.

I haven't actually reviewed the code in any detail yet. It seems simple and an easy r+ but only if you can convince me that navigator.onLine actually works in FirefoxOS. (and if it does, we should set dev-doc-needed or something on this bug to get MDN updated.)

If we aren't sure about navigator.onLine, then another way to address this boot time race condition would be to assume at startup that we do have a connection and update the online flag if the xhr request fails.
Attachment #8496112 - Flags: review?(dflanagan) → review-
(In reply to David Flanagan [:djf] from comment #4)
> The reason I didn't use navigator.onLine in the first place was because of
> the documentation here:
> https://developer.mozilla.org/en-US/docs/Web/API/NavigatorOnLine.onLine

According to this, the 'online' and 'offline' events that we register for are just reflections of changes to the value of |navigator.onLine|, so if the implementation of the onLine flag is problematic, then the online/offline events for tracking would be also..

> 
> It seems ambiguous to me whether this property actually tracks the online
> state of the device. If I open up the console in Aurora on my Mac, I see
> navigator.onLine is true. Then I disable wifi, and navigator.onLine is still
> true.

In my testing with mobile data disabled, WiFi enabled, and navigator.onLine this is what I've found:

* Initially, and for a period during device bootup when WiFi is disconnected, navigator.onLine is false
* Once WiFi is successfully connected, navigator.onLine is true
* After Turning off WiFi from Settings, navigator.onLine is false
* After turning WiFi back on From Settings, and a connection is established, navigator.onLine is true

> 
> I haven't actually reviewed the code in any detail yet. It seems simple and
> an easy r+ but only if you can convince me that navigator.onLine actually
> works in FirefoxOS. (and if it does, we should set dev-doc-needed or
> something on this bug to get MDN updated.)

Agreed! So have I convinced you? :)
Flags: needinfo?(dflanagan)
Moving to "System" component since that's where the fixes are going in.

Thanks
Hema
Component: Gaia → Gaia::System
Blocking Reason: App Usage metrics collection and transmission should work on when only wifi is enabled.
Assignee: nobody → marshall
blocking-b2g: 2.1? → 2.1+
Target Milestone: --- → 2.1 S6 (10oct)
Whiteboard: ft:media
Comment on attachment 8496112 [details] [review]
track navigator.onLine

I was hoping to be convinced by pointers to code, or by input from someone who knows the networking stack. But if your testing has convinced you, I don't want to hold this up, so resetting to r+.

I left one code style nit on github.

And I'd suggest you test navigator.onLine when you turn a data connection on and off to see if it responds to that as well as to wifi.

I'm concerned about false positives, where we think we're online, but our XHR can't actually succeed. Each time we try to send we're going to be writing data to asyncStorage and then rewriting it when we fail. I wonder if the default retry interval of 1 hour is the right choice.  Probably some kind of exponential backoff could have been better there... retry in 30 min, then 1 hour, then 2 hours, 4 hours, 8 hours, etc. until we get to a number larger than the reporting interval.  But probably not something to fix here.
Attachment #8496112 - Flags: review- → review+
Flags: needinfo?(dflanagan)
Marshall,

I just thought of something.  One of the nice features of the current implemntation is that when we transition from offline to online, we use that event to check whether it is time to send a batch of metrics and do that right away.

We don't actually run a timer for the reporting interval, so we only send metrics when something happens. online events can be one of those somethings.

With your patch we no longer listen for online events, so we will no longer send an overdue batch of metrics as soon as we come online. If the phone comes online but the user doesn't do anything, there will be nothing to trigger the metrics transmission. By the time the user actually switches apps (which would trigger a transmission) we may have lost connectivity.  

(This assumes that we actually get online and offline events as we move in and out of cell tower range, and I don't know if that actually happens.  Maybe with data enabled we're actually online all the time even if we have no signal.)

In any case, if this send-on-online behavior is something we should preserve, then an easier fix would be to take the existing code and initialize this.online to navigator.onLine instead of initializing it to false. Then, just continue to monitor online and offline events as we do currently, and that will presumably keep this.online in sync wiht navigator.onLine.  (And if we are not confident of that we could ensure that they are in sync in the event handler so that any time any app transition happens, we check that this.online === navigator.onLine, and if it isn't we issue a warning message and set it)
(In reply to David Flanagan [:djf] from comment #9)
> Marshall,
> 
> I just thought of something.  One of the nice features of the current
> implemntation is that when we transition from offline to online, we use that
> event to check whether it is time to send a batch of metrics and do that
> right away.
> 

We are registering AUM as an event listener for the online/offline events, we just don't have custom logic for them anymore. We always check navigator.onLine, which handily reflects the latest online/offline status :)
Blocks: 1078120
Can we land this?
Flags: needinfo?(marshall)
(In reply to Gregor Wagner [:gwagner] from comment #11)
> Can we land this?

Just waiting on a new try run, I had to fix a merge conflict and rebase
Flags: needinfo?(marshall)
Attached file v2.1 PR
[Approval Request Comment]
[Bug caused by] (feature/regressing bug #): Bug 1033549
[User impact] if declined: app usage metrics won't be properly submitted when the phone is online
[Testing completed]: Unit tests/Try, local testing.
[Risk to taking this patch] (and alternatives if risky):
[String changes made]: none
Attachment #8502059 - Flags: approval-gaia-v2.1?(fabrice)
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
slyu, can you please verify this once it lands on 2.1? Thanks !
Flags: needinfo?(slyu)
Attachment #8502059 - Flags: approval-gaia-v2.1?(fabrice) → approval-gaia-v2.1+
Comment on attachment 8502059 [details] [review]
v2.1 PR

Needs rebasing.
Attachment #8502059 - Attachment is obsolete: true
Flags: needinfo?(marshall)
PR rebased
Flags: needinfo?(marshall)
(In reply to bhavana bajaj [:bajaj] from comment #15)
> slyu, can you please verify this once it lands on 2.1? Thanks !

VERIFIED

Steps:
  * Change the upload interval to 30 sec.
  * make reset-gaia, setup Wi-Fi connection in FTU.
  * Reboot > Wi-Fi connects automatically.
  * Unlock the screen, wait for 2 sec, and lock the screen again.
  * Watch for logcat message and server log

Result:
  App usage metrics uploaded automatically (see attachments)
Status: RESOLVED → VERIFIED
Flags: needinfo?(slyu)
Attached image Verified result
Add the screenshot to prove the verified result.
You need to log in before you can comment on or make changes to this bug.