Closed Bug 1205835 Opened 9 years ago Closed 8 years ago

(telemetry v1) Support sending core metrics in telemetry pings from Firefox for Android

Categories

(Firefox for Android Graveyard :: General, defect)

All
Android
defect
Not set
normal

Tracking

(firefox45 fixed, firefox46 fixed, firefox47 fixed, fennec45+)

RESOLVED FIXED
Firefox 47
Tracking Status
firefox45 --- fixed
firefox46 --- fixed
firefox47 --- fixed
fennec 45+ ---

People

(Reporter: mfinkle, Assigned: mcomella)

References

Details

Attachments

(2 files)

Telemetry is moving to "main" pings and server-side analysis and summary datasets are already starting to depend on it. Firefox for Android only sends "saved-session" and "crash" (although that comes from CrashManager.jsm). 

I don't know the details on the difference between "saved-session" and "main", but the core place where Firefox for Android sends Telemetry is here:
https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetrySession.jsm#1751

That is hardcoded to send "saved-session". Should we send a duplicate payload with the type = "main"?
tracking-fennec: --- → ?
Proposal before diving into details:
* leave "saved-session" generation as is to keep t.m.o etc. working
* also generate a "main" ping in that location, like [0]:
* have that ping be of type "main" and with reason "application-background"
* always upload these, never replace or delete them

That way we start receiving the new data approach but keep existing setups and the dashboard working.
The main ping will basically have the same data as "saved-session", but it will follow the subsession model - i.e. histogram state and some other core measurements are reset between subsessions.

0: https://dxr.mozilla.org/mozilla-central/rev/9ed17db42e3e46f1c712e4dffd62d54e915e0fac/toolkit/components/telemetry/TelemetrySession.jsm#1570
The reason we have main pings on desktop is separation via environment changes, and 24-hour separation. Before making changes for Android we need to understand how we actually want to split pings up. App sessions are also shorter than gecko sessions.

Let's spend time designing this.
The strawman would be to have "app sessions" be represented as individual pings.
That assumes that we want to split them up at those times though.

What is the application cycle here that we want to model?

Something like this?
[app foreground] -> [app background/paused] -> ... repeat until ... -> [app terminated (clean or aborted)]

Can Gecko still be active in the background?
If yes, do we need to cleanly separate measurements that happened while in the background from those that occured after the app went into the foreground again?
Flags: needinfo?(bbermes)
Benjamin, can we please setup a time where we can discuss when this should be done and who from the fennec team needs to assist?
Flags: needinfo?(bbermes) → needinfo?(benjamin)
You will need to coordinate this with thuelbert (EPM) and kparlante.
Flags: needinfo?(benjamin)
tracking-fennec: ? → 45+
Blocks: ut-android
Assignee: nobody → ally
Ally, are you still working on this?
Flags: needinfo?(ally)
nope
Assignee: a.m.naaktgeboren → nobody
Flags: needinfo?(a.m.naaktgeboren)
tracking-fennec: 45+ → ?
Let's pull Sebastian's core work from bug 1226322 into this bug.
Assignee: nobody → s.kaspari
tracking-fennec: ? → 46+
Morphing the summary. We don't necessarily care about implementing the "main" ping, but rather sending the core metrics we need to power executive dashboards (and A/B testing).
Summary: Support "main" pings from Firefox for Android → Support sending core metrics in telemetry pings from Firefox for Android
Hot potato!
Assignee: s.kaspari → michael.l.comella
https://reviewboard.mozilla.org/r/29743/#review27063

::: mobile/android/services/src/main/java/org/mozilla/gecko/background/telemetry/TelemetryPingGenerator.java:45
(Diff revision 1)
> +            ping.put(CREATION_DATE, 0); // TODO: This.

times.json, held in the profile

::: mobile/android/services/src/main/java/org/mozilla/gecko/background/telemetry/TelemetryPingGenerator.java:68
(Diff revision 1)
> +            final String clientId = UUID.randomUUID().toString();

this will need to be synced to the gecko clientid

::: mobile/android/services/src/main/java/org/mozilla/gecko/background/telemetry/TelemetryPingGenerator.java:78
(Diff revision 1)
> +        app.put(APP_NAME, AppConstants.MOZ_APP_DISPLAYNAME); // TODO Correct?

I think this is MOZ_APP_BASENAME

::: mobile/android/services/src/main/java/org/mozilla/gecko/background/telemetry/TelemetryPingGenerator.java:81
(Diff revision 1)
> +        app.put(APP_PLATFORM_VERSION, 0); // TODO: where to get?

platform is locked to app, so you can use the APP_VERSION, but technically held in GRE_MILESTONE

::: mobile/android/services/src/main/java/org/mozilla/gecko/background/telemetry/TelemetryPingGenerator.java:83
(Diff revision 1)
> +        app.put(APP_CHANNEL, AppConstants.MOZ_UPDATE_CHANNEL); // TODO: Correct?

correct
> > +            ping.put(CREATION_DATE, 0); // TODO: This.
> 
> times.json, held in the profile

You should expose and use (or move to GeckoProfile!) this:

BrowserHealthRecorder.getAndPersistProfileInitTime(context, profilePath)

because it handles legacy profiles correctly, amongst other things. Don't directly read times.json.
Moving to GeckoProfile works. We already write times.json in GeckoProfile, so adding a nice accessor would be good:
http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/GeckoProfile.java#859
Comment on attachment 8704779 [details]
MozReview Request: Bug 1205835 - Add TelemetryPingGenerator for core pings. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/29743/diff/1-2/
Attachment #8704779 - Attachment description: MozReview Request: Bug 1205835 - (WIP) Add TelemetryPingGenerator. r=na → MozReview Request: Bug 1205835 - (WIP) Add TelemetryPingGenerator for core pings. r=na
Comment on attachment 8705866 [details]
MozReview Request: Bug 1205835 - Create telemetry upload service and upload in onStart. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/30215/diff/1-2/
I implemented the tentative "core" ping format from the Android telemetry doc and left in fewer TODOs. As it stands, it does not appear that we need the ProfileInformationCache (from FHR), which will reduce the complexity here. However, there are still a few unanswered questions which may revive it.

While those questions (generally from the format) are being decided, I'll dig into getting the server url from about:config (...maybe this will need the PIC, but alternatively, we can hard-code it for now or set these values in java) and finishing up the uploader.
Comment on attachment 8704779 [details]
MozReview Request: Bug 1205835 - Add TelemetryPingGenerator for core pings. r=rnewman

To get additional mfinkle thoughts on the new format.
Attachment #8704779 - Flags: feedback?(mark.finkle)
https://reviewboard.mozilla.org/r/29743/#review27413

::: mobile/android/services/src/main/java/org/mozilla/gecko/background/telemetry/TelemetryPingGenerator.java:47
(Diff revision 2)
> +        // TODO: Get server url from about:config.

Looks like the UpdateService uses PrefsHelper to fetch the update URL from a Gecko pref. Telemetry uses "toolkit.telemetry.server"

::: mobile/android/services/src/main/java/org/mozilla/gecko/background/telemetry/TelemetryPingGenerator.java:71
(Diff revision 2)
> +        ping.put(CorePing.ARCHITECTURE, AppConstants.MOZ_APP_ABI); // TODO: Get from gecko? https://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryEnvironment.jsm?rev=d9a65ae92a91#996

Let's not require Gecko for this

::: mobile/android/services/src/main/java/org/mozilla/gecko/background/telemetry/TelemetryPingGenerator.java:74
(Diff revision 2)
> +        ping.put(CorePing.LOCALE, Locale.getDefault().toString()); // TODO: Need app locale? Need to revive PIC.

Why would we need PIC?
https://reviewboard.mozilla.org/r/30215/#review27415

::: mobile/android/base/java/org/mozilla/gecko/BrowserApp.java:966
(Diff revision 2)
> +        // TODO: Remove me.

Just an FYI, maybe for the futrue or maybe for now. SwitchBoard will fire a local broadcast, letting any code know when the configuration has been fetched from the network. At the point, we know we have the list of active experiments:

fired:
http://mxr.mozilla.org/mozilla-aurora/source/mobile/android/thirdparty/com/keepsafe/switchboard/SwitchBoard.java#239

example catching:
http://mxr.mozilla.org/mozilla-aurora/source/mobile/android/base/java/org/mozilla/gecko/health/BrowserHealthRecorder.java#538
http://mxr.mozilla.org/mozilla-aurora/source/mobile/android/base/java/org/mozilla/gecko/health/BrowserHealthRecorder.java#734
(In reply to Mark Finkle (:mfinkle) from comment #21)
> Looks like the UpdateService uses PrefsHelper to fetch the update URL from a
> Gecko pref. Telemetry uses "toolkit.telemetry.server"

Seems feasible, but I don't think PrefsHelper starts Gecko meaning we have to wait for it to be running from other means (which is probably when the browser runs). This is fine for the purposes of our retention pings in onResume (caveat being we can miss someone entering the app and leaving before Gecko starts), but once we move this to a proper scheduled background Service, we'll have to move towards a PIC-style solution.

A short-term alternative that will catch the case of the user entering and leaving the app before Gecko starts is to have a Java preference that is taken out in non-developer builds. However, I doubt we care about these cases very much (at least immediately) so I'm going to do the PrefsHelper solution unless anyone says otherwise.

Note to self: since PrefsHelper uses a callback, for us to keep the operation in a service, we'll need to create a new Intent to the service in the PrefsHelper callback (which is what the UpdateService does).

> > +        ping.put(CorePing.ARCHITECTURE, AppConstants.MOZ_APP_ABI); // TODO: Get from gecko? https://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryEnvironment.jsm?rev=d9a65ae92a91#996
> 
> Let's not require Gecko for this

Is AppConstants.MOZ_APP_ABI the value we want here then?

> > +        ping.put(CorePing.LOCALE, Locale.getDefault().toString()); // TODO: Need app locale? Need to revive PIC.
> 
> Why would we need PIC?

From the PIC comments:

53   // There are really four kinds of locale in play:
54   //
55   // * The OS
56   // * The Android environment of the app (setDefault)
57   // * The Gecko locale
58   // * The requested content locale (Accept-Language).
59   //
60   // We track only the first two, assuming that the Gecko locale will typically
61   // be the same as the app locale.
62   //
63   // The app locale is fetched from the PIC because it can be modified at
64   // runtime -- it won't necessarily be what Locale.getDefaultLocale() returns
65   // in a fresh non-browser profile.
66   //
67   // We also track the OS locale here for the same reason -- we need to store
68   // the default (OS) value before the locale-switching code takes effect!

I suppose I'm actually unclear on how to retrieve each locale – which one do we want to record? The Android environment of the app?
Flags: needinfo?(mark.finkle)
Comment on attachment 8704779 [details]
MozReview Request: Bug 1205835 - Add TelemetryPingGenerator for core pings. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/29743/diff/2-3/
Attachment #8704779 - Flags: feedback?(mark.finkle)
Comment on attachment 8705866 [details]
MozReview Request: Bug 1205835 - Create telemetry upload service and upload in onStart. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/30215/diff/2-3/
Attachment #8705866 - Attachment description: MozReview Request: Bug 1205835 - (WIP) Create non-working telemetry upload service (IOException). r=na → MozReview Request: Bug 1205835 - (WIP) Create telemetry upload service. r=na
Latest patch series implements the new ping document format with a near final UploadService, albeit with no tests (yet). I tested locally that the ping counter successfully updates.

Another TODO: make sure we're not uploading data for dev builds.
https://reviewboard.mozilla.org/r/29743/#review27975

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java:42
(Diff revision 3)
> +    private static String getTelemetryServerUrl(final String serverUrlSchemeHostPort, final String docType) {

Can we change this code to a long set of StringBuilder appends?

StringBuilder url(128);
url.append(serverUrlSchemeHostPort);
url.append("/");
...

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java:77
(Diff revision 3)
> +        ping.put(CorePing.DEVICE, Build.MANUFACTURER + "-" + Build.MODEL); // TODO: finalize format

I wonder if we should limit/truncate these Build fields. These could be long strings.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java:102
(Diff revision 3)
> +    private static String getClientId(final String profilePath) throws IOException {

Sounds like Richard would like to see thise type of method added to GeckoProfile, which will handle writting the file, if it doesn't exist.
https://reviewboard.mozilla.org/r/30215/#review27977

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:52
(Diff revision 3)
> +        if (!intent.hasExtra(TelemetryConstants.EXTRA_SERVER_URL)) {

It looks like BrowserApp will not send the serverURL, so we will wait for Gecko. I don't like that. I would rather not even ask Gecko for the serverURL and hard code it as a default:

sharedPrefs.getString(TelemetryConstants.PREF_SERVER_URL, DEFAULT_SERVER_URL); //in uploadCorePing

Where DEFAULT_SERVER_URL is in TelemetryConstants as "https://incoming.telemetry.mozilla.org"

We can add code somewhere in a followup that asks Gecko for the pref and saves it in the shared pref. That will allow us to change the pref if we need to.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:88
(Diff revision 3)
> +            corePing = TelemetryPingGenerator.createCorePing(profilePath, serverUrlSchemeHostPort, pingCount);

If we only need the profile path for the clientId, we could add the getter to GeckoProfile and just pass the clientId into the ping generator directly.
https://reviewboard.mozilla.org/r/29743/#review27975

> Can we change this code to a long set of StringBuilder appends?
> 
> StringBuilder url(128);
> url.append(serverUrlSchemeHostPort);
> url.append("/");
> ...

Fixed but it's not strictly necessary because the compiler optimizes the way I wrote it into SB calls (see http://stackoverflow.com/a/7586780). Read my comment added in code for why I decided to fix it anyway.

> I wonder if we should limit/truncate these Build fields. These could be long strings.

Seems reasonable. I limited it to 32 characters (16 seemed too small but maximum cap is somewhat arbitrary) and broke it up unequally between the manufacteurer & the device name – see the code for more details.

> Sounds like Richard would like to see thise type of method added to GeckoProfile, which will handle writting the file, if it doesn't exist.

I like this solution – done.
Comment on attachment 8704779 [details]
MozReview Request: Bug 1205835 - Add TelemetryPingGenerator for core pings. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/29743/diff/3-4/
Attachment #8704779 - Flags: feedback?(mark.finkle)
Comment on attachment 8705866 [details]
MozReview Request: Bug 1205835 - Create telemetry upload service and upload in onStart. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/30215/diff/3-4/
Attachment #8705866 - Flags: feedback?(mark.finkle)
(In reply to Michael Comella (:mcomella) from comment #23)

> > > +        ping.put(CorePing.ARCHITECTURE, AppConstants.MOZ_APP_ABI); // TODO: Get from gecko? https://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryEnvironment.jsm?rev=d9a65ae92a91#996
> > 
> > Let's not require Gecko for this
> 
> Is AppConstants.MOZ_APP_ABI the value we want here then?

Probably not the best replacement. Currently, Services.sysinfo.get("arch") = "arm"
AppConstants.MOZ_APP_ABI = "arm-eabi-gcc3"
AppConstants.ANDROID_CPU_ARCH = "armeabi-v7a" or "x86"

I am tempted to go with AppConstants.ANDROID_CPU_ARCH
Flags: needinfo?(mark.finkle)
https://reviewboard.mozilla.org/r/29743/#review28439

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryConstants.java:12
(Diff revision 4)
> +        public static final String NAME = "fennec-coredata";

This was updated to just "core"

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryConstants.java:17
(Diff revision 4)
> +        public static final String PING_COUNT = "ping";

"ping" is now "seq"

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryConstants.java:19
(Diff revision 4)
> +        public static final String OS_VERSION = "os";

We updated the docs. "os" will be the OS name, "Android" in our case, but "iOS" for that platform. We added "osversion" would be the SDK_INT

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java:44
(Diff revision 4)
> +        return new StringBuilder(128) // Initial capacity chosen because most telemetry urls won't fill it.

I would be remiss if I didn't update with "current thinking" here. Using literals and constants in String concatenation does the work at compile time. Your original code is probably fine and easier to read.

Feel free to revert, or at least use constants and literals in a way that joins them at compile time.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java:82
(Diff revision 4)
> +        ping.put(CorePing.ARCHITECTURE, AppConstants.MOZ_APP_ABI); // TODO: Value correct?

This is actually supposed to be "arm" or "x86" using the Gecko impl as a base. I don't think we need to strictly comply with that. AppConstants.ANDROID_CPU_ARCH gets us closer ("armeabi-v7a" or "x86")
https://reviewboard.mozilla.org/r/30215/#review28441

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:54
(Diff revision 4)
> +            requestServerUrlAndRedeliverIntent(intent);

If Gecko is crashing or hung, we'll never send a ping, right?
A request regarding documentation:
Can we land updates to the in-tree documentation along with the actual changes?

The Telemetry documentation lives in toolkit/components/telemetry/docs/, the documentation for this ping could live in core-ping.rst, similar to e.g. crash-ping.rst.

I can update that documentation with more details later, but it would be great to have the in-tree doc on the basic ping contents match that revisions code.
https://reviewboard.mozilla.org/r/29743/#review28465

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java:86
(Diff revision 4)
> +        ping.put(CorePing.OS_VERSION, Build.VERSION.SDK_INT);

Does this mean it will show up as a number in the JSON serialization?
We defined this as a string (for cross-platform reasons etc.).
https://reviewboard.mozilla.org/r/30215/#review28441

> If Gecko is crashing or hung, we'll never send a ping, right?

This is fixed in the new implementation that grabs the value from shared preferences.

That being said, we currently only send pings from `BrowserApp.onStart` so we won't send pings if the application crashes before then.
(In reply to Michael Comella (:mcomella) from comment #37)
> https://reviewboard.mozilla.org/r/30215/#review28441
> 
> > If Gecko is crashing or hung, we'll never send a ping, right?
> 
> This is fixed in the new implementation that grabs the value from shared
> preferences.
> 
> That being said, we currently only send pings from `BrowserApp.onStart` so
> we won't send pings if the application crashes before then.

True, but Gecko != Application.
(In reply to Georg Fritzsche [:gfritzsche] from comment #36)
> https://reviewboard.mozilla.org/r/29743/#review28465
> 
> :::
> mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.
> java:86
> (Diff revision 4)
> > +        ping.put(CorePing.OS_VERSION, Build.VERSION.SDK_INT);
> 
> Does this mean it will show up as a number in the JSON serialization?
> We defined this as a string (for cross-platform reasons etc.).

Technically, yes this is an integer. We could convert it to a String object first.
https://reviewboard.mozilla.org/r/29743/#review28439

> I would be remiss if I didn't update with "current thinking" here. Using literals and constants in String concatenation does the work at compile time. Your original code is probably fine and easier to read.
> 
> Feel free to revert, or at least use constants and literals in a way that joins them at compile time.

Reverted and added a comment.

> This is actually supposed to be "arm" or "x86" using the Gecko impl as a base. I don't think we need to strictly comply with that. AppConstants.ANDROID_CPU_ARCH gets us closer ("armeabi-v7a" or "x86")

Used that constant and removed TODO.
https://reviewboard.mozilla.org/r/29743/#review28465

> Does this mean it will show up as a number in the JSON serialization?
> We defined this as a string (for cross-platform reasons etc.).

Converted to a String.
Comment on attachment 8704779 [details]
MozReview Request: Bug 1205835 - Add TelemetryPingGenerator for core pings. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/29743/diff/4-5/
Comment on attachment 8705866 [details]
MozReview Request: Bug 1205835 - Create telemetry upload service and upload in onStart. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/30215/diff/4-5/
(In reply to Georg Fritzsche [:gfritzsche] from comment #35)
> A request regarding documentation:
> Can we land updates to the in-tree documentation along with the actual
> changes?

I'm going to land the changes first and file a follow-up – I just want to land something asap.

Filed bug 1241599.
Attachment #8704779 - Attachment description: MozReview Request: Bug 1205835 - (WIP) Add TelemetryPingGenerator for core pings. r=na → MozReview Request: Bug 1205835 - Add TelemetryPingGenerator for core pings. r=rnewman
Attachment #8704779 - Flags: review?(rnewman)
Comment on attachment 8704779 [details]
MozReview Request: Bug 1205835 - Add TelemetryPingGenerator for core pings. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/29743/diff/5-6/
Comment on attachment 8705866 [details]
MozReview Request: Bug 1205835 - Create telemetry upload service and upload in onStart. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/30215/diff/5-6/
Attachment #8705866 - Attachment description: MozReview Request: Bug 1205835 - (WIP) Create telemetry upload service. r=na → MozReview Request: Bug 1205835 - Create telemetry upload service and upload in onStart. r=rnewman
Attachment #8705866 - Flags: review?(rnewman)
Plan: we should just land this to confirm we can get data from Nightly (and get other people cracking on the analysis scripts). Additionally, by landing some code, we can have more people working on it. Next up:
 * Add in-tree docs for the new format
 * Write tests

Finkle, the only TODO left in this code is to figure out which locale to use so... which locale do we use? :)
Flags: needinfo?(mark.finkle)
Attachment #8704779 - Flags: review?(rnewman) → review+
Comment on attachment 8704779 [details]
MozReview Request: Bug 1205835 - Add TelemetryPingGenerator for core pings. r=rnewman

https://reviewboard.mozilla.org/r/29743/#review28593

::: mobile/android/base/java/org/mozilla/gecko/GeckoProfile.java:617
(Diff revision 6)
> +            clientIdFileContents = readFile(CLIENT_ID_FILE_PATH);

Sidenote: I wonder if we should replace `readFile` with my usual one-liner:

```
    private String getData(InputStream stream) {
        return new java.util.Scanner(stream).useDelimiter("\\A").next();
    }
```

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPing.java:22
(Diff revision 6)
> +    public String getUrl() { return url; }

`getURL`

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java:34
(Diff revision 6)
> +    private static String getTelemetryServerUrl(final String serverUrlSchemeHostPort, final String docType) {

s/Url/URL/g throughout

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java:79
(Diff revision 6)
> +        ping.put(CorePing.LOCALE, Locale.getDefault().toString()); // TODO: Which locale here?

At the very least, this should be

```
    Locales.getLanguageTag()
```

which will return 'es-ES' instead of 'es_ES', and will correctly translate outdated Java codes.
Comment on attachment 8705866 [details]
MozReview Request: Bug 1205835 - Create telemetry upload service and upload in onStart. r=rnewman

https://reviewboard.mozilla.org/r/30215/#review28597

::: mobile/android/base/java/org/mozilla/gecko/BrowserApp.java:974
(Diff revision 6)
> +        uploadTelemetry();

Why here and not in onResume (which comes next) or inside the background runnable? `uploadTelemetry` grabs the profile, so it makes a mockery of the comment on line 958…

::: mobile/android/base/java/org/mozilla/gecko/BrowserApp.java:3877
(Diff revision 6)
> +        Log.d("GeckoTelemetry", "Upload service started");

Log this inside the service…

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:27
(Diff revision 6)
> +    private static final String LOGTAG = "Gecko" + TelemetryUploadService.class.getSimpleName();

Log tag is too long. Limit is 23, this is 28.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:32
(Diff revision 6)
> +        // We may upload to the server more than once but we have no metrics

Yikes. I'm not convinced this is right: why not false?

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:39
(Diff revision 6)
> +        Log.d(LOGTAG, "Service started");

Oh hey! You already do!

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:42
(Diff revision 6)
> +            Log.d(LOGTAG, "Health report upload feature is compile-time disabled; not handling upload intent.");

s/Health report/Telemetry

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:77
(Diff revision 6)
> +            clientId = GeckoProfile.get(this).getClientId();

Include this in the intent. The sender of the intent has the right profile -- indeed, has just checked its state.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:84
(Diff revision 6)
> +        final SharedPreferences sharedPrefs = GeckoSharedPrefs.forProfile(this);

So don't you need to pass in the profile name, too?

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:86
(Diff revision 6)
> +        final String serverUrlSchemeHostPort =

s/Url/URL throughout

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:107
(Diff revision 6)
> +        Log.d(LOGTAG, "Ping upload initiated.");

`post` is secretly synchronous, so you might find this gets called after the post runs. Perhaps just kill this line.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:122
(Diff revision 6)
> +            return null;

We should specify something here.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:132
(Diff revision 6)
> +                    sharedPrefs.edit().putInt(TelemetryConstants.PREF_PING_COUNT, pingCount + 1).apply();

If this really were asynchronous, it would be possible for the service's worker thread to kick off two uploads with the same ping ID.

You should define `BaseResource.postBlocking` (just like `BaseResource.getBlocking`) and use that instead of `post` to ensure that this ping runs before another request gets scheduled on the worker thread.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:154
(Diff revision 6)
> +            Log.w(LOGTAG, "HttpTransportException when trying to upload telemetry");

s/HttpTransportException/Transport exception
Attachment #8705866 - Flags: review?(rnewman)
(In reply to Richard Newman [:rnewman] from comment #48)

> mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.
> java:79
> (Diff revision 6)
> > +        ping.put(CorePing.LOCALE, Locale.getDefault().toString()); // TODO: Which locale here?
> 
> At the very least, this should be
> 
> ```
>     Locales.getLanguageTag()
> ```
> 
> which will return 'es-ES' instead of 'es_ES', and will correctly translate
> outdated Java codes.

Agreed. and let's make sure the ping docs state clearly that this field is the OS locale, which might not be the current browser locale.
Flags: needinfo?(mark.finkle)
https://reviewboard.mozilla.org/r/30215/#review28597

> Why here and not in onResume (which comes next) or inside the background runnable? `uploadTelemetry` grabs the profile, so it makes a mockery of the comment on line 958…

`onResume`/`onPause` will be called when our own Activities/dialogs (e.g. tab queue enable prompt, share overlay) obscure the main Activity and I saw no reason to ping the servers every time we send the dialog so I chose to upload in `onStart`/`onStop`, which will only be called after the Activity is fully obscured (or starting for the first time). I'll add a comment to clarify.

Good call on the background runnable – I totally missed that...

> Log tag is too long. Limit is 23, this is 28.

When we got around to fixing the associated lint error, I was hoping to discuss this as a broader issue – I filed bug 1242091 to get this discussion rolling now. For now, I'll leave this as is.

> Yikes. I'm not convinced this is right: why not false?

In theory, the `Service` could get killed before we can increment the ping count but after we've already uploaded.

> `post` is secretly synchronous, so you might find this gets called after the post runs. Perhaps just kill this line.

That's funny – I just assumed the request happened so quickly, it finished before Android could log. :P

> We should specify something here.

I followed the format of the FxAccount UA:
 https://mxr.mozilla.org/mozilla-central/source/mobile/android/services/src/main/java/org/mozilla/gecko/fxa/FxAccountConstants.java#40
https://reviewboard.mozilla.org/r/30215/#review28597

> In theory, the `Service` could get killed before we can increment the ping count but after we've already uploaded.

(sent before I finished) The service will redeliver the Intent and we could send a ping count, but not a duplicate UUID.

This is fine for now but having duplicate sequence #'s is not ideal so perhaps we should send the seq with the Intent.

I'll think this over for a bit and try to come up with a simple solution that works for now.
(In reply to Mark Finkle (:mfinkle) from comment #50)

> Agreed. and let's make sure the ping docs state clearly that this field is
> the OS locale, which might not be the current browser locale.

This will be the browser locale, which might not be the OS locale. And if we ever generate a telemetry ping prior to starting one of our activities or other services, it might be the OS locale.
Status: NEW → ASSIGNED
Component: Telemetry → General
OS: Unspecified → Android
Product: Toolkit → Firefox for Android
Hardware: Unspecified → All
Version: unspecified → Trunk
(In reply to Richard Newman [:rnewman] from comment #53)
> (In reply to Mark Finkle (:mfinkle) from comment #50)
> 
> > Agreed. and let's make sure the ping docs state clearly that this field is
> > the OS locale, which might not be the current browser locale.
> 
> This will be the browser locale, which might not be the OS locale. And if we
> ever generate a telemetry ping prior to starting one of our activities or
> other services, it might be the OS locale.

Really? Wouldn't this always get the OS locale? Locales.getLanguageTag(Locale.getDefault())

Side note: I see that Locale.toLanguageTag() is in SDK 21. Might be worth updating Locales.getLanguageTag(Locale)
(In reply to Mark Finkle (:mfinkle) from comment #54)

> Really? Wouldn't this always get the OS locale?
> Locales.getLanguageTag(Locale.getDefault())

The Java `Locale` 'default' is mutable. The way we do locale switching is to mutate it very early in the app lifecycle.

http://docs.oracle.com/javase/7/docs/api/java/util/Locale.html#setDefault%28java.util.Locale.Category,%20java.util.Locale%29

That's why we had some careful code to fetch the OS locale and the app locale separately for FHR -- we have to fetch it before we do anything else.
Frustration growing. Ability to care decreasing.

If BrowserLocaleManager can return the system locale, let's use it. If not, 0fg, and use Locale.getDefault().

It shouldn't be this hard.
The only reason to care about and additionally include the OS locale is if we want to know if it differs from the browser locale -- i.e., if users are switching locales within Firefox, and if that ability affects retention.

If we don't want to know that, then this ping format and the code Michael wrote are fine.
https://reviewboard.mozilla.org/r/30215/#review28597

> If this really were asynchronous, it would be possible for the service's worker thread to kick off two uploads with the same ping ID.
> 
> You should define `BaseResource.postBlocking` (just like `BaseResource.getBlocking`) and use that instead of `post` to ensure that this ping runs before another request gets scheduled on the worker thread.

> If this really were asynchronous, it would be possible for the service's worker thread to kick off two uploads with the same ping ID.

Why is that?

In any case, I realized it was necessary to block so that `onStartCommand` doesn't return and stop the Service before our upload request returns.
https://reviewboard.mozilla.org/r/30215/#review28597

> (sent before I finished) The service will redeliver the Intent and we could send a ping count, but not a duplicate UUID.
> 
> This is fine for now but having duplicate sequence #'s is not ideal so perhaps we should send the seq with the Intent.
> 
> I'll think this over for a bit and try to come up with a simple solution that works for now.

I've made the realization that since services can be killed at any time, we can be killed after we upload but before we can store that we uploaded. In that case, we'd unknowingly upload a duplicate document if we wanted to be sure we'd uploaded the document at all.

For the current system, my solution is to allow intent redelivery but pass in a sequence and docId to the Intent to ensure that these (the two metrics that allow de-duplication) are always in sync for a duplicate upload. The one caveat to this is if the Intent is not redelivered properly (e.g. do Intents get redelivered after shutdown & power on?) we may never reupload the document.

However, I don't think it is worth solving this because the redelivery scheme will have to change when we start resetting counts after we upload.

iirc, seq & docId are redundant so it'd be simpler if we just removed docId, but I'd rather land this than go back and forth on it anymore.
(In reply to Michael Comella (:mcomella) from comment #58)

> > If this really were asynchronous, it would be possible for the service's worker thread to kick off two uploads with the same ping ID.
> 
> Why is that?

  1st onHandleIntent > fetches ping ID 1 > begins upload > yields worker thread
  2nd onHandleIntent > fetches ping ID 1 > begins upload
  first upload succeeds, writes ID 2 to prefs
  second upload succeeds…

But how would we get two intents? 

Imagine the first upload takes 50 seconds. It's possible for you to do quite a bit of stuff, including ending and re-starting the activity (Don't Keep Activities), in 50 seconds. The running upload will stick around until the process dies or it completes, giving you plenty of time to start another.

And if you have a bug that causes two intents to be issued (e.g., running the same code in onStart in BrowserApp and GeckoApp), they'll be run sequentially, which in this case would routinely cause parallel uploads.
Comment on attachment 8704779 [details]
MozReview Request: Bug 1205835 - Add TelemetryPingGenerator for core pings. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/29743/diff/6-7/
Comment on attachment 8705866 [details]
MozReview Request: Bug 1205835 - Create telemetry upload service and upload in onStart. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/30215/diff/6-7/
Attachment #8705866 - Flags: review?(rnewman)
I changed my code to increment the seq no. each time upload is requested so the scenario in comment 60 shouldn't occur in the latest iteration.

Pending review, this should be good to land.
I see in the telemetry docs that "appName" (in the path) should be "Fennec" but we use MOZ_APP_BASENAME, which appears to be defined as Fennec all of the time [1] (but presumably is subject to change).

Finkle, should we leave it as MOZ_APP_BASENAME or hardcode it as "Fennec" to be sure it does not change?

[1]: https://mxr.mozilla.org/mozilla-central/source/mobile/android/confvars.sh#5
Flags: needinfo?(mark.finkle)
(In reply to Michael Comella (:mcomella) from comment #64)

> Finkle, should we leave it as MOZ_APP_BASENAME or hardcode it as "Fennec" to
> be sure it does not change?
> 
> [1]:
> https://mxr.mozilla.org/mozilla-central/source/mobile/android/confvars.sh#5

I'd leave it. We have been using MOZ_APP_BASENAME in Gecko telemetry forever. If MOZ_APP_BASENAME changes, it affects everything.
Flags: needinfo?(mark.finkle)
Attachment #8705866 - Flags: review?(rnewman) → review+
Comment on attachment 8705866 [details]
MozReview Request: Bug 1205835 - Create telemetry upload service and upload in onStart. r=rnewman

https://reviewboard.mozilla.org/r/30215/#review29501

LGTM with these changes.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:28
(Diff revision 7)
> +    private static final String LOGTAG = "Gecko" + TelemetryUploadService.class.getSimpleName();

Shorten me, plz.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:42
(Diff revision 7)
> +        setIntentRedelivery(true);

Let's not do this.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:139
(Diff revision 7)
> +        final String serverUrlSchemeHostPort =

:%s/Url/URL throughout.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryUploadService.java:153
(Diff revision 7)
> +            Log.w(LOGTAG, "URISyntaxException for server url when creating BaseResource: returning.");

URL
The one short-coming of this patch is that we won't upload or reschedule our upload when there's no network connection. This will be fixed with 1243585.
Summary: Support sending core metrics in telemetry pings from Firefox for Android → (telemetry v1) Support sending core metrics in telemetry pings from Firefox for Android
https://hg.mozilla.org/mozilla-central/rev/4df23c2aa5c2
https://hg.mozilla.org/mozilla-central/rev/7fbfcaed9a16
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 47
NI self to request uplift once we verify this works.
Flags: needinfo?(michael.l.comella)
We want this for beta.
tracking-fennec: 46+ → 45+
Comment on attachment 8704779 [details]
MozReview Request: Bug 1205835 - Add TelemetryPingGenerator for core pings. r=rnewman

This patch must be uplifted with bug 1244293.

It'd be great to additionally uplift bug 1241697, which will give us additional data for our core pings. To make this uplift (and potential later uplifts) easier, it'd be great to uplift bug 1241599 first, which landed some documentation.

Some bugs which are not strictly necessary but are file size optimizations worth uplifting are bug 1244859 and bug 1244861 (which is currently unfinished).

Approval Request Comment
[Feature/regressing bug #]: FHR servers were shut off and we're not collecting certain types of data.
[User impact if declined]: We currently don't have data for and blocking the uplift would prevent us from collecting this data in those versions.

[Describe test coverage new/current, TreeHerder]: Tested locally & nightly, verified uploads via telemetry servers for both local & nightly builds.
[Risks and why]: Medium – in theory we can crash (e.g. NullPointerException). In theory, since we run every time the app is shown, we could crash every time the app is shown. However, since all of this code runs in a background service, we could easily disable it with a change to one constant: TelemetryConstants.UPLOAD_ENABLED. With bug 1244293, users could disable this upload themselves although I'm not sure if there's a way for them to get to the Settings without opening the browser).

[String/UUID change made/needed]: None

Once I add more to telemetry, I ideally would uplift bug 1244295 (which prevents us from uploading asap in new profiles) and bug 1246209 (which would give us even more data).
Flags: needinfo?(michael.l.comella)
Attachment #8704779 - Flags: approval-mozilla-beta?
Attachment #8704779 - Flags: approval-mozilla-aurora?
Attachment #8705866 - Flags: approval-mozilla-beta?
Attachment #8705866 - Flags: approval-mozilla-aurora?
bug 1246314 should get uplifted with this too.
This (and probably a bunch of the related uplift requests) needs to be rebased to work around the lack of bug 1107811 on the release branches.
Flags: needinfo?(michael.l.comella)
Flags: needinfo?(michael.l.comella)
(In reply to Wes Kocher (:KWierso) from comment #75)
> This (and probably a bunch of the related uplift requests) needs to be
> rebased to work around the lack of bug 1107811 on the release branches.

This was resolved. bug 1107811 was not the issue, but rather that bug 1244859 landed first when it should have landed on top.
Comment on attachment 8704779 [details]
MozReview Request: Bug 1205835 - Add TelemetryPingGenerator for core pings. r=rnewman

michael had my approval. Updating the uplift flags for posterity.
Should be in 45 beta 4
Attachment #8704779 - Flags: approval-mozilla-beta?
Attachment #8704779 - Flags: approval-mozilla-beta+
Attachment #8704779 - Flags: approval-mozilla-aurora?
Attachment #8704779 - Flags: approval-mozilla-aurora+
Attachment #8705866 - Flags: approval-mozilla-beta?
Attachment #8705866 - Flags: approval-mozilla-beta+
Attachment #8705866 - Flags: approval-mozilla-aurora?
Attachment #8705866 - Flags: approval-mozilla-aurora+
Depends on: 1326212
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.