Closed Bug 1363924 Opened 7 years ago Closed 6 years ago

Event telemetry as part of sync ping

Categories

(Firefox for Android Graveyard :: Android Sync, enhancement, P1)

enhancement

Tracking

(firefox60 fixed)

RESOLVED FIXED
Firefox 60
Tracking Status
firefox60 --- fixed

People

(Reporter: Grisha, Assigned: eoger)

References

Details

Attachments

(3 files)

Bug 1308337 introduces the sync ping, and sets up stage for event telemetry. This bug is to track implementation of event telemetry collection & bundling it into payloads.

It seems that as a first stab at this, we want to:
- emit events out of the client sync stage when encountering or sending client commands
- emit events when receiving or sending tabs
Blocks: 1180321
Priority: -- → P2
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
Product: Android Background Services → Firefox for Android
Things to do:
- How do we get hashedFxAUID? It's returned when we get a TokenServerToken, but not stored anywhere, thoughts Grisha?
- Write/modify existing tests
Assignee: nobody → eoger
Status: RESOLVED → REOPENED
Priority: P2 → P1
Resolution: DUPLICATE → ---
Also I should prob split the last commit in two: add the events telemetry, then record events for commands.
Comment on attachment 8948019 [details]
Bug 1363924 p3 - Record commands telemetry events.

https://reviewboard.mozilla.org/r/217664/#review225784

R+ based on what it seems like the shape of this data is.

You have a bunch of TODOs about getting the deviceID and UID, you should make sure that that these values are the latest values as of the most recent sync, and not as of the following sync. I guess this doesn't matter tons, though, since changes are rare. The hashed uid is available https://searchfox.org/mozilla-central/source/mobile/android/services/src/main/java/org/mozilla/gecko/tokenserver/TokenServerToken.java#11, but I'm not sure if you can get that easily, it might require saving it across syncs if android fetches it every time (IIRC it might? I could be wrong).
Attachment #8948019 - Flags: review?(tchiovoloni) → review+
Comment on attachment 8948018 [details]
Bug 1363924 p1 - Add flowID to Client commands.

https://reviewboard.mozilla.org/r/217662/#review227250

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:61
(Diff revision 1)
>      public final String commandType;
>      public final JSONArray args;
>      private List<String> argsList;
> +    @Nullable public String flowID;
>  
>      public Command(String commandType, JSONArray args) {

Why do you need the old constructor?

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:65
(Diff revision 1)
>  
>      public Command(String commandType, JSONArray args) {
> +      this(commandType, args, null);
> +    }
> +
> +    public Command(String commandType, JSONArray args, String flowID) {

mark flowId as @nullable

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:164
(Diff revision 1)
>        if (unparsedArgs == null) {
>          return null;
>        }
> +      final String flowID = unparsedCommand.getString("flowID");
>  
> -      return new Command(type, unparsedArgs);
> +      return new Command(type, unparsedArgs, flowID);

This is a bit misleading - you're using the "i have a flowId" constructor, while in this context flowId might actually be null and there's the "i don't have a flowId" constructor available.

Just remove the two-argument constructor, perhaps?

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/android/ClientsDatabase.java:22
(Diff revision 1)
>  
>    public static final String LOG_TAG = "ClientsDatabase";
>  
>    // Database Specifications.
>    protected static final String DB_NAME = "clients_database";
> -  protected static final int SCHEMA_VERSION = 4;
> +  protected static final int SCHEMA_VERSION = 5;

While you're at this, it would be _great_ if you converted TestClientsDatabase and friends to be regular unit tests... a la BrowserProvider tests. Shouldn't be too much work, IIRC.

Otherwise, I don't think those older on-device tests run anymore in automation, and we won't have any test coverage for these changes.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/android/ClientsDatabaseAccessor.java:200
(Diff revision 1)
>    }
>  
>    protected static Command commandFromCursor(Cursor cur) {
> -    String commandType = RepoUtils.getStringFromCursor(cur, ClientsDatabase.COL_COMMAND);
> -    JSONArray commandArgs = RepoUtils.getJSONArrayFromCursor(cur, ClientsDatabase.COL_ARGS);
> -    return new Command(commandType, commandArgs);
> +    final String commandType = RepoUtils.getStringFromCursor(cur, ClientsDatabase.COL_COMMAND);
> +    final JSONArray commandArgs = RepoUtils.getJSONArrayFromCursor(cur, ClientsDatabase.COL_ARGS);
> +    final String flowID = RepoUtils.getStringFromCursor(cur, ClientsDatabase.COL_FLOW_ID);

Method's comment says: "The behavior of this method when the value in the database is <code>NULL</code> is determined by the implementation of the {@link Cursor}".

So, what's the implementation behaviour? Any chance it will throw? IIUC, this may be null in the db.
Attachment #8948018 - Flags: review?(gkruglov)
Comment on attachment 8948018 [details]
Bug 1363924 p1 - Add flowID to Client commands.

https://reviewboard.mozilla.org/r/217662/#review227254

This pretty much looks fine, but I'd like some clarification on one or two things. Bonus points for moving the tests over ;)
Comment on attachment 8948019 [details]
Bug 1363924 p3 - Record commands telemetry events.

https://reviewboard.mozilla.org/r/217664/#review227258

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryBackgroundReceiver.java
(Diff revision 1)
>                          final boolean didRestart = telemetryBundle.getBoolean(TelemetryContract.KEY_RESTARTED);
>  
>                          telemetryStore = syncTelemetryStore;
>                          TelemetrySyncPingBuilder localPingBuilder = new TelemetrySyncPingBuilder();
>  
> -                        if (uid != null) {

This would make a nice Pre commit with an explanation and test updates.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/pingbuilders/TelemetrySyncEventPingBuilder.java:18
(Diff revision 1)
> - */
>  public class TelemetrySyncEventPingBuilder extends TelemetryLocalPingBuilder {
> +
>      public TelemetrySyncEventPingBuilder fromEventTelemetry(Bundle data) {
> +        final JSONArray event = new JSONArray();
> +        event.add(data.getLong(TelemetryContract.KEY_EVENT_TIMESTAMP));

Which of these do you expect in a well-formed event of each type? All of them, or are some optional? I would assert that we're getting from the event bundle exactly what we should be getting, and throw otherwise.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/pingbuilders/TelemetrySyncEventPingBuilder.java:18
(Diff revision 1)
> - */
>  public class TelemetrySyncEventPingBuilder extends TelemetryLocalPingBuilder {
> +
>      public TelemetrySyncEventPingBuilder fromEventTelemetry(Bundle data) {
> +        final JSONArray event = new JSONArray();
> +        event.add(data.getLong(TelemetryContract.KEY_EVENT_TIMESTAMP));

Please add a comment explaining that individual events are serialized as arrays over the wire, and that order of these statements matters a whole lot.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/pingbuilders/TelemetrySyncEventPingBuilder.java:26
(Diff revision 1)
> +        event.add(data.getString(TelemetryContract.KEY_EVENT_OBJECT));
> +        event.add(data.getString(TelemetryContract.KEY_EVENT_VALUE));
> +        final Bundle extra = data.getBundle(TelemetryContract.KEY_EVENT_EXTRA);
> +        if (extra != null) {
> +            final ExtendedJSONObject extraJSON = new ExtendedJSONObject();
> +            for (final String k : extra.keySet()) {

This looks brittle to me.

The 'extras' is a well-defined dict from the server's point of view, for both event "methods" types we currently have. Code which bundles the payload is supposed to ensure it'll create a well-formed payload. However, this approach splits that responsibility between the builder and whoever is constructing the event bundle - the collection code. The latter shouldn't care or be aware of what the server expects keys to be named. However, you've created an implicit coupling here, with "action at a distance" sorts of bugs waiting to happen.

I think you would have noticed this mistake once you'd start unit-testing this code - you'd find that you're essentially testing your tests.

Instead, I would perhaps switch on the `method` (if extras might differ), and be very explicit about what "schema" different event types will have.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/pingbuilders/TelemetrySyncEventPingBuilder.java:33
(Diff revision 1)
> +            }
> +            event.add(extraJSON);
> +        } else {
> +            event.add(null);
> +        }
> +        // The build() API is a bit awkward and we have to return a JSONObject :/.

nit: this comment isn't clarifying anything (and is incorrect); either change the API to suite your case, make this a TODO with a follow-up, change the comment to be clearer, or just drop it.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:71
(Diff revision 1)
>  
>      public Command(String commandType, JSONArray args) {
>        this(commandType, args, null);
>      }
>  
> -    public Command(String commandType, JSONArray args, String flowID) {
> +    public Command(String commandType, JSONArray args, @Nullable String flowID) {

Ah, there's that nullable!

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:71
(Diff revision 1)
>  
>      public Command(String commandType, JSONArray args) {
>        this(commandType, args, null);
>      }
>  
> -    public Command(String commandType, JSONArray args, String flowID) {
> +    public Command(String commandType, JSONArray args, @Nullable String flowID) {

Ah, there's that nullable!

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:149
(Diff revision 1)
>        return;
>      }
>  
> +    HashMap<String, String> extra = new HashMap<>();
> +    if (command.flowID != null) {
> +      extra.put("flowID", command.flowID);

You're missing deviceID here, right?

Now, this should be our local fxaDeviceID, right? Except it should be hashed?

It will be part of the Token we get from the token server. We fetch it every time we sync. You might want to try exposing it via the GlobalSession - I think that'll be the easiest first pass, even if a bit hacky.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:254
(Diff revision 1)
>    protected void sendCommandToClient(String clientID, Command command, Context context) {
>      Logger.info(LOG_TAG, "Sending " + command.commandType + " to " + clientID);
>  
> +    HashMap<String, String> extra = new HashMap<>();
> +    try {
> +       // TODO: same problem, not sure where to get that.

You're missing `flowID` here, I think.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:255
(Diff revision 1)
>      Logger.info(LOG_TAG, "Sending " + command.commandType + " to " + clientID);
>  
> +    HashMap<String, String> extra = new HashMap<>();
> +    try {
> +       // TODO: same problem, not sure where to get that.
> +       extra.put("deviceID", Utils.byte2Hex(Utils.sha256(clientID.concat(hashedFxAUID).getBytes("UTF-8")));

The docs say that 'deviceID' is "A GUID which identifies the device the command is being sent to."

Don't we store fxaDeviceID in the clients table? Is that not the value you'd want here?

If the docs are correct, I'm not sure why you'd want the hashedFxAUID of the local device for an outgoing command?

But, if you do want to record a hashed deviceID of the "other" device here, ISTM that you'd also need to know how the token server performs the hashing, right? IIRC it returns a hash function...

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:1
(Diff revision 1)
> +package org.mozilla.gecko.sync.telemetry;

license block

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:24
(Diff revision 1)
> +import java.security.GeneralSecurityException;
> +import java.security.NoSuchAlgorithmException;
> +import java.util.HashMap;
> +import java.util.Map;
> +
> +public class TelemetryEventCollector {

On one hand, you're writing this collector to be generic. You're not enforcing the sync event schema, etc, and support the unused 'value' field, check for generic length/size limits, etc.

On the other hand, you hard-code a concept of deviceIDs here, which is inherently tied to the send tab commands!

This seems wrong to me. This is either a general purpose event collector, or it's a one-off collector tailored to two types of events, but it shouldn't be both at once.

If you want to go the generic route, I'd add a sub-class which knows about send-tab events, and knows about deviceIDs, and can perform tighter validation. At the same time, keep this class generic and only aware of whatever is in the general Events specification.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:35
(Diff revision 1)
> +    public static void recordEvent(final Context context, final String object, final String method,
> +                                   @Nullable final String value,
> +                                   @Nullable final HashMap<String, String> extra) {
> +        if (!validateTelemetryEvent(object, method, value, extra)) {
> +            // we've already logged what the problem is...
> +            return;

I would actually just throw here! Is there any circumstances in which we want to silently swallow these errors? This seems like a programming mistake to me.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:38
(Diff revision 1)
> +        if (!validateTelemetryEvent(object, method, value, extra)) {
> +            // we've already logged what the problem is...
> +            return;
> +        }
> +
> +        Log.d(LOG_TAG, "Recording event {" + object + ", " + method + ", " + value + ", " + extra);

Any chance you're leaking any PII here? If so, there's a special log method for those.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:54
(Diff revision 1)
> +            for (Map.Entry<String, String> e : extra.entrySet()) {
> +                extraBundle.putString(e.getKey(), e.getValue());
> +            }
> +            event.putBundle(TelemetryContract.KEY_EVENT_EXTRA, extraBundle);
> +        }
> +        if (!setIDs(context, event)) {

This is a smell - see class comment.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:69
(Diff revision 1)
> +    }
> +
> +    private static boolean validateTelemetryEvent(final String object, final String method,
> +                                           final String value, final HashMap<String, String> extra) {
> +        // length checks.
> +        if (method.length() > 20 || object.length() > 20 || (value != null && value.length() > 80)) {

Please add a comment explaning that Firefox Telemetry pipeline imposes the following length/size limits, with a link to docs if possible.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:70
(Diff revision 1)
> +
> +    private static boolean validateTelemetryEvent(final String object, final String method,
> +                                           final String value, final HashMap<String, String> extra) {
> +        // length checks.
> +        if (method.length() > 20 || object.length() > 20 || (value != null && value.length() > 80)) {
> +            Log.w(LOG_TAG, "Invalid event parameters - wrong lengths: " + method + " " +

I would actually just throw here! Is there any circumstances in which we want to silently swallow these errors? This seems like a programming mistake to me.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:81
(Diff revision 1)
> +            if (extra.size() > 10) {
> +                Log.w(LOG_TAG, "Invalid event parameters - too many extra keys: " + extra);
> +                return false;
> +            }
> +            for (Map.Entry<String, String> e : extra.entrySet()) {
> +                if (e.getKey().length() > 15 || e.getValue().length() > 85) {

Docs say that the value should be at most 80 bytes. Your check here looks suspect. Why 85? How do we encode the strings over the wire? Would each character become 1 byte? Don't we encode these as UTF-8 on the wire? Are we guaranteed to have values in the ASCII range (they'll be encoded as a single byte)?

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:108
(Diff revision 1)
> +            return false;
> +        }
> +
> +        final ClientsDataDelegate clientsDataDelegate = new SharedPreferencesClientsDataDelegate(sharedPrefs, context);
> +        try {
> +            final String hashedDeviceID = Utils.byte2Hex(Utils.sha256(

Either I'm misunderstanding the docs, or the event docs are incorrect.

This is supposed to be the hashed deviceId of the target device, right? So it might be a local device, or it might be the device we're sending the command to, right?

This split implies that your telemetry collector shouldn't be aware of such details - hashed ID setting should be the responsiblity of whoever will call the 'recordEvent' method, as they'll have the necessary context.

Otherwise, you're starting to smush your event logic all over the place.
Attachment #8948019 - Flags: review?(gkruglov) → review-
Comment on attachment 8948019 [details]
Bug 1363924 p3 - Record commands telemetry events.

https://reviewboard.mozilla.org/r/217664/#review227258

> You're missing deviceID here, right?
> 
> Now, this should be our local fxaDeviceID, right? Except it should be hashed?
> 
> It will be part of the Token we get from the token server. We fetch it every time we sync. You might want to try exposing it via the GlobalSession - I think that'll be the easiest first pass, even if a bit hacky.

In the desktop implementation, we do not record the deviceID when receiving a command:
https://searchfox.org/mozilla-central/rev/0c0ddaa7e859a2b76a56a0e2e9c0de88af166812/services/sync/modules/engines/clients.js#720-721

> On one hand, you're writing this collector to be generic. You're not enforcing the sync event schema, etc, and support the unused 'value' field, check for generic length/size limits, etc.
> 
> On the other hand, you hard-code a concept of deviceIDs here, which is inherently tied to the send tab commands!
> 
> This seems wrong to me. This is either a general purpose event collector, or it's a one-off collector tailored to two types of events, but it shouldn't be both at once.
> 
> If you want to go the generic route, I'd add a sub-class which knows about send-tab events, and knows about deviceIDs, and can perform tighter validation. At the same time, keep this class generic and only aware of whatever is in the general Events specification.

deviceID and UID are not unique to send tab: see the syncs TelemetryCollector [0] where we also set these fields.
We then later retrieve them in the bundler [1].

[0] https://searchfox.org/mozilla-central/rev/0c0ddaa7e859a2b76a56a0e2e9c0de88af166812/mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryCollector.java#75-86

[1] https://searchfox.org/mozilla-central/rev/0c0ddaa7e859a2b76a56a0e2e9c0de88af166812/mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryBackgroundReceiver.java#137-138

> Either I'm misunderstanding the docs, or the event docs are incorrect.
> 
> This is supposed to be the hashed deviceId of the target device, right? So it might be a local device, or it might be the device we're sending the command to, right?
> 
> This split implies that your telemetry collector shouldn't be aware of such details - hashed ID setting should be the responsiblity of whoever will call the 'recordEvent' method, as they'll have the necessary context.
> 
> Otherwise, you're starting to smush your event logic all over the place.

There's 2 concepts at hand here:
- The Sync Ping includes the hashed UID and deviceID of the ping sending device.
- The "send tab" event includes the hashed device ID of the receiver.
Comment on attachment 8948019 [details]
Bug 1363924 p3 - Record commands telemetry events.

https://reviewboard.mozilla.org/r/217664/#review227258

> This looks brittle to me.
> 
> The 'extras' is a well-defined dict from the server's point of view, for both event "methods" types we currently have. Code which bundles the payload is supposed to ensure it'll create a well-formed payload. However, this approach splits that responsibility between the builder and whoever is constructing the event bundle - the collection code. The latter shouldn't care or be aware of what the server expects keys to be named. However, you've created an implicit coupling here, with "action at a distance" sorts of bugs waiting to happen.
> 
> I think you would have noticed this mistake once you'd start unit-testing this code - you'd find that you're essentially testing your tests.
> 
> Instead, I would perhaps switch on the `method` (if extras might differ), and be very explicit about what "schema" different event types will have.

I don't agree, if our objective is to keep TelemetryEventCollector generic, we shouldn't enforce events |extra| prop schema validation and let the callers do the right thing.
However, I don't care that much about this.
(In reply to Edouard Oger [:eoger] from comment #12)
> I don't agree, if our objective is to keep TelemetryEventCollector generic,
> we shouldn't enforce events |extra| prop schema validation and let the
> callers do the right thing.
> However, I don't care that much about this.

I meant to update this comment after reading the rest of your patch, but I didn't. Yes, if the goal is to keep things generic - let's do just that. You could impose a layer of domain specific validation on top of the generic flow, similar to my suggestion for you collector. Given the small & annoying problems we've had almost every time we wrote/touched telemetry code on different platforms, I think the tiny bit of extra effort is worth it.
(In reply to Edouard Oger [:eoger] from comment #11)
> In the desktop implementation, we do not record the deviceID when receiving
> a command:
> https://searchfox.org/mozilla-central/rev/
> 0c0ddaa7e859a2b76a56a0e2e9c0de88af166812/services/sync/modules/engines/
> clients.js#720-721

So either the desktop implementation is incorrect, or the telemetry docs are. See https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/telemetry/data/sync-ping.html#events-in-the-sync-ping

I don't care either way, as long as things line up and we have the information we might need to answer all of the questions we might have about send tab flow. It seems like a useful thing to record, and should be possible to record. If the desktop telemetry's format is enough, however, then great, omit it. I'd ping Leif to double check. If you go that route, please update sync ping documentation.

> There's 2 concepts at hand here:
> - The Sync Ping includes the hashed UID and deviceID of the ping sending
> device.
> - The "send tab" event includes the hashed device ID of the receiver.

Right, yes, of course. So while both things are aware of the UIDs and deviceIDs, they're playing different roles and are used in different contexts. So the coincidence is almost accidental, in a sense. It's worth making that distinction very explicit!
Thanks for the first round of review Grisha.
- Comments addressed.
- We now store the hashed FxA UID along with the FxA account data.
- Added tests.
Comment on attachment 8948018 [details]
Bug 1363924 p1 - Add flowID to Client commands.

https://reviewboard.mozilla.org/r/217662/#review227974

Thank you for porting the tests!

::: mobile/android/services/src/androidTest/java/org/mozilla/gecko/background/testhelpers/CommandHelpers.java:9
(Diff revision 2)
>  package org.mozilla.gecko.background.testhelpers;
>  
>  import org.json.simple.JSONArray;
>  import org.mozilla.gecko.sync.CommandProcessor.Command;
>  
>  public class CommandHelpers {

Do you still need this?

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/android/ClientsDatabase.java:242
(Diff revision 2)
>      SQLiteDatabase db = this.getCachedReadableDatabase();
>  
>      return queryHelper.safeQuery(db, ".fetchClientsCursor", TBL_CLIENTS, TBL_CLIENTS_COLUMNS, TBL_CLIENTS_KEY, args);
>    }
>  
> +  // This method does not check flowID on purpose because we do not want to take it in account

nit: "into account"
Attachment #8948018 - Flags: review?(gkruglov) → review+
Comment on attachment 8952626 [details]
Bug 1363924 p2 - Move deviceID and uid to payload level in sync ping.

https://reviewboard.mozilla.org/r/221866/#review227978
Attachment #8952626 - Flags: review?(gkruglov) → review+
Comment on attachment 8948019 [details]
Bug 1363924 p3 - Record commands telemetry events.

https://reviewboard.mozilla.org/r/217664/#review228016

Getting close! I'd still like clarification on what you're doing with deviceID and "sendcommand". Also, make sure this builds both with gradle and without (locally with "ac_add_options --without-gradle" and/or on try with Bg and Bng).

::: mobile/android/app/src/test/java/org/mozilla/gecko/telemetry/pingbuilders/TelemetrySyncPingBundleBuilderTest.java:199
(Diff revision 2)
> +        event.putString(TelemetryContract.KEY_EVENT_CATEGORY, "sync");
> +        event.putString(TelemetryContract.KEY_EVENT_OBJECT, "object");
> +        event.putString(TelemetryContract.KEY_EVENT_METHOD, "method");
> +        event.putString(TelemetryContract.KEY_EVENT_VALUE, "value");
> +        Bundle extra = new Bundle();
> +        extra.putString("extra", "extra");

super nit: maybe call the value something other than what the key is called? Imagine a bizzare bug where key is being used as value, and your tests won't catch it.

::: mobile/android/services/src/main/java/org/mozilla/gecko/fxa/authenticator/AndroidFxAccount.java:111
(Diff revision 2)
>    private static final String ACCOUNT_KEY_DEVICE_REGISTRATION_VERSION = "deviceRegistrationVersion";
>    private static final String ACCOUNT_KEY_DEVICE_REGISTRATION_TIMESTAMP = "deviceRegistrationTimestamp";
>    private static final String ACCOUNT_KEY_DEVICE_PUSH_REGISTRATION_ERROR = "devicePushRegistrationError";
>    private static final String ACCOUNT_KEY_DEVICE_PUSH_REGISTRATION_ERROR_TIME = "devicePushRegistrationErrorTime";
>  
> +  // Cache of the last hashed FxA UID we retrieved.

It would be nice to have a brief description of why this is persisted.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:152
(Diff revision 2)
> +      Log.e(LOG_TAG, "Could not record telemetry event.");
> +    }
>      executableCommand.executeCommand(session, command.getArgsList());
>    }
>  
> +  private void recordProcessCommandTelemetryEvent(Context context, Command command) {

This is a pure function, mark it as static.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:155
(Diff revision 2)
>    }
>  
> +  private void recordProcessCommandTelemetryEvent(Context context, Command command) {
> +    HashMap<String, String> extra = new HashMap<>();
> +    if (command.flowID != null) {
> +      extra.put("flowID", command.flowID);

Just to double check for my benefit: you said that desktop doesn't record local deviceID for "processcommand". Although you could do it here easily, you're explicitly choosing not to, correct? If so, just make sure you'll modify the docs to match client behaviour.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:253
(Diff revision 2)
>      } finally {
>        db.close();
>      }
>    }
>  
> +  private void recordSendCommandTelemetryEvent(Context context, Command command, String clientID) {

Same, static

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:265
(Diff revision 2)
> +    if (TextUtils.isEmpty(hashedFxAUID)) {
> +      Log.e(LOG_TAG, "Can't record telemetry event: The hashed FxA UID is empty");
> +      return;
> +    }
> +
> +    HashMap<String, String> extra = new HashMap<>();

nit: final

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:270
(Diff revision 2)
> +    HashMap<String, String> extra = new HashMap<>();
> +    if (!TextUtils.isEmpty(command.flowID)) {
> +      extra.put("flowID", command.flowID);
> +    }
> +    try {
> +      extra.put("deviceID", Utils.byte2Hex(Utils.sha256(clientID.concat(hashedFxAUID).getBytes("UTF-8"))));

I don't think you've addressed my question regarding deviceID. Docs say this should be ID of the target device for "sendcommand". Are docs wrong?

Here you're concatenating target clientID with a local deviceID, which doesn't make sense to me.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/CommandProcessor.java:273
(Diff revision 2)
> +    }
> +    try {
> +      extra.put("deviceID", Utils.byte2Hex(Utils.sha256(clientID.concat(hashedFxAUID).getBytes("UTF-8"))));
> +    } catch (UnsupportedEncodingException | NoSuchAlgorithmException e) {
> +      // Should not happen.
> +      Log.e(LOG_TAG, "Either UTF-8 or SHA-256 are not supported", e);

Ok to recordEvent without deviceID?

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:17
(Diff revision 2)
> +import android.support.annotation.Nullable;
> +import android.support.v4.content.LocalBroadcastManager;
> +import android.text.TextUtils;
> +import android.util.Log;
> +
> +import org.mozilla.gecko.Telemetry;

Pretty sure this won't build without gradle. (try putting "ac_add_options --without-gradle" into your mozconfig).

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:46
(Diff revision 2)
> +            throw new IllegalArgumentException("Could not validate telemetry event.");
> +        }
> +
> +        Log.d(LOG_TAG, "Recording event {" + object + ", " + method + ", " + value + ", " + extra);
> +        final Bundle event = new Bundle();
> +        event.putLong(TelemetryContract.KEY_EVENT_TIMESTAMP, Telemetry.realtime());

(you can't use Telemetry object - see import comment).

That uses elapsedRealtime, which is "milliseconds since boot, including time spent in sleep". I don't think that's what you want here.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:73
(Diff revision 2)
> +        LocalBroadcastManager.getInstance(context).sendBroadcast(telemetryIntent);
> +    }
> +
> +    // The Firefox Telemetry pipeline imposes size limits.
> +    // See toolkit/components/telemetry/docs/collection/events.rst
> +    static boolean validateTelemetryEvent(final String object, final String method,

@VisibleForTesting

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:76
(Diff revision 2)
> +    // The Firefox Telemetry pipeline imposes size limits.
> +    // See toolkit/components/telemetry/docs/collection/events.rst
> +    static boolean validateTelemetryEvent(final String object, final String method,
> +                                          @Nullable final String value, @Nullable final HashMap<String, String> extra) {
> +        // The Telemetry Sender uses BaseResource under the hood.
> +        final Charset charset = Charset.forName(BaseResource.charset);

If we can assume all the things will be ASCII, we can skip the charset dance and just count the letters. I think that's a fair assumption here, but it's your call. Always good to err on the side of caution when feasible. But just be consistent then (e.g. you're currently assuming extra keys being ascii).

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:92
(Diff revision 2)
> +            if (extra.size() > 10) {
> +                Log.w(LOG_TAG, "Invalid event parameters - too many extra keys: " + extra);
> +                return false;
> +            }
> +            for (Map.Entry<String, String> e : extra.entrySet()) {
> +                if (e.getKey().length() > 15 || e.getValue().getBytes(charset).length > 80) {

You're assuming that keys are always ASCII, right?
Attachment #8948019 - Flags: review?(gkruglov) → review-
Comment on attachment 8948019 [details]
Bug 1363924 p3 - Record commands telemetry events.

https://reviewboard.mozilla.org/r/217664/#review228016

> Just to double check for my benefit: you said that desktop doesn't record local deviceID for "processcommand". Although you could do it here easily, you're explicitly choosing not to, correct? If so, just make sure you'll modify the docs to match client behaviour.

I'm pretty sure that was a copypasta mistake. I have made the change.

> I don't think you've addressed my question regarding deviceID. Docs say this should be ID of the target device for "sendcommand". Are docs wrong?
> 
> Here you're concatenating target clientID with a local deviceID, which doesn't make sense to me.

> with a local deviceID
I think you mean hashed FxA UID (which is not a device ID).

> Ok to recordEvent without deviceID?

Yup that's fine.

> (you can't use Telemetry object - see import comment).
> 
> That uses elapsedRealtime, which is "milliseconds since boot, including time spent in sleep". I don't think that's what you want here.

Thanks, I've used elapsedRealtime directly instead of importing Telemetry.
This clock is fine, the only characteristic we're looking for is monotony for ordering purposes.
See https://searchfox.org/mozilla-central/rev/9a8d3f8191b15cdca89a7ce044c7bea2dd0462dc/services/sync/modules/telemetry.js#72-79

> If we can assume all the things will be ASCII, we can skip the charset dance and just count the letters. I think that's a fair assumption here, but it's your call. Always good to err on the side of caution when feasible. But just be consistent then (e.g. you're currently assuming extra keys being ascii).

Let's try to respect the spec and keep the byte counting.

> You're assuming that keys are always ASCII, right?

Yup, the doc says: "Each extra key name: Max. string length is 15."
(In reply to Edouard Oger [:eoger] from comment #22)
> Thanks, I've used elapsedRealtime directly instead of importing Telemetry.
> This clock is fine, the only characteristic we're looking for is monotony
> for ordering purposes.
> See
> https://searchfox.org/mozilla-central/rev/
> 9a8d3f8191b15cdca89a7ce044c7bea2dd0462dc/services/sync/modules/telemetry.
> js#72-79

I see. Indeed, event telemetry docs have this to say about the timestamp: "This is the time in ms when the event was recorded, relative to the main process start time".

I think in our case, since we have a flowID present in order to tie the events together, this is fine - and the exact ordering doesn't matter. But in general, I do wonder how this is supposed to work. Different events recorded during different process lifetimes may bundled together and submitted together, making their timestamp-based ordering meaningless (we'll be comparing multiple different monotonic clocks, which is largely meaningless). I think the telemetry pipeline might be making certain assumptions about event submissions (e.g. before shutdown - not guaranteed on desktop, but very likely), which don't hold true most of the time in the mobile context.
Comment on attachment 8948019 [details]
Bug 1363924 p3 - Record commands telemetry events.

https://reviewboard.mozilla.org/r/217664/#review228706

LGTM, great work Ed! Thanks for fielding my questions, some admittedly dumb (I forgot for a second that hashed fxa uid will be the same for both receiver and the sender - that is, we're just sending tabs to ourselves, not to other users! duh).

::: mobile/android/services/src/main/java/org/mozilla/gecko/fxa/authenticator/AndroidFxAccount.java:116
(Diff revision 3)
>    private static final String ACCOUNT_KEY_DEVICE_PUSH_REGISTRATION_ERROR_TIME = "devicePushRegistrationErrorTime";
>  
> +  // We only see the hashed FxA UID once every sync.
> +  // We might need it later for telemetry purposes outside of the context of a sync, which
> +  // is why it is persisted.
> +  private static final String ACCOUNT_KEY_HASHED_FXA_UID = "hashedFxAUID";

nit: might be worth mentioning that it's not expected to actually change for the lifetime of an account, but we just set it every time.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/telemetry/TelemetryEventCollector.java:94
(Diff revision 3)
> +            if (extra.size() > 10) {
> +                Log.w(LOG_TAG, "Invalid event parameters - too many extra keys: " + extra);
> +                return false;
> +            }
> +            for (Map.Entry<String, String> e : extra.entrySet()) {
> +                if (e.getKey().length() > 15 || e.getValue().getBytes(charset).length > 80) {

I'm still unsure why you're assuming ASCII for extra key, and not for anything else.
Attachment #8948019 - Flags: review?(gkruglov) → review+
Thank you Grisha :)
Pushed by eoger@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d30f27293117
p1 - Add flowID to Client commands. r=Grisha
https://hg.mozilla.org/integration/autoland/rev/fbba0f584237
p2 - Move deviceID and uid to payload level in sync ping. r=Grisha
https://hg.mozilla.org/integration/autoland/rev/08747734c0fd
p3 - Record commands telemetry events. r=Grisha,tcsc
Backed out for android-test failures

backout: https://hg.mozilla.org/integration/autoland/rev/d35c034bf194e0cf88330157aa167a880c1e2b29

push with failures:

failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=164387122&repo=autoland&lineNumber=4953

[task 2018-02-26T17:50:49.461Z] 17:50:49     INFO -  SUITE-END | android-test | testOfficialWithoutGeckoBinariesNoMinApiPhotonDebugUnitTest org.mozilla.gecko.telemetry.measurements.TestSessionMeasurements
[task 2018-02-26T17:50:49.461Z] 17:50:49     INFO -  TinderboxPrint: report<br/><a href="https://queue.taskcluster.net/v1/task/H0NTOcl0S0umhkKTy8hq0w/runs/0/artifacts/public/app/unittest/testOfficialWithoutGeckoBinariesNoMinApiPhotonDebugUnitTest/index.html">HTML testOfficialWithoutGeckoBinariesNoMinApiPhotonDebugUnitTest report</a>, visit "Inspect Task" link for details
[task 2018-02-26T17:50:49.461Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | android-test | There were failing tests. See the reports at: https://queue.taskcluster.net/v1/task/H0NTOcl0S0umhkKTy8hq0w/runs/0/artifacts/public/app/unittest/testOfficialWithoutGeckoBinariesNoMinApiPhotonDebugUnitTest/index.html
[task 2018-02-26T17:50:49.461Z] 17:50:49     INFO -  SUITE-START | android-test | testOfficialWithoutGeckoBinariesNoMinApiPhotonDebugUnitTest org.mozilla.gecko.telemetry.pingbuilders.TelemetrySyncEventPingBuilderTest
[task 2018-02-26T17:50:49.461Z] 17:50:49     INFO -  TEST-START | testGeneralShape
[task 2018-02-26T17:50:49.462Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | <failure message="org.junit.ComparisonFailure: expected:&lt;...object&quot;,null,{&quot;extra[&quot;:&quot;extra]&quot;}]&gt; but was:&lt;...object&quot;,null,{&quot;extra[-key&quot;:&quot;extra-value]&quot;}]&gt;" type="org.junit.ComparisonFailure">org.junit.ComparisonFailure: expected:&lt;...object",null,{"extra[":"extra]"}]&gt; but was:&lt;...object",null,{"extra[-key":"extra-value]"}]&gt;
[task 2018-02-26T17:50:49.462Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | 	at org.junit.Assert.assertEquals(Assert.java:115)
[task 2018-02-26T17:50:49.462Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | 	at org.junit.Assert.assertEquals(Assert.java:144)
[task 2018-02-26T17:50:49.462Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | 	at org.mozilla.gecko.telemetry.pingbuilders.TelemetrySyncEventPingBuilderTest.testGeneralShape(TelemetrySyncEventPingBuilderTest.java:30)
[task 2018-02-26T17:50:49.463Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[task 2018-02-26T17:50:49.463Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[task 2018-02-26T17:50:49.463Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[task 2018-02-26T17:50:49.463Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | 	at java.lang.reflect.Method.invoke(Method.java:498)
[task 2018-02-26T17:50:49.463Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
[task 2018-02-26T17:50:49.464Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
[task 2018-02-26T17:50:49.464Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
[task 2018-02-26T17:50:49.464Z] 17:50:49     INFO -  TEST-UNEXPECTED-FAIL | testGeneralShape | 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
[...]
Flags: needinfo?(eoger)
Flags: needinfo?(eoger)
Pushed by eoger@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1187a3b7d4e4
p1 - Add flowID to Client commands. r=Grisha
https://hg.mozilla.org/integration/autoland/rev/26f6e01a6c2c
p2 - Move deviceID and uid to payload level in sync ping. r=Grisha
https://hg.mozilla.org/integration/autoland/rev/a558bb4b9be5
p3 - Record commands telemetry events. r=Grisha,tcsc
https://hg.mozilla.org/mozilla-central/rev/1187a3b7d4e4
https://hg.mozilla.org/mozilla-central/rev/26f6e01a6c2c
https://hg.mozilla.org/mozilla-central/rev/a558bb4b9be5
Status: REOPENED → RESOLVED
Closed: 7 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 60
Depends on: 1442326
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: