Closed Bug 870169 Opened 7 years ago Closed 7 years ago

Extract reusable telemetry code from ANRReporter.java

Categories

(Firefox for Android :: General, defect)

ARM
Android
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 25

People

(Reporter: rnewman, Assigned: liuche)

References

(Blocks 3 open bugs)

Details

Attachments

(2 files, 2 obsolete files)

I'd like to submit telemetry readings for FHR's performance on Android.

There seems to be a lot of code in ANRReporter.java that does exactly that, but it's all bundled together in private methods and string concatenation.

Can we extract a TelemetryReporter class, ideally one that lives in the Android Services module so that FHR can use it without contortions?

(Telemetry reporting should eventually be a non-Gecko service on Android, so this actually makes a ton of sense, and will also dramatically improve the testability of this code, which I think would be great! I'm happy to do the moving once such a class exists in the tree.)

Jim, if you're supportive but don't have time to do the work (or don't have time in the very short term), I'm happy to do the surgery if you'll review, so I can get it into 24.
Blocks: 870170
I think that'd be great! I don't know how much of the ANRReporter code is reusable though.

Originally I wanted the telemetry code in ANRReporter to be lean and fast, so I went with strings instead of JSONObject (since I didn't know how JSONObject performs), and also wrote the code to be pretty much specific to ANR reports. I think for a generic TelemetryReporter you would want to use JSONObject and have an API that applies to more than just ANR reports. ANRReporter can certainly be used as a reference.

So in my head we can first come up with a TelemetryReporter, and then refactor ANRReporter to use the new class. What do you think? I'd be happy to work on either/both. In the very short term I'm taking PTO from May 15 to 31, but after that I'm free to help out.
(In reply to Jim Chen [:jchen :nchen] from comment #1)

> So in my head we can first come up with a TelemetryReporter, and then
> refactor ANRReporter to use the new class. What do you think?

That sounds really good to me.

We're pushing to get some or all of FHR landed prior to the merge on May 13th, so anything that doesn't get done by then has a few weeks. Let's see where we are when you get back from PTO!
Lifted out the writing of a telemetry ping from ANRReporter, then subclassed an ANRTelemetryReporter.

Feedback on structure of the abstraction, naming, nits. ANRTelemetryReporter is pretty much copypasta from ANRReporter, with some style cleanup.
Assignee: nobody → liuche
Status: NEW → ASSIGNED
Attachment #755126 - Flags: feedback?(rnewman)
Attachment #755126 - Flags: feedback?(nalexander)
Attachment #755126 - Attachment is obsolete: true
Attachment #755126 - Flags: feedback?(rnewman)
Attachment #755126 - Flags: feedback?(nalexander)
Attachment #755128 - Flags: feedback?(rnewman)
Attachment #755128 - Flags: feedback?(nalexander)
Comment on attachment 755128 [details] [diff] [review]
Patch: WIP of ripping out Telemetry Reporter v1

Review of attachment 755128 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think class based inheritance is the way to go here.  I'd prefer to see two (well, N) classes: one TelemetryRecorder and N - 1 TelemetryReporters.  TelemetryReporterTemplate is basically the Recorder -- it handles writing files, calculating checksums, and listening for messages on a background thread.  It should be transport-agnostic, meaning it should accept messages via regular Java calls.  (Any transport layer you care to use can be built on top of that.)  All the broadcast receiver/intent filter business is ANRTelemetryReporter-specific and should live there.  The ANRTelemetryReporter-specific payload building business is in the correct place.

Think about what API you'd want to consume if you were instrumenting how long a Sync took.  Or how many records were uploaded and downloaded.

rnewman: there's some complexity around Looper's and Handlers that I think might be avoided entirely by using an ExecutorService.  Am I missing something?

::: mobile/android/base/ANRTelemetryReporter.java
@@ +97,5 @@
> +        }
> +    }
> +
> +    @Override
> +    protected File getOutputPingFile() {

Nobody will want to override this: I don't think ping file names themselves don't matter, and the "saved-telemetry-pings" directory is hard-coded into Gecko.  What people will want to provide is the name of the current profile -- this is very important to Background Services API consumers, such as FHR.

::: mobile/android/base/TelemetryReporterTemplate.java
@@ +2,5 @@
> + * This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +package org.mozilla.gecko.background.datareporting;

nit: this doesn't line up with where the file lives.  I'd like this telemetry module to be checked into android-sync github and live in background/*, just so that we can use it easily in background services.

@@ +27,5 @@
> +
> +/**
> + * Responds to registered broadcasts by processing reporter-specific data traces
> + * and writing them to a Telemtry ping payload file.
> + * @author liuche

nit: @author isn't Fennec convention.

@@ +33,5 @@
> + */
> +public abstract class TelemetryReporterTemplate extends BroadcastReceiver {
> +    protected static final String LOGTAG = "TelemetryReporterTemplate";
> +
> +    private static int  sRegisteredCount;

nit: space.

@@ +38,5 @@
> +    private static Handler mHandler;
> +
> +    protected static final String PING_CHARSET = "us-ascii";
> +
> +    // Fields for implementing classes to instantiate. 

nit: trailing whitespace.  Throughout, in fact.

@@ +40,5 @@
> +    protected static final String PING_CHARSET = "us-ascii";
> +
> +    // Fields for implementing classes to instantiate. 
> +    protected static final TelemetryReporterTemplate sInstance;
> +    protected static String INTENT_FILTER;

I believe that this defines a single INTENT_FILTER that is not overridden by subclasses.  I think you want a member function.

@@ +110,5 @@
> +    }
> +
> +    @Override
> +    public void onReceive(Context context, Intent intent) {
> +        if (!checkAndContinue(context, intent)) {

This seems pretty heavy, but I assume it was how ANRReporter was implemented.  Can we provide a two-part interface: an API that accepts an Intent, like this, and a simpler API that can just be called from Java?

@@ +201,5 @@
> +     *
> +     * @param pingStream OutputStream of the ping file
> +     * @param checksum MessageDigest for checksum associated with payload
> +     */
> +    protected abstract void fetchAndFillPingPayload(OutputStream pingStream,

This seems a poor API to expose to implementers -- checksum is an implementation detail that most consumers shouldn't care about.  Why can't this just be |String fetchPingPayload()| or |void writePingPayload(OutputStream)|?

@@ +237,5 @@
> +        byte[] data = payloadContent.getBytes(PING_CHARSET);
> +        checksum.update(data);
> +
> +        data = JSONObject.quote(payloadContent).getBytes(PING_CHARSET);
> +        // First and last bytes are quotes inserted by JSONObject.quote; discard

Can you verify: "Telemetry pings are JSON strings transferred without quotes."
Attachment #755128 - Flags: feedback?(nalexander) → feedback+
Comment on attachment 755128 [details] [diff] [review]
Patch: WIP of ripping out Telemetry Reporter v1

Review of attachment 755128 [details] [diff] [review]:
-----------------------------------------------------------------

Concur with Nick's architectural comments. Subclassing is not the right approach here.

I want to see a thing that knows how to write out telemetry payloads (TelemetryRecorder), and a bunch of things that ask it to record stuff (ANRTelemetryReporter). Eventually we can split out a TelemetryUploader.

Also, :%s,\s\+$,,

::: mobile/android/base/ANRTelemetryReporter.java
@@ +1,1 @@
> +package org.mozilla.gecko.background.datareporting;

Nit: license.

::: mobile/android/base/TelemetryReporterTemplate.java
@@ +34,5 @@
> +public abstract class TelemetryReporterTemplate extends BroadcastReceiver {
> +    protected static final String LOGTAG = "TelemetryReporterTemplate";
> +
> +    private static int  sRegisteredCount;
> +    private static Handler mHandler;

I prefer simply:

  private ExecutorService executor = Executors.newSingleThreadExecutor();

rather than all this nonsense with starting a thread, having a Handler, and pissing about with Looper. Just send Runnables to the Executor.

@@ +39,5 @@
> +
> +    protected static final String PING_CHARSET = "us-ascii";
> +
> +    // Fields for implementing classes to instantiate. 
> +    protected static final TelemetryReporterTemplate sInstance;

So... you have a bunch of static fields, like mHandler (should be sHandler!), but you still have an instance?

Either do a singleton (but don't, because singletons are evil), or have a singleton instance of an object. Don't mix the two.

@@ +47,5 @@
> +        if (sRegisteredCount++ != 0) {
> +            // Already registered
> +            return;
> +        }
> +        sInstance.start(context);

Careful with context here. If you're having a singleton, how can you support multiple contexts?

@@ +62,5 @@
> +        }
> +        sInstance.stop();
> +    }
> +
> +    public void start(final Context context) {

No need for this to be public if you're forcing me to register, right?

@@ +89,5 @@
> +    public void stop() {
> +        synchronized (this) {
> +            while (mHandler == null) {
> +                try {
> +                    wait(1000);

Meh, this is painful and pointless.

  public void stop() {
    executor.shutdown();
  }
Attachment #755128 - Flags: feedback?(rnewman)
Note that the ANR reporter is a bit picky about which thread it runs on -- it needs its own Handler on its own thread to register and receive the ANR broadcast intent, hence the Looper and Handler bits.

On the other hand, AFAICT the telemetry reporter should be able to run on any thread, so I think it would be better to have the threading code stay with the ANR reporter, and make the extracted telemetry reporter code thread-independent.
Re-work based on high-level comments. Extracted out a TelemetryRecorder class for handling writing ping files.
Attachment #755128 - Attachment is obsolete: true
Attachment #761881 - Flags: feedback?(rnewman)
Use TelemetryRecorder with an ANRReporter replacement. (Left ANRReporter undeleted.)

This just uses TelemetryRecorder synchronously, but I wanted to get feedback before attempting any background threading.
Attachment #761882 - Flags: feedback?(rnewman)
Attachment #761882 - Flags: feedback?(nalexander)
Comment on attachment 761881 [details] [diff] [review]
Part 1: Extract reusable TelemetryRecorder v1

Feedback optional from you, Nick. I'm sure Richard will have thoughts aplenty!
Attachment #761881 - Flags: feedback?(nalexander)
(Also, formatting is wonky, kind of a cross between android-sync and fennec. Will fix, I promise.)
Comment on attachment 761881 [details] [diff] [review]
Part 1: Extract reusable TelemetryRecorder v1

Review of attachment 761881 [details] [diff] [review]:
-----------------------------------------------------------------

Looking much better!

::: mobile/android/base/TelemetryRecorder.java
@@ +30,5 @@
> + *           {
> + *              "slug": "<uuid-string>",
> + *              "payload": "<escaped-json-data-string>",
> + *              "checksum": "<base64-sha-256-string>"
> + *          }

What's the API for this?  You call |start|, then |append|, then |finish|, correct?

@@ +32,5 @@
> + *              "payload": "<escaped-json-data-string>",
> + *              "checksum": "<base64-sha-256-string>"
> + *          }
> + */
> +public abstract class TelemetryRecorder {

I see no reason to make this abstract.  Make it concrete, and pass in whatever file you want to write to.

@@ +70,5 @@
> +     *
> +     * @return File to write telemetry ping to, or null if the File can not be
> +     *              opened/created.
> +     */
> +    protected abstract File getDestinationFile(Context context, String profileName);

Just make this a constructor parameter.  And why not make this a little more versatile by taking a stream (Writer)?  Easier to test that way, too.

@@ +81,5 @@
> +     * }
> +     *
> +     * @throws Exception rethrown exception to caller
> +     */
> +    public void startPingFile() throws Exception {

|open|?

@@ +175,5 @@
> +    /**
> +     * Add the checksum of the payload to the ping file and close the stream.
> +     * @throws Exception rethrown exception to caller
> +     */
> +    public int finishPingFile() throws Exception {

|close|? Why are you returning footer length?
Attachment #761881 - Flags: feedback?(nalexander) → feedback+
Comment on attachment 761882 [details] [diff] [review]
Part 2: ANRTelemetryReporter v1

Review of attachment 761882 [details] [diff] [review]:
-----------------------------------------------------------------

This seems like two things: a library for parsing traces.txt, and a BroadcastReceiver.  Can we split the library part for testing?  Otherwise, looks better.  I still think the Handler is unnecessary.
Attachment #761882 - Flags: feedback?(nalexander) → feedback+
(In reply to Nick Alexander :nalexander from comment #12)
> Comment on attachment 761881 [details] [diff] [review]
> Part 1: Extract reusable TelemetryRecorder v1
> 
> Review of attachment 761881 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looking much better!
> 
> ::: mobile/android/base/TelemetryRecorder.java
> @@ +30,5 @@
> > + *           {
> > + *              "slug": "<uuid-string>",
> > + *              "payload": "<escaped-json-data-string>",
> > + *              "checksum": "<base64-sha-256-string>"
> > + *          }
> 
> What's the API for this?  You call |start|, then |append|, then |finish|,
> correct?

Yes; start and finish also include writing the pre-payload bytes (slug) and post-payload bytes (checksum), respectively, including the extra JSON characters like curly braces and commas and such. It fits with writing telemetry pings (under the assumption that telemetry pings are all of the "slug, payload, checksum" format).

Do you think it should be abstracted out further?

> @@ +70,5 @@
> > +     *
> > +     * @return File to write telemetry ping to, or null if the File can not be
> > +     *              opened/created.
> > +     */
> > +    protected abstract File getDestinationFile(Context context, String profileName);
> 
> Just make this a constructor parameter.  And why not make this a little more
> versatile by taking a stream (Writer)?  Easier to test that way, too.

That's true, Writers are easier to test. On the other hand, this currently extracts the "slug" field from the filename and writes it into the ping file. I guess I could pass that in explicitly, but I'm assuming there's not a case where the slug doesn't match the File name.

I'll un-abstract and change the constructor to take a File then, and have additional constructors that take an encoding and stream block size.

> @@ +81,5 @@
> > +     * }
> > +     *
> > +     * @throws Exception rethrown exception to caller
> > +     */
> > +    public void startPingFile() throws Exception {
> 
> |open|?
> 
> @@ +175,5 @@
> > +    /**
> > +     * Add the checksum of the payload to the ping file and close the stream.
> > +     * @throws Exception rethrown exception to caller
> > +     */
> > +    public int finishPingFile() throws Exception {
> 
> |close|? Why are you returning footer length?

ANRReporter consumes and prints out the number of footer bytes written for debugging. It seems useful for debugging, but perhaps that doesn't need to be returned to the caller.

I'll make these changes to the android-sync github branch, and let's continue additional review comments there: https://github.com/mozilla-services/android-sync/commits/liuche/telemetry-recorder
> This seems like two things: a library for parsing traces.txt, and a
> BroadcastReceiver.  Can we split the library part for testing?  Otherwise,
> looks better.  I still think the Handler is unnecessary.

I'll file a follow-up - this patch tries to change ANRReporter as little as possible. I'm not sure about the Handler/Threading, especially based on Jim's comment 7.
Blocks: 882939
Attachment #761882 - Flags: feedback?(rnewman)
Comment on attachment 761881 [details] [diff] [review]
Part 1: Extract reusable TelemetryRecorder v1

Feedback received on pull request.
Attachment #761881 - Flags: feedback?(rnewman)
Updated pull request addressing comments: https://github.com/mozilla-services/android-sync/pull/324
Blocks: 889942
Blocks: 891644
Thanks for getting this out the door, Chenxia!
Target Milestone: --- → Firefox 25
https://hg.mozilla.org/mozilla-central/rev/508573c88f42
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.