Open Bug 891644 Opened 7 years ago Updated 6 years ago

Follow-up: Adapt ANRReporter.java to use TelemetryRecorder

Categories

(Firefox for Android :: General, defect)

ARM
Android
defect
Not set

Tracking

()

ASSIGNED

People

(Reporter: liuche, Assigned: liuche)

References

Details

Attachments

(1 file, 1 obsolete file)

Once bug 870169 lands in central, ANRReporter should be adapted to use TelemetryRecorder.
Ported ANRReporter to use TelemetryRecorder.

Would like some feedback on error handling - right now the functions that call TelemetryRecorder now have "throws Exception," which seems awkward.

Minor change to TelemetryRecorder to expose the cleanUp function as public.

Try build: https://tbpl.mozilla.org/?tree=Try&rev=71d4cd53f292
Assignee: nobody → liuche
Status: NEW → ASSIGNED
Attachment #779897 - Flags: feedback?(rnewman)
OS: Mac OS X → Android
Hardware: x86 → ARM
Comment on attachment 779897 [details] [diff] [review]
Patch: Port ANRReporter to use TelemetryRecorder v1

Review of attachment 779897 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for taking so long to get to this!

Overall this looks good. I'd think hard about whether all of the 'total +=' bits are adding any value, and if they are whether to move them into TelemetryRecorder. And make double extra sure that we're cleaning up in all the right places -- assume that the original code is wrong (with no offense to the author!). If you want to be extra thorough, add a finalizer that sternly warns and then cleans up.

Comments inline.

::: mobile/android/base/ANRReporter.java
@@ +272,2 @@
>  
>          // payload start

No longer adds useful info, so kill this.

@@ +275,4 @@
>              "\"ver\":1," +
>              "\"simpleMeasurements\":{" +
>                  "\"uptime\":" + String.valueOf(getUptimeMins()) +
>              "}," +

It's not clear to me whether it's better to concat this string and call appendPayload once, or to call appendPayload more than once. Thoughts?

@@ +307,5 @@
> +     * Finds a string pattern within a larger block and returns the index.
> +     *
> +     * This is straightforward if the entire pattern is within one block;
> +     * however, if the pattern spans across two blocks, we have to match both the start of
> +     * the pattern in the first block and the end of the pattern in the second block.

Or more than two blocks, right?

@@ +490,3 @@
>              Log.w(LOGTAG, e);
> +        } catch (Exception e) {
> +            // Exception in TelemetryRecorder, log and return.

Don't you need to clean up here, too?

@@ +574,5 @@
>              return;
>          }
> +
> +        // This is a Gecko ANR, so set up TelemetryRecorder to record ping.
> +        File pingFileDir = getPingDir();

getPingDir seems eminently reusable...

@@ +576,5 @@
> +
> +        // This is a Gecko ANR, so set up TelemetryRecorder to record ping.
> +        File pingFileDir = getPingDir();
> +        if (pingFileDir == null) {
> +          Log.w(LOGTAG, "Could not get directory for ping files.");

"Aborting ANR telemetry report."

@@ +581,5 @@
> +          return;
> +        }
> +
> +        // Get cache directory for temp files.
> +        File cacheDir = context.getCacheDir();

Make sure (by reading code; the docs don't say) that this never returns null?

@@ +589,1 @@
>          Log.i(LOGTAG, "processing Gecko ANR");

Can we get caps and punctuation? :D

::: mobile/android/base/background/datareporting/TelemetryRecorder.java
@@ +272,5 @@
>  
>    /**
>     * Clean up checksum and delete the temporary file.
>     */
> +  public void cleanUp() {

If you do this, you need to make sure that cleanUp is safe to call from outside the finally block of finishPingFile.

Right now you're giving callers the ability to do crazy things like

  r.cleanUp();
  r.appendPayload(...);

which probably wouldn't be a good idea. Either add an explicit state transition (OPEN, CLOSED, ERROR?), or kill the builder in cleanUp and check for it elsewhere, or some combination of those things.
Attachment #779897 - Flags: feedback?(rnewman) → feedback+
> 
> @@ +275,4 @@
> >              "\"ver\":1," +
> >              "\"simpleMeasurements\":{" +
> >                  "\"uptime\":" + String.valueOf(getUptimeMins()) +
> >              "}," +
> 
> It's not clear to me whether it's better to concat this string and call
> appendPayload once, or to call appendPayload more than once. Thoughts?
> 

Replaced this with a StringBuilder, while attempting to preserve JSON object clarity.

> @@ +307,5 @@
> > +     * Finds a string pattern within a larger block and returns the index.
> > +     *
> > +     * This is straightforward if the entire pattern is within one block;
> > +     * however, if the pattern spans across two blocks, we have to match both the start of
> > +     * the pattern in the first block and the end of the pattern in the second block.
> 
> Or more than two blocks, right?
> 

Apparently not! Fixed that and updated comment.

> @@ +490,3 @@
> >              Log.w(LOGTAG, e);
> > +        } catch (Exception e) {
> > +            // Exception in TelemetryRecorder, log and return.
> 
> Don't you need to clean up here, too?
> 

If TelemetryRecorder threw the exception, it should have already cleaned up. But probably safe to do it again, I guess. (This was part of the hesitation I had about the error handling in this code - passing Exceptions from TelemetryRecorder means you can't really differentiate between error sources very well.) Added a cleanUp() call.

> @@ +574,5 @@
> >              return;
> >          }
> > +
> > +        // This is a Gecko ANR, so set up TelemetryRecorder to record ping.
> > +        File pingFileDir = getPingDir();
> 
> getPingDir seems eminently reusable...

I think when Nick reviewed this, we decided that the right level of abstraction was to pass in a parent directory and a file to write to, so that non-telemetry-ping files could be written. Perhaps this should be named "getTelemetryPingDir".

> 
> @@ +581,5 @@
> > +          return;
> > +        }
> > +
> > +        // Get cache directory for temp files.
> > +        File cacheDir = context.getCacheDir();
> 
> Make sure (by reading code; the docs don't say) that this never returns null?

Good question! The code in the current SDK (frameworks.base.core.java.android.app.ContextImpl.java) creates the CacheDir if it does not exist, but apparently Android 2.2 may return null [1]. However, createTempFile does accept null as an argument and will default to using the device temp directory - and if this directory is null, then presumably an IOException is thrown, and TelemetryRecorder tries to create the file in the parentDir. So everything is okay.

However, this does make me wonder if we should handle ANR reports differently. Presumably on an ANR, Fennec may be killed very soon and so we might not want to bother writing an intermediate temp file; even a partial file is better than none at all. This might be moot if telemetry server barfs on partial files not having a checksum, being malformed JSON, etc.

[1] http://code.google.com/p/android/issues/detail?id=10789#c23

> 
> ::: mobile/android/base/background/datareporting/TelemetryRecorder.java
> @@ +272,5 @@
> >  
> >    /**
> >     * Clean up checksum and delete the temporary file.
> >     */
> > +  public void cleanUp() {
> 
> If you do this, you need to make sure that cleanUp is safe to call from
> outside the finally block of finishPingFile.
> 

Check for state now - let me know if you think I should fail in a less noisy way than throwing an error.

> Right now you're giving callers the ability to do crazy things like
> 
>   r.cleanUp();
>   r.appendPayload(...);
> 
> which probably wouldn't be a good idea. Either add an explicit state
> transition (OPEN, CLOSED, ERROR?), or kill the builder in cleanUp and check
> for it elsewhere, or some combination of those things.

Added state (OPEN, INPROGRESS, CLOSED).
Attachment #779897 - Attachment is obsolete: true
Attachment #785311 - Flags: review?(rnewman)
Comment on attachment 785311 [details] [diff] [review]
Patch: Port ANRReporter to use TelemetryRecorder v2

Review of attachment 785311 [details] [diff] [review]:
-----------------------------------------------------------------

Some concerns about state, a question about testing, and threading issues. Pretty close, though!

::: mobile/android/base/ANRReporter.java
@@ +272,2 @@
>  
> +        StringBuilder sb = new StringBuilder();

I don't think StringBuilder offers any value here; it was a simple string concatenation. What I was asking about was this: you concatenate one big string, only to then pass it to appendPayload, which quotes it, turns it to bytes, and writes it to disk, updating the hash as it goes.

Would it be better or worse to call appendPayload multiple times for smaller strings, rather than building one big string?

If it weren't for the hash it's kind of a no-brainer: stream append. With the hash... dunno, I don't know the incremental cost. (Of course, the hasher has to process each byte, so it might make no difference, in which case it's a net win: no big intermediate string.)

::: mobile/android/base/background/datareporting/TelemetryRecorder.java
@@ +44,5 @@
>   */
>  public class TelemetryRecorder {
>    private final String LOG_TAG = "TelemetryRecorder";
>  
> +  private State mState;

It's worth considering threading issues here. This is the tip of the iceberg -- not only is this not `volatile`, but you also check-then-set it. Either make this, and the rest of the class, thread-safe -- that is, make sure you can create, prep, append to, and finish this class from multiple threads -- or mark this class as single-thread-only so that suckers don't try to use it that way.

@@ +113,5 @@
>    }
>  
> +  private enum State {
> +    OPEN,
> +    INPROGRESS,

I have a gentle preference for IN_PROGRESS.

@@ +165,5 @@
>        byte[] header = makePingHeader(filename);
>        outputStream.write(header);
>        Logger.debug(LOG_TAG, "Wrote " + header.length + " header bytes.");
>  
> +      mState = State.OPEN;

I don't follow this. You require OPEN to startPingFile. Then you set INPROGRESS, then back to OPEN after the header is written. 

appendPayload requires INPROGRESS.

How are you testing this?
Attachment #785311 - Flags: review?(rnewman)
(In reply to Richard Newman [:rnewman] from comment #4)
> I don't think StringBuilder offers any value here; it was a simple string
> concatenation. What I was asking about was this: you concatenate one big
> string, only to then pass it to appendPayload, which quotes it, turns it to
> bytes, and writes it to disk, updating the hash as it goes.
> 
> Would it be better or worse to call appendPayload multiple times for smaller
> strings, rather than building one big string?
> 
> If it weren't for the hash it's kind of a no-brainer: stream append. With
> the hash... dunno, I don't know the incremental cost. (Of course, the hasher
> has to process each byte, so it might make no difference, in which case it's
> a net win: no big intermediate string.)
> 

I kind of feel like method switching/stack creation plus lots of SHA-256 hashes is pretty expensive; and either way, I'd be more comfortable calling appendPayload multiple times once bug 889942 gets fixed.

I don't see why we wouldn't use StringBuilder in this case - I guess it's a constant number of concatenations, but String concatenation is still *really* expensive in Java.

> ::: mobile/android/base/background/datareporting/TelemetryRecorder.java
> @@ +44,5 @@
> >   */
> >  public class TelemetryRecorder {
> >    private final String LOG_TAG = "TelemetryRecorder";
> >  
> > +  private State mState;
> 
> It's worth considering threading issues here. This is the tip of the iceberg
> -- not only is this not `volatile`, but you also check-then-set it. Either
> make this, and the rest of the class, thread-safe -- that is, make sure you
> can create, prep, append to, and finish this class from multiple threads --
> or mark this class as single-thread-only so that suckers don't try to use it
> that way.
> 

Synchronized the methods. I was at first hesitant that calling a synchronized method from another synchronized method wouldn't work (cleanUp), but since the lock is still held, that's actually not a problem. I'm not sure how you feel about synchronized methods vs synchronization on a private object.

> 
> @@ +165,5 @@
> >        byte[] header = makePingHeader(filename);
> >        outputStream.write(header);
> >        Logger.debug(LOG_TAG, "Wrote " + header.length + " header bytes.");
> >  
> > +      mState = State.OPEN;
> 
> I don't follow this. You require OPEN to startPingFile. Then you set
> INPROGRESS, then back to OPEN after the header is written. 
> 
> appendPayload requires INPROGRESS.
> 
> How are you testing this?

Sorry, that was stray leftover code!

Anyways, I've moved these changes (which were initially trivial, but now not-so) to a pull request in github [1]. The review request for the rest of the patch (ANRReporter using TelemetryRecorder) will still remain.

[1] https://github.com/mozilla-services/android-sync/pull/343
You need to log in before you can comment on or make changes to this bug.