Closed Bug 1150134 Opened 9 years ago Closed 9 years ago

Allow accessing archived pings via the TelemetryPing public API

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect)

defect
Not set
normal

Tracking

(firefox40 fixed)

RESOLVED FIXED
Tracking Status
firefox40 --- fixed

People

(Reporter: gfritzsche, Assigned: gfritzsche)

References

Details

Attachments

(7 files, 3 obsolete files)

16.88 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
1.72 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
1.19 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
21.43 KB, patch
gfritzsche
: review+
Details | Diff | Splinter Review
35.47 KB, patch
vladan
: review+
Details | Diff | Splinter Review
3.27 KB, patch
vladan
: review+
Details | Diff | Splinter Review
50.93 KB, patch
vladan
: review+
Details | Diff | Splinter Review
To e.g. enable about:healthreport and about:telemetry we need to enable accessing the archived pings.

The current minimal working approach would be to add 2 methods:
* getArchivedPingList() - returning an array of ping info objects of the form:
  {
    id: <string>, // UUID that identifies the ping.
    type: <string>, // Type of the ping, e.g. "main"
    submissionDate: <Date>, // Date/time this ping was submitted to the Telemetry system.
  }
* getArchivedPingsById(ids) - returns an array of objects with the full ping data for the ping ids passed.
Blocks: 1149754
Depends on: 1137252
Assignee: nobody → gfritzsche
Status: NEW → ASSIGNED
This adds:
* scanning the ping archive on delayed startup
* caching the archived ping info so we only need to scan once
* making the ping archive info accessible
* allow loading an archived ping by id from disk
Attachment #8590224 - Flags: review?(dteller)
Hm, i'm wondering whether its maybe better to not always scan on delayed startup and instead defer scanning the archived pings until the first request.
Attachment #8590220 - Flags: review?(dteller) → review+
Comment on attachment 8590221 [details] [diff] [review]
Part 2 - Improve the fakeNow() test helper

Review of attachment 8590221 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/tests/unit/head.js
@@ +108,5 @@
>    session.Policy.now = () => date;
>    let environment = Cu.import("resource://gre/modules/TelemetryEnvironment.jsm");
>    environment.Policy.now = () => date;
> +
> +  return new Date(date);

Why not `date` itself?
Attachment #8590221 - Flags: review?(dteller) → review+
Comment on attachment 8590222 [details] [diff] [review]
Part 3 - Avoid hitting the network from Telemetry unit tests

Review of attachment 8590222 [details] [diff] [review]:
-----------------------------------------------------------------

Isn't that already done in testing/profiles/prefs_general.js#234 ?

Other than that, r+.
Attachment #8590222 - Flags: review?(dteller) → review+
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #7)
> Comment on attachment 8590221 [details] [diff] [review]
> Part 2 - Improve the fakeNow() test helper
> 
> Review of attachment 8590221 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/telemetry/tests/unit/head.js
> @@ +108,5 @@
> >    session.Policy.now = () => date;
> >    let environment = Cu.import("resource://gre/modules/TelemetryEnvironment.jsm");
> >    environment.Policy.now = () => date;
> > +
> > +  return new Date(date);
> 
> Why not `date` itself?

I was worried that someone might change the state of `date` and would run into hard-to-trace-down test issues.
It seemed easier to just eliminate that by returning a clone.
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #8)
> Comment on attachment 8590222 [details] [diff] [review]
> Part 3 - Avoid hitting the network from Telemetry unit tests
> 
> Review of attachment 8590222 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Isn't that already done in testing/profiles/prefs_general.js#234 ?
> 
> Other than that, r+.

That addresses mochitests only - NS_ERROR_TOO_MANY_CONFIG_LOCATIONS:
https://developer.mozilla.org/en-US/docs/Mozilla/QA/Automated_testing#Need_to_set_preferences_for_test-suites.3F
Comment on attachment 8590224 [details] [diff] [review]
Part 4 - Implement API to access the archive pings

Review of attachment 8590224 [details] [diff] [review]:
-----------------------------------------------------------------

Note: in the future, could you use reviewboard? It makes it easier to find out if code has just moved around or has been changed.

::: toolkit/components/telemetry/TelemetryFile.jsm
@@ +294,5 @@
> +    if (typeof(ping.payload) == "string") {
> +      ping.payload = JSON.parse(ping.payload);
> +    }
> +    return ping;
> +  }),

I'm assuming that this code has just moved around, right?

::: toolkit/components/telemetry/TelemetryPing.jsm
@@ +307,5 @@
> +  },
> +
> +  /**
> +   * Allows waiting for TelemetryPings delayed initialization to complete.
> +   * This will complete before TelemetryPing is shutting down.

That sentence is not quite clear. Does this mean that the Promise is guaranteed to resolve before we start shutting down?

Also, the convention I have seen in code would be to call this `get promiseInitialized()`.

@@ +325,5 @@
> +   *                    { id: <string>,
> +   *                      timestampCreated: <number>,
> +   *                      type: <string> }
> +   */
> +  getArchivedPingList: function() {

If it returns a promise, I'd call this `promiseArchivedPingList` to avoid confusions. Same below.

@@ +330,5 @@
> +    return Impl.getArchivedPingList();
> +  },
> +
> +  /**
> +   * This loads an archived ping async from disk by id.

"Load an archived ping from disk by id, asynchronously."

Generally, it is a good idea to reserve word "this" for `this` object.

@@ +368,5 @@
>    _pendingPingRequests: new Map(),
>  
> +  // This tracks the archived pings in a Map of (id -> {timestampCreated, type}).
> +  // We use this to cache info on archived pings to avoid scanning the disk more than once.
> +  _archivedPings: new Map(),

Isn't this already initialized by `setupTelemetry`?

@@ +928,5 @@
>            yield OS.File.makeDir(DATAREPORTING_PATH, {ignoreExisting: true}).catch(reportError);
>            yield OS.File.makeDir(gPingsArchivePath, {ignoreExisting: true}).catch(reportError);
> +
> +          yield this._scanArchivedPingDirectory()
> +                    .catch((e) => this._log.error("setupTelemetry - failure scanning archived ping directory"));

Could you also log the error itself?

@@ +1127,5 @@
> +   * @param fileName {String} The filename.
> +   * @return {Object} An object with the extracted data in the form:
> +   *                  { timestamp: <number>,
> +   *                    id: <string>,
> +   *                    type: <string> }

Please document in which case this can be `null`.

@@ +1179,5 @@
> +    this._log.trace("_scanArchivedPingDirectory");
> +
> +    let iterator = new OS.File.DirectoryIterator(gPingsArchivePath);
> +    let subdirs = yield iterator.nextBatch();
> +    subdirs = subdirs.filter(e => e.isDir);

I'd combine both in a single statement:

let subdirs = (yield iterator.nextBatch()).
  filter(e => e.isDir);

This avoids having an intermediate state in which `subdirs` is not the list of subdirectories. Alternatively, you could just iterate

for (let entry of (yield iterator.nextBatch()) {
  if (!entry.isDir) {
    continue;
  }
}

@@ +1190,5 @@
> +        continue;
> +      }
> +
> +      this._log.trace("_scanArchivedPingDirectory - checking in subdir: " + dir.path);
> +      iterator = new OS.File.DirectoryIterator(dir.path);

Please don't reuse variables like this.

@@ +1192,5 @@
> +
> +      this._log.trace("_scanArchivedPingDirectory - checking in subdir: " + dir.path);
> +      iterator = new OS.File.DirectoryIterator(dir.path);
> +      let pings = yield iterator.nextBatch();
> +      pings = pings.filter(e => !e.isDir);

Here, too, please combine this in a single statement.

@@ +1197,5 @@
> +
> +      // Now process any ping files of the form "<timestamp>.<uuid>.<type>.json"
> +      for (let p of pings) {
> +        let data = this._getArchivedPingDataFromFileName(p.name);
> +        if (!data) {

A line of comment explaining why `data` can be `null`?

@@ +1230,5 @@
> +   */
> +  getArchivedPingList: function() {
> +    this._log.trace("getArchivedPingList");
> +
> +    let list = [for (p of this._archivedPings) {

`for ([id, ping] of this._archivedPings)`

::: toolkit/components/telemetry/tests/unit/test_PingAPI.js
@@ +50,5 @@
> +      id: data.id,
> +      type: data.type,
> +      timestampCreated: data.dateCreated.getTime(),
> +    });
> +    Assert.deepEqual(list, expectedPingList);

Nit: It's a good habit to add a third argument to Assert calls explaining what we're testing. Here and below.
Attachment #8590224 - Flags: review?(dteller) → review+
Blocks: 1120380
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #11)
> Note: in the future, could you use reviewboard? It makes it easier to find
> out if code has just moved around or has been changed.

I liked RB from testing, but my impression was that the integration is not bug-free yet?

> ::: toolkit/components/telemetry/TelemetryFile.jsm
> @@ +294,5 @@
> > +    if (typeof(ping.payload) == "string") {
> > +      ping.payload = JSON.parse(ping.payload);
> > +    }
> > +    return ping;
> > +  }),
> 
> I'm assuming that this code has just moved around, right?

Indeed.

> ::: toolkit/components/telemetry/TelemetryPing.jsm
> @@ +307,5 @@
> > +  },
> > +
> > +  /**
> > +   * Allows waiting for TelemetryPings delayed initialization to complete.
> > +   * This will complete before TelemetryPing is shutting down.
> 
> That sentence is not quite clear. Does this mean that the Promise is
> guaranteed to resolve before we start shutting down?

Yes, will make it clearer.

> @@ +325,5 @@
> > +   *                    { id: <string>,
> > +   *                      timestampCreated: <number>,
> > +   *                      type: <string> }
> > +   */
> > +  getArchivedPingList: function() {
> 
> If it returns a promise, I'd call this `promiseArchivedPingList` to avoid
> confusions. Same below.

I've seen different naming before, is it actually nice to leak that into the method name?

> @@ +368,5 @@
> >    _pendingPingRequests: new Map(),
> >  
> > +  // This tracks the archived pings in a Map of (id -> {timestampCreated, type}).
> > +  // We use this to cache info on archived pings to avoid scanning the disk more than once.
> > +  _archivedPings: new Map(),
> 
> Isn't this already initialized by `setupTelemetry`?

Right, we need to init it there so we can re-init it for tests.

> @@ +1230,5 @@
> > +   */
> > +  getArchivedPingList: function() {
> > +    this._log.trace("getArchivedPingList");
> > +
> > +    let list = [for (p of this._archivedPings) {
> 
> `for ([id, ping] of this._archivedPings)`

Sadly that doesn't work within array comprehensions.
Depends on: 1154717
Depends on: 1154856
Addressed review comments.
Attachment #8590224 - Attachment is obsolete: true
Attachment #8593300 - Flags: review+
Note: this can be folded with part 4 for landing to make history less confusing.
Attachment #8593582 - Flags: review?(vdjeric)
Note: this could be folded with part 4 for landing to make history less confusing.

This refactors the archiving parts to the new design we decided on (with one or two liberties taken):
https://docs.google.com/document/d/1tWwoqqr9aSewoye0TjLDRpolVOeEtzpVnw88ykbo9lk/edit?pli=1#heading=h.tj7p7jz0e834

It introduces TelemetryArchive.jsm and distributes the archiving implementation between that and TelemetryStorage.jsm
Attachment #8593585 - Flags: review?(vdjeric)
I noticed along the way that we enabled archiving everywhere.
I don't think that's great, e.g. we don't use the archived data on Fennec and surely don't want to take the storage hit there?
Attachment #8593589 - Flags: review?(vdjeric)
Attachment #8593589 - Attachment is obsolete: true
Attachment #8593589 - Flags: review?(vdjeric)
Attachment #8594009 - Flags: review?(vdjeric)
Attachment #8593582 - Flags: review?(vdjeric) → review+
Attachment #8594009 - Flags: review?(vdjeric) → review+
Comment on attachment 8593585 [details] [diff] [review]
Part 6 - Refactor the archiving implementation

Review of attachment 8593585 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryArchive.jsm
@@ +113,5 @@
> +      }
> +    }
> +
> +    const creationDate = new Date(ping.creationDate);
> +    this._archivedPings.set(ping.id, {

warn on overwrite?

@@ +133,5 @@
> +
> +    return list;
> +  },
> +
> +  promiseArchivedPingList: function() {

how does this module behave on shutdown?

::: toolkit/components/telemetry/TelemetryStorage.jsm
@@ +107,5 @@
> +   * pings out of their filename.
> +   *
> +   * @return {promise<sequence<object>>}
> +   */
> +  loadArchivedPingList: function() {

you dropped the "promise.." prefix?

@@ +121,5 @@
> +   * if |false| the file will not be overwritten and no error will be reported if
> +   * the file exists.
> +   * @returns {promise}
> +   */
> +  savePingToFile: function(ping, file, overwrite) {

couldn't you join savePingToFile and savePing into a single method with an optional filePath argument?

@@ +177,5 @@
> +   * through |popPendingPings|.
> +   *
> +   * @returns {promise}
> +   */
> +  loadSavedPings: function() {

should TelemetryStorage really load all saved pings? how about if TelemetryStorage built a list of unexpired saved pings on startup (at TelemetryController's behest) and then TelemetryController can ask TelemetryStorage to load the ping contents on demand. That way we avoid the memory overhead of loading all pending pings on startup

@@ +190,5 @@
> +   *
> +   * @param {string} file The file to load.
> +   * @returns {promise}
> +   */
> +  loadHistograms: function loadHistograms(file) {

can we rename this method? the pings have way more data than just histograms these days

@@ +197,5 @@
> +
> +  /**
> +   * The number of pings loaded since the beginning of time.
> +   */
> +  get pingsLoaded() {

rename this method if it's really since the beginning of time

@@ +314,5 @@
> +      this._log.trace("loadArchivedPingList - checking in subdir: " + dir.path);
> +      let pingIterator = new OS.File.DirectoryIterator(dir.path);
> +      let pings = (yield pingIterator.nextBatch()).filter(e => !e.isDir);
> +
> +      // Now process any ping files of the form "<timestamp>.<uuid>.<type>.json"

i'm wondering if we should have put a version number in the filename too ..<type>.v1.json in case we change the filename format or ping-content format in the future

i'm on the fence about this

@@ +323,5 @@
> +          continue;
> +        }
> +
> +        // In case of conflicts, overwrite only with newer pings.
> +        if (archivedPings.has(data.id)) {

should we clean up the on-disk state as well?
Attachment #8593585 - Flags: review?(vdjeric)
Blocks: 1156253
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #21)
> @@ +133,5 @@
> > +
> > +    return list;
> > +  },
> > +
> > +  promiseArchivedPingList: function() {
> 
> how does this module behave on shutdown?
> 
> ::: toolkit/components/telemetry/TelemetryStorage.jsm
> @@ +107,5 @@
> > +   * pings out of their filename.
> > +   *
> > +   * @return {promise<sequence<object>>}
> > +   */
> > +  loadArchivedPingList: function() {
> 
> you dropped the "promise.." prefix?

I have the promise prefix in the public API of TelemetryArchive, shall we do that here too?
It's currently not consistent with all the other internal methods in the Telemetry modules.

> @@ +121,5 @@
> > +   * if |false| the file will not be overwritten and no error will be reported if
> > +   * the file exists.
> > +   * @returns {promise}
> > +   */
> > +  savePingToFile: function(ping, file, overwrite) {
> 
> couldn't you join savePingToFile and savePing into a single method with an
> optional filePath argument?

I think it's easier to track of this for now with an explicit different function name.
We will remove this soon when we make TelemetryStorage know about aborted & pending pings too.

> @@ +177,5 @@
> > +   * through |popPendingPings|.
> > +   *
> > +   * @returns {promise}
> > +   */
> > +  loadSavedPings: function() {
> 
> should TelemetryStorage really load all saved pings? how about if
> TelemetryStorage built a list of unexpired saved pings on startup (at
> TelemetryController's behest) and then TelemetryController can ask
> TelemetryStorage to load the ping contents on demand. That way we avoid the
> memory overhead of loading all pending pings on startup

Yes, that is what i suggest, but i don't think it should be part of this particular patch to change the existing behavior.
Bug 1041616 would probably fit well?

> @@ +190,5 @@
> > +   *
> > +   * @param {string} file The file to load.
> > +   * @returns {promise}
> > +   */
> > +  loadHistograms: function loadHistograms(file) {
> 
> can we rename this method? the pings have way more data than just histograms
> these days

Can we keep other cleanups out of this patch?
I'm just moving historic code here and would like to keep cleanup that is not directly related to other patches or bugs.

> @@ +197,5 @@
> > +
> > +  /**
> > +   * The number of pings loaded since the beginning of time.
> > +   */
> > +  get pingsLoaded() {
> 
> rename this method if it's really since the beginning of time

Later patch/bug?
 
> @@ +314,5 @@
> > +      this._log.trace("loadArchivedPingList - checking in subdir: " + dir.path);
> > +      let pingIterator = new OS.File.DirectoryIterator(dir.path);
> > +      let pings = (yield pingIterator.nextBatch()).filter(e => !e.isDir);
> > +
> > +      // Now process any ping files of the form "<timestamp>.<uuid>.<type>.json"
> 
> i'm wondering if we should have put a version number in the filename too
> ..<type>.v1.json in case we change the filename format or ping-content
> format in the future
> 
> i'm on the fence about this

Hm, i think if we really needed to change it in the future, we could just start adding versioning then (and fallback to the old behavior if we don't find a vN part where we expect it).

> @@ +323,5 @@
> > +          continue;
> > +        }
> > +
> > +        // In case of conflicts, overwrite only with newer pings.
> > +        if (archivedPings.has(data.id)) {
> 
> should we clean up the on-disk state as well?

Yes, that sounds better then accumulating rejected pings.
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #21)
> @@ +133,5 @@
> > +
> > +    return list;
> > +  },
> > +
> > +  promiseArchivedPingList: function() {
> 
> how does this module behave on shutdown?

Both this module and TelemetryStorage don't have much state and this just returns a cached list of ping info.
This should not have behavior that we have to worry about around shutdown.
Do you have any specific concerns here?
Attachment #8593585 - Attachment is obsolete: true
Attachment #8594788 - Flags: review?(vdjeric)
Blocks: 1156355
Blocks: 1156358
I moved some of the points from comment 21 over to bug 1156355.
> I have the promise prefix in the public API of TelemetryArchive, shall we do
> that here too?
> It's currently not consistent with all the other internal methods in the
> Telemetry modules.

Your call I guess, I'm ok with using the promise prefix for external interfaces only

> Yes, that is what i suggest, but i don't think it should be part of this
> particular patch to change the existing behavior.
> Bug 1041616 would probably fit well?
-- 
> Can we keep other cleanups out of this patch?
> I'm just moving historic code here and would like to keep cleanup that is
> not directly related to other patches or bugs.
--
> Later patch/bug?

Ok
Comment on attachment 8594788 [details] [diff] [review]
Part 6: Refactor archiving to the new Telemetry module design

Review of attachment 8594788 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Georg Fritzsche [:gfritzsche] from comment #23)
> Both this module and TelemetryStorage don't have much state and this just
> returns a cached list of ping info.
> This should not have behavior that we have to worry about around shutdown.
> Do you have any specific concerns here?

Mainly wondering what happens to the write operations at shutdown. OS.File.makeDir/writeAtomic/remove
Attachment #8594788 - Flags: review?(vdjeric) → review+
Or would that be caught by OS.File's asyncShutdown blocker?
(In reply to Vladan Djeric (:vladan), please needinfo -- PTO April 25-May 3 from comment #28)
> Or would that be caught by OS.File's asyncShutdown blocker?

Yoric confirms that the OS.File implementation takes care of that (and immediately rejects calls that are too late).
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Keywords: leave-open
Resolution: --- → FIXED
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: