Allow accessing archived pings via the TelemetryPing public API

RESOLVED FIXED

Status

Firefox Health Report
Client: Desktop
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: gfritzsche, Assigned: gfritzsche)

Tracking

(Blocks: 1 bug)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox40 fixed)

Details

Attachments

(7 attachments, 3 obsolete attachments)

16.88 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
1.72 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
1.19 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
21.43 KB, patch
gfritzsche
: review+
Details | Diff | Splinter Review
35.47 KB, patch
vladan
: review+
Details | Diff | Splinter Review
3.27 KB, patch
vladan
: review+
Details | Diff | Splinter Review
50.93 KB, patch
vladan
: review+
Details | Diff | Splinter Review
(Assignee)

Description

3 years ago
To e.g. enable about:healthreport and about:telemetry we need to enable accessing the archived pings.

The current minimal working approach would be to add 2 methods:
* getArchivedPingList() - returning an array of ping info objects of the form:
  {
    id: <string>, // UUID that identifies the ping.
    type: <string>, // Type of the ping, e.g. "main"
    submissionDate: <Date>, // Date/time this ping was submitted to the Telemetry system.
  }
* getArchivedPingsById(ids) - returns an array of objects with the full ping data for the ping ids passed.
(Assignee)

Updated

3 years ago
Blocks: 1149754
(Assignee)

Updated

3 years ago
Duplicate of this bug: 1150030
(Assignee)

Updated

3 years ago
Depends on: 1137252
(Assignee)

Updated

3 years ago
Assignee: nobody → gfritzsche
Status: NEW → ASSIGNED
(Assignee)

Comment 2

3 years ago
Created attachment 8590220 [details] [diff] [review]
Part 1 - Make Components.* shorthand shared between Telemetry tests
Attachment #8590220 - Flags: review?(dteller)
(Assignee)

Comment 3

3 years ago
Created attachment 8590221 [details] [diff] [review]
Part 2 - Improve the fakeNow() test helper
Attachment #8590221 - Flags: review?(dteller)
(Assignee)

Comment 4

3 years ago
Created attachment 8590222 [details] [diff] [review]
Part 3 - Avoid hitting the network from Telemetry unit tests
Attachment #8590222 - Flags: review?(dteller)
(Assignee)

Comment 5

3 years ago
Created attachment 8590224 [details] [diff] [review]
Part 4 - Implement API to access the archive pings

This adds:
* scanning the ping archive on delayed startup
* caching the archived ping info so we only need to scan once
* making the ping archive info accessible
* allow loading an archived ping by id from disk
Attachment #8590224 - Flags: review?(dteller)
(Assignee)

Comment 6

3 years ago
Hm, i'm wondering whether its maybe better to not always scan on delayed startup and instead defer scanning the archived pings until the first request.
Attachment #8590220 - Flags: review?(dteller) → review+
Comment on attachment 8590221 [details] [diff] [review]
Part 2 - Improve the fakeNow() test helper

Review of attachment 8590221 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/tests/unit/head.js
@@ +108,5 @@
>    session.Policy.now = () => date;
>    let environment = Cu.import("resource://gre/modules/TelemetryEnvironment.jsm");
>    environment.Policy.now = () => date;
> +
> +  return new Date(date);

Why not `date` itself?
Attachment #8590221 - Flags: review?(dteller) → review+
Comment on attachment 8590222 [details] [diff] [review]
Part 3 - Avoid hitting the network from Telemetry unit tests

Review of attachment 8590222 [details] [diff] [review]:
-----------------------------------------------------------------

Isn't that already done in testing/profiles/prefs_general.js#234 ?

Other than that, r+.
Attachment #8590222 - Flags: review?(dteller) → review+
(Assignee)

Comment 9

3 years ago
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #7)
> Comment on attachment 8590221 [details] [diff] [review]
> Part 2 - Improve the fakeNow() test helper
> 
> Review of attachment 8590221 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/telemetry/tests/unit/head.js
> @@ +108,5 @@
> >    session.Policy.now = () => date;
> >    let environment = Cu.import("resource://gre/modules/TelemetryEnvironment.jsm");
> >    environment.Policy.now = () => date;
> > +
> > +  return new Date(date);
> 
> Why not `date` itself?

I was worried that someone might change the state of `date` and would run into hard-to-trace-down test issues.
It seemed easier to just eliminate that by returning a clone.
(Assignee)

Comment 10

3 years ago
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #8)
> Comment on attachment 8590222 [details] [diff] [review]
> Part 3 - Avoid hitting the network from Telemetry unit tests
> 
> Review of attachment 8590222 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Isn't that already done in testing/profiles/prefs_general.js#234 ?
> 
> Other than that, r+.

That addresses mochitests only - NS_ERROR_TOO_MANY_CONFIG_LOCATIONS:
https://developer.mozilla.org/en-US/docs/Mozilla/QA/Automated_testing#Need_to_set_preferences_for_test-suites.3F
Comment on attachment 8590224 [details] [diff] [review]
Part 4 - Implement API to access the archive pings

Review of attachment 8590224 [details] [diff] [review]:
-----------------------------------------------------------------

Note: in the future, could you use reviewboard? It makes it easier to find out if code has just moved around or has been changed.

::: toolkit/components/telemetry/TelemetryFile.jsm
@@ +294,5 @@
> +    if (typeof(ping.payload) == "string") {
> +      ping.payload = JSON.parse(ping.payload);
> +    }
> +    return ping;
> +  }),

I'm assuming that this code has just moved around, right?

::: toolkit/components/telemetry/TelemetryPing.jsm
@@ +307,5 @@
> +  },
> +
> +  /**
> +   * Allows waiting for TelemetryPings delayed initialization to complete.
> +   * This will complete before TelemetryPing is shutting down.

That sentence is not quite clear. Does this mean that the Promise is guaranteed to resolve before we start shutting down?

Also, the convention I have seen in code would be to call this `get promiseInitialized()`.

@@ +325,5 @@
> +   *                    { id: <string>,
> +   *                      timestampCreated: <number>,
> +   *                      type: <string> }
> +   */
> +  getArchivedPingList: function() {

If it returns a promise, I'd call this `promiseArchivedPingList` to avoid confusions. Same below.

@@ +330,5 @@
> +    return Impl.getArchivedPingList();
> +  },
> +
> +  /**
> +   * This loads an archived ping async from disk by id.

"Load an archived ping from disk by id, asynchronously."

Generally, it is a good idea to reserve word "this" for `this` object.

@@ +368,5 @@
>    _pendingPingRequests: new Map(),
>  
> +  // This tracks the archived pings in a Map of (id -> {timestampCreated, type}).
> +  // We use this to cache info on archived pings to avoid scanning the disk more than once.
> +  _archivedPings: new Map(),

Isn't this already initialized by `setupTelemetry`?

@@ +928,5 @@
>            yield OS.File.makeDir(DATAREPORTING_PATH, {ignoreExisting: true}).catch(reportError);
>            yield OS.File.makeDir(gPingsArchivePath, {ignoreExisting: true}).catch(reportError);
> +
> +          yield this._scanArchivedPingDirectory()
> +                    .catch((e) => this._log.error("setupTelemetry - failure scanning archived ping directory"));

Could you also log the error itself?

@@ +1127,5 @@
> +   * @param fileName {String} The filename.
> +   * @return {Object} An object with the extracted data in the form:
> +   *                  { timestamp: <number>,
> +   *                    id: <string>,
> +   *                    type: <string> }

Please document in which case this can be `null`.

@@ +1179,5 @@
> +    this._log.trace("_scanArchivedPingDirectory");
> +
> +    let iterator = new OS.File.DirectoryIterator(gPingsArchivePath);
> +    let subdirs = yield iterator.nextBatch();
> +    subdirs = subdirs.filter(e => e.isDir);

I'd combine both in a single statement:

let subdirs = (yield iterator.nextBatch()).
  filter(e => e.isDir);

This avoids having an intermediate state in which `subdirs` is not the list of subdirectories. Alternatively, you could just iterate

for (let entry of (yield iterator.nextBatch()) {
  if (!entry.isDir) {
    continue;
  }
}

@@ +1190,5 @@
> +        continue;
> +      }
> +
> +      this._log.trace("_scanArchivedPingDirectory - checking in subdir: " + dir.path);
> +      iterator = new OS.File.DirectoryIterator(dir.path);

Please don't reuse variables like this.

@@ +1192,5 @@
> +
> +      this._log.trace("_scanArchivedPingDirectory - checking in subdir: " + dir.path);
> +      iterator = new OS.File.DirectoryIterator(dir.path);
> +      let pings = yield iterator.nextBatch();
> +      pings = pings.filter(e => !e.isDir);

Here, too, please combine this in a single statement.

@@ +1197,5 @@
> +
> +      // Now process any ping files of the form "<timestamp>.<uuid>.<type>.json"
> +      for (let p of pings) {
> +        let data = this._getArchivedPingDataFromFileName(p.name);
> +        if (!data) {

A line of comment explaining why `data` can be `null`?

@@ +1230,5 @@
> +   */
> +  getArchivedPingList: function() {
> +    this._log.trace("getArchivedPingList");
> +
> +    let list = [for (p of this._archivedPings) {

`for ([id, ping] of this._archivedPings)`

::: toolkit/components/telemetry/tests/unit/test_PingAPI.js
@@ +50,5 @@
> +      id: data.id,
> +      type: data.type,
> +      timestampCreated: data.dateCreated.getTime(),
> +    });
> +    Assert.deepEqual(list, expectedPingList);

Nit: It's a good habit to add a third argument to Assert calls explaining what we're testing. Here and below.
Attachment #8590224 - Flags: review?(dteller) → review+
(Assignee)

Updated

3 years ago
Blocks: 1120380
(Assignee)

Comment 12

3 years ago
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #11)
> Note: in the future, could you use reviewboard? It makes it easier to find
> out if code has just moved around or has been changed.

I liked RB from testing, but my impression was that the integration is not bug-free yet?

> ::: toolkit/components/telemetry/TelemetryFile.jsm
> @@ +294,5 @@
> > +    if (typeof(ping.payload) == "string") {
> > +      ping.payload = JSON.parse(ping.payload);
> > +    }
> > +    return ping;
> > +  }),
> 
> I'm assuming that this code has just moved around, right?

Indeed.

> ::: toolkit/components/telemetry/TelemetryPing.jsm
> @@ +307,5 @@
> > +  },
> > +
> > +  /**
> > +   * Allows waiting for TelemetryPings delayed initialization to complete.
> > +   * This will complete before TelemetryPing is shutting down.
> 
> That sentence is not quite clear. Does this mean that the Promise is
> guaranteed to resolve before we start shutting down?

Yes, will make it clearer.

> @@ +325,5 @@
> > +   *                    { id: <string>,
> > +   *                      timestampCreated: <number>,
> > +   *                      type: <string> }
> > +   */
> > +  getArchivedPingList: function() {
> 
> If it returns a promise, I'd call this `promiseArchivedPingList` to avoid
> confusions. Same below.

I've seen different naming before, is it actually nice to leak that into the method name?

> @@ +368,5 @@
> >    _pendingPingRequests: new Map(),
> >  
> > +  // This tracks the archived pings in a Map of (id -> {timestampCreated, type}).
> > +  // We use this to cache info on archived pings to avoid scanning the disk more than once.
> > +  _archivedPings: new Map(),
> 
> Isn't this already initialized by `setupTelemetry`?

Right, we need to init it there so we can re-init it for tests.

> @@ +1230,5 @@
> > +   */
> > +  getArchivedPingList: function() {
> > +    this._log.trace("getArchivedPingList");
> > +
> > +    let list = [for (p of this._archivedPings) {
> 
> `for ([id, ping] of this._archivedPings)`

Sadly that doesn't work within array comprehensions.
(Assignee)

Updated

3 years ago
Depends on: 1154717
(Assignee)

Updated

3 years ago
Depends on: 1154856
(Assignee)

Comment 15

3 years ago
Created attachment 8593300 [details] [diff] [review]
Part 4 - Implement API to access the archive pings

Addressed review comments.
Attachment #8590224 - Attachment is obsolete: true
Attachment #8593300 - Flags: review+
(Assignee)

Comment 16

3 years ago
Created attachment 8593582 [details] [diff] [review]
Part 5 - Rename TelemetryFile to TelemetryStorage

Note: this can be folded with part 4 for landing to make history less confusing.
Attachment #8593582 - Flags: review?(vdjeric)
(Assignee)

Comment 17

3 years ago
Created attachment 8593585 [details] [diff] [review]
Part 6 - Refactor the archiving implementation

Note: this could be folded with part 4 for landing to make history less confusing.

This refactors the archiving parts to the new design we decided on (with one or two liberties taken):
https://docs.google.com/document/d/1tWwoqqr9aSewoye0TjLDRpolVOeEtzpVnw88ykbo9lk/edit?pli=1#heading=h.tj7p7jz0e834

It introduces TelemetryArchive.jsm and distributes the archiving implementation between that and TelemetryStorage.jsm
Attachment #8593585 - Flags: review?(vdjeric)
(Assignee)

Comment 18

3 years ago
Created attachment 8593589 [details] [diff] [review]
Part 7 - Only enable archiving on Firefox Desktop

I noticed along the way that we enabled archiving everywhere.
I don't think that's great, e.g. we don't use the archived data on Fennec and surely don't want to take the storage hit there?
Attachment #8593589 - Flags: review?(vdjeric)
(Assignee)

Comment 19

3 years ago
Created attachment 8594009 [details] [diff] [review]
Part 7 - Only enable archiving on Firefox Desktop
Attachment #8593589 - Attachment is obsolete: true
Attachment #8593589 - Flags: review?(vdjeric)
Attachment #8594009 - Flags: review?(vdjeric)
Attachment #8593582 - Flags: review?(vdjeric) → review+
Attachment #8594009 - Flags: review?(vdjeric) → review+
Comment on attachment 8593585 [details] [diff] [review]
Part 6 - Refactor the archiving implementation

Review of attachment 8593585 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryArchive.jsm
@@ +113,5 @@
> +      }
> +    }
> +
> +    const creationDate = new Date(ping.creationDate);
> +    this._archivedPings.set(ping.id, {

warn on overwrite?

@@ +133,5 @@
> +
> +    return list;
> +  },
> +
> +  promiseArchivedPingList: function() {

how does this module behave on shutdown?

::: toolkit/components/telemetry/TelemetryStorage.jsm
@@ +107,5 @@
> +   * pings out of their filename.
> +   *
> +   * @return {promise<sequence<object>>}
> +   */
> +  loadArchivedPingList: function() {

you dropped the "promise.." prefix?

@@ +121,5 @@
> +   * if |false| the file will not be overwritten and no error will be reported if
> +   * the file exists.
> +   * @returns {promise}
> +   */
> +  savePingToFile: function(ping, file, overwrite) {

couldn't you join savePingToFile and savePing into a single method with an optional filePath argument?

@@ +177,5 @@
> +   * through |popPendingPings|.
> +   *
> +   * @returns {promise}
> +   */
> +  loadSavedPings: function() {

should TelemetryStorage really load all saved pings? how about if TelemetryStorage built a list of unexpired saved pings on startup (at TelemetryController's behest) and then TelemetryController can ask TelemetryStorage to load the ping contents on demand. That way we avoid the memory overhead of loading all pending pings on startup

@@ +190,5 @@
> +   *
> +   * @param {string} file The file to load.
> +   * @returns {promise}
> +   */
> +  loadHistograms: function loadHistograms(file) {

can we rename this method? the pings have way more data than just histograms these days

@@ +197,5 @@
> +
> +  /**
> +   * The number of pings loaded since the beginning of time.
> +   */
> +  get pingsLoaded() {

rename this method if it's really since the beginning of time

@@ +314,5 @@
> +      this._log.trace("loadArchivedPingList - checking in subdir: " + dir.path);
> +      let pingIterator = new OS.File.DirectoryIterator(dir.path);
> +      let pings = (yield pingIterator.nextBatch()).filter(e => !e.isDir);
> +
> +      // Now process any ping files of the form "<timestamp>.<uuid>.<type>.json"

i'm wondering if we should have put a version number in the filename too ..<type>.v1.json in case we change the filename format or ping-content format in the future

i'm on the fence about this

@@ +323,5 @@
> +          continue;
> +        }
> +
> +        // In case of conflicts, overwrite only with newer pings.
> +        if (archivedPings.has(data.id)) {

should we clean up the on-disk state as well?
Attachment #8593585 - Flags: review?(vdjeric)
(Assignee)

Updated

3 years ago
Blocks: 1156253
(Assignee)

Comment 22

3 years ago
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #21)
> @@ +133,5 @@
> > +
> > +    return list;
> > +  },
> > +
> > +  promiseArchivedPingList: function() {
> 
> how does this module behave on shutdown?
> 
> ::: toolkit/components/telemetry/TelemetryStorage.jsm
> @@ +107,5 @@
> > +   * pings out of their filename.
> > +   *
> > +   * @return {promise<sequence<object>>}
> > +   */
> > +  loadArchivedPingList: function() {
> 
> you dropped the "promise.." prefix?

I have the promise prefix in the public API of TelemetryArchive, shall we do that here too?
It's currently not consistent with all the other internal methods in the Telemetry modules.

> @@ +121,5 @@
> > +   * if |false| the file will not be overwritten and no error will be reported if
> > +   * the file exists.
> > +   * @returns {promise}
> > +   */
> > +  savePingToFile: function(ping, file, overwrite) {
> 
> couldn't you join savePingToFile and savePing into a single method with an
> optional filePath argument?

I think it's easier to track of this for now with an explicit different function name.
We will remove this soon when we make TelemetryStorage know about aborted & pending pings too.

> @@ +177,5 @@
> > +   * through |popPendingPings|.
> > +   *
> > +   * @returns {promise}
> > +   */
> > +  loadSavedPings: function() {
> 
> should TelemetryStorage really load all saved pings? how about if
> TelemetryStorage built a list of unexpired saved pings on startup (at
> TelemetryController's behest) and then TelemetryController can ask
> TelemetryStorage to load the ping contents on demand. That way we avoid the
> memory overhead of loading all pending pings on startup

Yes, that is what i suggest, but i don't think it should be part of this particular patch to change the existing behavior.
Bug 1041616 would probably fit well?

> @@ +190,5 @@
> > +   *
> > +   * @param {string} file The file to load.
> > +   * @returns {promise}
> > +   */
> > +  loadHistograms: function loadHistograms(file) {
> 
> can we rename this method? the pings have way more data than just histograms
> these days

Can we keep other cleanups out of this patch?
I'm just moving historic code here and would like to keep cleanup that is not directly related to other patches or bugs.

> @@ +197,5 @@
> > +
> > +  /**
> > +   * The number of pings loaded since the beginning of time.
> > +   */
> > +  get pingsLoaded() {
> 
> rename this method if it's really since the beginning of time

Later patch/bug?
 
> @@ +314,5 @@
> > +      this._log.trace("loadArchivedPingList - checking in subdir: " + dir.path);
> > +      let pingIterator = new OS.File.DirectoryIterator(dir.path);
> > +      let pings = (yield pingIterator.nextBatch()).filter(e => !e.isDir);
> > +
> > +      // Now process any ping files of the form "<timestamp>.<uuid>.<type>.json"
> 
> i'm wondering if we should have put a version number in the filename too
> ..<type>.v1.json in case we change the filename format or ping-content
> format in the future
> 
> i'm on the fence about this

Hm, i think if we really needed to change it in the future, we could just start adding versioning then (and fallback to the old behavior if we don't find a vN part where we expect it).

> @@ +323,5 @@
> > +          continue;
> > +        }
> > +
> > +        // In case of conflicts, overwrite only with newer pings.
> > +        if (archivedPings.has(data.id)) {
> 
> should we clean up the on-disk state as well?

Yes, that sounds better then accumulating rejected pings.
(Assignee)

Comment 23

3 years ago
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #21)
> @@ +133,5 @@
> > +
> > +    return list;
> > +  },
> > +
> > +  promiseArchivedPingList: function() {
> 
> how does this module behave on shutdown?

Both this module and TelemetryStorage don't have much state and this just returns a cached list of ping info.
This should not have behavior that we have to worry about around shutdown.
Do you have any specific concerns here?
(Assignee)

Comment 24

3 years ago
Created attachment 8594788 [details] [diff] [review]
Part 6: Refactor archiving to the new Telemetry module design
(Assignee)

Updated

3 years ago
Attachment #8593585 - Attachment is obsolete: true
(Assignee)

Updated

3 years ago
Attachment #8594788 - Flags: review?(vdjeric)
(Assignee)

Updated

3 years ago
Blocks: 1156355
(Assignee)

Updated

3 years ago
Blocks: 1156358
(Assignee)

Comment 25

3 years ago
I moved some of the points from comment 21 over to bug 1156355.
> I have the promise prefix in the public API of TelemetryArchive, shall we do
> that here too?
> It's currently not consistent with all the other internal methods in the
> Telemetry modules.

Your call I guess, I'm ok with using the promise prefix for external interfaces only

> Yes, that is what i suggest, but i don't think it should be part of this
> particular patch to change the existing behavior.
> Bug 1041616 would probably fit well?
-- 
> Can we keep other cleanups out of this patch?
> I'm just moving historic code here and would like to keep cleanup that is
> not directly related to other patches or bugs.
--
> Later patch/bug?

Ok
Comment on attachment 8594788 [details] [diff] [review]
Part 6: Refactor archiving to the new Telemetry module design

Review of attachment 8594788 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Georg Fritzsche [:gfritzsche] from comment #23)
> Both this module and TelemetryStorage don't have much state and this just
> returns a cached list of ping info.
> This should not have behavior that we have to worry about around shutdown.
> Do you have any specific concerns here?

Mainly wondering what happens to the write operations at shutdown. OS.File.makeDir/writeAtomic/remove
Attachment #8594788 - Flags: review?(vdjeric) → review+
Or would that be caught by OS.File's asyncShutdown blocker?
(Assignee)

Comment 29

3 years ago
(In reply to Vladan Djeric (:vladan), please needinfo -- PTO April 25-May 3 from comment #28)
> Or would that be caught by OS.File's asyncShutdown blocker?

Yoric confirms that the OS.File implementation takes care of that (and immediately rejects calls that are too late).
(Assignee)

Updated

3 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Keywords: leave-open
Resolution: --- → FIXED
(Assignee)

Updated

3 years ago
status-firefox40: affected → fixed
You need to log in before you can comment on or make changes to this bug.