Closed Bug 827157 Opened 11 years ago Closed 11 years ago

Refactor sessions provider to count ticks

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect)

defect
Not set
normal

Tracking

(firefox20 fixed, firefox21 fixed)

RESOLVED FIXED
Firefox 21
Tracking Status
firefox20 --- fixed
firefox21 --- fixed

People

(Reporter: gps, Assigned: gps)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 1 obsolete file)

+++ This bug was initially created as a clone of Bug #826893 +++

We currently count time intervals. I think we should count active notification ticks instead. Much simpler implementations.
Whiteboard: [fixed in services]
And we should only insert every minute or something instead of every 5 seconds.
Depends on: 828152
Blocks: 828152
No longer depends on: 828152
Blocks: 829887
Depends on: 830495
Attached patch Refactored session provider, v1 (obsolete) — Splinter Review
While we don't yet have requirements from Product and Metrics, I'm hedging on certain outcomes and have implemented changes with performance in mind.

There are 2 major changes to this patch:

1) Current session state is recorded in preferences rather than the database.
2) Completed sessions are recorded to the database on provider initialization, not shutdown.

Everything that was being recorded under the "current session" measurement is now being recorded in prefs. This includes active time. Yes, we still update a pref every 5s if the user is active in the browser. But, it's a pref. I don't believe we're concerned about updating prefs every 5s. Please correct me if I am wrong about performance implications.

On shutdown, we store a boolean flag in prefs indicating that shutdown was clean. No database writes occur on shutdown!

On startup, we read state from the last session's prefs and perform the database writes necessary to persist it. We then clear out the old prefs and write new values for the current session.

Things to consider:

1) We still don't count short-lived sessions that did not last until FHR was initialized. This problem will get worse as we delay FHR initialization (bug 830489).

2) We should probably delete the old measurements from the database. We don't yet have APIs for that. In the mean time, having them linger doesn't hurt anything.

3) I don't like hardcoding the pref branch in the constructor. We should probably expose a per-provider Preferences object in the base class that's automatically attached to the provider's name.

Try at https://tbpl.mozilla.org/?tree=Try&rev=bab3de49675b. I expect it to recover all of the lost shutdown time.
Attachment #703072 - Flags: feedback?(vdjeric)
Attachment #703072 - Flags: feedback?(rnewman)
(In reply to Gregory Szorc [:gps] from comment #2)

> 1) We still don't count short-lived sessions that did not last until FHR was
> initialized. This problem will get worse as we delay FHR initialization (bug
> 830489).

Incrementally store rudimentary session information in prefs, using a component that *is* initialized fairly early. The next time FHR *does* get initialized, at least it knows about all the previous short sessions.

I tend to think we should have some kind of service that stores session information, rather than hacking it together for FHR, but minimal-fg.


> 2) We should probably delete the old measurements from the database. We
> don't yet have APIs for that. In the mean time, having them linger doesn't
> hurt anything.

Do you mean six-month expiry, or something more complex like rollup?
Comment on attachment 703072 [details] [diff] [review]
Refactored session provider, v1

Review of attachment 703072 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/healthreport/providers.jsm
@@ +430,5 @@
>      return Task.spawn(this._onInit.bind(this));
>    },
>  
>    _onInit: function () {
> +    // TODO remove old current measurement from the database.

Oh, I see what you meant.

This is why I wanted to have migration -- migration can be run only when appropriate, whereas cleanup code runs every time :/

@@ +466,5 @@
>      }
>  
> +    this._log.debug("Recording clean shutdown.");
> +    this._prefs.set("current.clean", true);
> +    this._prefs.set("current.totalTime", new Date() - this._startDate);

Write 'clean' last.

Use `Date.now()` rather than `new Date()` when you're performing arithmetic.

@@ +494,4 @@
>      let updateActive = active && !this._lastActivityWasInactive;
>      this._lastActivityWasInactive = !active;
>  
> +    this._prefs.set("current.totalTime", new Date() - this._startDate);

Same.

@@ +526,5 @@
> +      }
> +    }.bind(this);
> +
> +    if (!startDay) {
> +      this._log.info("No previous session recoded. Is this the first run?");

s/recoded/recorded
Attachment #703072 - Flags: feedback?(rnewman) → feedback+
I built a standalone component that captures session history independent of FHR. It records data for all sessions, even very short ones. It records active ticks for short sessions. No data is lost. Yes, no data is lost. Bug 830495 is irrelevant with this patch. It uses preferences to store things. Although, this is completely abstracted away. If someone wants to reimplement it in C++ or something, fine by me. I also think that long term the code could live as part of session restore. Follow-up bugs. The import consideration is whether prefs are acceptable for this storage. I think they are.

There are now no session-related database queries at startup or shutdown. Instead, data is moved from preferences to SQLite at payload generation time.

Data stored in preferences is pruned when it is moved to SQLite. We only record data in preferences when FHR is enabled. To protect against unbounded addition of new data in preferences, we prune data older than 7 days on "idle-daily" notification. This covers the case where FHR is disabled. Yes, this means that if FHR is enabled we will only have 7 days worth of session history. This is a limitation of how FHR's data collection is currently upload-driven. If we instituted a periodic timer to drive data aggregation, this limitation would go away. Follow-up bug. Not require blocker, IMO.

There are also a number of smaller changes in this patch. They could have been split out into separate patch parts. It's a Friday. 0fg.

mconnor is planning uplift to Aurora Saturday AFAIK. I'd love to see this make the uplift since I'm very confident this will revert most of the Talos regressions.
Assignee: nobody → gps
Attachment #703072 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #703072 - Flags: feedback?(vdjeric)
Attachment #704140 - Flags: review?(rnewman)
I will review this tonight.
Comment on attachment 704140 [details] [diff] [review]
A session provider that doesn't suck, v1

Review of attachment 704140 [details] [diff] [review]:
-----------------------------------------------------------------

I don't see anything obviously wrong with this.

One addition I'd like to see: reporting how long a tick is. This seems like important context in case it changes or varies by OS! :D

::: services/datareporting/sessions.jsm
@@ +110,5 @@
> +    return this._prefs.get("current.totalTime", 0);
> +  },
> +
> +  updateTotalTime: function () {
> +    this._prefs.set("current.totalTime", Date.now() - this.startDate.getTime());

This works, btw:

  Date.now() - this.startDate;

@@ +122,5 @@
> +    if (!Number.isInteger(value)) {
> +      throw new Error("main time must be an integer.");
> +    }
> +
> +    this._prefs.set("current.main", value);

Let's have a comment to explain why it's safe to store these ints in prefs, but it's not to store the dates.

@@ +179,5 @@
> +      if (!s) {
> +        continue;
> +      }
> +
> +      if (s.startDate.getTime() >= date.getTime()) {

Dates are directly comparable.

@@ +240,5 @@
> +    this.updateTotalTime();
> +
> +    this.recordMain();
> +    this.recordFirstPaint();
> +    this.recordSessionRestored();

Just inline these three functions.

@@ +256,5 @@
> +    if (!updateActive) {
> +      return;
> +    }
> +
> +    this.incrementActiveTicks();

Much as I love early return:

  ...
  if (updateActive) {
    this.incrementActiveTicks();
  }
}

@@ +272,5 @@
> +  },
> +
> +  // This is meant to be called only during onStartup().
> +  _moveCurrentToPrevious: function () {
> +    let clearCurrent = function () {

Add a comment that this marks us as 'dirty', because one of the prefs we clear is 'clean'.

@@ +300,5 @@
> +    };
> +
> +    this._log.debug("Recording last sessions as #" + count + ".");
> +    this._prefs.set("previous." + count, JSON.stringify(obj));
> +    clearCurrent();

OK, so what happens if that pref write fails (e.g., disk full)? We never clear current, and this whole thing ends poorly.

Can we add some fault tolerance here -- a try..finally from line 284 to 303, with the finally block being the body of clearCurrent? (It doesn't need to be a function in that case.)
Attachment #704140 - Flags: review?(rnewman) → review+
(In reply to Richard Newman [:rnewman] from comment #7)

Thanks for the quick review! On a Friday no less! beerDebt[rnewman]++.

> One addition I'd like to see: reporting how long a tick is. This seems like
> important context in case it changes or varies by OS! :D

I agree conceptually. However, we don't have access to it:

Notification origin: https://mxr.mozilla.org/mozilla-central/source/content/events/src/nsEventStateManager.cpp#241
Timer creation: https://mxr.mozilla.org/mozilla-central/source/content/events/src/nsEventStateManager.cpp#732
Interval declaration: https://mxr.mozilla.org/mozilla-central/source/content/events/src/nsEventStateManager.cpp#116

We'll need the C++ changed to get access to the interval. Sure, I could estimate it. But, that feels hacky. I'd rather do it right. I think Metrics can get by on mapping app version to interval until then.

...

I have implemented your other comments and will commit shortly.
I love it when a plan comes together.

https://hg.mozilla.org/services/services-central/rev/3e7e06804ae4
Whiteboard: [fixed in services]
Comment on attachment 704140 [details] [diff] [review]
A session provider that doesn't suck, v1

[Approval Request Comment]

General FHR Aurora uplift request. This is purely backend code. This patch is critical to FHR for performance reasons.

Should make it to m-c within a few hours or by Saturday afternoon assuming tests all pass.
Attachment #704140 - Flags: approval-mozilla-aurora?
The previous patch regressed this due to HealthReporter calling the wrong API. You are correct that the test only tests a subset of the fix. A thorough test would be much more involved. For now I've verified manually.

The thorough test will come with mochitest or similar testing of the XPCOM service and therefore falls under the umbrella of another bug.
Attachment #704140 - Attachment is obsolete: true
Attachment #704140 - Flags: approval-mozilla-aurora?
Attachment #704181 - Flags: review?(rnewman)
Comment on attachment 704140 [details] [diff] [review]
A session provider that doesn't suck, v1

The beer checked the obsolete checkbox.
Attachment #704140 - Attachment is obsolete: false
Attachment #704140 - Flags: approval-mozilla-aurora?
Comment on attachment 704181 [details] [diff] [review]
Send current session info with request, v1

Review of attachment 704181 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/healthreport/providers.jsm
@@ +396,5 @@
> +  _serializeJSONSingular: function (data) {
> +    let result = {};
> +
> +    for (let [field, value] of data) {
> +      result[field] = value[1];

I think what you're doing here is taking the second part of the value from

   fields.set("sessionRestored", [now, sessions.sessionRestored]);

If so, it'd be nice to get an example in a function comment, or a one-line comment explaining what this '1' is for.

If not, a comment seems even more necessary :)
Attachment #704181 - Flags: review?(rnewman) → review+
Follow-up:

https://hg.mozilla.org/services/services-central/rev/d15625b7c9c7

You can land a follow-up follow-up with a comment at your leisure :D
Comment on attachment 704181 [details] [diff] [review]
Send current session info with request, v1

Also flagging follow-up for Aurora.
Attachment #704181 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/3e7e06804ae4
https://hg.mozilla.org/mozilla-central/rev/d15625b7c9c7
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [fixed in services]
Target Milestone: --- → mozilla21
David: You've been doing a lot of session restore foo lately and may want to look at https://hg.mozilla.org/mozilla-central/file/default/services/datareporting/sessions.jsm. It's not session restore specific. But, it's something I could easily see living in the session management code one day. We landed it in /services/datareporting due to brevity. If you think it belongs elsewhere, I have few objections to moving it or transferring ownership to a different module.
Comment on attachment 704140 [details] [diff] [review]
A session provider that doesn't suck, v1

Approving these patches per comment# 10 and considering FHR is a new feature landing in FF20.
Attachment #704140 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #704181 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Blocks: 833612
Component: Metrics and Firefox Health Report → Client: Desktop
Product: Mozilla Services → Firefox Health Report
Target Milestone: mozilla21 → Firefox 21
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: