Closed Bug 1139751 Opened 9 years ago Closed 9 years ago

Try to collect data for Telemetry pings when the user is idle

Categories

(Toolkit :: Telemetry, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
firefox38 --- unaffected
firefox39 --- fixed
firefox40 --- fixed

People

(Reporter: vladan, Assigned: gfritzsche)

References

Details

Attachments

(2 files, 3 obsolete files)

Telemetry can cause jank when collecting certain types of data (e.g. gatherMemory() afair), so we should try to do collect data when the user is idle.

So for example, instead of collecting the "daily" ping at midnight local time, we could schedule a timer for 11pm local time and then collect data when we get the next "user-idle" notification. If the user isn't idle for the entire hour, then we can collect it at midnight as originally intended.
gatherMemory is only for telemetry not FHR, correct? I don't think this blocks release.
Flags: needinfo?(vdjeric)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #1)
> gatherMemory is only for telemetry not FHR, correct? I don't think this
> blocks release.

Hmm, are you suggesting we don't call gatherMemory() for the "main" ping type? Right now the classic saved-session payload has all the same measurements (including gatherMemory) as any "main" payload (daily, shutdown, environment-changed)
Flags: needinfo?(vdjeric)
No I'm suggesting that we shouldn't be calling gatherMemory for release users, only those with telemetry enabled.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #3)
> No I'm suggesting that we shouldn't be calling gatherMemory for release
> users, only those with telemetry enabled.

Yeah we should classify all measurements (not just Histograms.json histograms) as optInOnRelease/optOutOnRelease.
Depends on bug 1133536 for the scheduler implementation.
Depends on: 1133536
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #4)
> (In reply to Benjamin Smedberg  [:bsmedberg] from comment #3)
> > No I'm suggesting that we shouldn't be calling gatherMemory for release
> > users, only those with telemetry enabled.
> 
> Yeah we should classify all measurements (not just Histograms.json
> histograms) as optInOnRelease/optOutOnRelease.

That is bug 1122062.
Blocks: 1143796
Assignee: nobody → alessio.placitelli
Assignee: alessio.placitelli → nobody
Assignee: nobody → gfritzsche
Attached patch Collect daily ping on user idle (obsolete) — Splinter Review
Attachment #8581923 - Flags: review?(vdjeric)
Can you do a diff vs m-c?
Flags: needinfo?(gfritzsche)
Oh I see, this stacks on top of bug 1140037. Nevermind
Flags: needinfo?(gfritzsche)
No longer blocks: 1143796
Status: NEW → ASSIGNED
Depends on: 1143796
No longer depends on: 1133536
Version: unspecified → Trunk
Attached patch Collect daily ping on user idle (obsolete) — Splinter Review
Note that i moved the "daily due" logic refactoring over here.

This stacks on bug 1143796 (and therefore bug 1133536).
Attachment #8581923 - Attachment is obsolete: true
Attachment #8581923 - Flags: review?(vdjeric)
Attachment #8582443 - Flags: review?(vdjeric)
Attached patch Collect daily ping on user idle (obsolete) — Splinter Review
Rebased, now part of this stack: https://pastebin.mozilla.org/8827100
Attachment #8582443 - Attachment is obsolete: true
Attachment #8582443 - Flags: review?(vdjeric)
Attachment #8583137 - Flags: review?(vdjeric)
Comment on attachment 8583137 [details] [diff] [review]
Collect daily ping on user idle

Review of attachment 8583137 [details] [diff] [review]:
-----------------------------------------------------------------

What's the idle tick interval? If it's one hour (or gets changed to one hour at some point), and the user goes idle at 23:40, the next tick won't occur until 0:40. rescheduleTimeout should take into account next midnight when scheduling an idle tick

::: toolkit/components/telemetry/TelemetrySession.jsm
@@ +451,5 @@
>      if (this._schedulerTimer) {
>        Policy.clearSchedulerTickTimeout(this._schedulerTimer);
>      }
>  
> +    const interval = this._isUserIdle ? SCHEDULER_TICK_IDLE_INTERVAL_MS :

SCHEDULER_TICK_IDLE_INTERVAL_MS is defined in another bug?

@@ +468,5 @@
> +    // This is today's date and also the previous midnight (0:00).
> +    const todayDate = truncateToDays(nowDate);
> +    const nearestMidnight = getNearestMidnight(nowDate);
> +    // If we are close to midnight, we check against that, otherwise against the last midnight.
> +    const checkDate = !nearestMidnight ? todayDate : nearestMidnight;

don't negate the condition, makes it harder to read

@@ +478,5 @@
> +      this._log.trace("_isDailyPingDue - already sent one today");
> +      return false;
> +    }
> +
> +    if (!nearestMidnight) {

can you make this into: 

if (sentPingToday) {
} else if (!sentPingToday && !nearestMidnight) { <-- redundancy for readability
} else {
}

right now, it's hard to reason about what the conditions are when the idleness check is done at line 502

@@ +485,4 @@
>        return true;
>      }
>  
> +    if (nowDate.getTime() < (nearestMidnight.getTime() - SCHEDULER_MIDNIGHT_TOLERANCE_MS)) {

getNearestMidnight already does this check, it would return null if nowDate weren't within tolerance of midnight

@@ +548,5 @@
>     */
>    _onSchedulerTick: function() {
>      if (this._shuttingDown) {
>        this._log.warn("_onSchedulerTick - already shutdown.");
> +      return Promise.reject(new Error("Already shutdown."));

good catch

::: toolkit/components/telemetry/tests/unit/test_TelemetrySession.js
@@ +975,5 @@
>    fakeSchedulerTimer(callback => schedulerTickCallback = callback, () => {});
>    yield TelemetrySession.setup();
>  
> +  // Make sure the daily ping gets triggered at midnight.
> +  let firstDailyDue = new Date(2035, 1, 2, 0, 0, 0);

is this because of idleness detection? document it

@@ +996,5 @@
>      Assert.ok(false, "No more daily pings should be sent/received in this test.");
>    });
>  
>    // Set the current time to a bit after midnight.
> +  let secondDailyDue = futureDate(firstDailyDue, 2 * MS_IN_ONE_HOUR);

this is a different test than the old one.. the old one was checking that we don't send 2 daily pings for the same midnight
Attachment #8583137 - Flags: review?(vdjeric)
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #12)
> Comment on attachment 8583137 [details] [diff] [review]
> Collect daily ping on user idle
> 
> Review of attachment 8583137 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> What's the idle tick interval? If it's one hour (or gets changed to one hour
> at some point), and the user goes idle at 23:40, the next tick won't occur
> until 0:40. rescheduleTimeout should take into account next midnight when
> scheduling an idle tick

Good point, fixed and test coverage added.

> ::: toolkit/components/telemetry/TelemetrySession.jsm
> @@ +451,5 @@
> >      if (this._schedulerTimer) {
> >        Policy.clearSchedulerTickTimeout(this._schedulerTimer);
> >      }
> >  
> > +    const interval = this._isUserIdle ? SCHEDULER_TICK_IDLE_INTERVAL_MS :
> 
> SCHEDULER_TICK_IDLE_INTERVAL_MS is defined in another bug?
> 
> @@ +468,5 @@
> > +    // This is today's date and also the previous midnight (0:00).
> > +    const todayDate = truncateToDays(nowDate);
> > +    const nearestMidnight = getNearestMidnight(nowDate);
> > +    // If we are close to midnight, we check against that, otherwise against the last midnight.
> > +    const checkDate = !nearestMidnight ? todayDate : nearestMidnight;
> 
> don't negate the condition, makes it harder to read

Fixed.

> @@ +478,5 @@
> > +      this._log.trace("_isDailyPingDue - already sent one today");
> > +      return false;
> > +    }
> > +
> > +    if (!nearestMidnight) {
> 
> can you make this into: 
> 
> if (sentPingToday) {
> } else if (!sentPingToday && !nearestMidnight) { <-- redundancy for
> readability
> } else {
> }
> 
> right now, it's hard to reason about what the conditions are when the
> idleness check is done at line 502

Fixed per IRC.

> @@ +485,4 @@
> >        return true;
> >      }
> >  
> > +    if (nowDate.getTime() < (nearestMidnight.getTime() - SCHEDULER_MIDNIGHT_TOLERANCE_MS)) {
> 
> getNearestMidnight already does this check, it would return null if nowDate
> weren't within tolerance of midnight

Fixed, just stripped that part.

> ::: toolkit/components/telemetry/tests/unit/test_TelemetrySession.js
> @@ +975,5 @@
> >    fakeSchedulerTimer(callback => schedulerTickCallback = callback, () => {});
> >    yield TelemetrySession.setup();
> >  
> > +  // Make sure the daily ping gets triggered at midnight.
> > +  let firstDailyDue = new Date(2035, 1, 2, 0, 0, 0);
> 
> is this because of idleness detection? document it

Yes, documented.

> @@ +996,5 @@
> >      Assert.ok(false, "No more daily pings should be sent/received in this test.");
> >    });
> >  
> >    // Set the current time to a bit after midnight.
> > +  let secondDailyDue = futureDate(firstDailyDue, 2 * MS_IN_ONE_HOUR);
> 
> this is a different test than the old one.. the old one was checking that we
> don't send 2 daily pings for the same midnight

Fixed.
Attachment #8583137 - Attachment is obsolete: true
Attachment #8584453 - Flags: review?(vdjeric)
This fixes an aborted-session race that i ended up chasing for the tests here.

The way _sendDailyPing() was written we could end up triggering _saveAbortedSessionPing() from the .then() *after* we cleared the aborted save tasks and removed the saved aborted-session from disk in shutdownChromeProcess().
Attachment #8584456 - Flags: review?(vdjeric)
Comment on attachment 8584453 [details] [diff] [review]
Collect daily ping on user idle

Review of attachment 8584453 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetrySession.jsm
@@ +459,5 @@
> +      timeout = SCHEDULER_TICK_IDLE_INTERVAL_MS;
> +      // We need to check though that we don't miss sending pings around midnight
> +      // when we use the longer idle intervals.
> +      const now = Policy.now();
> +      if (!this._sentPingToday(now)) {

this is incorrect, we might have sent a ping today because a ping was overdue, we would still need to send a ping at midnight. i say just drop this check, it's better to err on the side of caution and schedule a tick for midnight even if we did send today, _schedulerTickLogic will take care of it. let's keep all the logic for dealing with the edge cases in _schedulerTickLogic

@@ +461,5 @@
> +      // when we use the longer idle intervals.
> +      const now = Policy.now();
> +      if (!this._sentPingToday(now)) {
> +        let nextMidnight = new Date(truncateToDays(now));
> +        nextMidnight.setDate(nextMidnight.getDate() + 1);

let's add a getNextMidnight function.. and then both this code and getNearestMidnight can call it

@@ +471,5 @@
>      this._schedulerTimer =
> +      Policy.setSchedulerTickTimeout(() => this._onSchedulerTick(), timeout);
> +  },
> +
> +  _sentPingToday: function(nowDate) {

how about _sentDailyPingToday

@@ +514,5 @@
> +    // within the tolerance interval.
> +    if (!this._isUserIdle && (nowDate.getTime() < nearestMidnight.getTime())) {
> +      this._log.trace("_isDailyPingDue - waiting for user idle period");
> +      return false;
> +    }

i'm thinking we could even drop this functionality.. the amount jank caused by telemetry is relatively small, and if it only happens at midnight it's going to be rare, and postponing it 15 minutes might make much of a difference. Your call.
Attachment #8584453 - Flags: review?(vdjeric) → review+
Attachment #8584456 - Flags: review?(vdjeric) → review+
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #16)
> @@ +514,5 @@
> > +    // within the tolerance interval.
> > +    if (!this._isUserIdle && (nowDate.getTime() < nearestMidnight.getTime())) {
> > +      this._log.trace("_isDailyPingDue - waiting for user idle period");
> > +      return false;
> > +    }
> 
> i'm thinking we could even drop this functionality.. the amount jank caused
> by telemetry is relatively small, and if it only happens at midnight it's
> going to be rare, and postponing it 15 minutes might make much of a
> difference. Your call.

I think we should leave as it addresses the original point of the bug.
We want to collect at/around midnight, so we can optimistically just try to collect a bit before it if we happen to find an idle time.
I just want to check something here. It's important that no matter whether the ping is started before or after midnight, that the subsessionStartDate is for the correct day, so e.g.:

Monday       Midnight      Tuesday
A          *                   B
A                     *        B

no matter exactly when we split the subsession, "A" should have subsessionStartDate=Monday and "B" should have subsessionStartDate=Tuesday.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #18)
> I just want to check something here. It's important that no matter whether
> the ping is started before or after midnight, that the subsessionStartDate
> is for the correct day, so e.g.:
> 
> Monday       Midnight      Tuesday
> A          *                   B
> A                     *        B
> 
> no matter exactly when we split the subsession, "A" should have
> subsessionStartDate=Monday and "B" should have subsessionStartDate=Tuesday.

This is not guaranteed right now, for this bug and the TelemetryScheduler coalescing.
If this is a hard requirement i'll file a phase 3 bug to deal with it.
Flags: needinfo?(benjamin)
The analysis requirement is that we associate most of the data that is collected on Monday with Monday and the data that is collected on Tuesday with Tuesday. So if we have a session that starts right before midnight but gets coalesced to last most of Tuesday, we still need to associate that with Tuesday's stats.
Flags: needinfo?(benjamin)
Being handled in bug 1139460.
Flags: needinfo?(gfritzsche)
Bug 1139460.
Flags: needinfo?(gfritzsche)
Comment on attachment 8584453 [details] [diff] [review]
Collect daily ping on user idle

Approved for Aurora. For approval request see bug 1139460 comment 42. For approval comments see bug 1139460 comment 43.
Attachment #8584453 - Flags: approval-mozilla-aurora+
Attachment #8584456 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.