Closed Bug 1369734 Opened 7 years ago Closed 7 years ago

Improve aborted-session ping scheduling for reduced performance impact

Categories

(Toolkit :: Telemetry, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla56
Performance Impact high
Tracking Status
firefox56 --- fixed

People

(Reporter: chutten, Assigned: Dexter)

References

(Blocks 1 open bug)

Details

(Whiteboard: [measurement:client])

Attachments

(1 file)

bug 1367026 has a profile[1] showing some egregiously bad Telemetry performance on sleep wake. Turns out on sleep wake we're updating the aborted-session ping immediately. This might not be necessary. Also, we lose a lot of context in the scheduler tick that we have to rebuild in _schedulerTickLogic. Most notably whether or not we ticked because of a sleep wake. We shouldn't have to do timer math to figure that one out, we should pass the reason for the tick through to the logic. From there maybe we can more intelligently decide under what circumstances we should update aborted-session and pack the daily ping. (Though I expect packing a daily ping post sleep-wake is something we want to do immediately to have a sharp subsession edge) [1]: https://perf-html.io/public/4392b42cfc3b6693624603bad4f31eb6045d60be/calltree/?implementation=js&range=1957.0782_1998.9269~1993.2211_1996.1495&thread=0
Alessio, what is your take?
Flags: needinfo?(alessio.placitelli)
(In reply to Chris H-C :chutten from comment #0) > bug 1367026 has a profile[1] showing some egregiously bad Telemetry > performance on sleep wake. > > Turns out on sleep wake we're updating the aborted-session ping immediately. > This might not be necessary. Woha, that's some bad perf. We might not need to update the aborted-session as soon as wake up from asleep, especially if it hit perfs like that. If we crash right after waking up, we'd still receive a "crash" ping that has a session id *and* an older "aborted-session" ping from before the sleep. > Also, we lose a lot of context in the scheduler tick that we have to rebuild > in _schedulerTickLogic. Most notably whether or not we ticked because of a > sleep wake. We shouldn't have to do timer math to figure that one out, we > should pass the reason for the tick through to the logic. > From there maybe we can more intelligently decide under what circumstances > we should update aborted-session and pack the daily ping. (Though I expect > packing a daily ping post sleep-wake is something we want to do immediately > to have a sharp subsession edge) Yup, I agree with you, I think we should still pack a daily ping right after a sleep-wake. I agree that passing on the 'context' to |_schedulerTickLogic| might be helpful, but we need to make sure that it simplifies the logic at [1]: I think we'd still need to keep the timer logic to make sure that we don't gather the measurements again for an aborted-session ping if one was saved using the "environment-change" or "daily" ping data. [1] - http://searchfox.org/mozilla-central/rev/507743376d1ba753d14ab6b9305b7c6358570be8/toolkit/components/telemetry/TelemetrySession.jsm#487
Flags: needinfo?(alessio.placitelli)
Per Florian on bug 1367026 we may not have to change much here after waking up from sleep: > If you don't want to wait an additional 5 minutes, just starting it off requestIdleCallback (or Services.tm.mainThread.idleDispatch) would move it out of the user's way (unless it takes a very long time).
Whiteboard: [qf]
Priority: -- → P2
Whiteboard: [qf] → [qf-]
Whiteboard: [qf-] → [qf:p1]
Priority: P2 → P1
Whiteboard: [qf:p1] → [measurement:client][qf:p1]
Assignee: nobody → alessio.placitelli
Comment on attachment 8877158 [details] Bug 1369734 - Spin the scheduler tick on idle after sleep-wake or idle-active cycles. https://reviewboard.mozilla.org/r/148508/#review152938 ::: toolkit/components/telemetry/TelemetrySession.jsm:426 (Diff revision 1) > + // idle (let's not slow down things, see bug 1369734) to avoid sessions > // spanning more than a day. > // This is needed because sleep time does not count towards timeouts > // on Mac & Linux - see bug 1262386, bug 1204823 et al. > - return this._onSchedulerTick(); > + return new Promise(r => { > + Services.tm.mainThread.idleDispatch(() => this._onSchedulerTick().then(r, r)); This'll put a fuzzier edge on the daily ping in addition to the saved-to-disk aborted-session ping. I thought we wanted to avoid fuzzy subsession edges?
Attachment #8877158 - Flags: review?(chutten) → review-
Comment on attachment 8877158 [details] Bug 1369734 - Spin the scheduler tick on idle after sleep-wake or idle-active cycles. https://reviewboard.mozilla.org/r/148508/#review152938 > This'll put a fuzzier edge on the daily ping in addition to the saved-to-disk aborted-session ping. I thought we wanted to avoid fuzzy subsession edges? I'm not sure this makes the daily ping much fuzzier than it is already, after sleeping for a random amount of time: the 'idle' time shouldn't not be too far ahead in the future anyway.
Comment on attachment 8877158 [details] Bug 1369734 - Spin the scheduler tick on idle after sleep-wake or idle-active cycles. https://reviewboard.mozilla.org/r/148508/#review152952 This will change things, but probably only by including more of the beginning of a post-wake session than it used to. Also reason=daily is infrequent in the grand scheme of things so it should all come out in the wash
Attachment #8877158 - Flags: review- → review+
Comment on attachment 8877158 [details] Bug 1369734 - Spin the scheduler tick on idle after sleep-wake or idle-active cycles. Florian, would you mind given this patch a look too? I'm afraid idleDispatch is not widely used in the code yet :(
Attachment #8877158 - Flags: review?(florian)
Comment on attachment 8877158 [details] Bug 1369734 - Spin the scheduler tick on idle after sleep-wake or idle-active cycles. https://reviewboard.mozilla.org/r/148508/#review153950 Looks good to me, but note that bug 1368072 is currently changing Services.tm.mainThread.idleDispatch to Services.tm.idleDispatchToMainThread, so these two bugs will bitrot each other. ::: toolkit/components/telemetry/TelemetrySession.jsm:418 (Diff revision 1) > this._isUserIdle = true; > return this._onSchedulerTick(); > case "active": > // User is back to work, restore the original tick interval. > this._isUserIdle = false; > return this._onSchedulerTick(); If _onSchedulerTick is (potentially) expensive, calling it right when the user is back to work may not be a good idea. Should we maybe move the idleDispatch call to something inside the _onSchedulerTick method so that it handles both "active" and "wake_notification"?
Attachment #8877158 - Flags: review?(florian) → review+
Comment on attachment 8877158 [details] Bug 1369734 - Spin the scheduler tick on idle after sleep-wake or idle-active cycles. https://reviewboard.mozilla.org/r/148508/#review153950 > If _onSchedulerTick is (potentially) expensive, calling it right when the user is back to work may not be a good idea. Should we maybe move the idleDispatch call to something inside the _onSchedulerTick method so that it handles both "active" and "wake_notification"? Goot point, I just changed the patch to do that. Thanks!
Comment on attachment 8877158 [details] Bug 1369734 - Spin the scheduler tick on idle after sleep-wake or idle-active cycles. Chris, I've changed the patch so that we dispatch to idle when the user transitions from an idle to active state as well. It's a bit tricky, so I'd love you to take another look if possible. Unfortunately, this wasn't as straightforward as I'd imagined: test_TelemetrySession.js started failing in many fun ways. Turns out that the test was a bit racy and it could have been failing before as well: TelemetryController.testReset() doesn't imply that TelemetryScheduler (from TelemetrySession.jsm) is reset. So, if we don't call TelemetryController.testShutdown(), we end up receiving unexpected daily pings generated from zombie schedulers. This is a test only problem and ties into bug 1312699, sigh.
Attachment #8877158 - Flags: review+ → review?(chutten)
Comment on attachment 8877158 [details] Bug 1369734 - Spin the scheduler tick on idle after sleep-wake or idle-active cycles. https://reviewboard.mozilla.org/r/148508/#review154596 ::: toolkit/components/telemetry/TelemetrySession.jsm:424 (Diff revision 2) > case "wake_notification": > // The machine woke up from sleep, trigger a tick to avoid sessions > // spanning more than a day. > // This is needed because sleep time does not count towards timeouts > // on Mac & Linux - see bug 1262386, bug 1204823 et al. > - return this._onSchedulerTick(); > + this._onSchedulerTick(true); Why did you take out the return? According to the comment, the returned promise is necessary for good testing.
Attachment #8877158 - Flags: review?(chutten) → review-
Comment on attachment 8877158 [details] Bug 1369734 - Spin the scheduler tick on idle after sleep-wake or idle-active cycles. https://reviewboard.mozilla.org/r/148508/#review154596 > Why did you take out the return? According to the comment, the returned promise is necessary for good testing. Heh, yeah, sorry!
Comment on attachment 8877158 [details] Bug 1369734 - Spin the scheduler tick on idle after sleep-wake or idle-active cycles. https://reviewboard.mozilla.org/r/148508/#review155284
Attachment #8877158 - Flags: review?(chutten) → review+
Pushed by alessio.placitelli@gmail.com: https://hg.mozilla.org/integration/autoland/rev/f495235291e7 Spin the scheduler tick on idle after sleep-wake or idle-active cycles. r=chutten,florian
Summary: Can we schedule aborted-session ping updates a little more performantly? → Improve aborted-session ping scheduling for reduced performance impact
To recap: - when the user becomes active again after some inactivity period, we dispatch the next tick on idle on the main thread, as per comment 9; - when the user wakes up the machine from sleep, we dispatch the next tick on idle on the main thread, as per comment 3;
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Depends on: 1374958
Backout by cbook@mozilla.com: https://hg.mozilla.org/mozilla-central/rev/e49151136658 Backed out changeset f495235291e7 backed out for test failures in test_TelemetrySession.js | xpcshell return code: 0
Whoops, this wasn't re-opened :)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 1368072
(In reply to Pulsebot from comment #20) > Backout by cbook@mozilla.com: > https://hg.mozilla.org/mozilla-central/rev/e49151136658 > Backed out changeset f495235291e7 backed out for test failures in > test_TelemetrySession.js | xpcshell return code: 0 Looks like the updated patch (that solves the starvation problem with the idle dispatch) works fine on try [1]. I'm landing this again! [1] - https://treeherder.mozilla.org/#/jobs?repo=try&revision=5aeceff810f9
Pushed by alessio.placitelli@gmail.com: https://hg.mozilla.org/integration/autoland/rev/1bc50cd63349 Spin the scheduler tick on idle after sleep-wake or idle-active cycles. r=chutten,florian
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Depends on: 1374828
Performance Impact: --- → P1
Whiteboard: [measurement:client][qf:p1] → [measurement:client]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: