Closed Bug 850709 Opened 7 years ago Closed 5 years ago

FHR should not poll every minute

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: gps, Unassigned, Mentored)

References

Details

(Whiteboard: [diamond] [good next bug])

Attachments

(1 file, 9 obsolete files)

I'm mass filing large ticket items from FHR's performance review: https://wiki.mozilla.org/Performance/FHR. This bug is for the first item:

4. FHR event polling Currently, FHR wakes up every minute to check if it needs to submit a report, expire any data, etc. This can be made neater by simply scheduling wake-ups for a specific time in the future.

This may get resolved as part of making FHR work for Android since we're planning massive refactorings to how policy works.
This'll be a win, but it's not as important as just delivering something on Android. And we'll be addressing scheduling in the course of that refactoring, so marking this as P4.
Priority: -- → P4
I think we need to be more intelligent than polling once per minute, especially given the battery implications of activity on idle.

While I agree that we need to make scheduling significantly better for Android, I suspect there's a straightforward win here where we set a timer on init to fire after the threshold we're checking against.  We can minimize the impact and risk in the short term by not changing the logic itself, just the time until we check again.  Longer-term, we can be even better.
Priority: P4 → P2
Component: Metrics and Firefox Health Report → Client: Desktop
Product: Mozilla Services → Firefox Health Report
Blocks: 888025
We're going to remove the implicit acceptance logic before we do this in order to reduce the net work. (Doing this first would require modifying lots of implicit acceptance code which we're just going to throw away anyway.)
Depends on: 862563
Assignee: nobody → smirea
Attached patch Don't poll every minute; (obsolete) — Splinter Review
Now instead of one 1 minute interval timer at startup that polls for info and kills your battery there is one offset timer that is supposed to run when data submission should occur followed by a second periodic timer every 24h for your every day data submission. Test impact was minimal

https://tbpl.mozilla.org/?tree=Try&rev=5f6f1b3304fc
Attachment #774396 - Flags: review?(gps)
Comment on attachment 774396 [details] [diff] [review]
Don't poll every minute;

Review of attachment 774396 [details] [diff] [review]:
-----------------------------------------------------------------

This is on the right track. We can discuss the feedback IRL if you want.

::: services/datareporting/policy.jsm
@@ +466,5 @@
> +
> +    this._timers = {
> +      offset: Cc["@mozilla.org/timer;1"].createInstance(Ci.nsITimer),
> +      interval: Cc["@mozilla.org/timer;1"].createInstance(Ci.nsITimer),
> +    };

How do you feel about using a single timer that is reset as needed instead of using multiple timers?

@@ +719,5 @@
> +
> +    // Because of the way this is implemented now all we need to do is re-start
> +    // the timers and an offset retry timeout will be scheduled automatically
> +    this.stop();
> +    this.start();

You should do this by cancelling and restarting the underlying timer rather than restarting the instance. You probably want to factor the timer management code into a reusable function.
Attachment #774396 - Flags: review?(gps) → feedback+
Attached patch Don't poll every minute; (obsolete) — Splinter Review
I have no idea why I did not realize that the one timer can just be reused. Fixed that and instantiated the timer instance in the constructor so now the .start() function is effectively a re-start function, but I guess it makes more sense to keep the naming
Attachment #775993 - Flags: review?(gps)
Attachment #774396 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Comment on attachment 775993 [details] [diff] [review]
Don't poll every minute;

Review of attachment 775993 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/datareporting/policy.jsm
@@ +217,5 @@
>     *
>     * The random bit is to ensure that other systems scheduling around the same
>     * interval don't all get scheduled together.
>     */
> +  POLL_INTERVAL_MSEC: MILLISECONDS_PER_DAY + Math.floor(2.5 * 1000 * Math.random()),

We probably want a larger fuzz window now that we poll every day rather than minute. Say +- 15 minutes?

@@ +455,3 @@
>     *
> +   * This will calculate an offset remaining to the next submission time and set
> +   * one time timeout for that time. After which, reset to a perioical 24h interval

Nit: Add period after sentences.

@@ +463,4 @@
>  
> +    // Time remaining until the first data submission.
> +    let offsetT = this.nextDataSubmissionDate - this.now() + 1;
> +    if (!offsetT || offsetT < 0) {

When is !offsetT true?

@@ +478,5 @@
> +    deferred.promise.then(() => {
> +      this._timer.initWithCallback({
> +        notify: this.checkStateAndTrigger.bind(this)
> +      }, this.POLL_INTERVAL_MSEC, this._timer.TYPE_REPEATING_SLACK);
> +    });

Is there any reason you can't inline this into the notify callback above?

@@ +710,5 @@
>      this.currentDaySubmissionFailureCount++;
> +
> +    // Because of the way this is implemented now all we need to do is re-start
> +    // the timers and an offset retry timeout will be scheduled automatically.
> +    this.start();

Please factor out the timer setting logic into its own function and call that, not start(). If start() is a one-line function that sets the timer, so be it. Calling start() on an already-started instance just seems wrong.
Attachment #775993 - Flags: review?(gps) → feedback+
Attached patch Don't poll every minute; (obsolete) — Splinter Review
Implemented changes
Attachment #794903 - Flags: review?(gps)
Attachment #775993 - Attachment is obsolete: true
Comment on attachment 794903 [details] [diff] [review]
Don't poll every minute;

Review of attachment 794903 [details] [diff] [review]:
-----------------------------------------------------------------

I'm granting this r+. However, I'd like to see the test coverage improved. Preferably as part of this bug, but I'll accept a followup.

I think it would help if you set a variable to the wall time the timer is scheduled for so you can more easily test that the timer is properly adjusted. Currently, I believe we're lacking test coverage around ensuring the timer is rescheduled after certain key events. These are important tests to have!
Attachment #794903 - Flags: review?(gps) → review+
Attached patch Don't poll every minute; (obsolete) — Splinter Review
After some off-line talks I ended up adding a side-effect to policy.nextDataSubmissionDate which is to reset the timer as this is always the desired behavior as well as adding tests to support it.
Attachment #797425 - Flags: review?(gps)
Attachment #794903 - Attachment is obsolete: true
Attachment #797425 - Flags: review?(rnewman)
Comment on attachment 797425 [details] [diff] [review]
Don't poll every minute;

Review of attachment 797425 [details] [diff] [review]:
-----------------------------------------------------------------

Nit: "r=gps,rnewman".

Pretty cursory review from me; gps will catch anything I missed!

::: services/datareporting/policy.jsm
@@ +433,5 @@
>  
> +  /**
> +   * Sets the nextDataSubmissionTime preference.
> +   *
> +   * Side Effects:

Nit: s/E/e.

@@ +559,4 @@
>     * You typically call this function for each constructed instance.
>     */
> +  start: function () {
> +    return this.restartTimer();

Why not just rename `restartTimer` to `start`, and always call `start`?

@@ +573,5 @@
> +    // scheduled in the past, fire the timer "immediately".
> +    let offsetT = Math.max(1, this.nextDataSubmissionDate - this.now() + 1);
> +    this._timerOffset = offsetT;
> +
> +    // Add random slack to the timer so we minimize the change of multiple systems

s/change/chance.

@@ +585,5 @@
> +    }
> +
> +    this.stop();
> +    this._timer.initWithCallback({
> +      notify: this.checkStateAndTrigger.bind(this)

Might be nice to blow away this._timer in the callback, too. That way the following call to `start` doesn't result in `stop`, and `cancel` on a completed timer instance.

@@ +602,2 @@
>      }
> +    this._timer.cancel();

Also need to drop the timer reference, no? Otherwise you can cancel it multiple times.
Attachment #797425 - Flags: review?(rnewman) → review+
Attached patch Don't poll every minute; (obsolete) — Splinter Review
(In reply to Richard Newman [:rnewman] from comment #11)

> Why not just rename `restartTimer` to `start`, and always call `start`?

Excerpt from Greg's Comment 7
> Please factor out the timer setting logic into its own function and call that, not start(). If start() is a one-line function that sets the timer, so be it. Calling start() on an already-started instance just seems wrong.

> Also need to drop the timer reference, no? Otherwise you can cancel it multiple times.

I made the timer as a single instance which is instantiated in the constructor for effiency purposes so if I clear the reference then subsequent calls to .stop() will fail with an undefined/null exception
Attachment #797592 - Flags: review?(rnewman)
Attachment #797592 - Flags: review?(gps)
Attachment #797425 - Attachment is obsolete: true
Attachment #797425 - Flags: review?(gps)
(In reply to Stefan Mirea [:smirea] from comment #12)

> Excerpt from Greg's Comment 7

Greg and I have different styles in some situations! In this, I defer to him.
 

> I made the timer as a single instance which is instantiated in the
> constructor for effiency purposes so if I clear the reference then
> subsequent calls to .stop() will fail with an undefined/null exception

In that case, you don't need to ever check that it's truthy. Be consistent!
Attached patch Don't poll every minute; (obsolete) — Splinter Review
Removed check for timer in policy and added test for it in test_policy.
Attachment #798979 - Flags: review?(rnewman)
Attachment #798979 - Flags: review?(gps)
Attachment #797592 - Attachment is obsolete: true
Attachment #797592 - Flags: review?(rnewman)
Attachment #797592 - Flags: review?(gps)
Attachment #798979 - Flags: review?(rnewman) → review+
Comment on attachment 798979 [details] [diff] [review]
Don't poll every minute;

Review of attachment 798979 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/datareporting/policy.jsm
@@ +583,5 @@
> +    if (offsetT >= 5 * 60 * 1000) {
> +      offsetT += Math.floor(15 * 60 * 1000 * Math.random());
> +    }
> +
> +    this.stop();

Just call this._timer.cancel().

@@ +586,5 @@
> +
> +    this.stop();
> +    this._timer.initWithCallback({
> +      notify: function () {
> +        this.stop();

Ditto.

@@ +638,5 @@
>  
>      // If the system clock were ever set to a time in the distant future,
>      // it's possible our next schedule date is far out as well. We know
>      // we shouldn't schedule for more than a day out, so we reset the next
> +    // scheduled date appropriately

Nit: add period after sentence.

@@ +639,5 @@
>      // If the system clock were ever set to a time in the distant future,
>      // it's possible our next schedule date is far out as well. We know
>      // we shouldn't schedule for more than a day out, so we reset the next
> +    // scheduled date appropriately
> +    if (nextSubmissionDate.getTime() >= nowT + MILLISECONDS_PER_DAY + 1) {

Or you could just use > instead of +1.
Attachment #798979 - Flags: review?(gps) → review+
Attached patch Don't poll every minute; (obsolete) — Splinter Review
Implemented changes
Attachment #800408 - Flags: review?(gps)
Attachment #798979 - Attachment is obsolete: true
Comment on attachment 800408 [details] [diff] [review]
Don't poll every minute;

Review of attachment 800408 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. Giving to rnewman for additional review because this code is important!
Attachment #800408 - Flags: review?(rnewman)
Attachment #800408 - Flags: review?(gps)
Attachment #800408 - Flags: review+
Attachment #800408 - Flags: review?(rnewman) → review+
Attached patch Don't poll every minute; (obsolete) — Splinter Review
Since Bug 862563 got the green light both from reviewrs and tbpl (finally!!!), I just made this patch compatible with those from that bug. Just one or two minor conflict fixes, nothing else

Here's the green try run as well: https://tbpl.mozilla.org/?tree=Try&rev=07acdd192c80
Attachment #807268 - Flags: review?(gps)
Attachment #800408 - Attachment is obsolete: true
Attachment #807268 - Flags: review?(gps) → review+
Blocks: 888052
Blocks: 925587
Duplicate of this bug: 958552
Blocks: 948528
Is there a reason why this patch is still pending?
Attached patch Don't poll every minute, v2 (obsolete) — Splinter Review
Attachment #807268 - Attachment is obsolete: true
Attachment #8361177 - Flags: review?(gps)
Comment on attachment 8361177 [details] [diff] [review]
Don't poll every minute, v2

Review of attachment 8361177 [details] [diff] [review]:
-----------------------------------------------------------------

This patch requires bug 862563 to land first. Sadly, the existing tests are flawed and they don't fail with just this patch applied. Bug 862563 shouldn't be too much effort to get landed.

Can you please also revert the Author information back to Stefan? I don't want to take credit away from you, but Stefan toiled on these patches and IMO he deserves the credit.

::: services/datareporting/tests/xpcshell/test_policy.js
@@ +614,5 @@
>    do_check_eq(listener.requestDataUploadCount, 1);
>    do_check_eq(listener.requestRemoteDeleteCount, 1);
>  });
>  
> +add_task(function test_polling() {

Nit: We should use function* for generators now.

add_task(function* test_polling() {

@@ +633,5 @@
>      value: function fakeCheckStateAndTrigger() {
>        let now = Date.now();
>        let after = now - then;
>        count++;
> +      let intended_elapsed_time = arrSum(intended.slice(0, count));

Nit: camelCase variable names.
Attachment #8361177 - Flags: review?(gps) → feedback+
Assignee: steven.mirea → nobody
Flags: firefox-backlog+
Setting to diamond & good-next-big, going looking for someone to carry this over the line. GPS, will you mentor?
Status: ASSIGNED → NEW
Flags: needinfo?(gps)
Whiteboard: [diamond] [good-next-bug]
Greg is taking a leave of absence, so I doubt it. I can mentor.
Flags: needinfo?(gps)
Whiteboard: [diamond] [good-next-bug] → [diamond] [good-next-bug] [mentor=rnewman]
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] → [diamond] [good-next-bug] [mentor=rnewman] p=0 [qa?]
Assignee: nobody → georg.fritzsche
I'll check into this, but there might be update hotfix work coming up that has to come first.
Added to Iteration 32.3.  Georg, can you recommend if this bug should be marked as [qa-] or [qa+] for QA verification.
Status: NEW → ASSIGNED
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] p=0 [qa?] → [diamond] [good-next-bug] [mentor=rnewman] p=0 s=it-32c-31a-30b.3 [qa?]
Also, can you provide a point estimate.
Flags: needinfo?(georg.fritzsche)
Assignee: georg.fritzsche → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(georg.fritzsche)
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] p=0 s=it-32c-31a-30b.3 [qa?] → [diamond] [good-next-bug] [mentor=rnewman] p=0 [qa?]
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] p=0 [qa?] → [diamond] [good-next-bug] [mentor=rnewman] p=5 [qa?]
Assignee: nobody → georg.fritzsche
Added to Iteration 33.1
Status: NEW → ASSIGNED
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] p=5 [qa?] → [diamond] [good-next-bug] [mentor=rnewman] p=5 s=33.1 [qa?]
Mentor: rnewman
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] p=5 s=33.1 [qa?] → [diamond] [good-next-bug] p=5 s=33.1 [qa?]
Due to other interruptions & test failures on bug 862563 i'm not sure if i'll actually manage it in this iteration and was recommended to drop it.
Assignee: georg.fritzsche → nobody
Removed from Iteration 33.1
Status: ASSIGNED → NEW
Whiteboard: [diamond] [good-next-bug] p=5 s=33.1 [qa?] → [diamond] [good-next-bug] p=5 [qa?]
Summary: Don't poll every minute → FHR should not poll every minute
This bug is now unblocked!
Points: --- → 5
Whiteboard: [diamond] [good-next-bug] p=5 [qa?] → [diamond] [good-next-bug]
Whiteboard: [diamond] [good-next-bug] → [diamond] [good next bug]
It looks like there've been a number of changes over the last year; this patch
un-bitrots the patch.  Assuming I haven't botched anything, this try run should
come back green:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2aa872ac130b

I think review on this should almost be a rubber-stamp, but I'll hold off until
I see green on try.
Attachment #8361177 - Attachment is obsolete: true
froydnj, that try job is long expired. Is it worth trying again?
Flags: needinfo?(nfroyd)
We're about to retire FHR with Unified Telemetry (http://mzl.la/1Ky4sj5), targetting Fx41.
I don't think it will be useful to re-activate this now, unless there is an impact on mobile.
(In reply to Nicholas Nethercote [:njn] from comment #33)
> froydnj, that try job is long expired. Is it worth trying again?

The try job came back with some oranges in FHR tests that I didn't understand.  Assuming that we're going to remove FHR and replace it with something that isn't so poll-happy, I don't think it's worth it for me to try and improve those patches.
Flags: needinfo?(nfroyd)
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
Resolution: WORKSFORME → WONTFIX
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.