FHR should not poll every minute

RESOLVED WONTFIX

Status

Firefox Health Report
Client: Desktop
P2
normal
RESOLVED WONTFIX
5 years ago
3 years ago

People

(Reporter: gps, Unassigned, Mentored)

Tracking

unspecified
Points:
5
Dependency tree / graph
Bug Flags:
firefox-backlog +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [diamond] [good next bug])

Attachments

(1 attachment, 9 obsolete attachments)

(Reporter)

Description

5 years ago
I'm mass filing large ticket items from FHR's performance review: https://wiki.mozilla.org/Performance/FHR. This bug is for the first item:

4. FHR event polling Currently, FHR wakes up every minute to check if it needs to submit a report, expire any data, etc. This can be made neater by simply scheduling wake-ups for a specific time in the future.

This may get resolved as part of making FHR work for Android since we're planning massive refactorings to how policy works.
This'll be a win, but it's not as important as just delivering something on Android. And we'll be addressing scheduling in the course of that refactoring, so marking this as P4.
Priority: -- → P4
I think we need to be more intelligent than polling once per minute, especially given the battery implications of activity on idle.

While I agree that we need to make scheduling significantly better for Android, I suspect there's a straightforward win here where we set a timer on init to fire after the threshold we're checking against.  We can minimize the impact and risk in the short term by not changing the logic itself, just the time until we check again.  Longer-term, we can be even better.
Priority: P4 → P2
(Reporter)

Updated

5 years ago
Component: Metrics and Firefox Health Report → Client: Desktop
Product: Mozilla Services → Firefox Health Report
(Reporter)

Updated

5 years ago
Blocks: 888025
(Reporter)

Comment 3

5 years ago
We're going to remove the implicit acceptance logic before we do this in order to reduce the net work. (Doing this first would require modifying lots of implicit acceptance code which we're just going to throw away anyway.)
Depends on: 862563

Updated

5 years ago
Assignee: nobody → smirea
Created attachment 774396 [details] [diff] [review]
Don't poll every minute;

Now instead of one 1 minute interval timer at startup that polls for info and kills your battery there is one offset timer that is supposed to run when data submission should occur followed by a second periodic timer every 24h for your every day data submission. Test impact was minimal

https://tbpl.mozilla.org/?tree=Try&rev=5f6f1b3304fc
Attachment #774396 - Flags: review?(gps)
(Reporter)

Comment 5

5 years ago
Comment on attachment 774396 [details] [diff] [review]
Don't poll every minute;

Review of attachment 774396 [details] [diff] [review]:
-----------------------------------------------------------------

This is on the right track. We can discuss the feedback IRL if you want.

::: services/datareporting/policy.jsm
@@ +466,5 @@
> +
> +    this._timers = {
> +      offset: Cc["@mozilla.org/timer;1"].createInstance(Ci.nsITimer),
> +      interval: Cc["@mozilla.org/timer;1"].createInstance(Ci.nsITimer),
> +    };

How do you feel about using a single timer that is reset as needed instead of using multiple timers?

@@ +719,5 @@
> +
> +    // Because of the way this is implemented now all we need to do is re-start
> +    // the timers and an offset retry timeout will be scheduled automatically
> +    this.stop();
> +    this.start();

You should do this by cancelling and restarting the underlying timer rather than restarting the instance. You probably want to factor the timer management code into a reusable function.
Attachment #774396 - Flags: review?(gps) → feedback+
Created attachment 775993 [details] [diff] [review]
Don't poll every minute;

I have no idea why I did not realize that the one timer can just be reused. Fixed that and instantiated the timer instance in the constructor so now the .start() function is effectively a re-start function, but I guess it makes more sense to keep the naming
Attachment #775993 - Flags: review?(gps)

Updated

5 years ago
Attachment #774396 - Attachment is obsolete: true

Updated

5 years ago
Status: NEW → ASSIGNED
(Reporter)

Comment 7

5 years ago
Comment on attachment 775993 [details] [diff] [review]
Don't poll every minute;

Review of attachment 775993 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/datareporting/policy.jsm
@@ +217,5 @@
>     *
>     * The random bit is to ensure that other systems scheduling around the same
>     * interval don't all get scheduled together.
>     */
> +  POLL_INTERVAL_MSEC: MILLISECONDS_PER_DAY + Math.floor(2.5 * 1000 * Math.random()),

We probably want a larger fuzz window now that we poll every day rather than minute. Say +- 15 minutes?

@@ +455,3 @@
>     *
> +   * This will calculate an offset remaining to the next submission time and set
> +   * one time timeout for that time. After which, reset to a perioical 24h interval

Nit: Add period after sentences.

@@ +463,4 @@
>  
> +    // Time remaining until the first data submission.
> +    let offsetT = this.nextDataSubmissionDate - this.now() + 1;
> +    if (!offsetT || offsetT < 0) {

When is !offsetT true?

@@ +478,5 @@
> +    deferred.promise.then(() => {
> +      this._timer.initWithCallback({
> +        notify: this.checkStateAndTrigger.bind(this)
> +      }, this.POLL_INTERVAL_MSEC, this._timer.TYPE_REPEATING_SLACK);
> +    });

Is there any reason you can't inline this into the notify callback above?

@@ +710,5 @@
>      this.currentDaySubmissionFailureCount++;
> +
> +    // Because of the way this is implemented now all we need to do is re-start
> +    // the timers and an offset retry timeout will be scheduled automatically.
> +    this.start();

Please factor out the timer setting logic into its own function and call that, not start(). If start() is a one-line function that sets the timer, so be it. Calling start() on an already-started instance just seems wrong.
Attachment #775993 - Flags: review?(gps) → feedback+
Created attachment 794903 [details] [diff] [review]
Don't poll every minute;

Implemented changes
Attachment #794903 - Flags: review?(gps)

Updated

5 years ago
Attachment #775993 - Attachment is obsolete: true
(Reporter)

Comment 9

5 years ago
Comment on attachment 794903 [details] [diff] [review]
Don't poll every minute;

Review of attachment 794903 [details] [diff] [review]:
-----------------------------------------------------------------

I'm granting this r+. However, I'd like to see the test coverage improved. Preferably as part of this bug, but I'll accept a followup.

I think it would help if you set a variable to the wall time the timer is scheduled for so you can more easily test that the timer is properly adjusted. Currently, I believe we're lacking test coverage around ensuring the timer is rescheduled after certain key events. These are important tests to have!
Attachment #794903 - Flags: review?(gps) → review+
Created attachment 797425 [details] [diff] [review]
Don't poll every minute;

After some off-line talks I ended up adding a side-effect to policy.nextDataSubmissionDate which is to reset the timer as this is always the desired behavior as well as adding tests to support it.
Attachment #797425 - Flags: review?(gps)

Updated

5 years ago
Attachment #794903 - Attachment is obsolete: true

Updated

5 years ago
Attachment #797425 - Flags: review?(rnewman)
Comment on attachment 797425 [details] [diff] [review]
Don't poll every minute;

Review of attachment 797425 [details] [diff] [review]:
-----------------------------------------------------------------

Nit: "r=gps,rnewman".

Pretty cursory review from me; gps will catch anything I missed!

::: services/datareporting/policy.jsm
@@ +433,5 @@
>  
> +  /**
> +   * Sets the nextDataSubmissionTime preference.
> +   *
> +   * Side Effects:

Nit: s/E/e.

@@ +559,4 @@
>     * You typically call this function for each constructed instance.
>     */
> +  start: function () {
> +    return this.restartTimer();

Why not just rename `restartTimer` to `start`, and always call `start`?

@@ +573,5 @@
> +    // scheduled in the past, fire the timer "immediately".
> +    let offsetT = Math.max(1, this.nextDataSubmissionDate - this.now() + 1);
> +    this._timerOffset = offsetT;
> +
> +    // Add random slack to the timer so we minimize the change of multiple systems

s/change/chance.

@@ +585,5 @@
> +    }
> +
> +    this.stop();
> +    this._timer.initWithCallback({
> +      notify: this.checkStateAndTrigger.bind(this)

Might be nice to blow away this._timer in the callback, too. That way the following call to `start` doesn't result in `stop`, and `cancel` on a completed timer instance.

@@ +602,2 @@
>      }
> +    this._timer.cancel();

Also need to drop the timer reference, no? Otherwise you can cancel it multiple times.
Attachment #797425 - Flags: review?(rnewman) → review+
Created attachment 797592 [details] [diff] [review]
Don't poll every minute;

(In reply to Richard Newman [:rnewman] from comment #11)

> Why not just rename `restartTimer` to `start`, and always call `start`?

Excerpt from Greg's Comment 7
> Please factor out the timer setting logic into its own function and call that, not start(). If start() is a one-line function that sets the timer, so be it. Calling start() on an already-started instance just seems wrong.

> Also need to drop the timer reference, no? Otherwise you can cancel it multiple times.

I made the timer as a single instance which is instantiated in the constructor for effiency purposes so if I clear the reference then subsequent calls to .stop() will fail with an undefined/null exception
Attachment #797592 - Flags: review?(rnewman)
Attachment #797592 - Flags: review?(gps)

Updated

5 years ago
Attachment #797425 - Attachment is obsolete: true
Attachment #797425 - Flags: review?(gps)
(In reply to Stefan Mirea [:smirea] from comment #12)

> Excerpt from Greg's Comment 7

Greg and I have different styles in some situations! In this, I defer to him.
 

> I made the timer as a single instance which is instantiated in the
> constructor for effiency purposes so if I clear the reference then
> subsequent calls to .stop() will fail with an undefined/null exception

In that case, you don't need to ever check that it's truthy. Be consistent!
Created attachment 798979 [details] [diff] [review]
Don't poll every minute;

Removed check for timer in policy and added test for it in test_policy.
Attachment #798979 - Flags: review?(rnewman)
Attachment #798979 - Flags: review?(gps)

Updated

5 years ago
Attachment #797592 - Attachment is obsolete: true
Attachment #797592 - Flags: review?(rnewman)
Attachment #797592 - Flags: review?(gps)
Attachment #798979 - Flags: review?(rnewman) → review+
(Reporter)

Comment 15

5 years ago
Comment on attachment 798979 [details] [diff] [review]
Don't poll every minute;

Review of attachment 798979 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/datareporting/policy.jsm
@@ +583,5 @@
> +    if (offsetT >= 5 * 60 * 1000) {
> +      offsetT += Math.floor(15 * 60 * 1000 * Math.random());
> +    }
> +
> +    this.stop();

Just call this._timer.cancel().

@@ +586,5 @@
> +
> +    this.stop();
> +    this._timer.initWithCallback({
> +      notify: function () {
> +        this.stop();

Ditto.

@@ +638,5 @@
>  
>      // If the system clock were ever set to a time in the distant future,
>      // it's possible our next schedule date is far out as well. We know
>      // we shouldn't schedule for more than a day out, so we reset the next
> +    // scheduled date appropriately

Nit: add period after sentence.

@@ +639,5 @@
>      // If the system clock were ever set to a time in the distant future,
>      // it's possible our next schedule date is far out as well. We know
>      // we shouldn't schedule for more than a day out, so we reset the next
> +    // scheduled date appropriately
> +    if (nextSubmissionDate.getTime() >= nowT + MILLISECONDS_PER_DAY + 1) {

Or you could just use > instead of +1.
Attachment #798979 - Flags: review?(gps) → review+
Created attachment 800408 [details] [diff] [review]
Don't poll every minute;

Implemented changes
Attachment #800408 - Flags: review?(gps)

Updated

5 years ago
Attachment #798979 - Attachment is obsolete: true
(Reporter)

Comment 17

5 years ago
Comment on attachment 800408 [details] [diff] [review]
Don't poll every minute;

Review of attachment 800408 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. Giving to rnewman for additional review because this code is important!
Attachment #800408 - Flags: review?(rnewman)
Attachment #800408 - Flags: review?(gps)
Attachment #800408 - Flags: review+
Attachment #800408 - Flags: review?(rnewman) → review+
Created attachment 807268 [details] [diff] [review]
Don't poll every minute;

Since Bug 862563 got the green light both from reviewrs and tbpl (finally!!!), I just made this patch compatible with those from that bug. Just one or two minor conflict fixes, nothing else

Here's the green try run as well: https://tbpl.mozilla.org/?tree=Try&rev=07acdd192c80
Attachment #807268 - Flags: review?(gps)

Updated

5 years ago
Attachment #800408 - Attachment is obsolete: true
(Reporter)

Updated

5 years ago
Attachment #807268 - Flags: review?(gps) → review+

Updated

5 years ago
Blocks: 888052

Updated

5 years ago
Blocks: 925587
(Reporter)

Updated

5 years ago
Duplicate of this bug: 958552
(Reporter)

Updated

5 years ago
Blocks: 948528
Is there a reason why this patch is still pending?
Created attachment 8361177 [details] [diff] [review]
Don't poll every minute, v2
Attachment #807268 - Attachment is obsolete: true
Attachment #8361177 - Flags: review?(gps)
(Reporter)

Comment 22

5 years ago
Comment on attachment 8361177 [details] [diff] [review]
Don't poll every minute, v2

Review of attachment 8361177 [details] [diff] [review]:
-----------------------------------------------------------------

This patch requires bug 862563 to land first. Sadly, the existing tests are flawed and they don't fail with just this patch applied. Bug 862563 shouldn't be too much effort to get landed.

Can you please also revert the Author information back to Stefan? I don't want to take credit away from you, but Stefan toiled on these patches and IMO he deserves the credit.

::: services/datareporting/tests/xpcshell/test_policy.js
@@ +614,5 @@
>    do_check_eq(listener.requestDataUploadCount, 1);
>    do_check_eq(listener.requestRemoteDeleteCount, 1);
>  });
>  
> +add_task(function test_polling() {

Nit: We should use function* for generators now.

add_task(function* test_polling() {

@@ +633,5 @@
>      value: function fakeCheckStateAndTrigger() {
>        let now = Date.now();
>        let after = now - then;
>        count++;
> +      let intended_elapsed_time = arrSum(intended.slice(0, count));

Nit: camelCase variable names.
Attachment #8361177 - Flags: review?(gps) → feedback+

Updated

4 years ago
Assignee: steven.mirea → nobody
Flags: firefox-backlog+

Comment 23

4 years ago
Setting to diamond & good-next-big, going looking for someone to carry this over the line. GPS, will you mentor?
Status: ASSIGNED → NEW
Flags: needinfo?(gps)
Whiteboard: [diamond] [good-next-bug]
Greg is taking a leave of absence, so I doubt it. I can mentor.
Flags: needinfo?(gps)

Updated

4 years ago
Whiteboard: [diamond] [good-next-bug] → [diamond] [good-next-bug] [mentor=rnewman]

Updated

4 years ago
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] → [diamond] [good-next-bug] [mentor=rnewman] p=0 [qa?]
Assignee: nobody → georg.fritzsche
I'll check into this, but there might be update hotfix work coming up that has to come first.
Added to Iteration 32.3.  Georg, can you recommend if this bug should be marked as [qa-] or [qa+] for QA verification.
Status: NEW → ASSIGNED
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] p=0 [qa?] → [diamond] [good-next-bug] [mentor=rnewman] p=0 s=it-32c-31a-30b.3 [qa?]
Also, can you provide a point estimate.
Flags: needinfo?(georg.fritzsche)

Updated

4 years ago
Assignee: georg.fritzsche → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(georg.fritzsche)
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] p=0 s=it-32c-31a-30b.3 [qa?] → [diamond] [good-next-bug] [mentor=rnewman] p=0 [qa?]

Updated

4 years ago
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] p=0 [qa?] → [diamond] [good-next-bug] [mentor=rnewman] p=5 [qa?]
Assignee: nobody → georg.fritzsche
Added to Iteration 33.1
Status: NEW → ASSIGNED
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] p=5 [qa?] → [diamond] [good-next-bug] [mentor=rnewman] p=5 s=33.1 [qa?]
(Assignee)

Updated

4 years ago
Mentor: rnewman
Whiteboard: [diamond] [good-next-bug] [mentor=rnewman] p=5 s=33.1 [qa?] → [diamond] [good-next-bug] p=5 s=33.1 [qa?]
Due to other interruptions & test failures on bug 862563 i'm not sure if i'll actually manage it in this iteration and was recommended to drop it.
Assignee: georg.fritzsche → nobody
Removed from Iteration 33.1
Status: ASSIGNED → NEW
Whiteboard: [diamond] [good-next-bug] p=5 s=33.1 [qa?] → [diamond] [good-next-bug] p=5 [qa?]
Summary: Don't poll every minute → FHR should not poll every minute
(Reporter)

Comment 31

4 years ago
This bug is now unblocked!

Updated

4 years ago
Points: --- → 5
Whiteboard: [diamond] [good-next-bug] p=5 [qa?] → [diamond] [good-next-bug]

Updated

4 years ago
Whiteboard: [diamond] [good-next-bug] → [diamond] [good next bug]
Created attachment 8545248 [details] [diff] [review]
Don't poll every minute in FHR

It looks like there've been a number of changes over the last year; this patch
un-bitrots the patch.  Assuming I haven't botched anything, this try run should
come back green:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2aa872ac130b

I think review on this should almost be a rubber-stamp, but I'll hold off until
I see green on try.
Attachment #8361177 - Attachment is obsolete: true
froydnj, that try job is long expired. Is it worth trying again?
Flags: needinfo?(nfroyd)
We're about to retire FHR with Unified Telemetry (http://mzl.la/1Ky4sj5), targetting Fx41.
I don't think it will be useful to re-activate this now, unless there is an impact on mobile.
(In reply to Nicholas Nethercote [:njn] from comment #33)
> froydnj, that try job is long expired. Is it worth trying again?

The try job came back with some oranges in FHR tests that I didn't understand.  Assuming that we're going to remove FHR and replace it with something that isn't so poll-happy, I don't think it's worth it for me to try and improve those patches.
Flags: needinfo?(nfroyd)
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → WORKSFORME
Resolution: WORKSFORME → WONTFIX
You need to log in before you can comment on or make changes to this bug.