Closed Bug 1365970 Opened 7 years ago Closed 7 years ago

Move sessionrestore data collector timer in the content process to idle dispatch

Categories

(Firefox :: Session Restore, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
Firefox 57
Iteration:
57.3 - Sep 19
Performance Impact high
Tracking Status
firefox57 --- fixed

People

(Reporter: ehsan.akhgari, Assigned: wiwang)

References

(Blocks 3 open bugs)

Details

(Whiteboard: [reserve-photon-performance])

Attachments

(2 files, 5 obsolete files)

See this profile on the reference Acer machine, 33ms (read 66ms) of jank optimizing XPath expressions when searching on youtube.com.  :-(

https://perfht.ml/2pXwwo8
Whiteboard: [qf]
Yeah, that XPath stuff is stupid... Olli already noticed that over in bug 1362330.
ttaubert and I talked about how inefficient this was when we had our conversation about sessionstore a few weeks ago - it looks like it is actually showing up on profiles.

Ehsan is suggesting here that we should requestIdleCallback to collect this information (if we do that, we should do it for all of SessionStore's data collection. Probably here: http://searchfox.org/mozilla-central/source/browser/components/sessionstore/content/content-sessionStore.js#777 - replacing the setTimeout with an idle-aware callback.

After that we may still want to look into making the form lookups be more granular (based on what the target of the event we are listening to is), in order to reduce this number even during requestIdleCallback periods.
Blocks: ss-perf
See Also: → 1353586
Whiteboard: [qf] → [qf] [photon-performance]
Blocks: 1354723
Whiteboard: [qf] [photon-performance] → [qf] [photon-performance] [triage]
Hmm now that I think of it, maybe we want to use an nsIIdleService idle observer here in fact?  Not really sure what the requirements of the sessionrestore code is but it's worth thinking about the kind of idleness we care about here.
Flags: qe-verify?
Priority: -- → P3
Whiteboard: [qf] [photon-performance] [triage] → [qf] [reserve-photon-performance]
Whiteboard: [qf] [reserve-photon-performance] → [qf:p1] [reserve-photon-performance]
Tim, do you have any suggestion on which kind of idle API we want to use here?
Flags: needinfo?(ttaubert)
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #3)
> Hmm now that I think of it, maybe we want to use an nsIIdleService idle
> observer here in fact?  Not really sure what the requirements of the
> sessionrestore code is but it's worth thinking about the kind of idleness we
> care about here.

FYR, I believe that nsIIdleService idle observer might stretch the time of collect interval from 15 seconds(default in pref) to several dozens of minutes[1], which is very likely to cause data loss; so unless we expect a fundamental change to session restore, it's not appropriate to use nsIIdleService API in this circumstance.


[1]
nsIIdleService - Mozilla | MDN
https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Reference/Interface/nsIIdleService

"The idle service is for computer-wide idle detection, not just application idle detection. In other words, even if the user is working in other applications, the idle service will still consider the user to be active."
Flags: needinfo?(ttaubert) → needinfo?(ehsan)
OK that makes sense, so idle dispatch is a more suitable candidate.
Flags: needinfo?(ehsan)
Depends on: 1368072
No longer depends on: 1353206
Assignee: nobody → wiwang
Status: NEW → ASSIGNED
Priority: P3 → P1
Flags: qe-verify? → qe-verify-
This bug is a priority for Speedometer V2.  It would be really nice if we can address it as soon as possible.  Thanks!
Here's a profile where it shows up during one of the "async" tests of Speedometer v2: http://bit.ly/2saz9YU
This bug is my 1st priority now.

I'd like to confirm one important thing: 
Currently, we don't support the `IdleDeadline`[1] when using Services.tm.idleDispatchToMainThread(), is this correct?
(We have timeout now, though. Bug 1368072)
- Implementation bug (Bug 1353206 comment 16) said "not expose handling of either deadlines or timeouts" (2 months ago)
- I didn't see any usage of `IdleDeadline` yet in the tree.

If we have `IdleDeadline`, which means we can't utilize the IdleDeadline.timeRemaining();
And I believe this will make idle dispatch less effective in cases where callback takes not just 1 or 2 ms.

Mike, we don't support the `IdleDeadline` now, right?


[1]
IdleDeadline - Web APIs | MDN 
https://developer.mozilla.org/en-US/docs/Web/API/IdleDeadline
Flags: needinfo?(mikedeboer.mozbugs)
Besides, I am going to give a patch without the IdleDeadline handling.
(In reply to Will Wang [:WillWang] from comment #9)
> If we have `IdleDeadline`, which means we can't utilize the
> IdleDeadline.timeRemaining();

s/have/don't have
(In reply to Markus Stange [:mstange] from comment #8)
> Here's a profile where it shows up during one of the "async" tests of
> Speedometer v2: http://bit.ly/2saz9YU

Thank you for this profile, mstange!

I believe that idle dispatch could easily shift the 1.8ms session history stuff during the jank :D
(In reply to Will Wang [:WillWang] from comment #9)
> And I believe this will make idle dispatch less effective in cases where
> callback takes not just 1 or 2 ms.

This is related with the re-profiling as bug 1373672 comment 33 indicated, since the bug 1362330 was landed.
Will, did you mean to needinfo me? Or another mike?
Flags: needinfo?(wiwang)
(In reply to Mike de Boer [:mikedeboer] from comment #14)
> Will, did you mean to needinfo me? Or another mike?

Oh! Sorry, Mike, I think I should consult the original API implementer :)

Hi Andreas, would it be possible for you to know whether we support the `IdleDeadline` currently? (Please also refer comment 9)
Since you helped to implement the bug 1353206 and bug 1368072, I think you might be the best person to consult;
if I'm wrong, could you recommend one? (Andrew Overholt?)
Many thanks for your help! :)
Flags: needinfo?(wiwang)
Flags: needinfo?(mikedeboer.mozbugs)
Flags: needinfo?(afarre)
(In reply to Will Wang [:WillWang] from comment #15)
> (In reply to Mike de Boer [:mikedeboer] from comment #14)
> > Will, did you mean to needinfo me? Or another mike?
> 
> Oh! Sorry, Mike, I think I should consult the original API implementer :)
> 
> Hi Andreas, would it be possible for you to know whether we support the
> `IdleDeadline` currently? (Please also refer comment 9)
> Since you helped to implement the bug 1353206 and bug 1368072, I think you
> might be the best person to consult;
> if I'm wrong, could you recommend one? (Andrew Overholt?)
> Many thanks for your help! :)

From ThreadManager we only support deadlines if the callback is an nsIIdleRunnable, which isn't accessible for script implementation. If you have a window available you can call requestIdleCallback, the you'll have the idle deadline object.
Flags: needinfo?(afarre)
Thank you for your speedy reply, Andreas! :)

Unfortunately, we can't get a window under session restore... (and some other components.)
As you know, do we have any plan to support deadline for this kind of circumstances?
Thanks!
Flags: needinfo?(afarre)
This bug is about the content process timer and there is the `content` global in a content script. That will happily provide you with `requestIdleCallback`.
(In reply to Tim Taubert [:ttaubert] from comment #18)
> This bug is about the content process timer and there is the `content`
> global in a content script. That will happily provide you with
> `requestIdleCallback`.

Right, should probably have said global. This should work perfectly.
Flags: needinfo?(afarre)
Thanks for your remind, Tim, Andreas! It's really delightful :D

I have wrote a patch and the consideration for `requestIdleCallback` parameters is as below:
1. options(only timeout for now): set the timeout as same as `browser.sessionstore.interval`.
2. IdleDeadline.timeRemaining(): To avoid jank, at this moment I set that we trigger the `send`[1] only if timeRemaining() > 10ms, since the p75 in the telemetry[2] is about 10ms and is reducing.(Telemetry sample bias should be considered.)

It's easy for `send` to find a 10ms idle slot across the `browser.sessionstore.interval`(15 sec) through my some experiments.

We surely can do more optimization here, but I believe that most of the jank should be gone if we apply above idle dispatch.

I'm more than happy to take more actions if we find the jank is still showing up on the profile, for example:
* Now we batch the data collection for 1 sec [3], we could split it.
* We could wait for a bigger idle slot. (IdleDeadline.timeRemaining())
* We could `send` out the bigger data right away once that one is `push`-ed [4]
* Use a more delicate queueing method.
* ......etc.

I'm going to upload the patch for reference.

Note: Some articles are very valuable during considering the trade-off, like [5].

[1]
http://searchfox.org/mozilla-central/rev/01d27fdd3946f7210da91b18fcccca01d7324fe2/browser/components/sessionstore/content/content-sessionStore.js#796

[2]
https://telemetry.mozilla.org/new-pipeline/evo.html#!aggregates=median!mean!5th-percentile!25th-percentile!75th-percentile!95th-percentile&cumulative=0&end_date=null&keys=formdata&max_channel_version=nightly%252F56&measure=FX_SESSION_RESTORE_CONTENT_COLLECT_DATA_MS&min_channel_version=nightly%252F52&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=null&trim=1&use_submission_date=0

[3]
http://searchfox.org/mozilla-central/rev/01d27fdd3946f7210da91b18fcccca01d7324fe2/browser/components/sessionstore/content/content-sessionStore.js#713

[4]
http://searchfox.org/mozilla-central/rev/88180977d79654af025158d8ebeb8c2aa11940eb/browser/components/sessionstore/content/content-sessionStore.js#780

[5]
Cooperative Scheduling of Background Tasks API - Web APIs | MDN 
 https://developer.mozilla.org/en-US/docs/Web/API/Background_Tasks_API
Attached patch bug-1365970-WIP.patch (obsolete) — Splinter Review
For reference: Here is the patch with dev logs which can be utilized if needed.

I found that a subtle issue when some parameters are improperly set.
Comment on attachment 8887875 [details] [diff] [review]
bug-1365970-WIP.patch

Review of attachment 8887875 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/sessionstore/content/content-sessionStore.js
@@ +783,5 @@
>  
>      if (!this._timeout && !this._timeoutDisabled) {
>        // Wait a little before sending the message to batch multiple changes.
> +      //this._timeout = setTimeout(() => this.send(), this.BATCH_DELAY_MS);
> +      this._timeout = setTimeout(() => { content.window.requestIdleCallback(this.waitIdleToSend);  }, this.BATCH_DELAY_MS);

If below we had:

`if (!deadline || deadline.timeRemaining() < 10) {`

then this could be:

`this._timeout = setTimeout(() => this.waitIdleToSend(), this.BATCH_DELAY_MS);`

@@ +788,5 @@
>      }
> +    console.log("==== leaving push .....");
> +  },
> +  waitIdleToSend(deadline) {
> +    if (deadline.timeRemaining() < 10) {

Not a bad idea but I wonder how often we get a chunk of 10ms? If this never happens then we might on very busy systems never write to disk and lose crash protection. Maybe we should just check that timeRemaining() > 0?

If even that situation never occurs we should guard against that by specifying a timeout. So the condition might become:

`if (!deadline || (!deadline.didTimeout && deadline.timeRemaining() == 0)) {`

requestIdleCallback() could probably take {timeout: 15000} then?

@@ +790,5 @@
> +  },
> +  waitIdleToSend(deadline) {
> +    if (deadline.timeRemaining() < 10) {
> +      console.log("==== deadline.timeRemaining() = " + deadline.timeRemaining() + ", request again ");
> +      setTimeout(() => { content.window.requestIdleCallback(MessageQueue.waitIdleToSend);  }, 500);

Can't we just immediately request another idle callback? I don't think we need the setTimeout().

Also, content.requestIdleCallback() should do, no? (Without the .window)

@@ +806,5 @@
>     *        {flushID: 123} to specify that this is a flush
>     *        {isFinal: true} to signal this is the final message sent on unload
>     */
>    send(options = {}) {
> +    console.log("==== `send` got called");

Did you run the tests? Do they succeed?
Many thanks to for your rich feedback, Tim!

(In reply to Tim Taubert [:ttaubert] from comment #22)
> If below we had:
> 
> `if (!deadline || deadline.timeRemaining() < 10) {`
> 
> then this could be:
> 
> `this._timeout = setTimeout(() => this.waitIdleToSend(), this.BATCH_DELAY_MS);`
Your way looks better!
I'm not sure that do you mean using the `IdleDeadline`(solely) before the calling of `requestIdleCallback`(as a parameter passed into callback)?

> > +    if (deadline.timeRemaining() < 10) {
> 
> Not a bad idea but I wonder how often we get a chunk of 10ms? If this never
> happens then we might on very busy systems never write to disk and lose
> crash protection. Maybe we should just check that timeRemaining() > 0?
I was also wondering about this part!
Currently, I know:
> It's easy for `send` to find a 10ms idle slot across the `browser.sessionstore.interval`(15 sec) through my some experiments.
Sometimes, my experiments even show that there are some 49ms idle slots. (almost reach 50ms upper bound)
But, it's on _my_ machine.
To me, it's much convenient that just using `timeRemaining() > 0`, but I'm very curious about how effective it can be.
(I also have one question: 
Why `timeRemaining()` can be <= 0 when the `requestIdleCallback` was called? ...Did we really get an idle period?
The answer might be in section: "4.3 The timeRemaining method" of the W3C spec[1], but I didn't read it over yet.)
In example 2 of W3C spec [1], it also uses "if (deadline.timeRemaining() <= 5)" to wait a big enough idle period, so I think we can use this way but have to wait a proper idle period(not too big, not too small....).
BTW, the spec said:
"Future versions of this specification could allow other scheduling strategies. For example, schedule idle callback within the same idle period, a period that has at least X milliseconds of idle time, and so on."
That's good for our case here.


> If even that situation never occurs we should guard against that by
> specifying a timeout. So the condition might become:
> 
> `if (!deadline || (!deadline.didTimeout && deadline.timeRemaining() == 0)) {`
> 
> requestIdleCallback() could probably take {timeout: 15000} then?
Yes, that's right :D 
I didn't put the timeout into my WIP patch yet and I'd like to use the {timeout: 15000} as you specified(same as my comment 20).


> > +      setTimeout(() => { content.window.requestIdleCallback(MessageQueue.waitIdleToSend);  }, 500);
> 
> Can't we just immediately request another idle callback? I don't think we
> need the setTimeout().
At first, I immediately request another idle callback but I saw a _lot_ of request log;
I'm afraid that's bad for performance so I tried to use setTimeout to throttle requests,
but I'm still considering whether it's good or not.
What do you think?

> Also, content.requestIdleCallback() should do, no? (Without the .window)
Yes, it can, you are right!


> Did you run the tests? Do they succeed?
Yes, I ran several times in local and didn't see obvious problem,
but ever observed that browser_464199.js failed intermittently; not sure the relationship yet.
I will do more tests along with my revised patch.

Thanks a lot, Tim!


[1] 
spec: https://www.w3.org/TR/requestidlecallback/
(In reply to Will Wang [:WillWang] from comment #23)
> I'm not sure that do you mean using the `IdleDeadline`(solely) before the
> calling of `requestIdleCallback`(as a parameter passed into callback)?

Sorry, I don't think I understand this question. Can you reframe it maybe?

> > It's easy for `send` to find a 10ms idle slot across the `browser.sessionstore.interval`(15 sec) through my some experiments.
> Sometimes, my experiments even show that there are some 49ms idle slots.
> (almost reach 50ms upper bound)
> But, it's on _my_ machine.

Exactly, that's what worries me.

> > Can't we just immediately request another idle callback? I don't think we
> > need the setTimeout().
> At first, I immediately request another idle callback but I saw a _lot_ of
> request log;
> I'm afraid that's bad for performance so I tried to use setTimeout to
> throttle requests,

But we'll be called back as soon as there's a new idle slot available. We want this to happen as fast as possible, just when the machine is somewhat idle. If you see lots of logs then the timeRemaining() check might be too restrictive?

> > Did you run the tests? Do they succeed?
> Yes, I ran several times in local and didn't see obvious problem,
> but ever observed that browser_464199.js failed intermittently; not sure the
> relationship yet.

Maybe this isn't actually your fault but bug 1375757. That would be great to have fixed too! Especially if you can reproduce it.
Gentle ping?  :-)  Any updates on this, Will?  Thanks!
Hi Mike,

After doing some experiments, here is my revised patch:
- improve logic for several suggestions from Tim (Thanks, Tim!)
- rebase for the change introduced by Bug 1384041 (Label the use of setTimeout of Timer.jsm in content-sessionStore.js)
- add better handling for possible pref change

Could you help to review this patch? :)

(Ehsan, thanks for your gentle ping :) There are some ongoing local experiments here and I better to update them more frequently!)
Attachment #8887875 - Attachment is obsolete: true
Attachment #8899795 - Flags: review?(mdeboer)
Comment on attachment 8899795 [details] [diff] [review]
Bug 1365970 - Move data collector timer in the content process to idle dispatch

Review of attachment 8899795 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the patch, Will!
It's almost there, I'd say! Just a few more changes and a green try run and it'll be good to go methinks.

::: browser/components/sessionstore/content/content-sessionStore.js
@@ +737,5 @@
>  
>    /**
> +   * The minimum idle period (in ms) we need for sending data to chrome process.
> +   */
> +  NEEDED_IDLE_PERIOD: 5,

Please add the unit of the constant in the name; 'NEEDED_IDLE_PERIOD_MS'

@@ +743,5 @@
> +  /**
> +   * Timeout for waiting an idle period to send data. We will set this from
> +   * the pref "browser.sessionstore.interval".
> +   */
> +  _timeoutIdleCallback: null,

This name is quite confusing to me... can you rename it to be more like `BATCH_DELAY_MS` above? It doesn't need to look like a constant above, because that might be confusing, but the naming should be more like `_idleWaitPeriodMs` or something.

@@ +830,5 @@
>  
>      if (!this._timeout && !this._timeoutDisabled) {
>        // Wait a little before sending the message to batch multiple changes.
>        this._timeout = setTimeoutWithTarget(
> +        () => content.requestIdleCallback(this.waitIdleToSend, {timeout: this._timeoutIdleCallback})

Please change this to simply
```js
() => this.sendWhenIdle(), this.BATCH_DELAY_MS, tabEventTarget);
```
... and add `if (deadline && deadline.didTimeout ...)` to the function body below.

@@ +836,4 @@
>      }
>    },
>  
> +  waitIdleToSend(deadline) {

Nit: please add a JSDoc comment here.

You might want to consider calling this `sendWhenIdle`

@@ +836,5 @@
>      }
>    },
>  
> +  waitIdleToSend(deadline) {
> +    if (deadline.didTimeout || (deadline.timeRemaining() > MessageQueue.NEEDED_IDLE_PERIOD)) {

isn't `deadline.didTimeout` the same as `deadline.timeRemaining() === 0`? Or is it an optimization because the `deadline.timeRemaining()` call is expensive?
Attachment #8899795 - Flags: review?(mdeboer) → review-
Thank you, Mike!
I already revised my patch and would like to confirm the following:

(In reply to Mike de Boer [:mikedeboer] from comment #27)
> @@ +830,5 @@
> >  
> >      if (!this._timeout && !this._timeoutDisabled) {
> >        // Wait a little before sending the message to batch multiple changes.
> >        this._timeout = setTimeoutWithTarget(
> > +        () => content.requestIdleCallback(this.waitIdleToSend, {timeout: this._timeoutIdleCallback})
> 
> Please change this to simply
> ```js
> () => this.sendWhenIdle(), this.BATCH_DELAY_MS, tabEventTarget);
> ```
Do you mean not using the `requestIdleCallback` the setTimeoutWithTarget?
If so, how can I get the `deadline` without invoking it?
(I probably misunderstand your meaning.)

> ... and add `if (deadline && deadline.didTimeout ...)` to the function body
> below.
Do you think that there is a probability that `deadline` can not be got?


> @@ +836,5 @@
> >      }
> >    },
> >  
> > +  waitIdleToSend(deadline) {
> > +    if (deadline.didTimeout || (deadline.timeRemaining() > MessageQueue.NEEDED_IDLE_PERIOD)) {
> 
> isn't `deadline.didTimeout` the same as `deadline.timeRemaining() === 0`? Or
> is it an optimization because the `deadline.timeRemaining()` call is
> expensive?
`deadline.didTimeout` is directly provided by API, so I can use it to check the timeout status;
I'm not sure whether `deadline.timeRemaining()` call is expensive or not, but indeed, I put the `deadline.didTimeout` as the former in the evaluation for avoiding the possible performance issue.

Thanks!!
Flags: needinfo?(mdeboer)
Typo:
not using the `requestIdleCallback` *within* the setTimeoutWithTarget
Flags: needinfo?(mdeboer)
Hi, Mike!

Thanks for your advice in comment and IRC! Here is the revised patch which is changed to a polymorphic and more clear way.

And for better readability (if I understand correctly), I didn't combine the `if (deadline)` and `if (deadline.didTimeout || ...)` into one if, what do you think?

Could you help to review again? thanks! :)
Attachment #8899795 - Attachment is obsolete: true
Attachment #8905465 - Flags: review?(mdeboer)
Comment on attachment 8905465 [details] [diff] [review]
(v2) Bug 1365970 - Move data collector timer in the content process to idle dispatch

Review of attachment 8905465 [details] [diff] [review]:
-----------------------------------------------------------------

Looks great, Will. Ship it!

::: browser/components/sessionstore/content/content-sessionStore.js
@@ +843,5 @@
> +   *        An IdleDeadline object passed by requestIdleCallback().
> +   */
> +  sendWhenIdle(deadline) {
> +    if (deadline) {
> +      if (deadline.didTimeout || (deadline.timeRemaining() > MessageQueue.NEEDED_IDLE_PERIOD_MS)) {

nit: superfluous braces at `(deadline.timeRemaining() > `.
Attachment #8905465 - Flags: review?(mdeboer) → review+
Mike, thanks for your assistance!

Fixed nit and bring r+ from patch v2.
Attachment #8905465 - Attachment is obsolete: true
Attachment #8905860 - Flags: review+
Please help to land, thanks! :)
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6ee4de08ac37
Move data collector timer in the content process to idle dispatch. r=mikedeboer
Keywords: checkin-needed
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #36)
> > TEST-UNEXPECTED-FAIL | browser/base/content/test/webextensions/browser_extension_update_background.js | leaked 8 window(s) until shutdown [url = about:addons]

Thank you for catching and handling, aryx!

Mike, Andreas, I'm investigating the leak, I was thinking whether the leak is related with the idle dispatch or not,
do you have any idea, if possible? Thanks!!

I'll keep posting my findings here.
Flags: needinfo?(wiwang)
Flags: needinfo?(mdeboer)
Flags: needinfo?(afarre)
I'm going to add cancelIdleCallback() once we get `unload` event [1] to see whether the leak can be avoided.


[1] http://searchfox.org/mozilla-central/rev/00fa5dacedb925022f53d025121f1a919508e7ce/browser/components/sessionstore/content/content-sessionStore.js#903
(In reply to Will Wang [:WillWang] from comment #37)
> Mike, Andreas, I'm investigating the leak, I was thinking whether the leak
> is related with the idle dispatch or not,
> do you have any idea, if possible? Thanks!!
After examination, the idle dispatch did cause the leak.


(In reply to Will Wang [:WillWang] from comment #38)
> I'm going to add cancelIdleCallback() once we get `unload` event [1] to see
> whether the leak can be avoided.
I cancel the _last_ idle callback there, but the memory still leaks.

So two questions now:
1. I should also cancel _all_ previous idle callback, this needs a neat way to do so.
2. I'm not sure whether canceling the idle callback on `unload` event is too late or not.

Besides, another question:
Do we need to treat the idle callback as a leak?
If not, could we change that and let it run in a proper way?
(In reply to Will Wang [:WillWang] from comment #39)
> (In reply to Will Wang [:WillWang] from comment #37)
> > Mike, Andreas, I'm investigating the leak, I was thinking whether the leak
> > is related with the idle dispatch or not,
> > do you have any idea, if possible? Thanks!!
> After examination, the idle dispatch did cause the leak.
> 
> 
> (In reply to Will Wang [:WillWang] from comment #38)
> > I'm going to add cancelIdleCallback() once we get `unload` event [1] to see
> > whether the leak can be avoided.
> I cancel the _last_ idle callback there, but the memory still leaks.
> 
> So two questions now:
> 1. I should also cancel _all_ previous idle callback, this needs a neat way
> to do so.

Please always cancel an idle callback when `sendWhenIdle` is called. I think it might help you to think about this as `setTimeout` and `clearTimeout` being similar to `requestIdleCallback` and `cancelIdleCallback`. You can use the same logic to ensure that only one request is running at the same time.

> 2. I'm not sure whether canceling the idle callback on `unload` event is too
> late or not.

I which case the 'unload' event should be good enough, once you've ensured that there's only one `requestIdleCallback` running at the same time.
Also, IF there's a `requestIdleCallback` active, you'll want to cancel it _and_ call `MessageQueue.send()` right away to make sure no data is lost.

> Besides, another question:
> Do we need to treat the idle callback as a leak?
> If not, could we change that and let it run in a proper way?

Yes, because it's a DOM API, which keeps the window (content) object alive. I think it helps us write more correct code that properly cleans up after itself. We apparently just had more work to do here ;)
Flags: needinfo?(mdeboer)
Mike, many thanks for your reply!!

(In reply to Mike de Boer [:mikedeboer] from comment #40)
> Please always cancel an idle callback when `sendWhenIdle` is called.
...
> Also, IF there's a `requestIdleCallback` active, you'll want to cancel it
> _and_ call `MessageQueue.send()` right away to make sure no data is lost.

Does this mean that we will lose the ability to check for deadline.didTimeout and deadline.timeRemaining()?
Flags: needinfo?(mdeboer)
(In reply to Will Wang [:WillWang] from comment #41)
> Does this mean that we will lose the ability to check for
> deadline.didTimeout and deadline.timeRemaining()?

What I meant was that in the case of an unload event, we don't want wait for an idle callback anymore before we can send the data; we want to send everything we have in one go when that happens. Does that make sense?
Flags: needinfo?(mdeboer)
> What I meant was that in the case of an unload event, we don't want wait for
> an idle callback anymore before we can send the data; we want to send
> everything we have in one go when that happens. Does that make sense?
Oh! Thanks for your clarification, I thought the logic in the wrong aspect.

In short:
- During normal running: keep only one `requestIdleCallback`
- When `unload` event: cancel the `requestIdleCallback` and send everything

Is that correct?
Thank you, Mike!
Flags: needinfo?(mdeboer)
Before the backout, these regressions were noticed:

== Change summary for alert #9359 (as of September 08 2017 12:37 UTC) ==

Regressions:

 19%  JS summary windows10-64 opt      125,799,573.86 -> 149,878,077.58
  9%  Explicit Memory summary windows10-64 opt 316,330,514.50 -> 344,672,390.22
  6%  Resident Memory summary windows10-64 opt 464,394,511.59 -> 489,918,908.47

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=9359
They were canceled by the backout and this is the confirmation:

== Change summary for alert #9360 (as of September 08 2017 14:05 UTC) ==

Improvements:

 21%  JS summary windows10-64 opt      155,091,810.97 -> 123,281,661.86
 11%  Explicit Memory summary windows10-64 opt 352,081,081.88 -> 314,509,088.48
  7%  Resident Memory summary windows10-64 opt 498,171,088.85 -> 462,974,543.16

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=9360
(In reply to Will Wang [:WillWang] from comment #43)
> In short:
> - During normal running: keep only one `requestIdleCallback`
> - When `unload` event: cancel the `requestIdleCallback` and send everything
> 
> Is that correct?
> Thank you, Mike!

I think so, yes!
Flags: needinfo?(mdeboer)
(In reply to Will Wang [:WillWang] from comment #39)
> After examination, the idle dispatch did cause the leak.

I performed more cross tests and found that:
1. this idle dispatch _patch_ did cause the leak.
2. the `Services.prefs.addObserver` part of patch causes the leak, not the idle dispatch part. (Good for the usage of `requestIdleCallback` API!)
3. Removing the observer during unload event can fix the leak in several local tests.

I'm testing this leak on try server.

Besides,
Since we already had the `MessageQueue.send()`[1] during unload event, I think I can save another `send` :D



[1]
http://searchfox.org/mozilla-central/rev/00fa5dacedb925022f53d025121f1a919508e7ce/browser/components/sessionstore/content/content-sessionStore.js#906
Hi Mike,

Here is the revised patch:
1. Fixed memory leak. (Remove the observer during `unload` event)
2. Keep only one `requestIdleCallback`.
3. Cancel the `requestIdleCallback` during `unload` event.

Could you please review again?
Thanks a lot!!
Attachment #8905860 - Attachment is obsolete: true
Attachment #8906991 - Flags: review?(mdeboer)
Flags: needinfo?(afarre)
Comment on attachment 8906991 [details] [diff] [review]
(v4) Bug 1365970 - Move data collector timer in the content process to idle dispatch

Review of attachment 8906991 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, Will. We'll be there with a few more changes:

::: browser/components/sessionstore/content/content-sessionStore.js
@@ +805,5 @@
> +
> +  /**
> +   * Cancels pending idle callback.
> +   */
> +  cancel() {

This name is confusing to me, because the first thing I thought about was 'oh, this cancels the current send() in progress?', which is not the case.
Perhaps `cleanupTimers` and include the `clearTimeout` from `send` in here as well?

@@ +864,5 @@
> +  sendWhenIdle(deadline) {
> +    if (deadline) {
> +      if (deadline.didTimeout || deadline.timeRemaining() > MessageQueue.NEEDED_IDLE_PERIOD_MS) {
> +        MessageQueue.send();
> +        MessageQueue.cancel();

OK, this makes sense from a log perspective, but why not contain _all_ this logic in `MessageQueue.send()`? Please make the change to `clearTimers` and call that from `send`.

@@ +867,5 @@
> +        MessageQueue.send();
> +        MessageQueue.cancel();
> +        return;
> +      }
> +    } else if (MessageQueue._idleCallbackID) {

Well, this would have been correct _if_ you had cleared `MessageQueue._idleCallbackID` in the previous if-statement when `deadline` is passed in.
Attachment #8906991 - Flags: review?(mdeboer) → review-
Thanks, Mike!

(In reply to Mike de Boer [:mikedeboer] from comment #50)
> This name is confusing to me, because the first thing I thought about was
> 'oh, this cancels the current send() in progress?', which is not the case.
> Perhaps `cleanupTimers` and include the `clearTimeout` from `send` in here
> as well?
Great improvement!


> logic in `MessageQueue.send()`? Please make the change to `clearTimers` and
Do you mean `cleanupTimers`?


> Well, this would have been correct _if_ you had cleared
> `MessageQueue._idleCallbackID` in the previous if-statement when `deadline`
> is passed in.
If I didn't clear that when `deadline` is passed in, this means another `requestIdleCallback` will be scheduled;
I'm not sure what error will happen, could you advise me?

Thanks!
Flags: needinfo?(mdeboer)
(In reply to Will Wang [:WillWang] from comment #51)
> > logic in `MessageQueue.send()`? Please make the change to `clearTimers` and
> Do you mean `cleanupTimers`?

Yup.

> If I didn't clear that when `deadline` is passed in, this means another
> `requestIdleCallback` will be scheduled;
> I'm not sure what error will happen, could you advise me?

Ok, well, I think with the `cleanupTimers` method implemented, you can leave this as-is indeed.
Flags: needinfo?(mdeboer)
Hi Mike,

Here is the revised patch(v5),
could you please review again? :)

Thanks!
Attachment #8906991 - Attachment is obsolete: true
Attachment #8907575 - Flags: review?(mdeboer)
Comment on attachment 8907575 [details] [diff] [review]
(v5) Bug 1365970 - Move data collector timer in the content process to idle dispatch

Review of attachment 8907575 [details] [diff] [review]:
-----------------------------------------------------------------

Nice, thanks!
Attachment #8907575 - Flags: review?(mdeboer) → review+
Mike, thank you!!
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/197a9e821ce8
Move data collector timer in the content process to idle dispatch. r=mikedeboer
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/197a9e821ce8
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 57
Iteration: --- → 57.3 - Sep 19
Depends on: 1401422
Comment on attachment 8910081 [details]
Bug 1365970: Bail out from idle callback if frameloader is being destroyed.

https://reviewboard.mozilla.org/r/181568/#review187052

Cool, thanks Kris! But I think you accidentally used the wrong bug number? I think this works too, though.
Attachment #8910081 - Flags: review?(mdeboer) → review+
Comment on attachment 8910081 [details]
Bug 1365970: Bail out from idle callback if frameloader is being destroyed.

https://reviewboard.mozilla.org/r/181568/#review187052

Oops, yeah, should have been bug 1401422
Performance Impact: --- → P1
Whiteboard: [qf:p1] [reserve-photon-performance] → [reserve-photon-performance]
You need to log in before you can comment on or make changes to this bug.