Try to run GC/CC slices, including forgetSkippable, during idle time

RESOLVED FIXED in Firefox 55

Status

()

Core
XPCOM
RESOLVED FIXED
3 months ago
5 days ago

People

(Reporter: smaug, Assigned: smaug)

Tracking

(Depends on: 2 bugs, Blocks: 2 bugs)

unspecified
mozilla55
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

(Whiteboard: [qf:p1])

Attachments

(2 attachments, 9 obsolete attachments)

(Assignee)

Description

3 months ago
Created attachment 8871473 [details] [diff] [review]
patch

Something like this. Since idle periods can be shorter and longer, better to try
to vary more in what kinds of budges are used.

This really needs bug 1311425 to land first.
(Assignee)

Comment 1

3 months ago
Comment on attachment 8871473 [details] [diff] [review]
patch

I haven't pushed this to try yet, but feel free to give feedback.
We can't use all the "automatic" timer handling from idle dispatch, since the setup here is a bit more complicated. But at least SetTimer and SetDeadline is called when expected, and of course also Run.

nsJSContext::NotifyDidPaint() removal is there because we should try to use idle time and not right-after-tick time. The latter shows up in Speedometer profiles.
(Idle time not, since that is between test runs.)
Attachment #8871473 - Flags: feedback?(continuation)
(Assignee)

Comment 2

3 months ago
The code may change a bit once GC timers are moved to use idle too, but better to start from something.
Especially GC timers don't use repeating timers atm.
(Assignee)

Updated

3 months ago
Whiteboard: [qf]
(Assignee)

Updated

3 months ago
Blocks: 1367944
(Assignee)

Comment 3

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a32dd9dec1da0504039a366dbd53defbd49a24c6
(Assignee)

Updated

3 months ago
Attachment #8871473 - Flags: feedback?(continuation) → review?(continuation)
Comment on attachment 8871473 [details] [diff] [review]
patch

Review of attachment 8871473 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsJSEnvironment.cpp
@@ +124,5 @@
>  
> +// ForgetSkippable is usually fast, so we can use small budgets.
> +// This isn't a real budget but a hint to CycleCollectorRunner whether there
> +// is enough time to call ForgetSkippable.
> +static const int64_t kForgetSkippableSliceBudget = 2;

If this isn't really a budget, maybe call it something else? SliceDuration instead of SliceBudget?

@@ +130,2 @@
>  // Maximum amount of time that should elapse between incremental CC slices
> +static const int64_t kICCIntersliceDelay = 64; // ms

Why did you adjust these times?

@@ +150,5 @@
>  // Large value used to specify that a script should run essentially forever
>  #define NS_UNLIMITED_SCRIPT_RUNTIME (0x40000000LL << 32)
>  
>  // if you add statics here, add them to the list in StartupJSEnvironment
> +class CycleCollectorRunner;

nit: Put this on the line before the comment, because it isn't a static.

@@ +254,5 @@
> +    }
> +
> +    // Deadline is null when called from timer.
> +    if (mDeadline.IsNull() ||
> +        (TimeStamp::Now() + TimeDuration::FromMilliseconds(mBudget)) <

Maybe store mBudget as a duration? That would make this line shorter at least. ;)

@@ +1574,5 @@
>    // use an unlimited budget.
>    js::SliceBudget budget = js::SliceBudget::unlimited();
>  
>    if (sIncrementalCC) {
> +    int64_t sliceBudget = kICCSliceBudget;

It is confusing to have variables called both "budget" and "sliceBudget". Maybe this could be "baseBudget"?

@@ +1581,5 @@
> +      sliceBudget =
> +        int64_t((aDeadline - TimeStamp::Now()).ToMilliseconds()) / 2;
> +    }
> +
> +    // ...but at least kICCSliceBudget.

Is there any minimum to how big the deadline will be? If we get called back with a deadline of like 0.005ms it would be better to not do anything.

@@ +1582,5 @@
> +        int64_t((aDeadline - TimeStamp::Now()).ToMilliseconds()) / 2;
> +    }
> +
> +    // ...but at least kICCSliceBudget.
> +    sliceBudget = std::max(kICCSliceBudget, sliceBudget);

This should go inside the !IsNull() branch.

@@ +1603,5 @@
>          const float maxLaterSlice = 40.0f;
>          float laterSliceBudget = maxLaterSlice * percentToHalfDone;
>  
>          budget = js::SliceBudget(js::TimeBudget(std::max({delaySliceBudget,
> +                  laterSliceBudget, (float)sliceBudget})));

This code takes the largest of the three, which means we will totally ignore the deadline sometimes. Is that what you want?

@@ +1685,5 @@
>    MOZ_ASSERT(!sICCTimer, "Tried to create a new ICC timer when one already existed.");
>  
>    // Create an ICC timer even if ICC is globally disabled, because we could be manually triggering
>    // an incremental collection, and we want to be sure to finish it.
> +  sICCTimer = CycleCollectorRunner::Create(ICCTimerFired, kICCIntersliceDelay,

Maybe rename this to sICCRunner instead? It isn't a timer any more.

@@ +2140,4 @@
>      // We can kill some objects before running forgetSkippable.
>      nsCycleCollector_dispatchDeferredDeletion();
>  
> +    sCCTimer =

Maybe rename this to sCCRunner instead? It isn't a timer any more.

@@ +2378,5 @@
>  void
>  mozilla::dom::StartupJSEnvironment()
>  {
>    // initialize all our statics, so that we can restart XPCOM
> +  sGCTimer = sShrinkingGCTimer = sFullGCTimer = nullptr;

Can you really drop sCCTimer here? At least at some point this would get called more than once.

::: layout/base/nsRefreshDriver.cpp
@@ -2019,5 @@
>  
>    if (notifyGC && nsContentUtils::XPConnect()) {
>      GeckoProfilerTracingRAII tracer("Paint", "NotifyDidPaint");
>      nsContentUtils::XPConnect()->NotifyDidPaint();
> -    nsJSContext::NotifyDidPaint();

Why do you remove the refresh driver mechanism? Does the idle callback method already schedule work during the spare refresh time? Please add a comment about that to the commit message.
Attachment #8871473 - Flags: review?(continuation) → review+
(Assignee)

Updated

3 months ago
Whiteboard: [qf] → [qf:p1]
(Assignee)

Comment 5

3 months ago
I'll modify the patch a bit more while hacking bug 1367944 too.
(Assignee)

Updated

3 months ago
Attachment #8871473 - Attachment is obsolete: true
(Assignee)

Updated

3 months ago
Summary: Try to run CC slices, including forgetSkippable, during idle time → Try to run GC/CC slices, including forgetSkippable, during idle time
(Assignee)

Comment 6

3 months ago
Created attachment 8873652 [details] [diff] [review]
v2, including GC stuff

https://treeherder.mozilla.org/#/jobs?repo=try&revision=01a03aeb3aa079976e13eb295c35eace045333f7
(Assignee)

Comment 7

3 months ago
> This code takes the largest of the three, which means we will totally ignore the deadline sometimes. Is that what you want?

Yes, if we're late with CC, we do want longer slices, as we did before.

> Can you really drop sCCTimer here? At least at some point this would get called more than once.
Really?
That would mean we leak.

I'm adding back a bit different mechanism for RefreshDriver stuff
(Assignee)

Comment 8

3 months ago
> Why did you adjust these times?
The longer time we have, the likelier we end up using idle time.
And when there is enough idle time we end up running runners quite often.
(Assignee)

Comment 9

3 months ago
Created attachment 8873879 [details] [diff] [review]
v3

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0b17510c8d4702e8f6388d6dfdb1045dc5f30ab6
Attachment #8873652 - Attachment is obsolete: true
(Assignee)

Comment 10

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b27a72aa1f53f88ff3d8c81c64a5018419dc243d
(Assignee)

Comment 11

3 months ago
Created attachment 8873952 [details] [diff] [review]
v4

Maybe v3 was a bit too aggressive. 
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cd9e7a03e82efca4b3b8b9dad37612666cb64e5e
(Assignee)

Comment 12

3 months ago
ah, another patch in my tree causes gtest failures at least
(Assignee)

Comment 13

3 months ago
Comment on attachment 8873952 [details] [diff] [review]
v4

let's try feedback while I'm sorting out the failures (especially the one in my tree related to idle period handling and not to this bug).

I increased some timeouts a bit and to some factor of 16
Attachment #8873952 - Flags: feedback?(continuation)
Comment on attachment 8873952 [details] [diff] [review]
v4

Review of attachment 8873952 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsJSEnvironment.cpp
@@ +225,5 @@
>  // us from triggering expensive full collections too frequently.
>  static int32_t sExpensiveCollectorPokes = 0;
>  static const int32_t kPokesBetweenExpensiveCollectorTriggers = 5;
>  
> +// Return true if some meaning full work was done.

nit: "meaningful".

@@ +257,5 @@
> +      didRun = mCallback(mDeadline, mData);
> +    }
> +
> +    if (mCallback && (mRepeating || !didRun)) {
> +      // If we didn't do meaningfull work, don't schedule using immediate

nit: meaningful.

@@ +1891,4 @@
>  {
> +  nsJSContext::KillInterSliceGCRunner();
> +  MOZ_ASSERT(sActiveIntersliceGCBudget > 0);
> +  int64_t budget = sActiveIntersliceGCBudget;

When does sActiveIntersliceGCBudget get set?

::: layout/base/nsRefreshDriver.cpp
@@ +1709,5 @@
> +nsRefreshDriver::DispatchIdleRunnableAfterTick(nsIRunnable* aRunnable,
> +                                               uint32_t aDelay)
> +{
> +  if (!sPendingIdleRunnables) {
> +    sPendingIdleRunnables = new AutoTArray<RunnableWithDelay, 8>();

Why does it need to be an auto array if you are just heap allocating it all anyways?

@@ +1713,5 @@
> +    sPendingIdleRunnables = new AutoTArray<RunnableWithDelay, 8>();
> +  }
> +
> +  RunnableWithDelay rwd = {aRunnable, aDelay};
> +  sPendingIdleRunnables->AppendElement(rwd, mozilla::fallible);

Why don't you check the return value here? It seems like a bad idea to just lose a runnable.

@@ +1719,5 @@
> +
> +void
> +nsRefreshDriver::CancelIdleRunnable(nsIRunnable* aRunnable)
> +{
> +  if (sPendingIdleRunnables) {

Reverse this test and do an early return?

@@ +1722,5 @@
> +{
> +  if (sPendingIdleRunnables) {
> +    for (uint32_t i = 0; i < sPendingIdleRunnables->Length(); ++i) {
> +      if ((*sPendingIdleRunnables)[i].mRunnable == aRunnable) {
> +        sPendingIdleRunnables->RemoveElementAt(i);

Hmm there should be some existing nsTArray method to do this.

@@ +1996,5 @@
>        imagesToRefresh[i]->RequestRefresh(aNowTime);
>      }
>    }
>  
> +  bool dispatchAfterTickRunnables = false;

This variable name seems slightly odd to me. Maybe "dispatchRunnablesAfterTick"?

@@ +2034,5 @@
>        MOZ_ASSERT(timelines);
>        MOZ_ASSERT(timelines->HasConsumer(docShell));
>        timelines->AddMarkerForDocShell(docShell, "Paint",  MarkerTracingType::END);
>      }
> + 

nit: trailing whitespace

@@ +2055,5 @@
>      ScheduleViewManagerFlush();
>    }
>  
> +  if (dispatchAfterTickRunnables && sPendingIdleRunnables) {
> +    AutoTArray<RunnableWithDelay, 8>* runnables = sPendingIdleRunnables;

Can these both be UniquePtr?
(Assignee)

Comment 15

3 months ago
(In reply to Andrew McCreight [:mccr8] from comment #14)
> nit: "meaningful".
> 
oops

> When does sActiveIntersliceGCBudget get set?
The same place it is set now
SetMemoryGCSliceTimePrefChangedCallback

> > +nsRefreshDriver::DispatchIdleRunnableAfterTick(nsIRunnable* aRunnable,
> > +                                               uint32_t aDelay)
> > +{
> > +  if (!sPendingIdleRunnables) {
> > +    sPendingIdleRunnables = new AutoTArray<RunnableWithDelay, 8>();
> 
> Why does it need to be an auto array if you are just heap allocating it all
> anyways?
To reduce malloc/free if there are many runnables (well <= 8)

> > +  RunnableWithDelay rwd = {aRunnable, aDelay};
> > +  sPendingIdleRunnables->AppendElement(rwd, mozilla::fallible);
> 
> Why don't you check the return value here? It seems like a bad idea to just
> lose a runnable.
oops, should be infallible append



> > +  if (sPendingIdleRunnables) {
> > +    for (uint32_t i = 0; i < sPendingIdleRunnables->Length(); ++i) {
> > +      if ((*sPendingIdleRunnables)[i].mRunnable == aRunnable) {
> > +        sPendingIdleRunnables->RemoveElementAt(i);
> 
> Hmm there should be some existing nsTArray method to do this.
To do what? To iterate?


> This variable name seems slightly odd to me. Maybe
> "dispatchRunnablesAfterTick"?
yup

> > +  if (dispatchAfterTickRunnables && sPendingIdleRunnables) {
> > +    AutoTArray<RunnableWithDelay, 8>* runnables = sPendingIdleRunnables;
> 
> Can these both be UniquePtr?
I guess so.
(Assignee)

Comment 16

3 months ago
UniquePtr would add static initializer.
> To do what? To iterate?

To remove a specified element from an array. bool RemoveElement(const Item& aItem).
(Assignee)

Comment 18

3 months ago
But I remove only based on mRunnable, not the whole struct.
(Assignee)

Comment 19

3 months ago
Created attachment 8874494 [details] [diff] [review]
v5

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4744af5d4fc1e76464651d5593f67baba763072c
Attachment #8873879 - Attachment is obsolete: true
Attachment #8873952 - Attachment is obsolete: true
Attachment #8873952 - Flags: feedback?(continuation)
(Assignee)

Comment 20

3 months ago
Created attachment 8874604 [details] [diff] [review]
v6

https://treeherder.mozilla.org/#/jobs?repo=try&revision=997dbbd9374b3d247752ba5297326444759e6199
Attachment #8874494 - Attachment is obsolete: true
Attachment #8874604 - Flags: review?(continuation)
Comment on attachment 8874604 [details] [diff] [review]
v6

Review of attachment 8874604 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsJSEnvironment.cpp
@@ +231,5 @@
>  
> +// Return true if some meaningful work was done.
> +typedef bool (*CollectorRunnerCallback) (TimeStamp aDeadline, void* aData);
> +
> +// Repeating callback runner for CC and GC..

nit: There's an extra "." at the end.

@@ +1653,5 @@
>        }
>      }
>    }
>  
> +  nsCycleCollector_collectSlice(budget);

Please file a followup to get rid of the aPreferShorterSlices argument, if we don't need it any more.

@@ -1907,5 @@
>    }
>  
>    if (sGCTimer) {
>      if (ReadyToTriggerExpensiveCollectorTimer()) {
> -      GCTimerFired(nullptr, reinterpret_cast<void *>(JS::gcreason::DOM_WINDOW_UTILS));

Why did you drop the GC reason here?
Attachment #8874604 - Flags: review?(continuation) → review+
(Assignee)

Comment 22

3 months ago
> Why did you drop the GC reason here?
bah, left over from a patch when GCTimer was using the runner.
(Assignee)

Comment 23

3 months ago
Created attachment 8874918 [details] [diff] [review]
+ more eager CC/GC when running reftests

We've changed the CC/GC counter couple of times, and need that again. Otherwise Win7 VM debug OOMs.
Blocks: 1366803

Comment 24

3 months ago
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ff33e6c6f6a5
Try to run GC/CC slices, including forgetSkippable, during idle time, r=mccr8
(Assignee)

Updated

3 months ago
Depends on: 1370844
sorry had to back this out for memory leaks like https://treeherder.mozilla.org/logviewer.html#?job_id=105151884&repo=mozilla-inbound
Flags: needinfo?(bugs)

Comment 26

3 months ago
Backout by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1cca42e690ae
Backed out changeset ff33e6c6f6a5 for memory leaks
(Assignee)

Comment 27

3 months ago
I guess I should not schedule new Run call after sDidShutdown has turned to true.
Flags: needinfo?(bugs)
Blocks: 1370844
No longer depends on: 1370844
(Assignee)

Comment 28

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7f3f67e4fdb46493ab99da8d98368b6364d418e6
(Assignee)

Comment 29

3 months ago
ok, it is not that. Trying then delayed dispatch. That is very very rarely used feature.
(Assignee)

Comment 30

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0e27122a28ab512c4b03de195c4fa0d7e8075ae
(Assignee)

Comment 31

3 months ago
And trying also something else
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2ed7d9c6512e28692048dec9c8c9b386cca537fa
(Assignee)

Comment 32

3 months ago
Created attachment 8875446 [details] [diff] [review]
idle_slices_additional.diff

There seems to be something wrong with DelayedDispatch
Attachment #8875446 - Flags: review?(continuation)
Attachment #8875446 - Flags: review?(continuation) → review+
(Assignee)

Comment 33

3 months ago
 https://treeherder.mozilla.org/#/jobs?repo=try&revision=bc722ba745ea6ac376f70a848738df6ea626c2bf
(Assignee)

Comment 34

3 months ago
Created attachment 8875448 [details] [diff] [review]
idle_slices_8.diff

full patch

Comment 35

3 months ago
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1c66da2b1e88
Try to run GC/CC slices, including forgetSkippable, during idle time, r=mccr8

Comment 36

3 months ago
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c5aaa3f7b79e
trigger GC/CC even more often in reftests, r=bustage
Depends on: 1371214
This is causing crashes on windows debug reftests due to CC running out of memory, it needs to be backed out.  Bug 1361461 was blamed for it.

Comment 38

3 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/1c66da2b1e88
https://hg.mozilla.org/mozilla-central/rev/c5aaa3f7b79e
Status: NEW → RESOLVED
Last Resolved: 3 months ago
status-firefox55: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
(Assignee)

Comment 39

3 months ago
Testing https://treeherder.mozilla.org/#/jobs?repo=try&revision=3a03dcaba739b055ade074c57f6549d3ed73851e
(In reply to Olli Pettay [:smaug] from comment #39)
> Testing
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=3a03dcaba739b055ade074c57f6549d3ed73851e

Is this to fix up the windows reftests that have been frequently failing since around when the original patch landed?
Flags: needinfo?(bugs)
(Assignee)

Comment 41

2 months ago
It is a possible fix yes. Ugly fix.
Flags: needinfo?(bugs)
(Assignee)

Comment 42

2 months ago
Trying another approach
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5e6385c37e86b53cfe4db6e8d2b98a04d9b94ee0
I need to back out these patches for the spike in Windows reftest failures, it's getting to the point that filing a new bug for each of them is taking up most of my afternoon. 

https://hg.mozilla.org/mozilla-central/rev/5c61d5d5fc653454cb816b2bcf8c6a9f5a71c705
Status: RESOLVED → REOPENED
Flags: needinfo?(bugs)
Resolution: FIXED → ---
Target Milestone: mozilla55 → ---
(Assignee)

Comment 44

2 months ago
Created attachment 8875944 [details] [diff] [review]
idle_leak_tweaks.diff

https://treeherder.mozilla.org/#/jobs?repo=try&revision=07923d49a0d98300de9c1133367a564dbe7518a0
Attachment #8874604 - Attachment is obsolete: true
Attachment #8874918 - Attachment is obsolete: true
Attachment #8875446 - Attachment is obsolete: true
Flags: needinfo?(bugs)
(Assignee)

Comment 45

2 months ago
Comment on attachment 8875944 [details] [diff] [review]
idle_leak_tweaks.diff

This looks promising.
Attachment #8875944 - Flags: review?(continuation)
Attachment #8875944 - Flags: review?(continuation) → review+
(Assignee)

Comment 46

2 months ago
Created attachment 8876336 [details] [diff] [review]
up-to-date tweak
Attachment #8875944 - Attachment is obsolete: true

Comment 47

2 months ago
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/22c8d6540061
Try to run GC/CC slices, including forgetSkippable, during idle time, r=mccr8
https://hg.mozilla.org/integration/mozilla-inbound/rev/d3eb944e22e1
Try to run GC/CC slices, including forgetSkippable, during idle time, tweaks to keep reftest memory usage lower, r=mccr8
https://hg.mozilla.org/mozilla-central/rev/22c8d6540061
https://hg.mozilla.org/mozilla-central/rev/d3eb944e22e1
Status: REOPENED → RESOLVED
Last Resolved: 3 months ago2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
(Assignee)

Updated

2 months ago
Blocks: 1372042
Duplicate of this bug: 1368069
Depends on: 1372641
this looks to have given us a startup improvement on windows 7:
== Change summary for alert #7248 (as of June 13 2017 19:30 UTC) ==

Improvements:

  2%  tpaint windows7-32 opt e10s     280.61 -> 273.97

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=7248
and a damp improvement as well
apologies, tpaint is window open, NOT startup!

also some AWSY memory changes (many improvements on desktop, small regressions on android- some questions about running memory tests on android on emulators though, maybe it is ok to accept these):

== Change summary for alert #7193 (as of June 10 2017 00:34 UTC) ==

Regressions:

  5%  Resident Memory summary android-4-2-x86 opt      182,034,354.17 -> 190,508,672.67
  4%  Resident Memory summary android-api-15-gradle opt 179,586,539.42 -> 186,667,111.08

Improvements:

  3%  JS summary linux64 opt      185,029,119.52 -> 179,178,262.35
  3%  Explicit Memory summary windows10-64-vm opt 160,883,278.69 -> 156,174,871.24
  3%  Explicit Memory summary linux64 opt 161,336,184.87 -> 157,199,571.80
  3%  Resident Memory summary windows10-64-vm opt 658,963,904.81 -> 642,512,333.40
  2%  JS summary windows10-64-vm opt 190,036,559.59 -> 185,358,544.89

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=7193
(Assignee)

Updated

2 months ago
Depends on: 1373067
Blocks: 1373292
(Assignee)

Updated

5 days ago
Duplicate of this bug: 1168991
You need to log in before you can comment on or make changes to this bug.