Closed Bug 1301301 Opened 8 years ago Closed 8 years ago

Large regression in page close cycle collection pauses

Categories

(Core :: XPCOM, defect)

51 Branch
All
Windows 7
defect
Not set
major

Tracking

()

VERIFIED FIXED
mozilla52
Tracking Status
firefox48 --- unaffected
firefox49 --- unaffected
firefox-esr45 --- unaffected
firefox50 --- unaffected
firefox51 + verified
firefox52 + verified

People

(Reporter: Virtual, Assigned: mccr8)

References

Details

(5 keywords, Whiteboard: gfx-noted)

Attachments

(4 files, 8 obsolete files)

1.36 KB, text/plain
Details
3.28 KB, text/plain
Details
58 bytes, text/x-review-board-request
smaug
: review+
Details
58 bytes, text/x-review-board-request
smaug
: review+
Details
Attached file 53EhBLj6 (LZMA2).7z (obsolete) β€”
After some time of browsing Firefox Nightly will hangs for some time making itself unresponsive for this time.

Peja Stija from MozillaZine got reliable STR (that it happens more frequently with "Self-Destructing Cookies" extension installed) and got in on Gecko Profiler here - https://cleopatra.io/#report=7ade24c52a5aa5dd1e15e3e668bdff5e599d62b5&jankOnly=true&selection=0,1,2,3,4,5,6,7,8,9,265,266,267,411,412,413,414,432,23,24,25,3127,26,36,23,24,427,428,23,24,49,50,24,25,3128,26,36,23,24,427,428,23,24,49,50,24,25,3129,26,36,23,24,427,428,23,24,49,50,24,25,1130,26,36,1131,50,24,25,1132,26,36,23,24,427,428,23,24,49,50,24,25,3133,26,36,23,24,427,428,23,24,49,50,24,25,1130,26,36,1131,50,24,25,1132,26,36,23,24,427,428,23,24,49,50,24

In my case I get only microshutter scrolling on Gecko Profiler, because when I want to catch a real hang the browser hangs completely. Profile is in attachments, care, after extracting it's nearly 80MB file.
I'm pretty sure that Firefox 48 and 49 versions are unaffected, because it happens for not more than about a month.
The profiler says that you're printing the display list out for some reason. Is your preference "layout.display-list.dump" or "layout.display-list.dump-content" enabled? Can you try this with a clean profile please?
Flags: needinfo?(bernesb)
Whiteboard: gfx-noted
Yes. I had these 2 preferences enabled, as I was thinking that they will help in gathering more useful data. I try to catch it one more time, but it's hard as Gecko Profiler hangs the whole browser.
In meantime please look on Peja Stija profile, which shows these hangs, as mine only shows microshutter scrolling.
(In reply to Mason Chang [:mchang] from comment #4)
> The profiler says that you're printing the display list out for some reason.
> Is your preference "layout.display-list.dump" or
> "layout.display-list.dump-content" enabled? Can you try this with a clean
> profile please?

In my case, I have neither of those settings enabled.
From Peja's profile, it looks like the JS engine is doing lots of work and the GC is spending lots of time doing things. Nothing from graphics stands out here. Can someone from the JS team look at this please?
Component: Graphics → JavaScript Engine
Flags: needinfo?(naveed)
Please disregard my problem (and profile), i didn't encounter it since using a blank profile, so it must be a setting/add-on related issue and most possible not related to Virtual_ManPL problem.
Please reopen if this comes up again. Thanks
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(naveed)
Resolution: --- → WORKSFORME
Flags: needinfo?(bernesb)
Just would like to add that the OP's problem probably is a complete different issue, and my be worth to focus on the attachment posted, since WORKSFORME probably doesn't fit for  Virtual_ManPL initial problem.
The attachment posted, at least from the profile, is because he had a very expensive debug preference on.
I would like for someone to diagnose at least Peja Stija profile, because like I said in my case Gecko Profiler hangs the whole browser.
Moreover it's clearly the regression, because going back to stable version not produce any hangs with same profile and extensions.
Status: RESOLVED → REOPENED
Flags: needinfo?(naveed)
Resolution: WORKSFORME → ---
(In reply to Virtual_ManPL [:Virtual] - (ni? me) from comment #12)
> I would like for someone to diagnose at least Peja Stija profile, because
> like I said in my case Gecko Profiler hangs the whole browser.
> Moreover it's clearly the regression, because going back to stable version
> not produce any hangs with same profile and extensions.

Can you bisect to find a bug which is causing this please?
Flags: needinfo?(bernesb)
I'm not sure what to do about Peja's profile since a clean profile fixes it for him, which means a bad add on is probably doing something. 

Virtual, is this still happening for you with a clean profile?
(In reply to Mason Chang [:mchang] from comment #14)
> I'm not sure what to do about Peja's profile since a clean profile fixes it
> for him, which means a bad add on is probably doing something.

It's hard for me to believe that my profile is broken or some extensions are causing this, as when I go back to stable Firefox version I didn't encounter any more hangs with same configuration.
Maybe diagnosing Peja Stija profile will give some hints what's broken.



(In reply to Mason Chang [:mchang] from comment #13)
> Can you bisect to find a bug which is causing this please?
(In reply to Mason Chang [:mchang] from comment #14)
> Virtual, is this still happening for you with a clean profile?

I will try to diagnose and find it, but not faster than 1-2 weeks,
as this issue isn't reproducible straight a way (at least for me) and I don't have that much time right now for this time consuming thing.



@ Peja Stija - Maybe you could help with finding regression range, as you could easily reproduce it (comparing to me)? Even getting daily instead of inbound builds will be a big help.
Flags: needinfo?(zombulla14)
(In reply to Virtual_ManPL [:Virtual] - (ni? me) from comment #15)

> @ Peja Stija - Maybe you could help with finding regression range, as you
> could easily reproduce it (comparing to me)? Even getting daily instead of
> inbound builds will be a big help.

I'll do it, can't promise direct results, cause the hanging I experienced goes back weeks (if not even months) and I was of the impression that that was just part of using a nightly build.
I'll try to recreate and chime in when certain.
Well, using the non-blank profile, I even get the hangs ever so many minutes with no single add-on enabled, so this must be either due to a corruption, a wrong setting or a bug. Personally I'm out of options.
I cannot pinpoint this to some add-on, it seems to be exclusive, no matter what add-ons installed or not.
I can reproduce it 60% of the time using these steps :

- Have 5 pinned tabs (of which 2 are youtube video pages)
- Reload all 5 pretty fast one after another
- A couple of seconds to have a minute after all 5 have finished (re-)loading, the hang circle appears, sometimes up to 10 seconds long and the browser hangs.
Reproduced with a clean profile, and having a couple of Youtube video pages open, 1 playing, the rest paused. Getting the browser in the background and do something else for a couple of minutes, then bringing the browser back in the front, result = seconds long hang(s).

Profiler : https://cleopatra.io/#report=52b38903b411d4e1f4250c2f1675576cfaddd61b&jankOnly=true&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A27589865,%22end%22%3A27590994%7D,%7B%22type%22%3A%22FocusedCallstackPrefixSampleFilter%22,%22name%22%3A%22(root)%22,%22focusedCallstack%22%3A%5B0%5D,%22appliesToJS%22%3Afalse%7D%5D&selection=0,1,2,3,4,5,6,7,8
I can confirm this, something fishy is going on.
Status: REOPENED → NEW
Flags: needinfo?(bernesb)
(In reply to Peja Stija from comment #19)
> Reproduced with a clean profile, and having a couple of Youtube video pages
> open, 1 playing, the rest paused. Getting the browser in the background and
> do something else for a couple of minutes, then bringing the browser back in
> the front, result = seconds long hang(s).
> 
> Profiler :
> https://cleopatra.io/
> #report=52b38903b411d4e1f4250c2f1675576cfaddd61b&jankOnly=true&filter=%5B%7B%
> 22type%22%3A%22RangeSampleFilter%22,%22start%22%3A27589865,
> %22end%22%3A27590994%7D,
> %7B%22type%22%3A%22FocusedCallstackPrefixSampleFilter%22,
> %22name%22%3A%22(root)%22,%22focusedCallstack%22%3A%5B0%5D,
> %22appliesToJS%22%3Afalse%7D%5D&selection=0,1,2,3,4,5,6,7,8

Is this happening on developer edition or release?

From this profile, it looks like we have large restyling problems and JS JIT problems. Changing components.
(In reply to Mason Chang [:mchang] from comment #21)
> (In reply to Peja Stija from comment #19)
> 
> Is this happening on developer edition or release?
> 
> From this profile, it looks like we have large restyling problems and JS JIT
> problems. Changing components.

This is happening on latest nightly (2016-09-13 32bit) with a clean profile.
Reproducable 50-60% of the time with the procedure explained in comment 19.
(In reply to Peja Stija from comment #22)
> (In reply to Mason Chang [:mchang] from comment #21)
> > (In reply to Peja Stija from comment #19)
> > 
> > Is this happening on developer edition or release?
> > 
> > From this profile, it looks like we have large restyling problems and JS JIT
> > problems. Changing components.
> 
> This is happening on latest nightly (2016-09-13 32bit) with a clean profile.
> Reproducable 50-60% of the time with the procedure explained in comment 19.

Yes, but is it reproducible on developer edition or release?
(In reply to Mason Chang [:mchang] from comment #23)
...
> Yes, but is it reproducible on developer edition or release?

I don't have developer edition, and I cannot reproduce this behavior on the release version.

Something I somehow overlooked, (might be a clue?) I see there are now a bunch of failure logs added to the Graphics part in about:support, last time I checked this page some months ago,  I remember not seeing these there at that time.

Screenshot : https://postimg.org/image/tgi9nqzxh/
(In reply to Peja Stija from comment #24)
> (In reply to Mason Chang [:mchang] from comment #23)
> ...
> > Yes, but is it reproducible on developer edition or release?
> 
> I don't have developer edition, and I cannot reproduce this behavior on the
> release version.
> 
> Something I somehow overlooked, (might be a clue?) I see there are now a
> bunch of failure logs added to the Graphics part in about:support, last time
> I checked this page some months ago,  I remember not seeing these there at
> that time.
> 
> Screenshot : https://postimg.org/image/tgi9nqzxh/

Those alerts are related to bug 1302713. Can you please download the developer edition and see if it still happens? Thanks!
Summary: Constant hangs and poor scrolling performance in Firefox Nightly → Constant hangs in Firefox Nightly
(In reply to Mason Chang [:mchang] from comment #25)

> Those alerts are related to bug 1302713. Can you please download the
> developer edition and see if it still happens? Thanks!

I installed the developer edition with it's own profile, and there everything stays perfectly fluid, no hangs detected after trying to recreate it for a couple of hours.
Have to say, Dev Edition definitely feels a lot more fluent than current nightlies.
Just to add, I stupidly forgot to add this every time, the 3 or 4 Youtube video tabs to reproduce the hangs are actually pinned tabs, don't know if this is of any more value to get to the core ?
In my case I don't even need to use pinned tabs or have opened YouTube to get Nightly hang after long (>30min) browsing session.
Another profile make it clear, horrible hangs, when getting FF nightly in background, doing something else, bring FF nightly in the front again, and reload a page (or open something in a new tab)

https://cleopatra.io/#report=b8abe7c86711961c06b933e75f0e34337ce25b2e&selection=0,1,2,3,4,5,6,7,8,9,10
Looking on naveed profile he isn't active (at least here, on Bugzilla) for 4 years. Do you have in mind any other person to ping for, so he or she could analyzed these profiles with recorded hangs?
Flags: needinfo?(zombulla14)
Flags: needinfo?(naveed)
Flags: needinfo?(mchang)
The profile in comment 30 shows a long cycle collection. I have also been experiencing very long cycle collection in Nightly, since about one or two weeks ago. Here's a profile from my machine: https://cleopatra.io/#report=44cf509ee14050a9675de21ddabf714f50d42626

The CYCLE_COLLECTOR_MAX_PAUSE telemetry measurement shows a big upwards jump on August 26: https://mzl.la/2cV5tIl
Flags: needinfo?(terrence)
Pushlog for the range 2016-08-25 to 2016-08-26: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=01748a2b1a463f24efd9cd8abad9ccfd76b037b8&tochange=a551f534773cf2d6933f78ce7d82a7a33a99643e

Maybe bug 1296484? It looks like that one was backed out within the range, though, not sure if it ended up making it into the 20160826 nightly build.
Clearing the regressionwindow-wanted keyword as a pushlog was provided in comment 33.
Forwarding ni? to CC owners.
Flags: needinfo?(terrence)
Flags: needinfo?(continuation)
Flags: needinfo?(bugs)
I think we should back out bug 1296484 for now. It is very plausible that in a page closing situation we somehow end up destroying things more aggressively now. I think I'm seeing this locally, with an easier STR. I'll try bisecting.
Blocks: 1296484
Flags: needinfo?(continuation)
Flags: needinfo?(bugs)
Component: JavaScript Engine → XPCOM
Well, I'll bisect to confirm, as it may require also backing out bug 1237058.
Testing locally, it doesn't seem like bug 1296484 is the cause.

Here are my steps to reproduce:

1. Set javascript.options.mem.log to true in about config. Open the browser console. Type "cc(" into the "filter output" box so you only see CC activity.
2. Open https://news.ycombinator.com/ then wait a few seconds (I just use this as an example of a very simple site).
3. Open http://www.wowhead.com/ in two tabs, leaving the browser open to one of them, then wait ten seconds. (This page is doing something awful like continuously reloading an iframe so it generates a ton of potential garbage.)
4. Right click on the news.ycombinator tab, and do close all tabs to the right.
5. Look for a large CC pause a few seconds later.

In Nightly, using either of RyanVM's builds from comment 37, the CC pause looks like this:

CC(T+21.8)[content] max pause: 216ms, total time: 530ms, slices: 49, suspected: 251, visited: 13126 RCed and 276827 GCed, collected: 13063 RCed and 276768 GCed (276768|0|0 waiting for GC)
ForgetSkippable 8 times before CC, min: 0 ms, max: 1 ms, avg: 0 ms, total: 6 ms, max sync: 0 ms, removed: 221

In Aurora it looks like this:

CC(T+38.1)[content] max pause: 35ms, total time: 306ms, slices: 51, suspected: 1096, visited: 15689 RCed and 282392 GCed, collected: 15608 RCed and 282332 GCed (282332|0|0 waiting for GC)
ForgetSkippable 9 times before CC, min: 0 ms, max: 3 ms, avg: 1 ms, total: 11 ms, max sync: 0 ms, removed: 6223

The max pause is much larger, even though we're looking at about the same amount of stuff in both cases. My new theory is that we messed up the CC budget somehow, so we're not doing very much work, so our final "catch up" slice of work has to do more. I'll try a local build to investigate that. Maybe bug 1297367 did something.
Flags: needinfo?(mchang)
Summary: Constant hangs in Firefox Nightly → Large regression in page close cycle collection pauses
Local testing indicates this regression is from bug 1263355. I'll do some local testing to figure out where exactly the time is going.
Blocks: 1263355
No longer blocks: 1296484
The profile in comment 32 is spending a ton of time tracing the children of JS objects. The frontend rewrite patch added a new GC tracekind for scopes. This is not getting added to the CC graph. It sounds like a bunch of objects can hold onto a scope, so we'll end up tracing the same scope over and over again. I think the fix is to then start adding Scopes to the CC graph. This needs to be done without exposing the actual definition of a scope to Gecko. Presumably a callback will be involved, like what is done for shapes.
Assignee: nobody → continuation
(In reply to Andrew McCreight [:mccr8] from comment #44)
> The profile in comment 32 is spending a ton of time tracing the children of
> JS objects. The frontend rewrite patch added a new GC tracekind for scopes.
> This is not getting added to the CC graph. It sounds like a bunch of objects
> can hold onto a scope, so we'll end up tracing the same scope over and over
> again. I think the fix is to then start adding Scopes to the CC graph. This
> needs to be done without exposing the actual definition of a scope to Gecko.
> Presumably a callback will be involved, like what is done for shapes.

For some extra clarity, the things that now have their own GC TraceKinds in the rewrite *used* to be JSObjects, so they "just worked" when traced by the CC.
I should add DEBUG counting to the graph building phase that compares the number of CCKind to non-CCKind children visited, and asserts if the ratio is too low. That should prevent quadratic blowups in the future and hopefully won't be too expensive.
Tracking 51/52 for this regression. Thanks to both Virtual_ManPL and Peja for doing all the testing and attaching their logs.
See Also: 1295272
I have work in progress patches that, basically, adds Scope to AddToCCKind().

In a debug build, on the STR I mentioned before, I was seeing about 450,000 objects traversed, with about 770,000 calls to JS::TraceChildren. The CC time stats were like this:
CC(T+52.5)[content] max pause: 1624ms, total time: 1991ms, slices: 48, suspected: 4457, visited: 21673 RCed and 450139 GCed, collected: 16231 RCed and 349956 GCed (349956|0|0 waiting for GC)

With my patches, about 500,000 objects are being traversed (more!) but there are only about 580,000 calls to JS::TraceChildren. The CC max pause is about 40% of the previous one:
CC(T+52.6)[content] max pause: 648ms, total time: 944ms, slices: 49, suspected: 433, visited: 19126 RCed and 496476 GCed, collected: 18970 RCed and 493321 GCed (493321|0|0 waiting for GC)

I'll try to finish up the patches and put them up for review soon.
I also filed bug 1304205 about making slice budgets larger for longer running slice times, to reduce max pauses for long CCs.
In Linux64 runs, I saw trace to traverse ratios of up to 1.58 in a patched version. In an unpatched version, I see many places with a ratio of 2.3. I'll try adding an assert for a ratio of 2.
I still need to wait for various try runs to finish, but I'm reasonably confident this is okay now, so I'm putting my patches up for review.

Part 1 and 2 are to fix the regression, while parts 3 through 5 add infrastructure to catch regressions like this in the future. Parts 1, 3 and 4 are refactoring patches that should not impact behavior. Part 5 is adds debug-only stats that are used in an assertion. I'm inclined to also backport 3-5 because we're early in Aurora, and the refactoring in there will make it harder to backport future patches otherwise. They should be low risk for opt builds.
try run showing how orange it is with a ratio limit of 2.0, and without part 2:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d6cdadc86ac7

opt looks okay:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=adda84b9be2b

all platform debug try run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1c6e526473c2
Android build failure is due to another patch in my queue. Hopefully Android will be okay...
I see an assertion failure with 2.0, so I'll have to bump it up a bit. The ratio was 2.017049. Maybe 2.2 will be high enough to provide some safety. The bad runs all have a ratio of 2.3 or higher.
Attachment #8793433 - Attachment is obsolete: true
Attachment #8793433 - Flags: review?(bugs)
Attachment #8793434 - Attachment is obsolete: true
Attachment #8793434 - Flags: review?(bugs)
Attachment #8793435 - Attachment is obsolete: true
Attachment #8793435 - Flags: review?(bugs)
Attachment #8793436 - Attachment is obsolete: true
Attachment #8793436 - Flags: review?(bugs)
Well, that did not do what I wanted.
Comment on attachment 8793923 [details]
Bug 1301301, part 1 - Unify NoteJSObject and NoteJSScript into NoteJSChild.

https://reviewboard.mozilla.org/r/80512/#review79262

I assume latter patches will explain why these changes are needed, but I guess that is quite obvious.
Attachment #8793923 - Flags: review?(bugs) → review+
Comment on attachment 8793924 [details]
Bug 1301301, part 2 - Add Scope as an AddToCCKind.

https://reviewboard.mozilla.org/r/80514/#review79264

(not about this bug but I'm trying to remember what ObjectGroup type is and why it isn't listed here.)
Attachment #8793924 - Flags: review?(bugs) → review+
Comment on attachment 8793925 [details]
Bug 1301301, part 3 - Implement Begin and EndCycleCollectionCallback in CCJSContext.

https://reviewboard.mozilla.org/r/80516/#review79266
Attachment #8793925 - Flags: review?(bugs) → review+
Comment on attachment 8793926 [details]
Bug 1301301, part 4 - Make calls to JS::TraceChildren go through a single method on CCJSContext.

https://reviewboard.mozilla.org/r/80518/#review79272

::: xpcom/base/CycleCollectedJSContext.cpp:133
(Diff revision 1)
>  } // namespace mozilla
>  
> -struct NoteWeakMapChildrenTracer : public JS::CallbackTracer
> +struct CCJSTracer : public JS::CallbackTracer
>  {
> -  NoteWeakMapChildrenTracer(JSContext* aCx,
> +  CCJSTracer(CycleCollectedJSContext* aCx,
> +             WeakMapTraceKind aWeakTraceKind = TraceWeakMapValues)

I don't yet understand this weakmap stuff here.
Comment on attachment 8793437 [details]
Bug 1301301, part 5 - In debug builds, check if there are too many calls to JS::TraceChildren.

https://reviewboard.mozilla.org/r/80174/#review79274

::: xpcom/base/CycleCollectedJSContext.h:482
(Diff revision 2)
>      void invoke(JS::HandleObject scope, Closure& closure) override;
>    };
>    EnvironmentPreparer mEnvironmentPreparer;
> +
> +#ifdef DEBUG
> +  uint32_t mNumTraversed;

mNumTraversed is only about JS things, so its name should say so. I first through this was about all things traversed.
mNumOfTraversedJSThings?
Comment on attachment 8793926 [details]
Bug 1301301, part 4 - Make calls to JS::TraceChildren go through a single method on CCJSContext.

https://reviewboard.mozilla.org/r/80518/#review79280

It might make sense to make CCJSTracer's ctor protected, so that only those subclasses could use it.

::: xpcom/base/CycleCollectedJSContext.cpp:150
(Diff revision 1)
> +
> +struct NoteWeakMapChildrenTracer : public CCJSTracer
> +{
> +  NoteWeakMapChildrenTracer(CycleCollectedJSContext* aCx,
>                              nsCycleCollectionNoteRootCallback& aCb)
> -    : JS::CallbackTracer(aCx), mCb(aCb), mTracedAny(false), mMap(nullptr),
> +    : CCJSTracer(aCx), mCb(aCb), mTracedAny(false), mMap(nullptr),

Note to myself...This relies on TraceWeakMapValues.
And looks like WeakMapTraceKind weakTraceKind = TraceWeakMapValues is also in JS::CallbackTracer, so behavior shouldn't change.

::: xpcom/base/CycleCollectedJSContext.cpp:347
(Diff revision 1)
> -{
> -  TraversalTracer(JSContext* aCx, nsCycleCollectionTraversalCallback& aCb)
> -    : JS::CallbackTracer(aCx, DoNotTraceWeakMaps), mCb(aCb)
> -  {
> +{
> -  }
> +  TraversalTracer(CycleCollectedJSContext* aCx,
> +                  nsCycleCollectionTraversalCallback& aCb)

note to myself...ok, here we keep using DoNotTraceWeakMaps
Attachment #8793926 - Flags: review?(bugs) → review+
(In reply to Olli Pettay [:smaug] (TPAC) from comment #67)
> (not about this bug but I'm trying to remember what ObjectGroup type is and
> why it isn't listed here.)

I'm not entirely sure. I think it is some kind of shape-like thing. It would probably be useful to further profile page close CCs with a bunch of JS.

Yeah, the weak map stuff is a little messy. I don't remember off the top of my head exactly how it all works.

> mNumTraversed is only about JS things, so its name should say so.

That's a good point. I'll rename it to mNumTraversedGCThings, because it is incremented in TraverseGCThing.

> It might make sense to make CCJSTracer's ctor protected, so that only those subclasses could use it.

Good point. I have done that.
Keywords: leave-open
Pushed by amccreight@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/af8d3e696e93
part 1 - Unify NoteJSObject and NoteJSScript into NoteJSChild. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/8dd081bff020
part 2 - Add Scope as an AddToCCKind. r=smaug
Comment on attachment 8793437 [details]
Bug 1301301, part 5 - In debug builds, check if there are too many calls to JS::TraceChildren.

https://reviewboard.mozilla.org/r/80174/#review79576

::: xpcom/base/CycleCollectedJSContext.cpp:1384
(Diff revision 3)
> +  // can spend a lot of time in JS::TraceChildren(). If you add a new
> +  // TraceKind and are hitting this assertion, adding it to
> +  // AddToCCKind might help. (The check is not done for small CC
> +  // graphs to avoid noise.)
> +  if (mNumTraceChildren > 1000) {
> +    double traceRatio = ((double)mNumTraceChildren) / ((double)mNumTraversedGCThings);

You want to ensure mNumTraversedGCThings != 0
Attachment #8793437 - Flags: review?(bugs) → review+
(In reply to Olli Pettay [:smaug] (TPAC) from comment #76)
> You want to ensure mNumTraversedGCThings != 0

Oh, good catch. I even saw that in a local run, but I forgot to fix it.
Keywords: leave-open
Pushed by amccreight@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/45fb9c1ce63a
part 3 - Implement Begin and EndCycleCollectionCallback in CCJSContext. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/6186eae0c2b7
part 4 - Make calls to JS::TraceChildren go through a single method on CCJSContext. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/da70c5abd73a
part 5 - In debug builds, check if there are too many calls to JS::TraceChildren. r=smaug
Hmm, well, this is likely not the only backout those patches will have, so I'll spin off the assertions into a separate bug to facilitate backporting as fast as possible.
Blocks: 1306086
Attachment #8793924 - Attachment is obsolete: true
Attachment #8793437 - Attachment is obsolete: true
Attachment #8793924 - Attachment is obsolete: false
Attachment #8793925 - Attachment is obsolete: true
Attachment #8793926 - Attachment is obsolete: true
This only hit m-c on the 9-27 build, so I'm going to wait a few more days to confirm that this fixes the regression on telemetry, then request uplift to Aurora.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago8 years ago
Flags: needinfo?(continuation)
Resolution: --- → FIXED
Flags: needinfo?(continuation)
Flags: needinfo?(continuation)
Target Milestone: --- → mozilla52
I want to add that, I'm no longer encountering these annoying hangs for more than 2 days,
so I making this bug as VERIFIED.

Thank you very much! \o/
Status: RESOLVED → VERIFIED
Looking at telemetry, I think this did fix the regression in mean and 95th percentile cycle collector max pauses.

https://mzl.la/2cGVplp
Flags: needinfo?(continuation)
Comment on attachment 8793923 [details]
Bug 1301301, part 1 - Unify NoteJSObject and NoteJSScript into NoteJSChild.

Approval Request Comment
[Feature/regressing bug #]: bug 1263355
[User impact if declined]: very long hangs
[Describe test coverage new/current, TreeHerder]: This code runs all the time. I have patches in progress to check this particular regression, but we only regress this every few years so I think it isn't too critical that this gets landed immediately.
[Risks and why]: 
[String/UUID change made/needed]: Fairly low. Part 1 has a lot of changes, but is mostly a code refactoring, so it is low risk. Part 2 is small, and potentially risky, but it should really just change our behavior back to before bug 1263355 landed.
Attachment #8793923 - Flags: approval-mozilla-aurora?
Attachment #8793924 - Flags: approval-mozilla-aurora?
Hi :mccr8,
Are patch 1 & 2 the only patches we want to uplift to 51 aurora?
Flags: needinfo?(continuation)
(In reply to Gerry Chang [:gchang] from comment #86)
> Hi :mccr8,
> Are patch 1 & 2 the only patches we want to uplift to 51 aurora?

Yes. The rest have not landed yet, and are only needed for some regression testing, which I think is not super important for this code. (I spun them off into a separate bug to simplify tracking.)
Flags: needinfo?(continuation)
Comment on attachment 8793923 [details]
Bug 1301301, part 1 - Unify NoteJSObject and NoteJSScript into NoteJSChild.

Fix a regression. Take it in 51 aurora.
Attachment #8793923 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8793924 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: