Closed
Bug 1301301
Opened 8 years ago
Closed 8 years ago
Large regression in page close cycle collection pauses
Categories
(Core :: XPCOM, defect)
Tracking
()
VERIFIED
FIXED
mozilla52
Tracking | Status | |
---|---|---|
firefox48 | --- | unaffected |
firefox49 | --- | unaffected |
firefox-esr45 | --- | unaffected |
firefox50 | --- | unaffected |
firefox51 | + | verified |
firefox52 | + | verified |
People
(Reporter: Virtual, Assigned: mccr8)
References
Details
(5 keywords, Whiteboard: gfx-noted)
Attachments
(4 files, 8 obsolete files)
1.36 KB,
text/plain
|
Details | |
3.28 KB,
text/plain
|
Details | |
58 bytes,
text/x-review-board-request
|
smaug
:
review+
gchang
:
approval-mozilla-aurora+
|
Details |
58 bytes,
text/x-review-board-request
|
smaug
:
review+
gchang
:
approval-mozilla-aurora+
|
Details |
After some time of browsing Firefox Nightly will hangs for some time making itself unresponsive for this time.
Peja Stija from MozillaZine got reliable STR (that it happens more frequently with "Self-Destructing Cookies" extension installed) and got in on Gecko Profiler here - https://cleopatra.io/#report=7ade24c52a5aa5dd1e15e3e668bdff5e599d62b5&jankOnly=true&selection=0,1,2,3,4,5,6,7,8,9,265,266,267,411,412,413,414,432,23,24,25,3127,26,36,23,24,427,428,23,24,49,50,24,25,3128,26,36,23,24,427,428,23,24,49,50,24,25,3129,26,36,23,24,427,428,23,24,49,50,24,25,1130,26,36,1131,50,24,25,1132,26,36,23,24,427,428,23,24,49,50,24,25,3133,26,36,23,24,427,428,23,24,49,50,24,25,1130,26,36,1131,50,24,25,1132,26,36,23,24,427,428,23,24,49,50,24
In my case I get only microshutter scrolling on Gecko Profiler, because when I want to catch a real hang the browser hangs completely. Profile is in attachments, care, after extracting it's nearly 80MB file.
Reporter | ||
Comment 1•8 years ago
|
||
e10s is disabled
Reporter | ||
Comment 2•8 years ago
|
||
I'm pretty sure that Firefox 48 and 49 versions are unaffected, because it happens for not more than about a month.
Reporter | ||
Comment 3•8 years ago
|
||
[Tracking Requested - why for this release]: Regression
tracking-firefox51:
--- → ?
Comment 4•8 years ago
|
||
The profiler says that you're printing the display list out for some reason. Is your preference "layout.display-list.dump" or "layout.display-list.dump-content" enabled? Can you try this with a clean profile please?
Flags: needinfo?(bernesb)
Updated•8 years ago
|
Whiteboard: gfx-noted
Reporter | ||
Comment 5•8 years ago
|
||
Yes. I had these 2 preferences enabled, as I was thinking that they will help in gathering more useful data. I try to catch it one more time, but it's hard as Gecko Profiler hangs the whole browser.
In meantime please look on Peja Stija profile, which shows these hangs, as mine only shows microshutter scrolling.
Comment 6•8 years ago
|
||
(In reply to Mason Chang [:mchang] from comment #4)
> The profiler says that you're printing the display list out for some reason.
> Is your preference "layout.display-list.dump" or
> "layout.display-list.dump-content" enabled? Can you try this with a clean
> profile please?
In my case, I have neither of those settings enabled.
Comment 7•8 years ago
|
||
From Peja's profile, it looks like the JS engine is doing lots of work and the GC is spending lots of time doing things. Nothing from graphics stands out here. Can someone from the JS team look at this please?
Component: Graphics → JavaScript Engine
Flags: needinfo?(naveed)
Comment 8•8 years ago
|
||
Please disregard my problem (and profile), i didn't encounter it since using a blank profile, so it must be a setting/add-on related issue and most possible not related to Virtual_ManPL problem.
Comment 9•8 years ago
|
||
Please reopen if this comes up again. Thanks
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(naveed)
Resolution: --- → WORKSFORME
Updated•8 years ago
|
Flags: needinfo?(bernesb)
Comment 10•8 years ago
|
||
Just would like to add that the OP's problem probably is a complete different issue, and my be worth to focus on the attachment posted, since WORKSFORME probably doesn't fit for Virtual_ManPL initial problem.
Comment 11•8 years ago
|
||
The attachment posted, at least from the profile, is because he had a very expensive debug preference on.
Reporter | ||
Comment 12•8 years ago
|
||
I would like for someone to diagnose at least Peja Stija profile, because like I said in my case Gecko Profiler hangs the whole browser.
Moreover it's clearly the regression, because going back to stable version not produce any hangs with same profile and extensions.
Status: RESOLVED → REOPENED
Flags: needinfo?(naveed)
Resolution: WORKSFORME → ---
Reporter | ||
Updated•8 years ago
|
Attachment #8789225 -
Attachment is obsolete: true
Comment 13•8 years ago
|
||
(In reply to Virtual_ManPL [:Virtual] - (ni? me) from comment #12)
> I would like for someone to diagnose at least Peja Stija profile, because
> like I said in my case Gecko Profiler hangs the whole browser.
> Moreover it's clearly the regression, because going back to stable version
> not produce any hangs with same profile and extensions.
Can you bisect to find a bug which is causing this please?
Flags: needinfo?(bernesb)
Comment 14•8 years ago
|
||
I'm not sure what to do about Peja's profile since a clean profile fixes it for him, which means a bad add on is probably doing something.
Virtual, is this still happening for you with a clean profile?
Reporter | ||
Comment 15•8 years ago
|
||
(In reply to Mason Chang [:mchang] from comment #14)
> I'm not sure what to do about Peja's profile since a clean profile fixes it
> for him, which means a bad add on is probably doing something.
It's hard for me to believe that my profile is broken or some extensions are causing this, as when I go back to stable Firefox version I didn't encounter any more hangs with same configuration.
Maybe diagnosing Peja Stija profile will give some hints what's broken.
(In reply to Mason Chang [:mchang] from comment #13)
> Can you bisect to find a bug which is causing this please?
(In reply to Mason Chang [:mchang] from comment #14)
> Virtual, is this still happening for you with a clean profile?
I will try to diagnose and find it, but not faster than 1-2 weeks,
as this issue isn't reproducible straight a way (at least for me) and I don't have that much time right now for this time consuming thing.
@ Peja Stija - Maybe you could help with finding regression range, as you could easily reproduce it (comparing to me)? Even getting daily instead of inbound builds will be a big help.
Flags: needinfo?(zombulla14)
Comment 16•8 years ago
|
||
(In reply to Virtual_ManPL [:Virtual] - (ni? me) from comment #15)
> @ Peja Stija - Maybe you could help with finding regression range, as you
> could easily reproduce it (comparing to me)? Even getting daily instead of
> inbound builds will be a big help.
I'll do it, can't promise direct results, cause the hanging I experienced goes back weeks (if not even months) and I was of the impression that that was just part of using a nightly build.
I'll try to recreate and chime in when certain.
Comment 17•8 years ago
|
||
Well, using the non-blank profile, I even get the hangs ever so many minutes with no single add-on enabled, so this must be either due to a corruption, a wrong setting or a bug. Personally I'm out of options.
Comment 18•8 years ago
|
||
I cannot pinpoint this to some add-on, it seems to be exclusive, no matter what add-ons installed or not.
I can reproduce it 60% of the time using these steps :
- Have 5 pinned tabs (of which 2 are youtube video pages)
- Reload all 5 pretty fast one after another
- A couple of seconds to have a minute after all 5 have finished (re-)loading, the hang circle appears, sometimes up to 10 seconds long and the browser hangs.
Comment 19•8 years ago
|
||
Reproduced with a clean profile, and having a couple of Youtube video pages open, 1 playing, the rest paused. Getting the browser in the background and do something else for a couple of minutes, then bringing the browser back in the front, result = seconds long hang(s).
Profiler : https://cleopatra.io/#report=52b38903b411d4e1f4250c2f1675576cfaddd61b&jankOnly=true&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A27589865,%22end%22%3A27590994%7D,%7B%22type%22%3A%22FocusedCallstackPrefixSampleFilter%22,%22name%22%3A%22(root)%22,%22focusedCallstack%22%3A%5B0%5D,%22appliesToJS%22%3Afalse%7D%5D&selection=0,1,2,3,4,5,6,7,8
Reporter | ||
Comment 20•8 years ago
|
||
I can confirm this, something fishy is going on.
Status: REOPENED → NEW
Flags: needinfo?(bernesb)
Comment 21•8 years ago
|
||
(In reply to Peja Stija from comment #19)
> Reproduced with a clean profile, and having a couple of Youtube video pages
> open, 1 playing, the rest paused. Getting the browser in the background and
> do something else for a couple of minutes, then bringing the browser back in
> the front, result = seconds long hang(s).
>
> Profiler :
> https://cleopatra.io/
> #report=52b38903b411d4e1f4250c2f1675576cfaddd61b&jankOnly=true&filter=%5B%7B%
> 22type%22%3A%22RangeSampleFilter%22,%22start%22%3A27589865,
> %22end%22%3A27590994%7D,
> %7B%22type%22%3A%22FocusedCallstackPrefixSampleFilter%22,
> %22name%22%3A%22(root)%22,%22focusedCallstack%22%3A%5B0%5D,
> %22appliesToJS%22%3Afalse%7D%5D&selection=0,1,2,3,4,5,6,7,8
Is this happening on developer edition or release?
From this profile, it looks like we have large restyling problems and JS JIT problems. Changing components.
Comment 22•8 years ago
|
||
(In reply to Mason Chang [:mchang] from comment #21)
> (In reply to Peja Stija from comment #19)
>
> Is this happening on developer edition or release?
>
> From this profile, it looks like we have large restyling problems and JS JIT
> problems. Changing components.
This is happening on latest nightly (2016-09-13 32bit) with a clean profile.
Reproducable 50-60% of the time with the procedure explained in comment 19.
Comment 23•8 years ago
|
||
(In reply to Peja Stija from comment #22)
> (In reply to Mason Chang [:mchang] from comment #21)
> > (In reply to Peja Stija from comment #19)
> >
> > Is this happening on developer edition or release?
> >
> > From this profile, it looks like we have large restyling problems and JS JIT
> > problems. Changing components.
>
> This is happening on latest nightly (2016-09-13 32bit) with a clean profile.
> Reproducable 50-60% of the time with the procedure explained in comment 19.
Yes, but is it reproducible on developer edition or release?
Comment 24•8 years ago
|
||
(In reply to Mason Chang [:mchang] from comment #23)
...
> Yes, but is it reproducible on developer edition or release?
I don't have developer edition, and I cannot reproduce this behavior on the release version.
Something I somehow overlooked, (might be a clue?) I see there are now a bunch of failure logs added to the Graphics part in about:support, last time I checked this page some months ago, I remember not seeing these there at that time.
Screenshot : https://postimg.org/image/tgi9nqzxh/
Comment 25•8 years ago
|
||
(In reply to Peja Stija from comment #24)
> (In reply to Mason Chang [:mchang] from comment #23)
> ...
> > Yes, but is it reproducible on developer edition or release?
>
> I don't have developer edition, and I cannot reproduce this behavior on the
> release version.
>
> Something I somehow overlooked, (might be a clue?) I see there are now a
> bunch of failure logs added to the Graphics part in about:support, last time
> I checked this page some months ago, I remember not seeing these there at
> that time.
>
> Screenshot : https://postimg.org/image/tgi9nqzxh/
Those alerts are related to bug 1302713. Can you please download the developer edition and see if it still happens? Thanks!
Reporter | ||
Updated•8 years ago
|
Summary: Constant hangs and poor scrolling performance in Firefox Nightly → Constant hangs in Firefox Nightly
Comment 26•8 years ago
|
||
(In reply to Mason Chang [:mchang] from comment #25)
> Those alerts are related to bug 1302713. Can you please download the
> developer edition and see if it still happens? Thanks!
I installed the developer edition with it's own profile, and there everything stays perfectly fluid, no hangs detected after trying to recreate it for a couple of hours.
Have to say, Dev Edition definitely feels a lot more fluent than current nightlies.
Comment 27•8 years ago
|
||
Just to add, I stupidly forgot to add this every time, the 3 or 4 Youtube video tabs to reproduce the hangs are actually pinned tabs, don't know if this is of any more value to get to the core ?
Reporter | ||
Comment 28•8 years ago
|
||
In my case I don't even need to use pinned tabs or have opened YouTube to get Nightly hang after long (>30min) browsing session.
Reporter | ||
Comment 29•8 years ago
|
||
Firefox 50 set to unaffected, per Comment #26
Reporter | ||
Updated•8 years ago
|
status-firefox-esr45:
--- → unaffected
Comment 30•8 years ago
|
||
Another profile make it clear, horrible hangs, when getting FF nightly in background, doing something else, bring FF nightly in the front again, and reload a page (or open something in a new tab)
https://cleopatra.io/#report=b8abe7c86711961c06b933e75f0e34337ce25b2e&selection=0,1,2,3,4,5,6,7,8,9,10
Reporter | ||
Comment 31•8 years ago
|
||
Looking on naveed profile he isn't active (at least here, on Bugzilla) for 4 years. Do you have in mind any other person to ping for, so he or she could analyzed these profiles with recorded hangs?
Flags: needinfo?(zombulla14)
Flags: needinfo?(naveed)
Flags: needinfo?(mchang)
Comment 32•8 years ago
|
||
The profile in comment 30 shows a long cycle collection. I have also been experiencing very long cycle collection in Nightly, since about one or two weeks ago. Here's a profile from my machine: https://cleopatra.io/#report=44cf509ee14050a9675de21ddabf714f50d42626
The CYCLE_COLLECTOR_MAX_PAUSE telemetry measurement shows a big upwards jump on August 26: https://mzl.la/2cV5tIl
Flags: needinfo?(terrence)
Comment 33•8 years ago
|
||
Pushlog for the range 2016-08-25 to 2016-08-26: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=01748a2b1a463f24efd9cd8abad9ccfd76b037b8&tochange=a551f534773cf2d6933f78ce7d82a7a33a99643e
Maybe bug 1296484? It looks like that one was backed out within the range, though, not sure if it ended up making it into the 20160826 nightly build.
Comment hidden (offtopic) |
Comment 35•8 years ago
|
||
Comment 36•8 years ago
|
||
Clearing the regressionwindow-wanted keyword as a pushlog was provided in comment 33.
Keywords: regressionwindow-wanted
Comment 37•8 years ago
|
||
This should block 51 IMO. Bug 1296484 looks plausible.
Try pushes to hopefully confirm whether this is bug 1296484 or not.
rev a8cc2852ae3f (parent rev): https://treeherder.mozilla.org/#/jobs?repo=try&revision=5d24b3f64393
rev c68fdbfcec39 (bug 1296484): https://treeherder.mozilla.org/#/jobs?repo=try&revision=21189aa1d112
Though interestingly, smaug pointed out that 25th percentile graph shows a big regression around 8-Sep, which better fits when this bug was filed. So maybe there's more to this story?
https://telemetry.mozilla.org/new-pipeline/evo.html#!aggregates=median!5th-percentile!25th-percentile&cumulative=0&end_date=2016-09-08&keys=!__none__!__none__&max_channel_version=nightly%252F51&measure=CYCLE_COLLECTOR_MAX_PAUSE&min_channel_version=nightly%252F48&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2016-09-08&trim=1&use_submission_date=0
Comment 38•8 years ago
|
||
Forwarding ni? to CC owners.
Flags: needinfo?(terrence)
Flags: needinfo?(continuation)
Flags: needinfo?(bugs)
Assignee | ||
Comment 39•8 years ago
|
||
I think we should back out bug 1296484 for now. It is very plausible that in a page closing situation we somehow end up destroying things more aggressively now. I think I'm seeing this locally, with an easier STR. I'll try bisecting.
Assignee | ||
Updated•8 years ago
|
Component: JavaScript Engine → XPCOM
Assignee | ||
Comment 40•8 years ago
|
||
Well, I'll bisect to confirm, as it may require also backing out bug 1237058.
Assignee | ||
Comment 41•8 years ago
|
||
str |
Testing locally, it doesn't seem like bug 1296484 is the cause.
Here are my steps to reproduce:
1. Set javascript.options.mem.log to true in about config. Open the browser console. Type "cc(" into the "filter output" box so you only see CC activity.
2. Open https://news.ycombinator.com/ then wait a few seconds (I just use this as an example of a very simple site).
3. Open http://www.wowhead.com/ in two tabs, leaving the browser open to one of them, then wait ten seconds. (This page is doing something awful like continuously reloading an iframe so it generates a ton of potential garbage.)
4. Right click on the news.ycombinator tab, and do close all tabs to the right.
5. Look for a large CC pause a few seconds later.
In Nightly, using either of RyanVM's builds from comment 37, the CC pause looks like this:
CC(T+21.8)[content] max pause: 216ms, total time: 530ms, slices: 49, suspected: 251, visited: 13126 RCed and 276827 GCed, collected: 13063 RCed and 276768 GCed (276768|0|0 waiting for GC)
ForgetSkippable 8 times before CC, min: 0 ms, max: 1 ms, avg: 0 ms, total: 6 ms, max sync: 0 ms, removed: 221
In Aurora it looks like this:
CC(T+38.1)[content] max pause: 35ms, total time: 306ms, slices: 51, suspected: 1096, visited: 15689 RCed and 282392 GCed, collected: 15608 RCed and 282332 GCed (282332|0|0 waiting for GC)
ForgetSkippable 9 times before CC, min: 0 ms, max: 3 ms, avg: 1 ms, total: 11 ms, max sync: 0 ms, removed: 6223
The max pause is much larger, even though we're looking at about the same amount of stuff in both cases. My new theory is that we messed up the CC budget somehow, so we're not doing very much work, so our final "catch up" slice of work has to do more. I'll try a local build to investigate that. Maybe bug 1297367 did something.
Flags: needinfo?(mchang)
Assignee | ||
Updated•8 years ago
|
Summary: Constant hangs in Firefox Nightly → Large regression in page close cycle collection pauses
Assignee | ||
Comment 42•8 years ago
|
||
Local testing indicates this regression is from bug 1263355. I'll do some local testing to figure out where exactly the time is going.
Assignee | ||
Comment 43•8 years ago
|
||
Reporter | ||
Updated•8 years ago
|
Has Regression Range: --- → yes
Has STR: --- → yes
Assignee | ||
Comment 44•8 years ago
|
||
The profile in comment 32 is spending a ton of time tracing the children of JS objects. The frontend rewrite patch added a new GC tracekind for scopes. This is not getting added to the CC graph. It sounds like a bunch of objects can hold onto a scope, so we'll end up tracing the same scope over and over again. I think the fix is to then start adding Scopes to the CC graph. This needs to be done without exposing the actual definition of a scope to Gecko. Presumably a callback will be involved, like what is done for shapes.
Assignee: nobody → continuation
status-firefox52:
--- → affected
Comment 45•8 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #44)
> The profile in comment 32 is spending a ton of time tracing the children of
> JS objects. The frontend rewrite patch added a new GC tracekind for scopes.
> This is not getting added to the CC graph. It sounds like a bunch of objects
> can hold onto a scope, so we'll end up tracing the same scope over and over
> again. I think the fix is to then start adding Scopes to the CC graph. This
> needs to be done without exposing the actual definition of a scope to Gecko.
> Presumably a callback will be involved, like what is done for shapes.
For some extra clarity, the things that now have their own GC TraceKinds in the rewrite *used* to be JSObjects, so they "just worked" when traced by the CC.
Assignee | ||
Comment 46•8 years ago
|
||
I should add DEBUG counting to the graph building phase that compares the number of CCKind to non-CCKind children visited, and asserts if the ratio is too low. That should prevent quadratic blowups in the future and hopefully won't be too expensive.
Reporter | ||
Comment 47•8 years ago
|
||
[Tracking Requested - why for this release]: Regression
tracking-firefox52:
--- → ?
Reporter | ||
Updated•8 years ago
|
Status: NEW → ASSIGNED
Comment 48•8 years ago
|
||
Tracking 51/52 for this regression. Thanks to both Virtual_ManPL and Peja for doing all the testing and attaching their logs.
Assignee | ||
Comment 49•8 years ago
|
||
I have work in progress patches that, basically, adds Scope to AddToCCKind().
In a debug build, on the STR I mentioned before, I was seeing about 450,000 objects traversed, with about 770,000 calls to JS::TraceChildren. The CC time stats were like this:
CC(T+52.5)[content] max pause: 1624ms, total time: 1991ms, slices: 48, suspected: 4457, visited: 21673 RCed and 450139 GCed, collected: 16231 RCed and 349956 GCed (349956|0|0 waiting for GC)
With my patches, about 500,000 objects are being traversed (more!) but there are only about 580,000 calls to JS::TraceChildren. The CC max pause is about 40% of the previous one:
CC(T+52.6)[content] max pause: 648ms, total time: 944ms, slices: 49, suspected: 433, visited: 19126 RCed and 496476 GCed, collected: 18970 RCed and 493321 GCed (493321|0|0 waiting for GC)
I'll try to finish up the patches and put them up for review soon.
Assignee | ||
Comment 50•8 years ago
|
||
I also filed bug 1304205 about making slice budgets larger for longer running slice times, to reduce max pauses for long CCs.
Assignee | ||
Comment 51•8 years ago
|
||
In Linux64 runs, I saw trace to traverse ratios of up to 1.58 in a patched version. In an unpatched version, I see many places with a ratio of 2.3. I'll try adding an assert for a ratio of 2.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 57•8 years ago
|
||
I still need to wait for various try runs to finish, but I'm reasonably confident this is okay now, so I'm putting my patches up for review.
Part 1 and 2 are to fix the regression, while parts 3 through 5 add infrastructure to catch regressions like this in the future. Parts 1, 3 and 4 are refactoring patches that should not impact behavior. Part 5 is adds debug-only stats that are used in an assertion. I'm inclined to also backport 3-5 because we're early in Aurora, and the refactoring in there will make it harder to backport future patches otherwise. They should be low risk for opt builds.
Assignee | ||
Comment 58•8 years ago
|
||
try run showing how orange it is with a ratio limit of 2.0, and without part 2:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d6cdadc86ac7
opt looks okay:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=adda84b9be2b
all platform debug try run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1c6e526473c2
Android build failure is due to another patch in my queue. Hopefully Android will be okay...
Assignee | ||
Comment 59•8 years ago
|
||
I see an assertion failure with 2.0, so I'll have to bump it up a bit. The ratio was 2.017049. Maybe 2.2 will be high enough to provide some safety. The bad runs all have a ratio of 2.3 or higher.
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Attachment #8793433 -
Attachment is obsolete: true
Attachment #8793433 -
Flags: review?(bugs)
Assignee | ||
Updated•8 years ago
|
Attachment #8793434 -
Attachment is obsolete: true
Attachment #8793434 -
Flags: review?(bugs)
Assignee | ||
Updated•8 years ago
|
Attachment #8793435 -
Attachment is obsolete: true
Attachment #8793435 -
Flags: review?(bugs)
Assignee | ||
Updated•8 years ago
|
Attachment #8793436 -
Attachment is obsolete: true
Attachment #8793436 -
Flags: review?(bugs)
Assignee | ||
Comment 61•8 years ago
|
||
Well, that did not do what I wanted.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 66•8 years ago
|
||
mozreview-review |
Comment on attachment 8793923 [details]
Bug 1301301, part 1 - Unify NoteJSObject and NoteJSScript into NoteJSChild.
https://reviewboard.mozilla.org/r/80512/#review79262
I assume latter patches will explain why these changes are needed, but I guess that is quite obvious.
Attachment #8793923 -
Flags: review?(bugs) → review+
Comment 67•8 years ago
|
||
mozreview-review |
Comment on attachment 8793924 [details]
Bug 1301301, part 2 - Add Scope as an AddToCCKind.
https://reviewboard.mozilla.org/r/80514/#review79264
(not about this bug but I'm trying to remember what ObjectGroup type is and why it isn't listed here.)
Attachment #8793924 -
Flags: review?(bugs) → review+
Comment 68•8 years ago
|
||
mozreview-review |
Comment on attachment 8793925 [details]
Bug 1301301, part 3 - Implement Begin and EndCycleCollectionCallback in CCJSContext.
https://reviewboard.mozilla.org/r/80516/#review79266
Attachment #8793925 -
Flags: review?(bugs) → review+
Comment 69•8 years ago
|
||
mozreview-review |
Comment on attachment 8793926 [details]
Bug 1301301, part 4 - Make calls to JS::TraceChildren go through a single method on CCJSContext.
https://reviewboard.mozilla.org/r/80518/#review79272
::: xpcom/base/CycleCollectedJSContext.cpp:133
(Diff revision 1)
> } // namespace mozilla
>
> -struct NoteWeakMapChildrenTracer : public JS::CallbackTracer
> +struct CCJSTracer : public JS::CallbackTracer
> {
> - NoteWeakMapChildrenTracer(JSContext* aCx,
> + CCJSTracer(CycleCollectedJSContext* aCx,
> + WeakMapTraceKind aWeakTraceKind = TraceWeakMapValues)
I don't yet understand this weakmap stuff here.
Comment 70•8 years ago
|
||
mozreview-review |
Comment on attachment 8793437 [details]
Bug 1301301, part 5 - In debug builds, check if there are too many calls to JS::TraceChildren.
https://reviewboard.mozilla.org/r/80174/#review79274
::: xpcom/base/CycleCollectedJSContext.h:482
(Diff revision 2)
> void invoke(JS::HandleObject scope, Closure& closure) override;
> };
> EnvironmentPreparer mEnvironmentPreparer;
> +
> +#ifdef DEBUG
> + uint32_t mNumTraversed;
mNumTraversed is only about JS things, so its name should say so. I first through this was about all things traversed.
mNumOfTraversedJSThings?
Comment 71•8 years ago
|
||
mozreview-review |
Comment on attachment 8793926 [details]
Bug 1301301, part 4 - Make calls to JS::TraceChildren go through a single method on CCJSContext.
https://reviewboard.mozilla.org/r/80518/#review79280
It might make sense to make CCJSTracer's ctor protected, so that only those subclasses could use it.
::: xpcom/base/CycleCollectedJSContext.cpp:150
(Diff revision 1)
> +
> +struct NoteWeakMapChildrenTracer : public CCJSTracer
> +{
> + NoteWeakMapChildrenTracer(CycleCollectedJSContext* aCx,
> nsCycleCollectionNoteRootCallback& aCb)
> - : JS::CallbackTracer(aCx), mCb(aCb), mTracedAny(false), mMap(nullptr),
> + : CCJSTracer(aCx), mCb(aCb), mTracedAny(false), mMap(nullptr),
Note to myself...This relies on TraceWeakMapValues.
And looks like WeakMapTraceKind weakTraceKind = TraceWeakMapValues is also in JS::CallbackTracer, so behavior shouldn't change.
::: xpcom/base/CycleCollectedJSContext.cpp:347
(Diff revision 1)
> -{
> - TraversalTracer(JSContext* aCx, nsCycleCollectionTraversalCallback& aCb)
> - : JS::CallbackTracer(aCx, DoNotTraceWeakMaps), mCb(aCb)
> - {
> +{
> - }
> + TraversalTracer(CycleCollectedJSContext* aCx,
> + nsCycleCollectionTraversalCallback& aCb)
note to myself...ok, here we keep using DoNotTraceWeakMaps
Attachment #8793926 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 72•8 years ago
|
||
(In reply to Olli Pettay [:smaug] (TPAC) from comment #67)
> (not about this bug but I'm trying to remember what ObjectGroup type is and
> why it isn't listed here.)
I'm not entirely sure. I think it is some kind of shape-like thing. It would probably be useful to further profile page close CCs with a bunch of JS.
Yeah, the weak map stuff is a little messy. I don't remember off the top of my head exactly how it all works.
> mNumTraversed is only about JS things, so its name should say so.
That's a good point. I'll rename it to mNumTraversedGCThings, because it is incremented in TraverseGCThing.
> It might make sense to make CCJSTracer's ctor protected, so that only those subclasses could use it.
Good point. I have done that.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Keywords: leave-open
Comment 75•8 years ago
|
||
Pushed by amccreight@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/af8d3e696e93
part 1 - Unify NoteJSObject and NoteJSScript into NoteJSChild. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/8dd081bff020
part 2 - Add Scope as an AddToCCKind. r=smaug
Comment 76•8 years ago
|
||
mozreview-review |
Comment on attachment 8793437 [details]
Bug 1301301, part 5 - In debug builds, check if there are too many calls to JS::TraceChildren.
https://reviewboard.mozilla.org/r/80174/#review79576
::: xpcom/base/CycleCollectedJSContext.cpp:1384
(Diff revision 3)
> + // can spend a lot of time in JS::TraceChildren(). If you add a new
> + // TraceKind and are hitting this assertion, adding it to
> + // AddToCCKind might help. (The check is not done for small CC
> + // graphs to avoid noise.)
> + if (mNumTraceChildren > 1000) {
> + double traceRatio = ((double)mNumTraceChildren) / ((double)mNumTraversedGCThings);
You want to ensure mNumTraversedGCThings != 0
Attachment #8793437 -
Flags: review?(bugs) → review+
Comment 77•8 years ago
|
||
bugherder |
Assignee | ||
Comment 78•8 years ago
|
||
(In reply to Olli Pettay [:smaug] (TPAC) from comment #76)
> You want to ensure mNumTraversedGCThings != 0
Oh, good catch. I even saw that in a local run, but I forgot to fix it.
Assignee | ||
Updated•8 years ago
|
Keywords: leave-open
Comment 79•8 years ago
|
||
Pushed by amccreight@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/45fb9c1ce63a
part 3 - Implement Begin and EndCycleCollectionCallback in CCJSContext. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/6186eae0c2b7
part 4 - Make calls to JS::TraceChildren go through a single method on CCJSContext. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/da70c5abd73a
part 5 - In debug builds, check if there are too many calls to JS::TraceChildren. r=smaug
Comment 80•8 years ago
|
||
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/d217f8dd78d65ae5c90d98fb7191879d12d9ff14
The assertion got hit in https://treeherder.mozilla.org/logviewer.html#?job_id=36582894&repo=mozilla-inbound and https://treeherder.mozilla.org/logviewer.html#?job_id=36579865&repo=mozilla-inbound and it might have been getting hit and swallowed in VideoPuppeteer, https://treeherder.mozilla.org/logviewer.html#?job_id=36564623&repo=mozilla-inbound and a dozen others, or that might be something else, dunno.
Assignee | ||
Comment 81•8 years ago
|
||
Hmm, well, this is likely not the only backout those patches will have, so I'll spin off the assertions into a separate bug to facilitate backporting as fast as possible.
Assignee | ||
Updated•8 years ago
|
Attachment #8793924 -
Attachment is obsolete: true
Assignee | ||
Updated•8 years ago
|
Attachment #8793437 -
Attachment is obsolete: true
Assignee | ||
Updated•8 years ago
|
Attachment #8793924 -
Attachment is obsolete: false
Assignee | ||
Updated•8 years ago
|
Attachment #8793925 -
Attachment is obsolete: true
Assignee | ||
Updated•8 years ago
|
Attachment #8793926 -
Attachment is obsolete: true
Assignee | ||
Comment 82•8 years ago
|
||
This only hit m-c on the 9-27 build, so I'm going to wait a few more days to confirm that this fixes the regression on telemetry, then request uplift to Aurora.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago → 8 years ago
Flags: needinfo?(continuation)
Resolution: --- → FIXED
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(continuation)
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(continuation)
Updated•8 years ago
|
Target Milestone: --- → mozilla52
Reporter | ||
Comment 83•8 years ago
|
||
I want to add that, I'm no longer encountering these annoying hangs for more than 2 days,
so I making this bug as VERIFIED.
Thank you very much! \o/
Status: RESOLVED → VERIFIED
Assignee | ||
Comment 84•8 years ago
|
||
Looking at telemetry, I think this did fix the regression in mean and 95th percentile cycle collector max pauses.
https://mzl.la/2cGVplp
Flags: needinfo?(continuation)
Assignee | ||
Comment 85•8 years ago
|
||
Comment on attachment 8793923 [details]
Bug 1301301, part 1 - Unify NoteJSObject and NoteJSScript into NoteJSChild.
Approval Request Comment
[Feature/regressing bug #]: bug 1263355
[User impact if declined]: very long hangs
[Describe test coverage new/current, TreeHerder]: This code runs all the time. I have patches in progress to check this particular regression, but we only regress this every few years so I think it isn't too critical that this gets landed immediately.
[Risks and why]:
[String/UUID change made/needed]: Fairly low. Part 1 has a lot of changes, but is mostly a code refactoring, so it is low risk. Part 2 is small, and potentially risky, but it should really just change our behavior back to before bug 1263355 landed.
Attachment #8793923 -
Flags: approval-mozilla-aurora?
Assignee | ||
Updated•8 years ago
|
Attachment #8793924 -
Flags: approval-mozilla-aurora?
Comment 86•8 years ago
|
||
Hi :mccr8,
Are patch 1 & 2 the only patches we want to uplift to 51 aurora?
Flags: needinfo?(continuation)
Assignee | ||
Comment 87•8 years ago
|
||
(In reply to Gerry Chang [:gchang] from comment #86)
> Hi :mccr8,
> Are patch 1 & 2 the only patches we want to uplift to 51 aurora?
Yes. The rest have not landed yet, and are only needed for some regression testing, which I think is not super important for this code. (I spun them off into a separate bug to simplify tracking.)
Flags: needinfo?(continuation)
Comment 88•8 years ago
|
||
Comment on attachment 8793923 [details]
Bug 1301301, part 1 - Unify NoteJSObject and NoteJSScript into NoteJSChild.
Fix a regression. Take it in 51 aurora.
Attachment #8793923 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Updated•8 years ago
|
Attachment #8793924 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 89•8 years ago
|
||
bugherder uplift |
Reporter | ||
Updated•8 years ago
|
Reporter | ||
Updated•7 years ago
|
Keywords: nightly-community
Reporter | ||
Updated•7 years ago
|
QA Contact: Virtual
You need to log in
before you can comment on or make changes to this bug.
Description
•