Closed
Bug 1229654
Opened 9 years ago
Closed 5 years ago
GC performance (or something) goes off a cliff with about 300 tabs open
Categories
(Core :: JavaScript: GC, defect)
Core
JavaScript: GC
Tracking
()
RESOLVED
WORKSFORME
Tracking | Status | |
---|---|---|
firefox45 | --- | affected |
People
(Reporter: dbaron, Unassigned)
Details
(Keywords: perf)
Attachments
(1 file)
258.03 KB,
image/png
|
Details |
So I tend to be a tab hoarder. Most of my tabs are bugs on bugzilla.mozilla.org. I currently have 334 tabs in a single browser window, all loaded. (I have the browser.sessionstore.restore_on_demand pref set to false.)
With 334 tabs, if I load a bug on bugzilla.mozilla.org (while logged in), I frequently see the bug take 20-30 seconds to load, sometimes as much as 60 seconds. Based on some simple profiling with gdb (since everything else is broken right now), most of the time appears to be spent in JS GC, although also some in FireForgetSkippable CC stuff. (And, interestingly, a single CPU core isn't even pegged to 100% the whole time.) Yet if I load a bug from a different browser profile, it takes 2-3 seconds.
The particularly odd aspect here is the performance cliff. I currently have 334 tabs. Based on recent experience, I think if I were to close 30-50 tabs, the problem would effectively be completely gone. This cliff here makes me suspect something is O(N^2) or worse.
If I look at the browser console during a case where a bug loads slowly (in this case taking about 50 seconds), searching for "max pause", the worst GC I see is:
GC(T+114425.9) Summary - Max Pause: 7770.213ms; MMU 20ms: 0.0%; MMU 50ms: 0.0%; Total: 17572.173ms; Zones: 342 of 343; Compartments: 1059 of 1060; HeapSize: 716.734 MiB; HeapChange (abs): +0 (0);
and none of the CCs have a Max Pause of over 80ms, but this is during a load in which GC and CC appears to be delaying a page load that normally takes 2s to take 50s instead.
So I'm a bit puzzled.
Reporter | ||
Comment 1•9 years ago
|
||
Reporter | ||
Comment 2•9 years ago
|
||
So a profile in perf (the Linux command) showed that, within a profile that was probably only about half one of these pauses (so you should really roughly double the percentages for the things that would be in a GC pause):
> 6.34% Web Content libxul.so [.] _ZNK2js2gc11TenuredCell11arenaHeaderEv
> 6.33% Web Content libpthread-2.21.so [.] pthread_getspecific
> 4.66% Web Content libxul.so [.] _ZNK2js2gc11TenuredCell14markIfUnmarkedEj
> 4.17% Web Content libxul.so [.] _ZN2js2gc15IsInsideNurseryEPKNS0_4CellE
> 3.36% Web Content libxul.so [.] _ZN2js8GCMarker19processMarkStackTopERNS_11SliceBudgetE
> 3.34% Web Content libxul.so [.] _ZN2js29CurrentThreadCanAccessRuntimeEP9JSRuntime
> 2.92% Web Content libxul.so [.] _ZNK2js2gc11TenuredCell4zoneEv
> 2.79% Web Content libnspr4.so [.] PR_GetCurrentThread
> 2.38% Web Content libxul.so [.] _ZN2js2gc11TenuredCell11fromPointerEPv
> 1.92% Web Content libxul.so [.] _ZN2js26CurrentThreadCanAccessZoneEPN2JS4ZoneE
> 1.69% Web Content libxul.so [.] _ZNK2js2gc11TenuredCell9isAlignedEv
> 1.62% Web Content libxul.so [.] _ZN2JS4Zone11isGCMarkingEv
> 1.40% Web Content libxul.so [.] _ZNK2js2gc11ArenaHeader12getAllocKindEv
> 1.39% Web Content libxul.so [.] _ZN2js8GCMarker19eagerlyMarkChildrenEPNS_5ShapeE
> 1.35% Web Content libxul.so [.] _ZN2js16CheckTracedThingI8JSStringEEvP8JSTracerPT_
> 1.29% Web Content libxul.so [.] _ZNK8JSObject4zoneEv
> 1.26% Web Content libxul.so [.] _ZNK2JS4Zone11isAtomsZoneEv
> 1.04% Web Content libxul.so [.] _ZN2js2gc6detailL25GetGCThingMarkWordAndMaskEmjPPmS2_
(I am profiling a debug build here, but one with the patch:
https://hg.mozilla.org/users/dbaron_mozilla.com/patches/raw-file/32f3c2002e7f/disable-slow-js-asserts )
Reporter | ||
Comment 3•9 years ago
|
||
Some of this may actually be DEBUG-only stuff related to AssertZoneIsMarking. I'll try disabling that locally.
Reporter | ||
Comment 4•9 years ago
|
||
With that DEBUG stuff commented out, the profile has some different things in it, but it's still very slow, and still dominated by JS GC.
Comment 5•9 years ago
|
||
I've had 400+ tabs often over the years, so I'm well acquainted with GC and other performance issues (not at a technical level mind you)
(starting last fall?) It seems to have changed some whereby I've observed the "off the cliff" performance with similarly high bugzilla tab counts as you describe, and often being OK in other browsers. But this is typically accompanied by (every) tab indicating it's waiting on some network-related activity. This occurs in multiple locations - so I shouldn't think it's bandwidth related. (Unfortunately I can't find a screen shot of the exact wording.) In prior years, when the issue was GC and high-memory related, I often eventually get things to clear up by closing a few hundred tabs. Not so lately.
In general I've found significant relief by using dom.ipc.processCount >1, which significantly reduces the memory usage per process. I also clear old session data using about:sessionstore's clear closed tabs and windows. But neither seems to make me totally immune. (Note, I'm Windows, not on linux)
Comment 6•9 years ago
|
||
Wayne, did you check your issue is actually GC related and not some other issue, say something in e10s?
You may set javascript.options.mem.log to true and look at browser console to see GC and CC times.
Flags: needinfo?(vseerror)
Comment 7•9 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #6)
> Wayne, did you check your issue is actually GC related and not some other
> issue, say something in e10s?
> You may set javascript.options.mem.log to true and look at browser console
> to see GC and CC times.
I have it enabled. I didn't check it at the time, so I don't know for a fact that my issue wasn't GC related so I can't rule anything out. But
* I've been running e10s a long time and have a small sense of the issues
* I've seen GC issues for several years and the symptoms I was seeing was new to me, that pages were waiting on data from the server (And I have to assume my GC situation has been greatly helped by e10s as a result of the much reduced memory allocation by splitting the memory across multiple processes.)
To clarify my comment 5, I suspect my recent issues are not GC, and I'm suggesting dbaron might be seeing the same thing.
Flags: needinfo?(vseerror)
Reporter | ||
Comment 8•9 years ago
|
||
Comment 9•8 years ago
|
||
(In reply to David Baron :dbaron: ⌚️UTC-8 from comment #8)
> comment 0 and comment 2 pretty clearly report stuff that is related to GC
I understand that. In which case, does exploiting dom.ipc.processCount >1 help mitigate the problem?
Flags: needinfo?(dbaron)
Reporter | ||
Comment 10•8 years ago
|
||
I've basically changed my usage patterns (including no longer using debug builds, which may make a substantial difference) so that I avoid hitting this bug, so it's become hard to evaluate.
Flags: needinfo?(dbaron)
Comment 11•5 years ago
|
||
Jon, without clear STR is this worth keeping?
(for me, the last couple years with e10s and routinely 300-500 tabs I still don't have visible performance issues - so haven't dug into gc)
Flags: needinfo?(jcoppeard)
Comment 12•5 years ago
|
||
Closing this for now. Please reopen if it becomes a problem again.
Status: NEW → RESOLVED
Closed: 5 years ago
Flags: needinfo?(jcoppeard)
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•