Multi-second pause in compacting GC

RESOLVED WORKSFORME

Status

()

Core
JavaScript: GC
RESOLVED WORKSFORME
2 years ago
10 months ago

People

(Reporter: The 8472, Unassigned)

Tracking

(Blocks: 1 bug)

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(6 attachments)

(Reporter)

Description

2 years ago
Created attachment 8599975 [details]
compacting.png

I'm occasionally seeing quite long application pauses, one of the causes seems to be the compacting phase of the GC.
The length of those pauses seems to worsen with browser uptime, but that might simply be due to more of the lazy-loaded tabs being, well, loaded over the course of a browsing session.

My profile has ~1.5k tabs. using Firefox 40.0a1 Build ID 20150425030208
(Reporter)

Updated

2 years ago
Blocks: 1122579
(Reporter)

Comment 1

2 years ago
Created attachment 8601453 [details]
previous.js

While the bug itself concerns pause times during regular browsing after process startup I found that compacting GC also has an impact on startup time, which is easier to compare than intermittent GCs

Reduced Testcase:

1. create a blank profile
2. disable e10s [multiple processes make analysis of observations more difficult]
3. set dom.max_chrome_script_run_time=0 [profile startup takes so long that script timeouts would otherwise occur]
4. load attached sessionstore into profile (should have 1 foreground tab + 1000 background tabs in a separate panorama group)
5. start profile either with incremental+compacting GC enabled or disabled
6. measure startup time


used measurement method: manual tracking of CPU time (not wall time) until the initial tab from the session restore becomes scrollable after startup


dom.cycle_collector.incremental=true
javascript.options.mem.gc_incremental=true
javascript.options.compact_on_user_inactive=truemeasure
javascript.options.mem.gc_compacting=true

measured time (3 samples): 2m58s 2m57s 2m59s


dom.cycle_collector.incremental=false
javascript.options.mem.gc_incremental=false
javascript.options.compact_on_user_inactive=false
javascript.options.mem.gc_compacting=false

measured time (3 samples): 47s 50s 50smeasure


Note: On my real profile the discrepancy in startup time is even larger and more erratic, possibly due larger session store or addons interacting with the session restore
(Reporter)

Comment 2

2 years ago
(In reply to The 8472 from comment #1)
forgot to add firefox/system info used to measure:

Nightly x64
40.0a1
20150504030210
Windows 7 x64

I'm using 64bit builds since 32bit ones just tend to exhaust the address space
(In reply to The 8472 from comment #1)
> dom.cycle_collector.incremental

The cycle collector interacts with the GC, but it's not part of the GC.

> javascript.options.mem.gc_incremental

Incremental GC splits GC up into small chunks to avoid janking the browser too much. Disabling incremental GC changes things a whole lot more than just disabling compacting GC.

Can you reproduce the same difference with
dom.cycle_collector.incremental=true
javascript.options.mem.gc_incremental=true
?
(Reporter)

Comment 4

2 years ago
mhh, good point, should have measured all combinations.

Looks about like this:

-inc -compact = 47s
-inc +compact = 1m10s
+inc +compact = 2m58s
+inc -compact = 3m12s


So the startup time difference seems to be dominated by incremental collection. Then the question is whether that's a separate issue or if the initially reported pauses during regular browsing are also caused by incremental instead of compacting GC.

Should try running my regular profile with -inc +compacting then and see if I can reproduce it?
It's possible that the incremental GCs end up doing a lot of bookkeeping without making a lot of actual progress, and so we spend a large amount of time doing repeated work compared to just doing full GCs.

One thing that might be worth trying, if you're not tired of testing by now, would be to raise javascript.options.mem.gc_incremental_slice_ms (e.g. double it), and see what that does to the times.

CCing Terrence since he's come across situations like that before (and recently), though there's probably no quick general fix for it.
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #5)
Also, as Emanuel previously referred to, there's a difference between the cycle collector and the garbage collector (they are two separate components).  Is it possible to test with making one incremental and the other not?
(Reporter)

Comment 7

2 years ago
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #5)
> One thing that might be worth trying, if you're not tired of testing by now,
> would be to raise javascript.options.mem.gc_incremental_slice_ms (e.g.
> double it), and see what that does to the times.

Definitely makes a difference in startup performance for the testcase

compact + incremental 20ms = 1m54s
compact + incremental 10ms = 3m05s


The thing is, on my main profile I already have this on 20ms. I guess I'll bump it to 100ms there to see if the long pauses go away.
(Reporter)

Comment 8

2 years ago
(In reply to Jon Coppeard (:jonco) from comment #6)
> (In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #5)
> Also, as Emanuel previously referred to, there's a difference between the
> cycle collector and the garbage collector (they are two separate
> components).  Is it possible to test with making one incremental and the
> other not?

10ms slice +compact -cc incremental +gc incremental = 3m5s
10ms slice +compact +cc incremental -gc incremental = 59s



Btw, do those settings require a restart to be applied or can I change them on the fly to see what difference they make on my main profile? The issue is that things get progressively worse over time, so it's easier to observe pauses on a session that has some uptime.
Created attachment 8601739 [details]
gclogs.txt

Here are MOZ_GCTIMER logs from a startup with that profile with IGC. The majority of the GC's reported are very short DOM_WORKER GC's which we can ignore. There is a non-incremental GC because we've gone too long since GCing that takes only 0.5 sec or so. Then, right near the end we do an extremely long GC -- ~4s of GC work done over a 15s period in ~100 slices.

At a first glance we appear to be suffering from the same symptom as I've seen elswhere recently where we're spending a ton of time in Finalize Start Callback and not getting much actual work done on each slice. The code segment that's getting flagged is:

    {
        gcstats::AutoPhase ap(stats, gcstats::PHASE_FINALIZE_START);
        callFinalizeCallbacks(&fop, JSFINALIZE_GROUP_START);
        callWeakPointerCallbacks();
    }
 
The callFinalizeCallbacks appears to be going to https://dxr.mozilla.org/mozilla-central/source/js/xpconnect/src/XPCJSRuntime.cpp#796 -- it sets two bools, so is probably not implicated. The other call goes to here: https://dxr.mozilla.org/mozilla-central/source/js/ipc/JavaScriptChild.cpp#55

We seem to call the entire list at the start of each slice, so I'd guess that this work is largely duplicated.
Also, I missed:
https://dxr.mozilla.org/mozilla-central/source/js/xpconnect/src/XPCJSRuntime.cpp#970
To see what's actually implicated we can dump out a CC log from about:memory. Grepping it, we can see:

╰> grep WeakMapEntry /tmp/cc-edges.18387.1430862612.log | wc -l
7100

Whereas there are 10x more nsXPCWrappedJS:

╰> grep nsXPCWrappedJS /tmp/cc-edges.18387.1430862612.log | wc -l
71828

So definitely an XPConnect issue. Let's histogram the wrapped-js edges to see if anything sticks out:

╰> grep nsXPCWrappedJS /tmp/cc-edges.18387.1430862612.log | awk '{print $4}' | awk ' NF > 0{ counts[$0] = counts[$0] + 1; } END { for (word in counts) print counts[word], word; }' | sort -nr
31141 (nsIDOMEventListener)
30206 (nsIMessageListener)
6152 (nsIObserver)
3123 (nsISupports)
1003 (nsIWebProgressListener)
... long tail ...

Maybe? I don't really know. We're solidly into the browser now and I'm out of my depth there. The thing that strikes me as odd is that there are only ~1500 unloaded tabs here, so we must have 10's of listeners per unloaded tab? Fishy.
(Reporter)

Comment 12

2 years ago
(In reply to Terrence Cole [:terrence] from comment #11)
> 30206 (nsIMessageListener)

could this be e10s framescripts hooked into unloaded tabs?
(Reporter)

Comment 13

2 years ago
Created attachment 8641703 [details]
50%-compacting.png

I think there might be two separate issues. One is the GC pauses during steady-state operation and the other is startup, which is more about GC throughput than pause times.

Maybe it would be good to make the incremental collector *less* incremental until session restore is complete? E.g. temporarily bump the slice time to ~200ms until the first restored tab has been loaded? Or maybe some heuristic that gives up on incremental collection for a short time if several slices happen back to back due to allocation pressure.

As for pauses during steady-state: Here's another profile captured after ~1 day of uptime with incremental off. Of a 2 second pause it spends about 50% in compacting.

That's on Nightly 42.0a1 20150729030208
(Reporter)

Comment 14

2 years ago
Since reporting this bug excessive pauses (not just GC ones) have gone down a lot.


measures that lead to that point:

* most recently the fix for bug 1214961 helped
* force-disabled a11y, bug 1187587
* getting some extensions that use Components.util.Sandbox to use the sameZoneAs argument which may have had an impact on the pauses caused by js::NukeCrossCompartmentWrappers; bug 816784, need to take new measurements to confirm

the remaining jank I occasionally see (~500ms) seems to come from

a) something related to cookie-/session-store. still trying to investigate with the profiler.

b) some GC slices taking significantly longer than others.

here is one for example:

[...]
10:22:10.407 GC Slice 27 - Pause: 40,641ms of 40ms budget (@ 4050,017ms); Reason: INTER_SLICE_GC; Reset: no; Times: Mark: 38,538ms
10:22:10.537 GC Slice 28 - Pause: 29,873ms of 40ms budget (@ 4192,186ms); Reason: INTER_SLICE_GC; Reset: no; Times: Mark: 28,044ms, Other: 28,000ms
10:22:11.085 GC Slice 29 - Pause: 442,845ms of 40ms budget (@ 4322,175ms); Reason: INTER_SLICE_GC; Reset: no; Times: Sweep: 433,195ms, Other: 32,730ms, Mark During Sweeping: 303,074ms, Mark Incoming Black Pointers: 0,187ms, Mark Weak: 26,893ms, Mark Incoming Gray Pointers: 0,136ms, Mark Gray: 271,176ms, Mark Gray and Weak: 4,591ms, Finalize Start Callbacks: 16,316ms, Per-Slice Weak Callback: 4,162ms, Per-Compartment Weak Callback: 12,152ms, Sweep Compartments: 54,453ms, Sweep Discard Code: 20,014ms, Sweep Inner Views: 0,300ms, Sweep Cross Compartment Wrappers: 49,080ms, Sweep Base Shapes: 5,087ms, Sweep Initial Shapes: 12,153ms, Sweep Type Objects: 27,533ms, Sweep Breakpoints: 0,546ms, Sweep Regexps: 0,716ms, Sweep Miscellaneous: 11,309ms, Sweep type information: 0,200ms, Sweep type tables and compilations: 0,179ms, Sweep Object: 26,198ms, Sweep String: 0,181ms, Sweep JIT code: 0,120ms
[...]
GC(T+110257.6) Summary - Max Pause: 442,845ms; MMU 20ms: 0,0%; MMU 50ms: 0,0%; Total: 2179,753ms; Zones: 1870 of 1870; Compartments: 4112 of 4112; HeapSize: 1052,613 MiB; HeapChange (abs): +0 (0); 


Question/Speculation related to that: Does the size of a zone as listed in about:memory have an impact on the minimum duration a slice covering that zone can have? Because one zone (the system zone) seems to grow in size with the number of tabs. I.e. this may not scale properly.
(In reply to The 8472 from comment #14)
> Since reporting this bug excessive pauses (not just GC ones) have gone down
> a lot.
> 
> 
> measures that lead to that point:
> 
> * most recently the fix for bug 1214961 helped

I'm extremely glad to hear that our changes are improving your experience!

> * force-disabled a11y, bug 1187587
> * getting some extensions that use Components.util.Sandbox to use the
> sameZoneAs argument which may have had an impact on the pauses caused by
> js::NukeCrossCompartmentWrappers; bug 816784, need to take new measurements
> to confirm
> 
> the remaining jank I occasionally see (~500ms) seems to come from
> 
> a) something related to cookie-/session-store. still trying to investigate
> with the profiler.
> 
> b) some GC slices taking significantly longer than others.
> 
> here is one for example:
> 
> [...]
> 10:22:10.407 GC Slice 27 - Pause: 40,641ms of 40ms budget (@ 4050,017ms);
> Reason: INTER_SLICE_GC; Reset: no; Times: Mark: 38,538ms
> 10:22:10.537 GC Slice 28 - Pause: 29,873ms of 40ms budget (@ 4192,186ms);
> Reason: INTER_SLICE_GC; Reset: no; Times: Mark: 28,044ms, Other: 28,000ms
> 10:22:11.085 GC Slice 29 - Pause: 442,845ms of 40ms budget (@ 4322,175ms);
> Reason: INTER_SLICE_GC; Reset: no; Times: Sweep: 433,195ms, Other: 32,730ms,
> Mark During Sweeping: 303,074ms, Mark Incoming Black Pointers: 0,187ms, Mark
> Weak: 26,893ms, Mark Incoming Gray Pointers: 0,136ms, Mark Gray: 271,176ms,
> Mark Gray and Weak: 4,591ms, Finalize Start Callbacks: 16,316ms, Per-Slice
> Weak Callback: 4,162ms, Per-Compartment Weak Callback: 12,152ms, Sweep
> Compartments: 54,453ms, Sweep Discard Code: 20,014ms, Sweep Inner Views:
> 0,300ms, Sweep Cross Compartment Wrappers: 49,080ms, Sweep Base Shapes:
> 5,087ms, Sweep Initial Shapes: 12,153ms, Sweep Type Objects: 27,533ms, Sweep
> Breakpoints: 0,546ms, Sweep Regexps: 0,716ms, Sweep Miscellaneous: 11,309ms,
> Sweep type information: 0,200ms, Sweep type tables and compilations:
> 0,179ms, Sweep Object: 26,198ms, Sweep String: 0,181ms, Sweep JIT code:
> 0,120ms
> [...]
> GC(T+110257.6) Summary - Max Pause: 442,845ms; MMU 20ms: 0,0%; MMU 50ms:
> 0,0%; Total: 2179,753ms; Zones: 1870 of 1870; Compartments: 4112 of 4112;
> HeapSize: 1052,613 MiB; HeapChange (abs): +0 (0); 
> 

Incrementalizing something is much like this: http://despair.com/products/underachievement. As we incrementalize more things, we keep seeing new long pauses. It's just a matter of time though: we're actively working on fixing the currently non-incremental parts of sweeping, so even the 500ms pauses should fall eventually.

In the specific bad case you posted, it looks like the main component is "Mark During Sweep" at 300ms, the majority of which is under "Mark Gray" at 270ms. The "Mark Gray" phase is where we mark all objects held by held live by cycle collectable, e.g. DOM, objects. This is currently non-incremental because of our lack of barriers in gecko and is something else we've got on the schedule.

The second tallest blade of grass is "Sweep Compartments" at 54ms. Gratifyingly, the tasks under this actually add to ~200+ms, so our parallelization efforts from last year are paying huge dividends in the right workloads. We've also got plans to incrementalize this eventually.

Worryingly, we also have 32ms in "Sweeping:Other". This category shows all the time that we didn't track in a specific sub-category. I /thought/ we had all the possible time-consuming tasks properly tracked at this point: guess not.

> Question/Speculation related to that: Does the size of a zone as listed in
> about:memory have an impact on the minimum duration a slice covering that
> zone can have? Because one zone (the system zone) seems to grow in size with
> the number of tabs. I.e. this may not scale properly.

Yes and no. It's complicated. Most marking and sweeping can be fluidly broken into tiny chunks, so growing a zone has little impact. The things we can't easily break into chunks are hash tables and other C data structures (although we're also working on this). For these currently non-incremental structures, visiting them takes time proportional to their size. However, the size of these tables is not necessarily proportional to the size of the zone: they are generally proportional to the number of instances of usage of some specific feature. This is, of course, going to be dependent on the specifics of the running JS code, but I suppose for large enough zones, it will probably average out to being more or less proportional. So yes, maybe?
(Reporter)

Comment 16

2 years ago
> but I suppose for large enough zones, it will probably average out to being more or less proportional. So yes, maybe?

filed bug 1223458 for that, closing this one since the title is misleading anyway.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WORKSFORME
(Reporter)

Comment 17

2 years ago
Created attachment 8685535 [details]
compacting2.png

Well, I'll be damned. A few minutes after closing the bug I'm seeing a compacting pause in the content process spanning several seconds.

Is there a way to get the javascript.options.mem.log = true output written to a file? It spews so much that the browser console output falls off the scrollback buffer.
(Reporter)

Updated

2 years ago
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
(Reporter)

Comment 18

2 years ago
Ok, I got a veeery long pause again, this time I found a corresponding mem.log entry:

GC(T+230.0) Summary - Non-Incremental: 70177,489ms (allocation trigger); Zones: 1732 of 1886; Compartments: 3808 of 3962; HeapSize: 882,449 MiB; HeapChange (abs): +339 (339); Kind: Shrinking; Relocated: 29,852 MiB;
See Also: → bug 1161706
(Reporter)

Comment 19

a year ago
Created attachment 8715506 [details]
compacting during startup.png

This one is during startup.
Thanks for posting the profiler output.  In both of the cases, most of the time is being spent in JSCompartment::fixupCrossCompartmentWrappersAfterMovingGC.

The problem here is that the wrapper map only has information about outgoing edges from a zone.  Here we need to update any zones which have pointers into the zone we're operating on, and so we have to update them all.  Since there are so many zones in this case performance is poor.

A possible solution is to track which zones have incoming cross-compartment pointers into a zone.  This would allow us to only update the necessary zones here.  It would also allow us to improve JSCompartment::traceIncomingCrossCompartmentEdgesForZoneGC which is used in normal zone GC, which is also conservative at the moment.
(Reporter)

Comment 21

a year ago
Is my understanding correct that window objects separate zones? 

the js-non-window stuff in about:memory looks like this for me:

├──1,060.35 MB (50.42%) -- js-non-window
│  ├────986.75 MB (46.92%) -- zones
│  │    ├──977.50 MB (46.48%) -- zone(0xb6d000)
│  │    │  ├──622.66 MB (29.61%) ++ compartment([System Principal], inProcessTabChildGlobal?ownedBy=chrome://browser/content/browser.xul)
│  │    │  ├──149.03 MB (07.09%) ++ (473 tiny)
│  │    │  ├───69.73 MB (03.32%) ++ strings
│  │    │  ├───52.24 MB (02.48%) ── unused-gc-things
│  │    │  ├───43.30 MB (02.06%) -- object-groups
│  │    │  │   ├──42.96 MB (02.04%) ── gc-heap
│  │    │  │   └───0.34 MB (00.02%) ── malloc-heap
│  │    │  └───40.55 MB (01.93%) ── type-pool
│  │    └────9.26 MB (00.44%) -- (17 tiny)
│  │         ├──4.14 MB (00.20%) ++ zone(0xb60800)
│  │         ├──2.08 MB (00.10%) ++ zone(0x54ec6000)
│  │         ├──1.36 MB (00.06%) ++ zone(0x1c004000)/compartment([System Principal])
│  │         ├──0.27 MB (00.01%) ++ zone(0xb63000)
│  │         ├──0.19 MB (00.01%) ++ zone(0x3eb11000)
│  │         ├──0.15 MB (00.01%) ++ zone(0x2d0ab800)
│  │         ├──0.14 MB (00.01%) ++ zone(0x41d1b000)
│  │         ├──0.11 MB (00.01%) ++ zone(0x2d096800)
│  │         ├──0.11 MB (00.01%) ++ zone(0x2de10800)
│  │         ├──0.11 MB (00.01%) ++ zone(0x2ee0f800)
│  │         ├──0.11 MB (00.01%) ++ zone(0x40067800)
│  │         ├──0.11 MB (00.01%) ++ zone(0x4d5c5800)
│  │         ├──0.11 MB (00.01%) ++ zone(0x51922800)
│  │         ├──0.07 MB (00.00%) ++ zone(0x50b05800)/compartment([Expanded Principal [about:blank]])
│  │         ├──0.06 MB (00.00%) ++ zone(0x1b772000)/compartment([Expanded Principal [resource://gre-resources/hiddenWindow.html]])
│  │         ├──0.06 MB (00.00%) ++ zone(0x41d8f800)/compartment([Expanded Principal [about:blank]])
│  │         └──0.06 MB (00.00%) ++ zone(0x24c38000)/compartment([Expanded Principal [about:blank]])

additionally there are

├────336.21 MB (15.99%) -- window-objects
│    ├──255.47 MB (12.15%) -- (1425 tiny)


So there is that one huge system zone and everything else is tiny. Would tracking incoming edges still help given this distribution?
(In reply to The 8472 from comment #21)
> Is my understanding correct that window objects separate zones? 

It's a zone per tab plus one (or more?) for the system.

> Would tracking incoming edges still help given this distribution?

Well the current implementation is N^2 in the number of zones and seeing that you have rather a lot of them, my guess is that this is the root of the issue.  Tracking incoming edges should help this.
(In reply to Jon Coppeard (:jonco) from comment #22)
So far I've prototyped the approach outlined above but haven't been able to demonstrate any performance improvement.
(Reporter)

Comment 24

a year ago
were you able to reproduce the long pauses in the first place? They don't occur that frequently, otherwise the browser would be unusable. So a dramatic improvement may only show up equally rarely.
(In reply to The 8472 from comment #24)
No I wasn't able to reproduce the pauses either.  So my theory was probably not correct.

In case you didn't try this already, does the problem still happen with add-ons disabled?
(Reporter)

Comment 26

a year ago
> In case you didn't try this already, does the problem still happen with add-ons disabled?

Eh, as I have said before, it's difficult to observe reliably "in the wild". Gecko profiler is crashy and after a restart its recording output often does not contain the first 90 seconds, sometimes even more.

And testing something anything other than startup with addons off is not really acceptable on my regular profile.

> No I wasn't able to reproduce the pauses either.

Would be an artificial test-case that involves hitting "minimize memory usage" in about:memory good enough to reproduce long trace/fixup-CCW calls?
(In reply to The 8472 from comment #26)
> And testing something anything other than startup with addons off is not
> really acceptable on my regular profile.

That's fair enough.  Out of interested, which addons are you running?  Just in case anything jumps out.

> Would be an artificial test-case that involves hitting "minimize memory
> usage" in about:memory good enough to reproduce long trace/fixup-CCW calls?

I expect so but I guess it depends on what the problem is.
(Reporter)

Comment 28

a year ago
(In reply to Jon Coppeard (:jonco) from comment #27)
> That's fair enough.  Out of interested, which addons are you running?  Just
> in case anything jumps out.

Abduction!	3.5.0.1-signed	true	{b0e1b4a6-2c6f-4e99-94f2-8e625d7ae255}
Cookie Monster	1.3.0.0	true	{45d8ff86-d909-11db-9705-005056c00008}
FaviconizeTab	1.0.6.1-signed	true	faviconizetab@espion.just-size.jp
Firefox Hello Beta	1.1.12	true	loop@mozilla.org
Greasemonkey	2015.11.13.beta	true	{e4a8a97b-f2ed-450b-b12d-ee082ba24781}
IPFS Gateway Redirect	1.5.5	true	ipfs-firefox-addon@lidel.org
Multi-process staged rollout	1.0	true	e10srollout@mozilla.org
New Tab Override (browser.newtab.url replacement)	2.3.1	true	newtaboverride@agenedia.com
Pocket	1.0	true	firefox@getpocket.com
Rikaichan	3.00a3	true	{0AA9101C-D3C1-4129-A9B7-D778C6A17F82}
Rikaichan Japanese Names Dictionary File	2.01.160101	true	rikaichan-jpnames@polarcloud.com
Rikaichan Japanese-English Dictionary File	2.01.160101	true	rikaichan-jpen@polarcloud.com
Session Manager	0.8.1.10	true	{1280606b-2510-4fe0-97ef-9b5a22eafe30}
Tab Counter	1.9.9.3	true	tabcounter@morac
Tab Groups	1.0.2	true	tabgroups@quicksaver
uMatrix	0.9.3.3	true	uMatrix@raymondhill.net

> I expect so but I guess it depends on what the problem is.

testcase derived from bug 816784 comment 3 which was meant to demonstrate nuke-CCW time, but with minor modification it can also show trace/fixup time:

- create a clean profile
- install greasemonkey 3.3, µmatrix 0.9.1.2, gecko profiler
- enable chrome debugging
- try to analyze profiler recordings. this will fail due to µMatrix rules blocking some cross-origin javascript loads from cleopatra.io. Allow the blocked requests and save the rules so that it will when we actually want to profile
- restart
- open browser console
- open about:memory
- run this from the console: for(let i=0;i<800;i++) window.openNewTabWith("about:blank");
- wait for CPU usage to settle down
- hit minimize memory usage
- wait for CPU usage to settle down
- analyze profiler records
- bonus round: switch to one of the about:blank tabs, rightclick -> close other tabs, observe browser hang

profiler results that I get: https://cleopatra.io/#report=25bde8c6395b65f30f58af8a9cc2a8984ce01494&jankOnly=true&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A134438,%22end%22%3A157408%7D%5D&selection=0,30,31,4081,4082,4083,4084,42,43,4085,4086,40,41,42,43,4085,44,45,3547,19,20,3548,1183,1184,896,206,207,208,209,3551,3552,3564

With some variations / trial and error I can also get similar results in the parent process even without hitting minimize memory usage. So I assume this is what eventually happens at some point during startup. It's just difficult to figure out reliable steps to reproduce it naturally, so I hope this procedure will be good enough for testing.
(Reporter)

Comment 29

a year ago
Status update:

Very long pauses now occur less frequently. GCs during startup are still slowing things down. 

Things what changed:

* some contributions/bugs filed for several addons that
 * moved one addon's process script to a separate zone
 * semi-leaks and inefficient allocation behavior that touched lots of zones. Now there are fewer zones overall and fewer zones touched during most GCs.
 * reduction of per-tab information stored in sessionstore. sessionstore, especially in combination with the sessionmanager addon, does lots of data serialization, which creates large bursts of garbage
* new pref javascript.options.mem.gc_refresh_frame_slices_enabled = false from bug 1260475 has improved GC throughput. which may have had some knock-on effects?
* identified leak in SDK (bug 1268898). I currently use disable-enable cycling of SDK addons after startup or after long session uptime as a workaround


Since that only applies to the addons i'm using (other ones might show similar behavior) and cross-compartment-references seem to incur such a high cost, maybe it would be useful for addon development if about:memory provided additional information on those? E.g. how how many compartments are referenced and how many CCWs are held overall or something like that.

Additional question: Do weak refs make CCWs cheaper, i.e. could costs be reduced by sprinkling weak refs instead of unload-listener-and-cleanup code in the right places?
(Reporter)

Comment 30

10 months ago
I'm not sure what changed, but things are in a tolerable state now. There still is some fairly long pause some minutes after startup, but then it doesn't happen again for hours.

Good enough for me.
Status: REOPENED → RESOLVED
Last Resolved: 2 years ago10 months ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.