GC takes a long time on the test case in bug 1269695

NEW
Unassigned

Status

()

Core
JavaScript: GC
11 months ago
2 months ago

People

(Reporter: Away for a while, Unassigned)

Tracking

(Blocks: 2 bugs, {perf})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qf:p3])

Attachments

(4 attachments)

(Reporter)

Description

11 months ago
Created attachment 8825299 [details]
GC breakdown

While profiling bug 1269695, I'm noticing several seconds of GC.  For example, see the screenshot which shows the Instruments breakdown of a given run.

I also captured a GC log by setting MOZ_GCTIMER, which I will attach shortly.
(Reporter)

Comment 1

11 months ago
Created attachment 8825300 [details]
GC log
(Reporter)

Updated

11 months ago
Attachment #8825300 - Attachment mime type: text/x-log → text/plain
That log has 1975ms of GC in it.

I ran on my machine. With the default settings, I had a run with 2.0sec and 1.9sec of GC. Disabling incremental GC dropped it to 1.7sec. Changing a bunch of GC settings to ridiculously high dropped the total GC time to 0.6sec. Those settings were:

javascript.options.mem.gc_allocation_threshold_mb	1000
javascript.options.mem.gc_high_frequency_heap_growth_max	2000
javascript.options.mem.gc_high_frequency_high_limit_mb	1000
javascript.options.mem.gc_high_frequency_low_limit_mb	1500
javascript.options.mem.gc_low_frequency_heap_growth	600
javascript.options.mem.high_water_mark	512

The total memory usage of my browser after this run was about 660MB (this is non-e10s).

There are 24 GCs in Ehsan's log, only 3 of them are zone GCs. (There was only a single tab open, so I'm not sure how much it matters.) It looks like the last 4 GCs are from shutdown.

Updated

11 months ago
Depends on: 1329918
Jon, can you take a look at what we might be able to do here? It's a big chunk of a very very slow google document load, one where it appears that Chrome takes much less time (I saw around 0.7sec in their GC function for this.) There are instructions for reproducing in bug 1269695, but you just have to load https://docs.google.com/document/d/1EPSmGqm2r4Qq42B4t1VOYacjTlL0JVuC8JSlUvoIhss/edit and press ctrl-end to go to the end of the document.
It would be good to understand why so few zone GCs. Though, during page load I do expect that we run bunch of frame script code too.
If this is the test case I'm thinking of, one of the problems is that we do a full GC 5 seconds after we start a page load (via a PokeGC call, maybe in SetNewDocument). The page load takes more than 5 seconds, so we're GCing while we're still loading, which is not good.

As for why it is a full GC, bug 1110928 is about making this into a zone GC, but I haven't been able to land it, because it causes OOMs in testing.

It might be interesting to comment out a few of those PokeGC calls that might be triggered during loading and see how that affects things.
To explain it a little more thoroughly... in SetNewDocument (and a few other places) we call PokeGC. What this does is it starts a 5 second timer. When that timer is finished, we run a full GC. I expect the full GC part of it doesn't matter much for this test case, because people are probably running the test case without any other tabs open (in bug 1110928, we run the zone GC for the chrome zone plus the document zone).

Comment 7

11 months ago
I'm seeing a couple of things in local testing:
 - lots of slow (>10ms) minor GCs
 - non-incremental major GCs due to allocation pressure

For the former, we seem to be consistently tenuring a lot of data even though we have a 16MiB nursery.  It could be that we need to tweak our pre-tenuring heuristics.  I'll attach a patch that might help.  Steve, are you able to test this where you are?

For the latter the adjusting our allocation thresholds might help.  But it does look like this is just allocating a ton of data.

Comment 8

11 months ago
Created attachment 8825473 [details] [diff] [review]
pretenure-more

Patch to pre-tenure more when collecting the nursery.

Comment 9

11 months ago
Oh, I have a feeling that the way we tenure nursery objects is not terribly efficient either.  There are a ton of conditions on the type of the object (e.g. see JSObject::allocKindForTenure and TenuringTracer::moveObjectToTenured).  It might be worth investigating whether we can improve this.
FWIW, I think the discussion about zone GCs could happen in bug 1110928.
(Reporter)

Comment 11

11 months ago
(In reply to Andrew McCreight [:mccr8] from comment #6)
> To explain it a little more thoroughly... in SetNewDocument (and a few other
> places) we call PokeGC. What this does is it starts a 5 second timer. When
> that timer is finished, we run a full GC.

Can we do this on idle or something?  Many of the GSuite test cases I have profiled have load times well over 5s, and GCing in the middle of the load surely won't help those load times.
GCing only on idle can be problematic, leading to OOMs.
But sure, we might be able to tweak the scheduling some more
Depends on: 1110928
ehsan, do you have still link to the original cleopatra profile?

After bug 1110928 and following comment 3 from this bug, I'm not seeing lots of major GCs in profile, but minor GCs are there. I'll test the patch in this bug.
Hmm, Gecko profiler is still broken on local linux builds :/
Created attachment 8833753 [details] [diff] [review]
merged

jonco's patch which applies to current m-i.

If profiler's summary is something to rely on (in my case when symbols aren't available), this may drop GC time in child process a tiny bit.
(Reporter)

Comment 16

10 months ago
(In reply to Olli Pettay [:smaug] from comment #13)
> ehsan, do you have still link to the original cleopatra profile?

No.  FWIW I wasn't using Cleopatra for this bug, I was using Instruments (a native profiler on OSX).  See the screenshot in comment 0 for example.  You should be able to replicate this with the Zoom profiler (or any other native profiler.)
Whiteboard: [qf:p1]
Whiteboard: [qf:p1] → [qf:p3]
Blocks: 1299643
Keywords: perf
You need to log in before you can comment on or make changes to this bug.