Open Bug 1329900 Opened 7 years ago Updated 2 years ago

GC takes a long time on the test case in bug 1269695

Tracking

()

Status:

NEW

Project Flags:

Performance Impact

low

People

(Reporter: ehsan.akhgari, Unassigned, NeedInfo)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(4 files)

GC breakdown 7 years ago (no longer active) 85.76 KB, image/png		Details
GC log 7 years ago (no longer active) 83.70 KB, text/plain		Details
pretenure-more 7 years ago Jon Coppeard (:jonco) 1.54 KB, patch		Details \| Diff \| Splinter Review
merged 7 years ago Olli Pettay [:smaug][bugs@pettay.fi] 1.18 KB, patch		Details \| Diff \| Splinter Review

(no longer active)

Reporter

Description

•

7 years ago

Attached image GC breakdown — Details

While profiling bug 1269695, I'm noticing several seconds of GC.  For example, see the screenshot which shows the Instruments breakdown of a given run.

I also captured a GC log by setting MOZ_GCTIMER, which I will attach shortly.

(no longer active)

Reporter

Comment 1

•

7 years ago

Attached file GC log — Details

(no longer active)

Reporter

Updated

•

7 years ago

Attachment #8825300 - Attachment mime type: text/x-log → text/plain

Steve Fink [:sfink] [:s:]

Comment 2

•

7 years ago

That log has 1975ms of GC in it.

I ran on my machine. With the default settings, I had a run with 2.0sec and 1.9sec of GC. Disabling incremental GC dropped it to 1.7sec. Changing a bunch of GC settings to ridiculously high dropped the total GC time to 0.6sec. Those settings were:

javascript.options.mem.gc_allocation_threshold_mb	1000
javascript.options.mem.gc_high_frequency_heap_growth_max	2000
javascript.options.mem.gc_high_frequency_high_limit_mb	1000
javascript.options.mem.gc_high_frequency_low_limit_mb	1500
javascript.options.mem.gc_low_frequency_heap_growth	600
javascript.options.mem.high_water_mark	512

The total memory usage of my browser after this run was about 660MB (this is non-e10s).

There are 24 GCs in Ehsan's log, only 3 of them are zone GCs. (There was only a single tab open, so I'm not sure how much it matters.) It looks like the last 4 GCs are from shutdown.

Steve Fink [:sfink] [:s:]

Updated

•

7 years ago

Depends on: 1329918

Steve Fink [:sfink] [:s:]

Comment 3

•

7 years ago

Jon, can you take a look at what we might be able to do here? It's a big chunk of a very very slow google document load, one where it appears that Chrome takes much less time (I saw around 0.7sec in their GC function for this.) There are instructions for reproducing in bug 1269695, but you just have to load https://docs.google.com/document/d/1EPSmGqm2r4Qq42B4t1VOYacjTlL0JVuC8JSlUvoIhss/edit and press ctrl-end to go to the end of the document.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 4

•

7 years ago

It would be good to understand why so few zone GCs. Though, during page load I do expect that we run bunch of frame script code too.

Andrew McCreight [:mccr8]

Comment 5

•

7 years ago

If this is the test case I'm thinking of, one of the problems is that we do a full GC 5 seconds after we start a page load (via a PokeGC call, maybe in SetNewDocument). The page load takes more than 5 seconds, so we're GCing while we're still loading, which is not good.

As for why it is a full GC, bug 1110928 is about making this into a zone GC, but I haven't been able to land it, because it causes OOMs in testing.

It might be interesting to comment out a few of those PokeGC calls that might be triggered during loading and see how that affects things.

Andrew McCreight [:mccr8]

Comment 6

•

7 years ago

To explain it a little more thoroughly... in SetNewDocument (and a few other places) we call PokeGC. What this does is it starts a 5 second timer. When that timer is finished, we run a full GC. I expect the full GC part of it doesn't matter much for this test case, because people are probably running the test case without any other tabs open (in bug 1110928, we run the zone GC for the chrome zone plus the document zone).

Jon Coppeard (:jonco)

Comment 7

•

7 years ago

I'm seeing a couple of things in local testing:
 - lots of slow (>10ms) minor GCs
 - non-incremental major GCs due to allocation pressure

For the former, we seem to be consistently tenuring a lot of data even though we have a 16MiB nursery.  It could be that we need to tweak our pre-tenuring heuristics.  I'll attach a patch that might help.  Steve, are you able to test this where you are?

For the latter the adjusting our allocation thresholds might help.  But it does look like this is just allocating a ton of data.

Jon Coppeard (:jonco)

Comment 8

•

7 years ago

Attached patch pretenure-more — Details — Splinter Review

Patch to pre-tenure more when collecting the nursery.

Jon Coppeard (:jonco)

Comment 9

•

7 years ago

Oh, I have a feeling that the way we tenure nursery objects is not terribly efficient either.  There are a ton of conditions on the type of the object (e.g. see JSObject::allocKindForTenure and TenuringTracer::moveObjectToTenured).  It might be worth investigating whether we can improve this.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 10

•

7 years ago

FWIW, I think the discussion about zone GCs could happen in bug 1110928.

(no longer active)

Reporter

Comment 11

•

7 years ago

(In reply to Andrew McCreight [:mccr8] from comment #6)
> To explain it a little more thoroughly... in SetNewDocument (and a few other
> places) we call PokeGC. What this does is it starts a 5 second timer. When
> that timer is finished, we run a full GC.

Can we do this on idle or something?  Many of the GSuite test cases I have profiled have load times well over 5s, and GCing in the middle of the load surely won't help those load times.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 12

•

7 years ago

GCing only on idle can be problematic, leading to OOMs.
But sure, we might be able to tweak the scheduling some more

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

7 years ago

Depends on: 1110928

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 13

•

7 years ago

ehsan, do you have still link to the original cleopatra profile?

After bug 1110928 and following comment 3 from this bug, I'm not seeing lots of major GCs in profile, but minor GCs are there. I'll test the patch in this bug.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 14

•

7 years ago

Hmm, Gecko profiler is still broken on local linux builds :/

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 15

•

7 years ago

Attached patch merged — Details — Splinter Review

jonco's patch which applies to current m-i.

If profiler's summary is something to rely on (in my case when symbols aren't available), this may drop GC time in child process a tiny bit.

(no longer active)

Reporter

Comment 16

•

7 years ago

(In reply to Olli Pettay [:smaug] from comment #13)
> ehsan, do you have still link to the original cleopatra profile?

No.  FWIW I wasn't using Cleopatra for this bug, I was using Instruments (a native profiler on OSX).  See the screenshot in comment 0 for example.  You should be able to replicate this with the Zoom profiler (or any other native profiler.)

Anthony Jones (:ajones, :kentuckyfriedtakahe, :k17e)

Updated

•

7 years ago

Whiteboard: [qf:p1]

Naveed Ihsanullah [:naveed]

Updated

•

7 years ago

Whiteboard: [qf:p1] → [qf:p3]

Steve Fink [:sfink] [:s:]

Updated

•

7 years ago

Blocks: 1299643

Firefox Bug Husbandry Bot

Updated

•

7 years ago

Keywords: perf

Jason Orendorff [:jorendorff]

Comment 17

•

6 years ago

(ni?sfink since jonco is on PTO)

Steve, this was [qf:p1] for a while and ... well, it looks like it might be low-hanging fruit, guessing from the size of the patch and the fact that there was a reproducible test case. Hence P2. Please reprioritize as appropriate!

Flags: needinfo?(sphink)

Priority: -- → P2

Steve Fink [:sfink] [:s:]

Comment 18

•

6 years ago

Note to self: consider this bug when looking at making string unpretenuring based on retention rates.

Note to others: the patch is very aggressive, making it *far* more eager to start pretenuring things and skip the nursery entirely: we originally considered pretenuring when 80% of all objects are surviving a minor GC. We've since changed that to 60%. This patch drops it to 20%. Additionally, we only pretenure an individual object class if we tenure at least 3000 of them; this drops it to 1000.

We could try out these new thresholds, but I'd probably want to do it as an A/B test. Pretenuring too much defeats the nursery and causes more and longer major GCs.

But as a smoke test, I'm pushing a talos run to try: https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=015d8f5eca32d2540923797c821a5be1b98f099d

Steve Fink [:sfink] [:s:]

Comment 19

•

6 years ago

Oops, make that https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=35b015011fe70d31db716f63b3964c23b8aef36b

Steve Fink [:sfink] [:s:]

Comment 20

•

6 years ago

For the record, that shows that on talos the aggressive pretenuring makes almost no difference, but it speeds up one test.

...I'm not sure what to make of that.

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

2 years ago

Performance Impact: --- → P3

Whiteboard: [qf:p3]

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

You need to log in before you can comment on or make changes to this bug.