Closed Bug 1341474 Opened 4 years ago Closed 4 years ago

FinishAnyIncrementalGC can be really expensive

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: ehsan, Unassigned)

References

Details

Attachments

(1 file)

448.12 KB, application/x-gzip
Details
Olli, Bill, I think we once discussed this but I don't remember what the conclusion was.  But I keep seeing this coming up in profiles, so filing a bug about it in the hope that we can do something better.

This profile shows this function taking 1.2 seconds in the parent process's main thread: <https://perfht.ml/2l5CkJT>
Flags: needinfo?(bugs)
This means GC is too slow - which also means we have too much JS.
(IMO we should try to convert more code to C++ [or if one solves the memory management when using Rust, then Rust would be fine too])

One issue I've seen in parent process is that since almost all the JS uses system zone, we end up collecting the whole world almost all the time.
I wonder if we could split system zone and have several such. Some zone for rarely used .jsms and then perhaps separate zone for each top level window.
Flags: needinfo?(bugs)
FWIW, per telemetry (which is showing data only up to Feb 5 atm), this happens very rarely.
0.01% of the CCs end up finishing iGC.
Basically, the CC wants to run every X seconds. If an IGC takes a really long time, then the CC synchronously finishes off the GC so it can run, which can take a long time.

We could increase X a little, which might help, though it can also cause memory to increase. Also note that sometimes the GC takes a long time due to problems in the GC. For instance, Terrence had an issue in his own browsing session where some XPConnect gunk had to get traced in every GC slice, and there was a lot of it, so almost all of the GC slice was spent on that overhead, and practically no forward progress was being made. Obviously, if we can find and fix issues like that it would be better than poking around the margin with heuristic tweaks.

One heuristic I've thought would be handy for things like this would be to increase the slice time as a GC/CC progresses longer. It would be better to have, say, 4 250ms slices than a single one second slice.
(In reply to Olli Pettay [:smaug] (pto-ish for couple of days) from comment #1)
> One issue I've seen in parent process is that since almost all the JS uses
> system zone, we end up collecting the whole world almost all the time.

FWIW anecdotally, I do see more GC related issues in the parent process than content processes, and I always wondered why that is...
(In reply to :Ehsan Akhgari from comment #4)
> (In reply to Olli Pettay [:smaug] (pto-ish for couple of days) from comment
> #1)
> > One issue I've seen in parent process is that since almost all the JS uses
> > system zone, we end up collecting the whole world almost all the time.
> 
> FWIW anecdotally, I do see more GC related issues in the parent process than
> content processes, and I always wondered why that is...

This is still really surprising. Typically there's not *that* much chrome JS. It would be good to dig into why your GC times are so horrible. Can you post about:memory for the chrome process? The next step would be to collect GC statistics.
I see
434,651,456 B (39.44%) -- js-non-window 
in parent process.

Bug 1287330 should help quite a bit in cases where user has many un-restored tabs.

But even after that we have tons of system zone compartments. I see 455 such compartments.
Attached file about:memory
(In reply to Bill McCloskey (:billm) from comment #5)
> (In reply to :Ehsan Akhgari from comment #4)
> > (In reply to Olli Pettay [:smaug] (pto-ish for couple of days) from comment
> > #1)
> > > One issue I've seen in parent process is that since almost all the JS uses
> > > system zone, we end up collecting the whole world almost all the time.
> > 
> > FWIW anecdotally, I do see more GC related issues in the parent process than
> > content processes, and I always wondered why that is...
> 
> This is still really surprising. Typically there's not *that* much chrome
> JS. It would be good to dig into why your GC times are so horrible. Can you
> post about:memory for the chrome process? The next step would be to collect
> GC statistics.

Not sure why.  Here's an about:memory.  What type of GC statistics did you have in mind?
(In reply to Olli Pettay [:smaug] (pto-ish for couple of days) from comment #6)
> I see
> 434,651,456 B (39.44%) -- js-non-window 
> in parent process.
> 
> Bug 1287330 should help quite a bit in cases where user has many un-restored
> tabs.

Which I do.

Another thing that worries me is the high amount of heap-unclassified.
Depends on: 1287330
(In reply to Olli Pettay [:smaug] (pto-ish for couple of days) from comment #6)
> I see
> 434,651,456 B (39.44%) -- js-non-window 
> in parent process.
> 
> Bug 1287330 should help quite a bit in cases where user has many un-restored
> tabs.

More work needs to be done, see bug 906076 comment 215.
Depends on: lazytabs
No longer depends on: 1287330
Ugh, okay. I guess the unrestored tabs will do it. No need for a GC log. You have about 700MB of JS in the parent process, and marking takes roughly about 1ms/MB. So if you include sweep time, that's not unreasonable.

I think the unrestored tabs should each be in their own zone. But the TabChildGlobal will live in the system zone. And you might have hit a full (non-zone) GC anyway.
Is this bug still valid / useful / actionable?
Flags: needinfo?(ehsan)
I haven't seen this in a while, don't think the bug in its current form is actionable any more.  I'd be happy to reopen if I saw it again.
Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(ehsan)
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.