To reduce long pauses, I wonder if we could first traverse and collect objects to unlink, and then post a timer to do the actual unlink in a few seconds. Unlinking and deleting for example dom nodes is not very fast.
Need to evaluate this.
Does it need to be a timer? Posting to the event loop might be enough to respond to a user interaction.
posting to event loop happens probably too soon.
er, I mean handling of the runnable in the event loop happens probably too soon.
Created attachment 572831 [details] [diff] [review] WIP This is anyway using runnables. The patch is hackish, but something I wanted to try out. It splits xpcom object unlink to batches of 512 releases.
This sounds like it is worth investigating. When looking at CC times on my own, unlinking is the only thing that ever takes much time aside from graph building. Peter mentioned some kind of caching table of weak pointers that can decide that a weak pointer it holds should become strong. At least, that was my understanding. It seems like that would doom any effort to allow the browser to run in between computing dead objects and freeing them. Any time an object is revived this way, the entire set of dead objects has to be invalidated because any of them could have become live again. Furthermore, the set of dead objects can't just be thrown away because it is possible that the garbage includes a real dead cycle that we'll leak. That can be solved either by pausing the browser again and running a special mini-CC just on the garbage we found or by pushing all of the garbage back into the purple buffer.
So, in some cases, like when closing a large, non-js-heavy web page, unrooting nsISupports objects can take almost 30% of the CC time on an opt linux build. Unfortunately tryserver shows that the patch has major problems. I'll investigate what and why.
(In reply to Olli Pettay [:smaug] from comment #8) > So, in some cases, like when closing a large, non-js-heavy web page, > unrooting nsISupports objects can > take almost 30% of the CC time on an opt linux build. Apparently not always only non-js-heavy. When I closed a tab which had Facebook loaded, releasing/unrooting nsISupports objects took 16ms.
Created attachment 573177 [details] [diff] [review] WIP2 Adds a null checks. https://tbpl.mozilla.org/?tree=Try&rev=31d55cb3ec1c
Interesting. So I guess CollectWhite time is mostly taken up by Unroot?
Created attachment 573193 [details] [diff] [review] WIP3 The previous one doesn't pass on debug builds. Changing the assertion check a bit, hopefully the crash is about that. I didn't yet look at what could have caused that ::PreCreate problem. Something to do with XBL, I think.
(In reply to Andrew McCreight [:mccr8] from comment #11) > Interesting. So I guess CollectWhite time is mostly taken up by Unroot? I think it depends on the case. Releasing and possibly deleting for example DOM nodes is quite expensive, but unrooting some other objects can be a lot faster.
The latest patch looks ok on try https://hg.mozilla.org/try/rev/6dbdb8fb69db, well, at least so far on linux. But I don't yet quite understand the ::PreCreate assertion problem. How the old code can guarantee that there is always outerwindow...
Some random timing info (without the patch) (Closed the following tabs: FB (not logged in), GMail, CNN) root time 1ms unlink time 15ms unroot time 4ms (Closed the following tabs: D3E, DOM4) root time 5ms unlink time 30ms unroot time 18ms (Closed the following tab: HTML spec) root time 13ms unlink time 100ms unroot time 144ms
Looks like there are some leaks, but nothing terribly bad. I'll investigate some more.
Created attachment 578666 [details] [diff] [review] without undef
Andrew, can you snappy-prioritize this?
Marking P2. Right now, the CollectWhite phase mostly takes no time, except when you are freeing a bunch of stuff and it can take 20% or more of the CC time, and this optimization will let us split out the frees from that part of it. I'm not sure how much of the time it is, but probably a fair amount.
Created attachment 596397 [details] [diff] [review] up-to-date Uploaded to try to see how this behaves now. https://tbpl.mozilla.org/?tree=Try&rev=e3c563c3bee0
Linux M1 leaked 1.5 megs, which sounds like it could be due to this patch.
Looks like there is only 1 document leaked. Shouldn't be too hard to find the leak.
CC has changed significantly since filing this. WONTFIX.