Investigate if cycle collector could unroot asynchronously

RESOLVED WONTFIX

Status

()

Core
General
RESOLVED WONTFIX
7 years ago
4 years ago

People

(Reporter: smaug, Unassigned)

Tracking

(Blocks: 1 bug)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [Snappy:P2])

Attachments

(2 attachments, 3 obsolete attachments)

To reduce long pauses, I wonder if we could first traverse and collect
objects to unlink, and then post a timer to do the actual unlink in a few seconds.
Unlinking and deleting for example dom nodes is not very fast.
Need to evaluate this.
Blocks: 698919

Comment 2

6 years ago
Does it need to be a timer? Posting to the event loop might be enough to respond to a user interaction.
posting to event loop happens probably too soon.
er, I mean handling of the runnable in the event loop happens probably too soon.
Created attachment 572831 [details] [diff] [review]
WIP

This is anyway using runnables. The patch is hackish, but something I wanted
to try out. It splits xpcom object unlink to batches of 512 releases.
This sounds like it is worth investigating.  When looking at CC times on my own, unlinking is the only thing that ever takes much time aside from graph building.

Peter mentioned some kind of caching table of weak pointers that can decide that a weak pointer it holds should become strong.  At least, that was my understanding.  It seems like that would doom any effort to allow the browser to run in between computing dead objects and freeing them.

Any time an object is revived this way, the entire set of dead objects has to be invalidated because any of them could have become live again.  Furthermore, the set of dead objects can't just be thrown away because it is possible that the garbage includes a real dead cycle that we'll leak.  That can be solved either by pausing the browser again and running a special mini-CC just on the garbage we found or by pushing all of the garbage back into the purple buffer.
https://tbpl.mozilla.org/?tree=Try&rev=50f9c505aa80
Summary: Investigate if cycle collector could unlink asynchronously → Investigate if cycle collector could unroot asynchronously
So, in some cases, like when closing a large, non-js-heavy web page, unrooting nsISupports objects can
take almost 30% of the CC time on an opt linux build.

Unfortunately tryserver shows that the patch has major problems. I'll investigate what and why.
(In reply to Olli Pettay [:smaug] from comment #8)
> So, in some cases, like when closing a large, non-js-heavy web page,
> unrooting nsISupports objects can
> take almost 30% of the CC time on an opt linux build.
Apparently not always only non-js-heavy. When I closed a tab which had Facebook loaded, 
releasing/unrooting nsISupports objects took 16ms.
Created attachment 573177 [details] [diff] [review]
WIP2

Adds a null checks.

https://tbpl.mozilla.org/?tree=Try&rev=31d55cb3ec1c
Interesting.  So I guess CollectWhite time is mostly taken up by Unroot?
Created attachment 573193 [details] [diff] [review]
WIP3

The previous one doesn't pass on debug builds.
Changing the assertion check a bit, hopefully the crash is about that.

I didn't yet look at what could have caused that ::PreCreate problem.
Something to do with XBL, I think.
Attachment #572831 - Attachment is obsolete: true
Attachment #573177 - Attachment is obsolete: true
(In reply to Andrew McCreight [:mccr8] from comment #11)
> Interesting.  So I guess CollectWhite time is mostly taken up by Unroot?
I think it depends on the case. Releasing and possibly deleting for example DOM nodes is quite
expensive, but unrooting some other objects can be a lot faster.
The latest patch looks ok on try https://hg.mozilla.org/try/rev/6dbdb8fb69db, well, at least so far on linux. But I don't yet quite understand the ::PreCreate assertion problem.
How the old code can guarantee that there is always outerwindow...
Some random timing info (without the patch)

(Closed the following tabs: FB (not logged in), GMail, CNN)
root time 1ms
unlink time 15ms
unroot time 4ms

(Closed the following tabs: D3E, DOM4)
root time 5ms
unlink time 30ms
unroot time 18ms

(Closed the following tab: HTML spec)
root time 13ms
unlink time 100ms
unroot time 144ms
Looks like there are some leaks, but nothing terribly bad.
I'll investigate some more.
https://tbpl.mozilla.org/?tree=Try&rev=afef7a4f1222
Depends on: 701342
Blocks: 705771
Created attachment 578666 [details] [diff] [review]
without undef
Attachment #573193 - Attachment is obsolete: true
Hardware: x86_64 → All
Version: unspecified → Trunk
Whiteboard: [Snappy]

Comment 19

6 years ago
Andrew, can you snappy-prioritize this?
Marking P2.  Right now, the CollectWhite phase mostly takes no time, except when you are freeing a bunch of stuff and it can take 20% or more of the CC time, and this optimization will let us split out the frees from that part of it.  I'm not sure how much of the time it is, but probably a fair amount.
Whiteboard: [Snappy] → [Snappy:P2]
Created attachment 596397 [details] [diff] [review]
up-to-date

Uploaded to try to see how this behaves now.
https://tbpl.mozilla.org/?tree=Try&rev=e3c563c3bee0
Linux M1 leaked 1.5 megs, which sounds like it could be due to this patch.
Very possible
Looks like there is only 1 document leaked. Shouldn't be too hard to find the leak.
No longer blocks: 698919
Blocks: 698919

Updated

4 years ago
Blocks: 917264
No longer blocks: 917264
CC has changed significantly since filing this. WONTFIX.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.