Closed Bug 523083 Opened 15 years ago Closed 14 years ago

Traverse/unlink cached content list

Categories

(Core :: DOM: Core & HTML, defect)

x86
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- final+

People

(Reporter: smaug, Assigned: roc)

References

Details

Attachments

(2 files, 2 obsolete files)

Attached patch patch (obsolete) — Splinter Review
Split from bug 335998.

Peter, you wanted to do this in some other way (IIRC, traverse/unlink from
element, if element is list root), but I never got any good reasons why.

I don't think keeping the subtree rooted by root node alive a bit longer is
a big problem.
Attachment #407000 - Flags: review?(peterv)
Blocks: 542437
This patch should also remove nsBaseContentList::Shutdown.
Attached patch my version (obsolete) — Splinter Review
I wrote this to fix bug 542437. It's a lot like Olli's patch, but it gives each document its own cached content list.
There are issues with the CC graph not reflecting reality. Of course it makes it harder to understand the lifetimes when debugging, but I am worried that it can lead to uncollectable situations (we have run into this before with some of the caches that are stored in ownerDocuments). Here's one example, it relies on nodes not always holding their ownerDocument alive but I wouldn't be surprised if we can construct others that don't rely on that. Imagine an element that is in a cycle with its cached contentlist and the content list holds on to the ownerDocument somehow. The ownerDocument is in a collectable cycle of its own, not involving the element. The element is held by one unknown edge. When that edge goes away, we'll add the element to the purple buffer and from that point on we can't collect element, content list or document. The element has no edge going to the content list, so we'll start traversing from the element and get nowhere. That's an example of a shutdown leak. It's also very easy to get temporary leaks, where large content lists will stay alive for the lifetime of the document even though their root goes away. I do realize that the cache might be blown away at some point, making things collectable. But if we want to have the cache in the document (I think we do), then the cache being blown away becomes less likely.

I think we should traverse the cached content list from its root. If that becomes a performance issue, we should use a flag on the nodes to signal that they have a cached content list.
> The element is held by one unknown edge. When that
> edge goes away, we'll add the element to the purple buffer and from that point
> on we can't collect element, content list or document.

I don't understand why we can't collect the element, content list and document at that point. Doesn't the element have an edge to the document, and the document to the content list?
It does look like the element always has an edge to its document, so yes, we're covered in this case. Now add an indirect edge from element to document that isn't recorded completely in cycle collection. With an edge from element to cache we can collect element and cache (and might even collect document), with an edge from document to cache we can't.
Like I said, there are multiple scenarios and we can probably solve them all, but I think that if the graph doesn't reflect reality it becomes harder, we're more likely to leak temporarily or permanently. That makes me nervous, but I'm fine with this if people are confident that it won't cause any issues.
I'm not sure what you mean by the "graph not reflecting reality". In my patch, at least, there's no trickery going on: the document references a cached content-list and traverses that reference during cycle collection. What the cycle collector sees is exactly what the reference graph really is.

I think what you're actually saying is that we should make any cycles known to the cycle collector as small as possible to reduce the risk that an edge not known to the cycle collector will create an uncollectable cycle. E.g., given a document D, element E and content-list C, and a cycle from E to D to C to E known to the cycle collector, if there's an unknown edge from E to D then all of E, D and C leak. However if there is only a cycle from E to C to E, an unknown edge from E to D will not cause a leak --- although an unknown edge from E to C still would.

Is that right?
No, what I'm saying is that the cache lifetime really should be tied to the element, it doesn't really matter *where* you store the cache. We cache a bunch of things for elements in hashes in the document, yet we traverse almost all of them from the element for exactly these reasons. It's my experience that that causes less problems.
Comment on attachment 407000 [details] [diff] [review]
patch

Minussing since I should be more responsive. Sorry, should have just done that ages ago, when I first explained why I thought this was wrong.
Attachment #407000 - Flags: review?(peterv) → review-
Content lists are often not associated with an element. For many content lists, including the "title" content list I was looking at in bug 542437, the root node is the document itself.
So s/element/node/.
For 5 of the 7 NS_GetContentList call sites, the root node is the document or the document's root element.
http://mxr.mozilla.org/mozilla-central/search?string=NS_GetContentList

The other two are in element.getElementsByTagName and element.getElementsByTagNameNS, which aren't used all that much AFAIK (less than the document versions).

If we stored the cached list in the document but traversed it through the document's root element, would you be happy?
That seems worse actually (why would you traverse from the documentElement if the root is the document?), what benefit does that give us? Like I said, I think we should traverse from the root node (regardless of whether it's a document, a documentElement or any element).
I still don't understand why we need to consider the cached content list to be a property of its root node, when we could just as easily consider it a property of the document, and it just happens to have a root node field. The difference seems completely subjective to me, in which case cycle collection should work either way.

Storing the cached content list in the document and traversing it via the root node seems ugly. If we really have to go in this direction, should we store the content list as a node property?
Ok, so explain to me why it's logical to bind the lifetime of the content list to the document. Why should the cached content list stay alive even though nothing references it and its root node can be collected, just because the document can't be collected?
It's a very simple issue and I've pointed out an example where your solution fails to resolve a real leak, so I'm not sure how it's subjective. Or are you saying that my solution fails to resolve leaks that yours does? And whether it's common or not is fairly irrelevant, most leaks are not that common. I tried to warn about what I think is an issue with the patch, one that looks a lot like issues we've run into in the past.
You mean the example in comment #5? In that example, isn't an unknown edge from an element to document just a bug? If the element is in the document, then we have a cycle with at least one edge unknown to the cycle collector, so the cycle would just leak, right?

> Why should the cached content list stay alive even though
> nothing references it and its root node can be collected, just because the
> document can't be collected?

That situation does illustrate why traversing through the root node would be better --- thanks.

So then the question is, while we traverse the cached content list from its root node, should we store one globally, or per-document, or per-node?
(In reply to comment #15)
> You mean the example in comment #5? In that example, isn't an unknown edge from
> an element to document just a bug? If the element is in the document, then we
> have a cycle with at least one edge unknown to the cycle collector, so the
> cycle would just leak, right?

If the element is in the document it will keep everything alive, yes. Unknown edges are bugs, unfortunately we still have plenty of those around (and they're not always under our control), I'd rather make them leak as little as possible.

> So then the question is, while we traverse the cached content list from its
> root node, should we store one globally, or per-document, or per-node?

Personally I think per document would be ok, though I think Webkit caches per-node. Maybe Boris has an opinion. Though strictly speaking we can make that change separately from this bug, I'd be fine with just a change to traverse the cache.
Attached patch fix v2Splinter Review
OK, this is Olli's patch, revised to traverse through the root node (document or element).
Assignee: Olli.Pettay → roc
Attachment #407000 - Attachment is obsolete: true
Attachment #425764 - Attachment is obsolete: true
Attachment #426104 - Flags: review?
Attachment #426104 - Flags: review? → review?(peterv)
blocking2.0: --- → ?
Whiteboard: [needs review]
The point of the cached content list, I thought, was to deal with badly written code that gets the list over and over in a loop.  At least so my dim memory goes.  Someone want to check the blame?

If so, might it make sense to just nuke that cache altogether and tell C++ code to not be badly written?
Wouldn't it also help with badly written JS code that calls getElementsByTagName(NS) over and over in a loop?
ah, of course the JS heap would keep that alive.

The single-element cache was added in bug 140758. Re-reading that bug, it doesn't sound like it was added for a very good reason.
So yeah, I'd be happy to nuke that cache until we come up with a good reason for it to exist.
The only call to GetElementsByTagName(NS) that I can see that I think might be a problem is in PresShell::GoToAnchor...
Problem only if GoToAnchor is called multiple times, right?

I would be fine with having a cache of some sort on the document too, for only lists which have the document as the root, and owned by the document...
Don't we keep calling GoToAnchor frequently as the document loads, until we finally find the anchor element?
Actually, it looks like for HTML documents we create a new content list every time (via nsHTMLDocument::GetElementsByName), and we only call GetElementsByTagNameNS for non-HTML documents, so who cares. Let's kill it.
Take your pick, Peter, but I prefer this one!
Attachment #426154 - Flags: review?(peterv)
Attachment #426154 - Flags: review?(peterv) → review+
Attachment #426104 - Flags: review?(peterv)
Whiteboard: [needs review] → [needs landing]
Given the risk/reward ratio I'd say this is a blocker
blocking2.0: ? → final
http://hg.mozilla.org/mozilla-central/rev/cc3751898ccc
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [needs landing]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: