Closed Bug 535638 Opened 12 years ago Closed 12 years ago

DNS prefetch service can keep many DOMWindows alive, e.g. during mochitest

Categories

(Core :: DOM: Core & HTML, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: sayrer, Assigned: peterv)

References

Details

(Keywords: memory-footprint)

Attachments

(3 files)

Right now, the TM debug mochitest boxes are hanging on the TM tree, so I decided to see if I could detect a leak.

It looks like something near the tests that create the 500-1000th DOMWindow objects is leaking pretty badly. 

I'll attach graphs.
Attached image macintosh
y-axis: active DOMWindows
x-axis: serial number (autoincremented with each DOMWindow creation).
To be clear, when I say "leak", I mean things that are kept alive longer than necessary. I think they would disappear on shutdown. It's just that TM isn't making it through the whole run right now.
Flags: blocking1.9.2?
Do we think that this was caused by the recent changes to the cycle collector? Not sure why this is nominated as a blocker, although I agree that we should investigate / look into it.
I'll assign this to me for now, I'm looking at it but going to be on a plane soon and probably a bit jetlagged after.
Assignee: nobody → peterv
So far it looks like we're missing some edges to some elements.
DNS prefetch service is keeping anchor elements alive. The elements have event listeners, which keep windows alive.
It is related to our testsuites loading pages in quick succession, we defer the DNS prefetching while there are pages loading and I think there's a cache of 512 elements, so we could potentially hold 512 windows alive for a while. The cache gets flushed off of a timer, but only when there are no more loads.

Should probably not block, since I don't think a user will see this during normal browsing, but we should try to take a patch.
Flags: wanted1.9.2+
Flags: blocking1.9.2?
Flags: blocking1.9.2-
Do we still unbind or DestroyContent the anchors on page unload?  If so, can we remove them from the prefetch queue at that point?  Or would finding them in the list be too expensive?
By the way, I'd love to see how you found that!
I'd prefer not to rely on DestroyContent since it's a hack that ideally will go away, but if all else fails it's of course an option.

Wouldn't count on getting an UnbindFromTree since the element keeps the window alive, which keeps the document alive, which means that we won't tear down any nodes.

Maybe DestroyContent is the solution after all :(

Alternatively if we could use the LoadGroup somehow, but I'm not sure how that would work.
We could change the heuristics of the DNS prefetch queue, by timing out items. Could for example keep track of the head of the queue when deferring at the last two timeouts, and if we defer again just flush the items before the first head, set the first head to the second head and set the second head to the current head. Can't do much now, need to catch my flight, so feel free to submit patches :-).
Flags: blocking1.9.2- → blocking1.9.2?
Comment 11 sounds like probably the best approach...
Alternatively, could the DNS prefetch code simply hold weak references to the links?
In the nsWeakReference sense?
Yes.

Though looking at the code, we don't even really need to hold on to the element at all. All we're using the element for is grab the href uri, and check that the owner document is non-null.

I don't really think we care enough about dynamic changes to the domain of the href attribute that we couldn't just grab the href up front.

And the owner document is soon going to be a owning reference, so no null checking is useless. I guess the code is trying to check if the document has gone away which could be worth something? But if we want to do that, using a nsWeakRef is a better (and soon only) solution.
How about this idea: Grab the hostname up front, but then hold a weak reference to the owner document to make sure we don't prefetch for documents that have been navigated away from?
> All we're using the element for is grab the href uri

As it happens, this is expensive.  The reason for the current setup is to defer that until we're sure we need it.  See the checkin history for that code.

> And the owner document is soon going to be a owning reference

Then we can remove that null-check.  In the meantime, we were hitting asserts and crashes without it.

> How about this idea: Grab the hostname up front

See above about expensive.
Attached patch v1Splinter Review
I'm fine with holding weak references, I have a patch that does that and it seems to help. I have a vague memory that we couldn't do this in the past, but I'm not really sure why. Boris, do you? Maybe I'm just confused. I do remember that fetching the href before putting things in the cache regressed performance in some cases.
Attachment #418458 - Flags: superreview?(jonas)
Attachment #418458 - Flags: review?(bzbarsky)
Attachment #418458 - Flags: review?(bzbarsky) → review+
That looks ok to me, as long as it doesn't hurt perf due to the extra QIs and the extra object...
Comment on attachment 418458 [details] [diff] [review]
v1

I think it's silly that we can't get the href up front, but I guess this'll do.
Attachment #418458 - Flags: superreview?(jonas) → superreview+
Extra two objects actually.. :(  (the weakref object, and the slots)
Flags: blocking1.9.2? → blocking1.9.2-
http://hg.mozilla.org/mozilla-central/rev/876a1db26a28
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Keywords: footprint
Summary: big resource leak apparent during mochitest → DNS prefetch service can keep many DOMWindows alive, e.g. during mochitest
Looks like the weak ref approach is actually quite slow, given the performance requirements of this year.
Depends on: 1356556
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.