Closed Bug 610956 Opened 9 years ago Closed 9 years ago

<video> causes nsGlobalWindow to leak until shutdown

Categories

(Core :: Audio/Video, defect, P1)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla2.0b8
Tracking Status
blocking2.0 --- final+

People

(Reporter: jruderman, Assigned: bzbarsky)

References

(Blocks 1 open bug)

Details

(Keywords: memory-leak, testcase)

Attachments

(3 files, 2 obsolete files)

Attached file v.html
Steps:
1. Install 'DOM Fuzz Lite' from
    https://www.squarefree.com/extensions/domFuzzLite.xpi
2. Run a debug build of Firefox like this:
    XPCOM_MEM_LEAK_LOG=2 firefox v.html
3. Click the "Quit with leak check" button.

Result: during shutdown, it says "Windows: 6"

Expected: during shutdown, it should say "Windows: 4"

This seems serious: if I reload a bunch of times before clicking the quit button, the number of windows held until shutdown grows without bound.
blocking2.0: --- → ?
Attached file baseline.html
I can't reproduce this...  I get "Windows: 4", both with a local file and the bugzilla attachment.
I'm testing with a local file, Ubuntu 10.10 (64-bit), Firefox debug build from Tinderbox (64-bit).

I can reproduce even after updating to a http://hg.mozilla.org/mozilla-central/rev/d1649deec025 Tinderbox build.

Last night I even tried a new profile and was still able to reproduce.
OK.  On Linux, I do see this.

It looks like the nsHTMLVideoElement is being kept alive by the nsHTMLMediaElement::MediaLoadListener.  This last doesn't drop the ref to the element until it gets the xpcom-shutdown notification.
I have no idea why I couldn't reproduce on Mac; this is a clear leak.

In nsHTMLMediaElement::LoadResource we have:

965   // loadListener will be unregistered either on shutdown or when
966   // OnStartRequest fires.
967   nsContentUtils::RegisterShutdownObserver(loadListener);

This adds loadListener, which is holding a strong ref to the element, as a shutdown observer, with the observer service holding a strong ref.

Now if we take any early return before a successful return from AsyncOpen, then we'll never get an OnStartRequest, so will leak through shutdown.
Attached patch Proposed fix (obsolete) — Splinter Review
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
Attachment #489753 - Flags: review?(roc)
Attachment #489753 - Flags: approval2.0?
Priority: -- → P1
Whiteboard: [need review]
Attachment #489753 - Flags: review?(roc)
Attachment #489753 - Flags: review+
Attachment #489753 - Flags: approval2.0?
Attachment #489753 - Flags: approval2.0+
blocking2.0: ? → final+
Whiteboard: [need review]
Whiteboard: [need landing]
http://hg.mozilla.org/mozilla-central/rev/c3a14297dff8
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Whiteboard: [need landing]
Backed out due to leak: http://hg.mozilla.org/mozilla-central/rev/d35ddd8a05a2

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1289778830.1289781214.25041.gz
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 349199 bytes during test execution
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 144 instances of AtomImpl with size 20 bytes each (2880 bytes total)
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of BackstagePass with size 24 bytes
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 2 instances of CSSImportRuleImpl with size 44 bytes each (88 bytes total)
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 36 instances of CSSImportantRule with size 16 bytes each (576 bytes total)
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 11 instances of CSSNameSpaceRuleImpl with size 40 bytes each (440 bytes total)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Can I ask for a favor?  Please don't mess with my status whiteboard?
Whiteboard: [need landing]
I removed it because I landed the patch.  I didn't put it back after I backed the patch out because the patch appeared to cause leaks and needs further investigation.
Yes, but that also dropped it off my radar completely; if I didn't happen to see the bugmail in my "bugmail to ignore" folder, I wouldn't have realized I needed to do anything with this bug.  Please just don't touch status whiteboard annotations you didn't add...
Alright, there's good news and bad news.

The good news is that the test failure is an honest-to-God leak caused by this patch.  It's also easy to fix.

The bad news is that before this patch the same circumstance led to a leak through shutdown due to a cycle that was held alive up until the xpcom-shutdown notification we were registered for broke it.

Specifically, we leak on test_mozfiledataurl.html, because in LoadResource aURI is a moz-filedata URI.  Its principal has http://mochi.test:8888/tests/content/base/test/test_mozfiledataurl.html as the codebase, but our document codebase is http://example.com/tests/content/base/test/file_mozfiledataurl_inner.html

So the CheckLoadURI checks in nsCrossSiteListenerProxy::UpdateChannel fail, and the constructor returns an error rv.  Which means we take this early return in LoadResource:

    nsCrossSiteListenerProxy* crossSiteListener =
      new nsCrossSiteListenerProxy(loadListener,
                                   NodePrincipal(),
                                   mChannel,
                                   PR_FALSE,
                                   &rv);
    listener = crossSiteListener;
    NS_ENSURE_TRUE(crossSiteListener, NS_ERROR_OUT_OF_MEMORY);
    NS_ENSURE_SUCCESS(rv, rv);

Now at this point, |this| has a strong ref to mChannel, mChannel has a strong ref to loadListener, loadListener has a strong ref to |this|.  Before the patch for this bug, the xpcom-shutdown notification sent to loadListener made it drop the ref to |this|, breaking the cycle.  After the patch, that doesn't happen anymore.

Anyway, updated patch with a fix for this problem coming up.
I will now claim that having shutdown observers that break cycles between objects not really expected to be long-lived is an antipattern, btw.  I'm just not sure how to deal with it better in this situation.  :(
Attached patch Now with better leak fixage (obsolete) — Splinter Review
Attachment #490515 - Flags: review?(roc)
Whiteboard: [need landing] → [need review]
The other, possibly safer, option is to stick the channel on the stack until we succeed in calling AsyncOpen, _then_ assign it to mChannel.  Let me know if you'd prefer that?
Attached patch Like this, saySplinter Review
Attachment #489753 - Attachment is obsolete: true
Attachment #490515 - Attachment is obsolete: true
Attachment #490528 - Flags: review?(roc)
Attachment #490515 - Flags: review?(roc)
Whiteboard: [need review] → [need landing]
Pushed http://hg.mozilla.org/mozilla-central/rev/f9e32506519f
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Whiteboard: [need landing]
Target Milestone: --- → mozilla2.0b8
I installed the 'DOM Fuzz Lite' extension, but I don't see any  "Quit with leak check" button.
How do I get the 'DOM Fuzz Lite' extension to work?
The "Quit with leak check" button is part of the testcase attached to this bug.
You need to log in before you can comment on or make changes to this bug.