Closed Bug 757807 Opened 12 years ago Closed 12 years ago

Ghost windows on nytimes.com

Categories

(Core :: DOM: Core & HTML, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla15
Tracking Status
firefox12 --- affected
firefox13 - affected
firefox14 + fixed

People

(Reporter: roc, Assigned: roc)

References

()

Details

(Keywords: memory-leak, Whiteboard: [MemShrink])

Attachments

(1 file)

Several times with long-lived browser sessions I've noticed "Ghost Windows" in about:compartments at nytimes.com domains. Currently I've got:

http://movies.nytimes.com/2012/05/16/movies/the-dictator-with-sacha-baron-cohen.html [4]
http://www.nytimes.com/2007/02/14/world/asia/14iht-green.4590765.html [4]
http://www.nytimes.com/2012/05/18/world/asia/china-princelings-using-family-ties-to-g [clipped, maybe 4 too?]

I do not have many addons installed. Currently the only enabled addons are about:telemetry, about:ccdump, DOM Inspector, Font Information, and PDF Viewer.

I installed about:ccdump and had it identify "zombie documents". It identified two documents for each of the above URLs (6 total). Then I did "Show Roots" on each of those.
For both http://www.nytimes.com/2012/05/18/world/asia/china-princelings-using-family-ties-to-gain-riches.html?_r=1 documents, it returned the same root: a single <p> element in one of those documents with refcount 5, known edges 3.
For both http://www.nytimes.com/2007/02/14/world/asia/14iht-green.4590765.html documents, it return the same root: the same <p> element.
For both http://movies.nytimes.com/2012/05/16/movies/the-dictator-with-sacha-baron-cohen.html documents, it returned the same root: a <p> element in one of those documents.
That <p> element also has refcount 5, known edges 3. It is a child of a div with class "articleBody", i.e., one of the regular paragraphs of the article. (It happens to be the one start "There is nothing especially outrageous here.")

The other <p> element is the one starting "Chinese officials and their relatives rarely discuss such a delicate issue publicly."
In my current browser session, every time I double-click on a word in the article, the Web page pops up its little "?" icon you can click on for a definition. If I click on that icon, close the resulting popup window, and then close the article tab, I have a new ghost window for that article.

In a debug build with a test profile, this does not happen :-(.
I would identify one of the nodes being held alive by unknown edges, set a breakpoint on nsFoo::Release with this == that node, and then close the tab and identify the stacks that don't match up with what the CC sees.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #3)
> I would identify one of the nodes being held alive by unknown edges, set a
> breakpoint on nsFoo::Release with this == that node, and then close the tab
> and identify the stacks that don't match up with what the CC sees.

Tried that. In fact, tried exiting Firefox entirely. The breakpoint was not hit. Maybe I did something wrong. Or maybe it was because I was using an opt build --- although it did have symbols and it wasn't PGO.

Fortunately I seem to be able to reproduce the problem quickly when I start up again in my default profile (with session store restored).
This may actually involve DOM Inspector. I've been using DOM Inspector to bypass the NYTimes article limit by manually removing the overlay element.
Here's what I think are steps to reproduce.

1) Install DOM Inspector (2.0.11 here).
2) Visit articles at nytimes.com until you can no longer view articles for free.
3) Visit one more article to get the paywall overlay.
4) Open DOM inspector, click the left-most toolbar button, click on the paywall overlay image to select that element in DOM Inspector.
5) Follow the DOM hierarchy up to the outermost DIV element containing the overlay image, and remove that DIV from the DOM (e.g. pressing Del or using "Cut" from the DOMI context menu).
6) Double-click on a word in the article text of the NYTimes page.
7) Click on the ? icon to get a definition of the word in a popup window.
8) Close the popup window.
9) Close the DOM Inspector window.
10) Close the NYTimes page tab.
11) Open about:compartments.

You should have a ghost window for the NYTimes page.

I'll try to minimize these STR a bit.
I think you can skip the ? icon and the popup window bit. I've reproduced the bug in a new profile just by opening tons on NYTimes articles until I get the paywall overlay, then removing it using the DOM inspector as described above, then closing the NYTimes tabs.

Interestingly, it's not just the window where I removed the paywall overlay that is shown as a ghost window.
I tried doing comment #3 again and again I never hit the breakpoint. Maybe the addresses exposed by about:ccdump don't match the 'this' pointer in nsGenericElement::Release, or maybe Visual Studio is screwing things up in some other way.
Whiteboard: [MemShrink]
> Here's what I think are steps to reproduce.

On a clean profile (no add-ons), I saw a ghost window just after visiting a few sites on nytimes then opening about:compartments.  No double-clicking on words, no paywall interspersed.  (This was on a debug build, and I haven't been able to reproduce it.)

It will probably help to set memory.ghost_window_timeout_seconds to 1.  At its default value of 60, you won't get any ghost windows until they've been detached for 60s; that may be affecting your STR.

(Alternatively, you can leave the timeout the same and simply look for a nytimes compartment in the list.)
> It will probably help to set memory.ghost_window_timeout_seconds to 1.  At its default value of 60, 
> you won't get any ghost windows until they've been detached for 60s; that may be affecting your STR.

Actually, you don't need to do this.  When you open about:compartments, it gc/cc's a bunch of times, then marks anything which survives as a ghost.
I opened nytimes.com, then loaded a bunch of pages from there into different tabs, then closed the tabs. I had to go through several rounds of opening pages into other tabs and closing the tabs, because I had trouble getting ghost windows. Then, with just the home page open in a tab, I noticed that under User Compartments there was a page from a tab that I'd already closed, but it wasn't showing up in Ghost Windows. So I closed the main nytimes.com tab and presto, the offending compartment started showing up in Ghost Windows. Maybe that's expected, but it seemed weird.
> So I closed the main nytimes.com tab and presto,
> the offending compartment started showing up in Ghost Windows. Maybe that's
> expected, but it seemed weird.

Yes, that's by design.  I'm writing a blog post about it now, but until then, see

http://hg.mozilla.org/mozilla-central/file/f43e8d300f21/dom/base/nsWindowMemoryReporter.h#l58
Instead of comment #3 (breakpoint in nsGenericElement::Release), I tried setting a data breakpoint on the mRefCnt member of the offending element. This is a LOT faster, but this breakpoint was never hit during shutdown either :-(. I conclude that the 2 unknown references to the element are in fact permanently leaked. As supporting evidence, my debug build complains that zillions of nytimes-related URLs are leaked, and the --DOMWINDOW count ends at 5.
I think I've found it!

nsFrameSelection contains mDelayedMouseEvent. mDelayedMouseEvent contains "target", "currentTarget" and "originalTarget". All three are nsCOMPtrs, none are traced by cycle collection. In my memory dump, "target" and "originalTarget" are the unknown references to the leaking <p> element.
(this revelation brought to you by WinDbg and its "search memory" command)
The callers of GetDelayedCaretData() only use me->IsShift() and me->clickCount. So we don't even need to store an actual event there.
Attached patch fixSplinter Review
Assignee: nobody → roc
Attachment #626706 - Flags: review?(bugs)
Comment on attachment 626706 [details] [diff] [review]
fix


> nsFrameSelection::nsFrameSelection()
>-  : mDelayedMouseEvent(false, 0, nsnull, nsMouseEvent::eReal)
> {
>   PRInt32 i;
>   for (i = 0;i<nsISelectionController::NUM_SELECTIONTYPES;i++){
>     mDomSelections[i] = new nsTypedSelection(this);
>     mDomSelections[i]->SetType(GetSelectionTypeFromIndex(i));
>   }
>   mBatching = 0;
>   mChangesDuringBatching = false;

You should initialize the new member variables somewhere in the ctor.

I think we should take the patch to Aurora too.
Leaks are bad.
Attachment #626706 - Flags: review?(bugs) → review+
Keywords: mlk
This should be safe for Aurora, maybe even for Beta.
Should we file a separate bug on hooking nsMouseEvent up to the CC?
No. ns*Event structs shouldn't be CCable. They aren't refcounted either.
This was just a bizarre and wrong way to use an ns*Event struct.
Comment on attachment 626706 [details] [diff] [review]
fix

Review of attachment 626706 [details] [diff] [review]:
-----------------------------------------------------------------

This is a bad bug, and a very simple fix.

It would be great to get this fixed for FF13, but since it's an old bug it would be OK to miss FF13.
Attachment #626706 - Flags: approval-mozilla-beta?
Attachment #626706 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/a339c3a648af
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla15
Comment on attachment 626706 [details] [diff] [review]
fix

[Triage Comment]
Thanks Robert - I'd rather we didn't take this on FF13 for our final beta, since it doesn't feel like the issue chemspill-worthy (although it does look low risk). Approving for Aurora 14, however.
Attachment #626706 - Flags: approval-mozilla-beta?
Attachment #626706 - Flags: approval-mozilla-beta-
Attachment #626706 - Flags: approval-mozilla-aurora?
Attachment #626706 - Flags: approval-mozilla-aurora+
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.