nsGlobalWindow leak in devtools/debugger/test/browser_dbg_conditional-breakpoints-02.js

RESOLVED FIXED in Firefox 26, Firefox OS v1.2

Status

RESOLVED FIXED
5 years ago
4 months ago

People

(Reporter: shu, Assigned: shu)

Tracking

unspecified
Firefox 28
x86
Mac OS X
Dependency tree / graph

Firefox Tracking Flags

(firefox26 fixed, firefox27 fixed, firefox28 fixed, firefox-esr24 unaffected, b2g-v1.2 fixed)

Details

(Whiteboard: [qa-])

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

5 years ago
CC logs of the roots:

0x12197d480 [FragmentOrElement (xhtml) html http://example.com/browser/browser/devtools/debugger/test/doc_conditional-breakpoints.html]
    --[mAttrsAndChildren[i]]--> 0x12070f880 [FragmentOrElement (xhtml) body http://example.com/browser/browser/devtools/debugger/test/doc_conditional-breakpoints.html]
    --[mAttrsAndChildren[i]]--> 0x1219124c0 [FragmentOrElement (xhtml) button http://example.com/browser/browser/devtools/debugger/test/doc_conditional-breakpoints.html]
    --[[via hash] mListenerManager]--> 0x12070fba0 [nsEventListenerManager]
    --[mListeners event=onclick listenerType=1 [i]]--> 0x1219cf180 [nsJSEventListener handlerName=onclick]
    --[mContext]--> 0x12043fbe0 [nsJSContext]
    --[mWindowProxy]--> 0x11d1191c0 [JS Object (Proxy)]
    --[type_proto]--> 0x11d108220 [JS Object (XPC_WN_ModsAllowed_NoCall_Proto_JSClass - Window)]
    --[setter]--> 0x11d152500 [JS Object (Function - ononline)]
    --[parent]--> 0x11d107100 [JS Object (Window)]
    --[js::GetObjectPrivate(obj)]--> 0x1207d26a0 [XPCWrappedNative (Window)]
    --[mIdentity]--> 0x121962c00 [nsGlobalWindow #17]

    Root 0x12197d480 is a ref counted object with 1 unknown edge(s).
    known edges:
       0x1219bcdc0 [nsGenericDOMDataNode] --[GetParent()]--> 0x12197d480
       0x12070f880 [FragmentOrElement (xhtml) body http://example.com/browser/browser/devtools/debugger/test/doc_conditional-breakpoints.html] --[GetParent()]--> 0x12197d480
       0x1207ca800 [nsDocument normal (xhtml) http://example.com/browser/browser/devtools/debugger/test/doc_conditional-breakpoints.html] --[mChildren[i]]--> 0x12197d480
       0x12197d530 [FragmentOrElement (xhtml) head http://example.com/browser/browser/devtools/debugger/test/doc_conditional-breakpoints.html] --[GetParent()]--> 0x12197d480

0x1207ca800 [nsDocument normal (xhtml) http://example.com/browser/browser/devtools/debugger/test/doc_conditional-breakpoints.html]
    --[mFormControls]--> 0x11d2700d0 [nsBaseContentList]
    --[mElements[i]]--> 0x1219124c0 [FragmentOrElement (xhtml) button http://example.com/browser/browser/devtools/debugger/test/doc_conditional-breakpoints.html]
    --[[via hash] mListenerManager]--> 0x12070fba0 [nsEventListenerManager]
    --[mListeners event=onclick listenerType=1 [i]]--> 0x1219cf180 [nsJSEventListener handlerName=onclick]
    --[mContext]--> 0x12043fbe0 [nsJSContext]
    --[mWindowProxy]--> 0x11d1191c0 [JS Object (Proxy)]
    --[type_proto]--> 0x11d108220 [JS Object (XPC_WN_ModsAllowed_NoCall_Proto_JSClass - Window)]
    --[setter]--> 0x11d152500 [JS Object (Function - ononline)]
    --[parent]--> 0x11d107100 [JS Object (Window)]
    --[js::GetObjectPrivate(obj)]--> 0x1207d26a0 [XPCWrappedNative (Window)]
    --[mIdentity]--> 0x121962c00 [nsGlobalWindow #17]

    Root 0x1207ca800 is a ref counted object with 6 unknown edge(s).
    known edges:
       0x121962c00 [nsGlobalWindow #17] --[mDoc]--> 0x1207ca800
       0x120777f40 [nsNodeInfoManager] --[mDocument]--> 0x1207ca800
       0x11d119300 [JS Object (Proxy)] --[UnwrapDOMObject(obj)]--> 0x1207ca800

0x120291400 [nsGlobalWindow #15]

    Root 0x120291400 is a ref counted object with 1 unknown edge(s).
    known edges:
       0x12043fbe0 [nsJSContext] --[mGlobalObjectRef]--> 0x120291400
       0x121962c00 [nsGlobalWindow #17] --[mOuterWindow]--> 0x120291400
(Assignee)

Comment 1

5 years ago
Panos, do you know anything about this? It's happening fairly frequently.
Flags: needinfo?(past)

Comment 2

5 years ago
I wonder if this is related to bug 931052. That's also OSX-only, and related to the debugger's global object management.
(Assignee)

Comment 3

5 years ago
The only thing I can see entraining the leaking nsGlobalWindows other than what's already accounted for in the CC log is nsFocusManager. I'm not sure if that means something is installing focus event handlers on elements in that window or there are focus event handlers which has those windows as free variables. Does this ring any bells?
Sounds a little like bug 605294.  Do you have any ideas, Olli?
(Assignee)

Comment 5

5 years ago
Created attachment 8336597 [details] [diff] [review]
load-bearing-wallpaper.patch

This patch gets rid of the leak for me locally. Don't ask me why, I don't know.

Try push is here: https://tbpl.mozilla.org/?tree=Try&rev=fbdd2319bd5f

At time of this writing I have to to bed, so I don't know if the patch actually helps. If it does, feel free to push it with r=orange or something.
(Assignee)

Comment 6

5 years ago
We should still figure out why it's leaking, even if that patch does help.

Comment 7

5 years ago
Can anyone produce this locally? Is this a runtime leak only, or is the leak there also during shutdown?


(Debugging runtime-only leaks is easier. Create the leak, take the address of the
root of the leaked graph by creating cclog or using the addon attached to bug 726346,
add a break point to the Release of that object and close the browser and see where the leak is
broken down.)
Blocks: 942102
I filed bug 942102 for this cluster of problems. As you can see, there are a bunch of them.

(In reply to Olli Pettay [:smaug] from comment #7)
> Can anyone produce this locally?

Yes. shu is able to reproduce this locally.  It is OSX 10.8 opt only.

> Is this a runtime leak only, or is the leak there also during shutdown?

It is a runtime leak.

> (Debugging runtime-only leaks is easier. Create the leak, take the address
> of the root of the leaked graph by creating cclog or using the addon attached to
> bug 726346, add a break point to the Release of that object and close the browser and
> see where the leak is broken down.)

Yeah, shu did that.  In comment 3, he found that it was nsFocusManager that had the outstanding references.  Do you have any idea what might be going wrong with that?
(In reply to Shu-yu Guo [:shu] from comment #5)
> At time of this writing I have to to bed, so I don't know if the patch
> actually helps. If it does, feel free to push it with r=orange or something.

I talked to shu in IRC and since we are getting also windows xp etc reports he pushed a new try for all platforms - https://tbpl.mozilla.org/?tree=Try&rev=35c7285d48ff and did the retriggeres so far
Oh, sorry, totally missed comment 3.
(In reply to Shu-yu Guo [:shu] from comment #3)
> The only thing I can see entraining the leaking nsGlobalWindows other than
> what's already accounted for in the CC log is nsFocusManager. I'm not sure
> if that means something is installing focus event handlers on elements in
> that window
> or there are focus event handlers which has those windows as
> free variables. Does this ring any bells?
No, probably nothing to do with event handlers or listeners.

Focus just for some reason stays in the focus manager. Sounds very much like intermittent 
"leak". We could "fix" it by adding optimization to mark windows kept alive by focus manager black,
or just focus some other window when the test is about to end.
And this is very different to Bug 605294, since that was a shutdown leak.
A patch coming.
(Assignee)

Comment 14

5 years ago
FWIW I have not had luck reproducing the other leaks in the cluster by repeatedly running the test in question until failure. Only this one produced a leak using that method. I'll try running the entire suite later to see if I can repro.
(Assignee)

Comment 15

5 years ago
It seems like I always have to eat my words. Just reproduced the leak in browser_dbg_global-search04.js with /mach mochitest-browser --run-until-failure --repeat 100 browser/devtools/debugger/test/browser_dbg_search-global-04.js
(Assignee)

Comment 16

5 years ago
(In reply to Shu-yu Guo [:shu] from comment #15)
> It seems like I always have to eat my words. Just reproduced the leak in
> browser_dbg_global-search04.js with /mach mochitest-browser
> --run-until-failure --repeat 100
> browser/devtools/debugger/test/browser_dbg_search-global-04.js

Very, very intermittent, however.

Updated

5 years ago
Depends on: 942240
Olli, is it really not a bug that some windows hang around in the nsFocusManager for over an hour to almost shutdown?
Is that really the case? Are we not moving focus to anywhere else? Or are we using a separate
process here?

nsFocusManager has 3 strong refs to nsPIDOMWindow. Which one is keeping the cycle alive?
(I don't have OSX to test.)
(Assignee)

Comment 19

5 years ago
(In reply to Olli Pettay [:smaug] from comment #18)
> Is that really the case? Are we not moving focus to anywhere else? Or are we
> using a separate
> process here?
> 
> nsFocusManager has 3 strong refs to nsPIDOMWindow. Which one is keeping the
> cycle alive?
> (I don't have OSX to test.)

Can't tell yet, lldb doesn't want to cooperate.
(Assignee)

Comment 20

5 years ago
Some additional info for posterity: this test leaks pretty quickly for me locally on 10.9 even with bug 933882 backed out.

I am still having great difficulty reproducing the other leaks locally on 10.9.
Even with the patch for Bug 942240 there are some devtools/debugger leaks.

Is something else than nsFocusManager keeping stuff alive?
(Assignee)

Comment 22

5 years ago
(In reply to Olli Pettay [:smaug] from comment #21)
> Even with the patch for Bug 942240 there are some devtools/debugger leaks.
> 
> Is something else than nsFocusManager keeping stuff alive?

Testing your patch now with this leak that I can reproduce locally. I wish I could tell you what's up with the other leaks, but I can't reproduce those locally.
(Assignee)

Comment 23

5 years ago
Bad news. With the patch from 942240 browser_dbg_conditional-breakpoints-02.js still leaks locally on 10.9 :(
I don't really have any idea why this is happening, but the workaround seems fine to me. If the leak is in the focus manager, would focusing on another element right before finishing the test work too?
Flags: needinfo?(past)

Comment 25

5 years ago
Shu, I've gotten my hands on an OSX 10.8 machine. Could you attach the .mozconfig file you're using (that includes debugger symbols)?
Flags: needinfo?(shu)
(Assignee)

Comment 26

5 years ago
(In reply to Jim Blandy :jimb from comment #25)
> Shu, I've gotten my hands on an OSX 10.8 machine. Could you attach the
> .mozconfig file you're using (that includes debugger symbols)?

Since the try builds have been only failing on opt, I've just used this .mozconfig:

> ac_add_options --with-ccache
> mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-opt

My usual debug .mozconfig is:

> ac_add_options --enable-debug
> ac_add_options --enable-debug-symbols
> ac_add_options --with-ccache

I hope you're able to reproduce the other leaks (search-global-04, etc) on 10.8! If we don't make much progress I'll request a 10.8 slave next week from releng.
Flags: needinfo?(shu)
(Assignee)

Comment 27

5 years ago
Created attachment 8337463 [details] [diff] [review]
bug941787-nested-event-loop-race.patch

What was entraining the nsGlobalWindows was indeed nsFocusManager, but the bug is in the devtools browser actor. Events are suppressed on the window when inside a nested event loop. This suppression holds a stronger reference to the window in the delayed focus events array inside nsFocusManager. There is a race between BTA_exit and BTA_postNest where BTA_exit can be called before BTA_postNest. When this happens, we never unsuppress events on the now nulled out window, causing nsFocusManager to never fire its delayed events. Those delayed events live inside nsFocusManager until shutdown, holding the nsGlobalWindow alive.

Push try is here to see if it helps the other leaks: https://tbpl.mozilla.org/?tree=Try&rev=ced0b335eab9
Attachment #8336597 - Attachment is obsolete: true
Attachment #8337463 - Flags: review?(past)
(Assignee)

Comment 28

5 years ago
lol, s/stronger reference/strong reference
Comment on attachment 8337463 [details] [diff] [review]
bug941787-nested-event-loop-race.patch

Review of attachment 8337463 [details] [diff] [review]:
-----------------------------------------------------------------

The event loop handling logic is mostly inside the thread actor (script.js), so this fix seems slightly out of place, but I don't think it's that big of a deal. Perhaps the promises used there contribute to the race somehow.
Attachment #8337463 - Flags: review?(past) → review+
(Assignee)

Comment 30

5 years ago
Try looks very green. Thanks for whoever retriggered!

Panos, I don't think the promises contribute to the race, other than that the "TabClose" event which triggers the exit isn't on the main "spine of promises" that the rest of the actors follow.

I'll push this to stop the leaks, but please push a more elegant solution in line with how promises are used when you have one.
Forgot to add: huge props for figuring this one out!
https://hg.mozilla.org/mozilla-central/rev/d504746c4456
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 28
Assignee: nobody → shu
Is this something we should consider uplifting to Aurora/Beta?
Panos, shu is on PTO. Can you please nominate this for uplift? :)
Flags: needinfo?(past)

Comment 36

5 years ago
(In reply to Shu-yu Guo [:shu] from comment #27)
> What was entraining the nsGlobalWindows was indeed nsFocusManager, but the
> bug is in the devtools browser actor. Events are suppressed on the window
> when inside a nested event loop. This suppression holds a stronger reference
> to the window in the delayed focus events array inside nsFocusManager. There
> is a race between BTA_exit and BTA_postNest where BTA_exit can be called
> before BTA_postNest. When this happens, we never unsuppress events on the
> now nulled out window, causing nsFocusManager to never fire its delayed
> events. Those delayed events live inside nsFocusManager until shutdown,
> holding the nsGlobalWindow alive.

AMAZING. Very impressive.
(Assignee)

Comment 37

5 years ago
Comment on attachment 8337463 [details] [diff] [review]
bug941787-nested-event-loop-race.patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Bug 697762
User impact if declined: Leaks sometimes when opening the debugger and closing the debuggee tab.
Testing completed (on m-c, etc.): on m-c
Risk to taking this patch (and alternatives if risky): None
String or IDL/UUID changes made by this patch:
Attachment #8337463 - Flags: approval-mozilla-beta?
Attachment #8337463 - Flags: approval-mozilla-aurora?
I just want to clarify that no string or IDL/UUID changes are made by this patch.
Flags: needinfo?(past)
Duplicate of this bug: 937913
Duplicate of this bug: 938241
Duplicate of this bug: 941500
Duplicate of this bug: 941549
Duplicate of this bug: 942038
Attachment #8337463 - Flags: approval-mozilla-beta?
Attachment #8337463 - Flags: approval-mozilla-beta+
Attachment #8337463 - Flags: approval-mozilla-aurora?
Attachment #8337463 - Flags: approval-mozilla-aurora+
Blocks: 937912
https://hg.mozilla.org/releases/mozilla-aurora/rev/fc8df6e6a540
https://hg.mozilla.org/releases/mozilla-beta/rev/b189801a4cc4
status-firefox26: --- → fixed
status-firefox27: --- → fixed
status-firefox28: --- → fixed
status-firefox-esr24: --- → unaffected
status-b2g-v1.2: --- → fixed
Duplicate of this bug: 916693
Duplicate of this bug: 941586
Duplicate of this bug: 941954
Duplicate of this bug: 942048
Duplicate of this bug: 942526
Whiteboard: [qa-]

Updated

4 months ago
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.