Compile-time let block captures runtime references (was: Download Manager leaks nsGlobalWindow)

RESOLVED FIXED

Status

()

Core
JavaScript Engine
P2
normal
RESOLVED FIXED
10 years ago
10 years ago

People

(Reporter: Ben Turner (not reading bugmail, use the needinfo flag!), Assigned: Igor Bukanov)

Tracking

({mlk})

Trunk
x86
Mac OS X
Points:
---
Bug Flags:
blocking1.9 +
in-testsuite +
in-litmus -

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [needs bug 427798])

Attachments

(2 attachments, 2 obsolete attachments)

Here's the CC_DEBUG output:

nsCycleCollector: JS Object (Script - chrome://mozapps/content/downloads/downloads.js) (global=1bb976e0) 0x1bb94ee0 was not collected due to 
  external references
  An object expected to be garbage could be reached from it by the path:
    JS Object (Script - chrome://mozapps/content/downloads/downloads.js) (global=1bb976e0) 0x1bb94ee0
    JS Object (Function - Startup) (global=1bb942a0) 0x1bb942a0
    JS Function 0x1c0aa4c8
    JS Object (Block) (global=1bb94320) 0x1bb942c0
    JS Object (XULElement) (global=1bb93940) 0x1cec3b40
    JS Object (ChromeWindow) (global=1bb93940) 0x1bb93940
    nsGlobalWindow 0x1c3eabcc
  The known references to it were from:
    nsXULPrototypeNode 0x1c2c1da0
0x1bb94ee0 Script 1357000         via locked object
0x1bb94ee0 Script 1357000         via JSObjectHolder
nsCycleCollector: nsXULPrototypeDocument 0x1c9961f0 was not collected due to 1
  external references (3 total - 2 known)
  An object expected to be garbage could be reached from it by the path:
    nsXULPrototypeDocument 0x1c9961f0
    nsXULPrototypeNode 0x1c3db2c0
    nsXULPrototypeNode 0x1c2c1da0
    JS Object (Script - chrome://mozapps/content/downloads/downloads.js) (global=1bb976e0) 0x1bb94ee0
    JS Object (Function - Startup) (global=1bb942a0) 0x1bb942a0
    JS Function 0x1c0aa4c8
    JS Object (Block) (global=1bb94320) 0x1bb942c0
    JS Object (XULElement) (global=1bb93940) 0x1cec3b40
    JS Object (ChromeWindow) (global=1bb93940) 0x1bb93940
    nsGlobalWindow 0x1c3eabcc
  The 2 known references to it were from:
    nsDocument 0x1334c00
    nsDocument 0x1334c00
Flags: blocking1.9?
(In reply to comment #0)
> Here's the CC_DEBUG output:

Er, DEBUG_CC.

And I realized that I forgot STR. All you have to do is open the download manager (I used the menu) and then close it by clicking the X on the window frame.
If I recall, ispiked had seen this before, but said he also saw it opening the addons window, and just about any other sub-window.

Updated

10 years ago
Assignee: nobody → bent.mozilla
Flags: blocking1.9? → blocking1.9+
Priority: -- → P2
Keywords: mlk
Keeping this one a blocker, sounds like this can cause multiple chrome windows to get leaked.

However if the cache is working properly we might only leak one download manager window, one addons window, etc, so not horribly bad still.
Note:  I don't know the details, nor do I know if that statement is still correct.
So this actually appears to be a JS engine bug as I can work around this leak by making a small change to downloads.js. It looks like the scoped let is somehow holding on to the element at compile time rather than getting it at runtime.
Assignee: bent.mozilla → general
Component: DOM → JavaScript Engine
QA Contact: general → general
Created attachment 309272 [details] [diff] [review]
Workaround patch (-w)

This patch fixes the leak.
(In reply to comment #5)
> It looks like the scoped let is somehow holding on to the element
> at compile time rather than getting it at runtime.

I think I said that poorly. What's happening here is that the XUL prototype document is sitting in the XUL prototype cache. That document has this script compiled and hangs on to it. The problem is that the let causes the first window that uses the script to stay alive by holding on to one of its elements.
Can you use JS_DumpHeap to find the paths from roots that reach the leaked element? The compile-time Block object should not entrain any runtime values -- that would be a bad bug if it is what is happening.

/be
Created attachment 309357 [details]
JS_DumpHeap session

(In reply to comment #8)
> Can you use JS_DumpHeap to find the paths from roots that reach the leaked
> element? The compile-time Block object should not entrain any runtime values --
> that would be a bad bug if it is what is happening.

Brendan, unless I'm misinterpreting this I believe that is exactly what is happening. I've attached a more detailed DEBUG_CC dump along with the JS_DumpHeap output for the links in the chain... The only reference to the element is from the Block, it looks like.
Blocks are cloned at runtime if eval or a function captures them in its environment, but the compile-time Blocks should not entrain anything in their slots. Yikes. Blake, you around?

/be
Summary: Download Manager leaks nsGlobalWindow → Compile-time let block captures runtime references (was: Download Manager leaks nsGlobalWindow)
Crowder: Do you think you'd be able to have a look at this one? This causes some known leaks, and possibly causing a lot of unknown ones since any XBL that uses 'let' will likely be affected.
Assignee: general → crowder
(Assignee)

Comment 12

10 years ago
It is not only the let blocks that leaks the compile-time scope. regexp literals do the same as the code uses js_NewObject(cx, &js_RegExpClass, NULL, NULL) to create compile-time regular expressions.

Comment 13

10 years ago
It looks like regexp literals are okay, though:
        if (!(tc->flags & TCF_COMPILE_N_GO)) {
            STOBJ_SET_PARENT(obj, NULL);
            STOBJ_SET_PROTO(obj, NULL);
        }

Unless I am misunderstanding the problem?

Comment 14

10 years ago
I'm not a good owner for this bug; it looks like Igor has some thoughts, though?
Assignee: crowder → general
(Assignee)

Updated

10 years ago
Assignee: general → igor
(Assignee)

Comment 15

10 years ago
Created attachment 312251 [details] [diff] [review]
v1

The patch changes js_NewBlockObject to use js_NewObjectWithGivenProto to make sure that both proto and parent slots are null for the newly created let object. For consistency the patch also uses js_NewObjectWithGivenProto for regexp objects when they are created in non-compile-and-go-scripts.
Attachment #312251 - Flags: review?(brendan)

Comment 16

10 years ago
My intent isn't to review, just to understand:  does js_CloseRegExpObject need similar modification?  What about js_CloneBlockObject?
(Assignee)

Comment 17

10 years ago
(In reply to comment #16)
> My intent isn't to review, just to understand:  does js_CloseRegExpObject need
> similar modification?

Do you mean js_CloneRegExpObject ?

>  What about js_CloneBlockObject?

Regexp, block and function clones need the proper parent and proto slots as they are part of the execution scope and are exposed to scripts.

Also note that in the case of regexp and functions the only purpose of objects created during compilation is to hold the corresponding private data (JSRegExp and JSScriptedFiunction).  

Comment on attachment 312251 [details] [diff] [review]
v1

> JSObject *
> js_NewBlockObject(JSContext *cx)
> {
>-    JSObject *obj;
>-    JSBool ok;
>-
>     /*
>      * Null obj's proto slot so that Object.prototype.* does not pollute block
>-     * scopes.  Make sure obj has its own scope too, since clearing proto does
>-     * not affect OBJ_SCOPE(obj).
>+     * scopes.

Suggestion: could the comment just say "Avoid Object.prototype.* name pollution." or something as brief?

> /*
>- * Create, serialize/deserialize, or clone a RegExp object.
>+ * Create a RegExp object. When nullParentAndProto is true, this function sets
>+ * that parent and proto slots of the regexp object to null.

s/that parent/the parent/

/be
Attachment #312251 - Flags: review?(brendan) → review+
Igor: What's the latest progress here?
(Assignee)

Comment 20

10 years ago
The real cause of the leak is the following code from js_NativeGet:

JSBool
js_NativeGet(JSContext *cx, JSObject *obj, JSObject *pobj,
             JSScopeProperty *sprop, jsval *vp)
{
...
    JS_LOCK_SCOPE(cx, scope);
    JS_ASSERT(scope->object == pobj);
    if (SLOT_IN_SCOPE(slot, scope) &&
        (JS_LIKELY(cx->runtime->propertyRemovals == sample) ||
         SCOPE_GET_PROPERTY(scope, sprop->id) == sprop)) {
        LOCKED_OBJ_SET_SLOT(pobj, slot, *vp);
    }

    return JS_TRUE;
}

For code like

function f()
{
    let m = Math;
    return (function() { m.sin(1); })();
}

f();

the interpreter eventually calls js_NativeGet with obj pointing to the cloned let object and pobj pointing to the block object created during the compilation (it is a proto of the clone).

Now, since the compiler creates the properties of the compile-time let block in BindLet from jsparse.c via:

    /* Use JSPROP_ENUMERATE to aid the disassembler. */
    return js_DefineNativeProperty(cx, blockObj, ATOM_TO_JSID(atom),
                                   JSVAL_VOID, NULL, NULL,
                                   JSPROP_ENUMERATE | JSPROP_PERMANENT,
                                   SPROP_HAS_SHORTID,
                                   (intN)data->u.let.index++,
                                   NULL);

then such properties are not shared and are backed by a slot. Thus the code in js_NativeGet would store the passed value in this slot of the compiler-time object leading to a leak.

To Brendan: could you comment on the purpose of that code block in js_NativeGet when obj != pobj?
(Assignee)

Comment 21

10 years ago
One way to fix this bug would be to address bug 427798. The proposed change there would use shared properties for block objects so SLOT_IN_SCOPE(slot, scope) in js_NativeGet would return false.

But this would be sweeping things under the carpet since the real issue is the js_NativeGet leak.
(Assignee)

Comment 22

10 years ago
Here is a test case demonstrating the issue with js shell using countHeap function:

~/m/ff/mozilla/js/src $ cat ~/s/x.js
function f()
{
    let m = {sin: Math.sin};
    (function() { m.sin(1); })();
    return m;
}

var x = f();
gc();
var n = countHeap();
x = null;
gc();

var n2 = countHeap();
if (n2 >= n)
    throw "leak is detected, something roots the result of f";
~/m/ff/mozilla/js/src $ ~/m/ff/mozilla/js/src/Linux_All_DBG.OBJ/js ~/s/x.js
before 20480, after 20480, break 0a009000
before 20480, after 20480, break 0a009000
uncaught exception: leak is detected, something roots the result of f
(Assignee)

Comment 23

10 years ago
Created attachment 314430 [details] [diff] [review]
leak fix v1

This fixes the leak but I am not sure about the consequences.
Attachment #312251 - Attachment is obsolete: true
(Assignee)

Updated

10 years ago
Attachment #314430 - Flags: review?
JSPROP_SHARED was added to avoid such leaks or entrainment bugs. We should use it. It's not sweeping things under the carpet, since the ancient API always would update the proto-slot for a non-shared property.

The latest patch here breaks that and I'm concerned it will break some other user. js_NativeGet is a common path for every native property access except those optimized into js_Interpret. I therefore would much prefer to fix bug 427798 and reject the latest patch here.

/be
(Assignee)

Updated

10 years ago
Attachment #314430 - Attachment is obsolete: true
Attachment #314430 - Flags: review?
(Assignee)

Updated

10 years ago
Depends on: 427798
How we doing here?  We're down to the wire.
Whiteboard: [needs bug 427798]
Poke.  How's this coming?

Comment 27

10 years ago
Seems like this one should be WONTFIXd or dup'd to 427798, which should contain a fix for this issue, as well.
I think a leak bug is a very unlikely candidate for WONTFIX, but at a glance your dup suggestion looks good. :)
I'd be hesitant to mark it a DUP given how vastly different the titles are, but rather simply leave the dependency and mark this one FIXED one the patch for the other bug is checked in...
OK.  Let's just mark this as fixed once the the other is fixed.  Fair enough?
Should be fixed by patch in bug 427798
Status: NEW → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED

Comment 32

10 years ago
/cvsroot/mozilla/js/tests/js1_8/extensions/regress-422269.js,v  <--  regress-422269.js
initial revision: 1.1

mozilla-central:  16422:37bb30d2bf64
Flags: in-testsuite+
Flags: in-litmus-
You need to log in before you can comment on or make changes to this bug.