Closed
Bug 611653
Opened 14 years ago
Closed 14 years ago
Assertion failure: obj->containsSlot(slot)
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla2.0b12
Tracking | Status | |
---|---|---|
blocking2.0 | --- | betaN+ |
People
(Reporter: bc, Assigned: dmandelin)
References
()
Details
(Keywords: assertion, Whiteboard: [hardblocker][has patch][fixed-in-tracemonkey])
Attachments
(5 files, 7 obsolete files)
1. install Spider
2. firefox -spider -url 'http://gamehay.net/' -start -quit
3. Assertion failure: obj->containsSlot(slot), at /work/mozilla/builds/2.0.0/mozilla/js/src/jsinterp.cpp:5316
Doesn't appear to happen on mac at least just loading the page in the browser.
See also: bug 596145
2.0.0 mac/win/linux
Assertion failure: obj->containsSlot(slot), at /work/mozilla/builds/2.0.0/mozilla/js/src/jsinterp.cpp:5316
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
0x0649e465 in JS_Assert (s=0x696d7cf "obj->containsSlot(slot)", file=0x696bb38 "/work/mozilla/builds/2.0.0/mozilla/js/src/jsinterp.cpp", ln=5316) at /work/mozilla/builds/2.0.0/mozilla/js/src/jsutil.cpp:80
80 *((int *) NULL) = 0; /* To continue from here in GDB: "return" then "continue". */
(gdb) bt
#0 0x0649e465 in JS_Assert (s=0x696d7cf "obj->containsSlot(slot)", file=0x696bb38 "/work/mozilla/builds/2.0.0/mozilla/js/src/jsinterp.cpp", ln=5316) at /work/mozilla/builds/2.0.0/mozilla/js/src/jsutil.cpp:80
#1 0x063cc633 in js::Interpret () at /work/mozilla/builds/2.0.0/mozilla/js/src/jsinterp.cpp:5316
#2 0x063daea7 in js::RunScript (cx=0x22d18530, script=0x1ee56aa0, fp=0x1000048) at jsinterp.cpp:665
#3 0x063dbc63 in js::Invoke (cx=0x22d18530, argsRef=@0xbfffce54, flags=8192) at jsinterp.cpp:768
#4 0x063dc35b in js::ExternalInvoke (cx=0x22d18530, thisv=@0xbfffceb8, fval=@0xbfffcef8, argc=1, argv=0xbfffd0f8, rval=0xbfffd170) at jsinterp.cpp:881
#5 0x06311511 in js::ExternalInvoke (cx=0x22d18530, obj=0x22c5f888, fval=@0xbfffcef8, argc=1, argv=0xbfffd0f8, rval=0xbfffd170) at jsinterp.h:954
#6 0x06311669 in JS_CallFunctionValue (cx=0x22d18530, obj=0x22c5f888, fval={asBits = 18446462629381047232, s = {payload = {i32 = 583435200, u32 = 583435200, boo = 583435200, str = 0x22c683c0, obj = 0x22c683c0, ptr = 0x22c683c0, why = 583435200}, tag = JSVAL_TAG_OBJECT}, asDouble = -nan(0xf000722c683c0), asPtr = 0x22c683c0}, argc=1, argv=0xbfffd0f8, rval=0xbfffd170) at /work/mozilla/builds/2.0.0/mozilla/js/src/jsapi.cpp:4908
#7 0x058ed612 in nsXPCWrappedJSClass::CallMethod (this=0x22b83c30, wrapper=0x23351b00, methodIndex=3, info=0x889e28, nativeParams=0xbfffd40c) at /work/mozilla/builds/2.0.0/mozilla/js/src/xpconnect/src/xpcwrappedjsclass.cpp:1694
#8 0x058e4589 in nsXPCWrappedJS::CallMethod (this=0x23351b00, methodIndex=3, info=0x889e28, params=0xbfffd40c) at /work/mozilla/builds/2.0.0/mozilla/js/src/xpconnect/src/xpcwrappedjs.cpp:577
#9 0x06150bf8 in PrepareAndDispatch (self=0x24966d20, methodIndex=3, args=0xbfffd534) at /work/mozilla/builds/2.0.0/mozilla/xpcom/reflect/xptcall/src/md/unix/xptcstubs_unixish_x86.cpp:93
#10 0x0614b73b in nsXPTCStubBase::Stub3 (this=0x24966d20) at xptcstubsdef.inc:1
#
Reporter | ||
Comment 2•14 years ago
|
||
I was able to reproduce 1 out of 8 tries with Mac OS X 10.5 intel and 1 out of 3 tries with WinXP and 0 out of 3 tries on Linux.
Comment 3•14 years ago
|
||
I can repro this under a debugger on Linux. The assertion hits in JSOP_GETGLOBAL which wants to access slot 152 where the (global) object's slotSpan() is 124.
Bob, can you think of any way that the spider addon is doing anything with script globals that could break compile-n-go assumptions (e.g., adding or removing properties)? The assert happens at the very beginning of 'handleEvent' on http://gamehay.net/game/js/media.js line 57, which is called to handle the event of an XHR OnStopRequest.
Reporter | ||
Comment 4•14 years ago
|
||
When Spider starts and before the content page is loaded in the xul:browser element, Spider sets up a number of global variables in its own window object, a window watcher for dialogs, a console listener and a load handler for the xul:browser element where the content page is loaded. In this case Spider doesn't need to load the external hook script to crash, so I think that covers most of what would violate the assumption. The source (a little out of date) can be found at <http://hg.mozilla.org/automation/sisyphus/file/687aa4ec9cf1/spider/chrome/content/spider>
Comment 6•14 years ago
|
||
Is this the same bug as bug 614714 or bug 613619? Asking because bug 614714 has a shell test case which triggers this assert.
Comment 7•14 years ago
|
||
Bug 620991 covers comment 5.
No longer depends on: 605015
Whiteboard: [orange]
Updated•14 years ago
|
Assignee | ||
Comment 8•14 years ago
|
||
I repro this in Windows about 1 out of 3. The global object that it crashes on is weird:
object 090040A0
class 04A4F44C Window
flags: delegate branded has_equality inDictionaryMode
properties:
enumerate readonly permanent "window": slot 123
proto <XPC_WN_ModsAllowed_NoCall_Proto_JSClass object at 09006150>
parent null
private 0B5B38C0
slots:
0 (reserved) = undefined
1 (reserved) = undefined
[ Many more undefined reserved slots ]
120 (reserved) = undefined
121 (reserved) = <unnamed function at 090072D0 (JSFunction at 090072D0)>
122 (reserved) = <RegExpStatics object at 090040C8>
123 = <Proxy object at 09008048>
That's printed out after the crash in the debugger. I went and made it print out the global when it is compiling the script, and it's an object with the same address, and the same first 124 properties, and then many more properties, with property 'http' in slot 152, which is exactly what the code is trying to do.
Interestingly, this repros iff I first get an assertion dialog "XPConnect is being called on a scope without a 'Components' property!". (Actually, I get it twice before the crash.) That assertion is generated from this code in xpcwrappednativescope.cpp:
// This is pretty much always bad. It usually means that native code is
// making a callback to an interface implemented in JavaScript, but the
// document where the JS object was created has already been cleared and the
// global properties of that document's window are *gone*. Generally this
// indicates a problem that should be addressed in the design and use of the
// callback code.
NS_ERROR("XPConnect is being called on a scope without a 'Components' property!");
That would seem to explain why we see an empty Window object.
Assignee | ||
Comment 9•14 years ago
|
||
Bob, do you think this might be a bug in Spider? Comment 8 says we are tripping an XPConnect assertion about when we make a callback. It seems to me that's probably a bug in the extension's use of JSAPI, or maybe something about event generation and delegation in the browser (DOM/xpconnect)?
Reporter | ||
Comment 10•14 years ago
|
||
David, it could very well be a bug in Spider. I'll see what I can find out.
Assignee | ||
Comment 11•14 years ago
|
||
(In reply to comment #10)
> David, it could very well be a bug in Spider. I'll see what I can find out.
OK, unblocking for now. Let us know if you find out that it is a browser bug.
blocking2.0: betaN+ → -
Comment 12•14 years ago
|
||
I hit this all the time with a vanilla debug build and google voice. We should block on this.
blocking2.0: - → ?
Comment 13•14 years ago
|
||
Assertion failure: obj->containsSlot(slot), at ../../../js/src/jsinterp.cpp:5334
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000
0x0000000101a707d6 in JS_Assert (s=0x101f56554 "obj->containsSlot(slot)", file=0x101f56a88 "../../../js/src/jsinterp.cpp", ln=5334) at ../../../js/src/jsutil.cpp:80
80 *((int *) NULL) = 0; /* To continue from here in GDB: "return" then "continue". */
(gdb) bt 10
#0 0x0000000101a707d6 in JS_Assert (s=0x101f56554 "obj->containsSlot(slot)", file=0x101f56a88 "../../../js/src/jsinterp.cpp", ln=5334) at ../../../js/src/jsutil.cpp:80
#1 0x000000010199697a in js::Interpret () at ../../../js/src/jsinterp.cpp:5334
#2 0x00000001019a47f8 in js::RunScript (cx=0x126db0a90, script=0x11bf087c0, fp=0x1189060b0) at jsinterp.cpp:657
#3 0x00000001019a5612 in js::Invoke (cx=0x126db0a90, argsRef=@0x7fff5fbfb310, flags=8192) at jsinterp.cpp:737
#4 0x00000001019556f5 in CallOrConstructBoundFunction (cx=0x126db0a90, argc=1, vp=0x118906038) at ../../../js/src/jsfun.cpp:2290
#5 0x00000001019a75bf in js::CallJSNative (cx=0x126db0a90, native=0x10195550b <CallOrConstructBoundFunction(JSContext*, unsigned int, js::Value*)>, argc=1, vp=0x118906038) at jscntxtinlines.h:692
#6 0x00000001019a5406 in js::Invoke (cx=0x126db0a90, argsRef=@0x7fff5fbfb500, flags=0) at jsinterp.cpp:700
#7 0x00000001019a5e51 in js::ExternalInvoke (cx=0x126db0a90, thisv=@0x7fff5fbfb590, fval=@0x7fff5fbfb5c8, argc=1, argv=0x7fff5fbfbc48, rval=0x7fff5fbfb840) at jsinterp.cpp:858
#8 0x00000001018d5262 in js::ExternalInvoke (cx=0x126db0a90, obj=0x11b8e0058, fval=@0x7fff5fbfb5c8, argc=1, argv=0x7fff5fbfbc48, rval=0x7fff5fbfb840) at jsinterp.h:961
#9 0x00000001018d539d in JS_CallFunctionValue (cx=0x126db0a90, obj=0x11b8e0058, fval={asBits = 18445477441071579648, debugView = {payload47 = 4757225984, tag = JSVAL_TAG_OBJECT}, s = {payload = {i32 = 462258688, u32 = 462258688, why = 462258688, word = 18445477441071579648}}, asDouble = -nan(0xb80011b8d8200), asPtr = 0xfffb80011b8d8200}, argc=1, argv=0x7fff5fbfbc48, rval=0x7fff5fbfb840) at ../../../js/src/jsapi.cpp:5019
(More stack frames follow...)
(gdb) call js_DumpObject(obj)
No symbol "obj" in current context.
(gdb) up
#1 0x000000010199697a in js::Interpret () at ../../../js/src/jsinterp.cpp:5334
5334 JS_ASSERT(obj->containsSlot(slot));
(gdb) call js_DumpObject(obj)
object 0x11b100120
class 0x126e3a038 Window
flags: delegate branded has_equality inDictionaryMode
properties:
enumerate readonly permanent "window": slot 117
proto <XPC_WN_ModsAllowed_NoCall_Proto_JSClass object at 0x11b103210>
parent null
private 0x11d325f60
slots:
0 (reserved) = undefined
1 (reserved) = undefined
2 (reserved) = undefined
3 (reserved) = undefined
4 (reserved) = undefined
5 (reserved) = undefined
6 (reserved) = undefined
7 (reserved) = undefined
8 (reserved) = undefined
9 (reserved) = undefined
10 (reserved) = undefined
11 (reserved) = undefined
12 (reserved) = undefined
13 (reserved) = undefined
14 (reserved) = undefined
15 (reserved) = undefined
16 (reserved) = undefined
17 (reserved) = undefined
18 (reserved) = undefined
19 (reserved) = undefined
20 (reserved) = undefined
21 (reserved) = undefined
22 (reserved) = undefined
23 (reserved) = undefined
24 (reserved) = undefined
25 (reserved) = undefined
26 (reserved) = undefined
27 (reserved) = undefined
28 (reserved) = undefined
29 (reserved) = undefined
30 (reserved) = undefined
31 (reserved) = undefined
32 (reserved) = undefined
33 (reserved) = undefined
34 (reserved) = undefined
35 (reserved) = undefined
36 (reserved) = undefined
37 (reserved) = undefined
38 (reserved) = undefined
39 (reserved) = undefined
40 (reserved) = undefined
41 (reserved) = undefined
42 (reserved) = undefined
43 (reserved) = undefined
44 (reserved) = undefined
45 (reserved) = undefined
46 (reserved) = undefined
47 (reserved) = undefined
48 (reserved) = undefined
49 (reserved) = undefined
50 (reserved) = undefined
51 (reserved) = undefined
52 (reserved) = undefined
53 (reserved) = undefined
54 (reserved) = undefined
55 (reserved) = undefined
56 (reserved) = undefined
57 (reserved) = undefined
58 (reserved) = undefined
59 (reserved) = undefined
60 (reserved) = undefined
61 (reserved) = undefined
62 (reserved) = undefined
63 (reserved) = undefined
64 (reserved) = undefined
65 (reserved) = undefined
66 (reserved) = undefined
67 (reserved) = undefined
68 (reserved) = undefined
69 (reserved) = undefined
70 (reserved) = undefined
71 (reserved) = undefined
72 (reserved) = undefined
73 (reserved) = undefined
74 (reserved) = undefined
75 (reserved) = undefined
76 (reserved) = undefined
77 (reserved) = undefined
78 (reserved) = undefined
79 (reserved) = undefined
80 (reserved) = undefined
81 (reserved) = undefined
82 (reserved) = undefined
83 (reserved) = undefined
84 (reserved) = undefined
85 (reserved) = undefined
86 (reserved) = undefined
87 (reserved) = undefined
88 (reserved) = undefined
89 (reserved) = undefined
90 (reserved) = undefined
91 (reserved) = undefined
92 (reserved) = undefined
93 (reserved) = undefined
94 (reserved) = undefined
95 (reserved) = undefined
96 (reserved) = undefined
97 (reserved) = undefined
98 (reserved) = undefined
99 (reserved) = undefined
100 (reserved) = undefined
101 (reserved) = undefined
102 (reserved) = undefined
103 (reserved) = undefined
104 (reserved) = undefined
105 (reserved) = undefined
106 (reserved) = undefined
107 (reserved) = undefined
108 (reserved) = undefined
109 (reserved) = undefined
110 (reserved) = undefined
111 (reserved) = undefined
112 (reserved) = undefined
113 (reserved) = undefined
114 (reserved) = undefined
115 (reserved) = <unnamed function at 0x11b102e00 (JSFunction at 0x11b102e00)>
116 (reserved) = <RegExpStatics object at 0x11b100168>
117 = <Proxy object at 0x11b105068>
(gdb) p slot
$1 = 157
(gdb) call DumpJSStack()
0 anonymous() ["https://www.google.com/voice/resources/3747311955-gc_prod.js":434]
this = [object Object]
1 anonymous() ["https://www.google.com/voice/resources/3747311955-gc_prod.js":433]
this = [object Object]
2 anonymous([object Event @ 0x11ac6e9c0 (native @ 0x127083910)]) ["https://www.google.com/voice/resources/3747311955-gc_prod.js":433]
this = [object Object]
(gdb)
Updated•14 years ago
|
blocking2.0: ? → betaN+
Could be a dupe of bug 623863
Assignee | ||
Comment 15•14 years ago
|
||
Andreas, (a) are you still getting this, and (b) if so do you get the XPConnect "assertion" dialog before the containsSlot assertion? (If not it might be an OS thing.) It still seems to me this is a bug in either XPConnect or the add-on where a function is getting called on a window that is closed. Do we look for an xpconnect bug? File bugs on Google Voice for Fx? Change xpconnect so that it squelches the function call in that condition? I think we need some help from xpconnect experts here.
Comment 16•14 years ago
|
||
I can run with a debug build and see if this reproduces. Though, this is a JS engine bug. What slot is not found here?
Comment 17•14 years ago
|
||
(google voice uses flash by the way, and flash is know to make a lot of weird calls into JS, but still, we shouldn't assert here)
Assignee | ||
Comment 18•14 years ago
|
||
(In reply to comment #16)
> I can run with a debug build and see if this reproduces. Though, this is a JS
> engine bug. What slot is not found here?
I'm not convinced it is a JS engine bug. See comment 8: the reason the slot is not found is that we are calling a function that belongs (i.e., has as its global) a window that has been torn down, which clears out its slots. Asserting this way is new because of the global var optimizations: we just read the slot directly because it's compile-n-go. Maybe we should start making calls to closed windows fail before they even start.
Comment 19•14 years ago
|
||
The api doesn't disallow doing clear scope and then calling a function, does it?
(In reply to comment #19)
> The api doesn't disallow doing clear scope and then calling a function, does
> it?
It seems like it should. If a global object has permanent properties, the only reason those properties should be removed is if the object is no longer reachable by language code. Deleting them, and then invoking code which has baked in assumptions about the object layout, is bizarre.
We don't want to introduce shape checks on GETGLOBAL, it'd cost us perf (we already lost some on SETGLOBAL for spec conformance).
It's hard to imagine that the JS code would run properly, even with shape guards. Why exactly is this being done?
Comment 21•14 years ago
|
||
The clear should compensate for this and refuse to remove read-only properties. That might be a security issue though (for the outer window, not for the inner).
Even if it's only permanent properties (that would fix this bug), what does it mean to grab a random script in compile-n-go code, clear out the global, and try to run it again?
Comment 23•14 years ago
|
||
I am ok with throwing in the API if we can.
Comment 24•14 years ago
|
||
This seems to be causing the orange in bug 620991. Specifically, it seems to be breaking a clearInterval call in one of our tests, which is interfering with mochitest-chrome runs on Windows XP.
See bug 620991 comment 73 and later for an analysis of what's happening here.
Assignee | ||
Updated•14 years ago
|
Whiteboard: hardblocker
Assignee | ||
Comment 25•14 years ago
|
||
Smaug, can you help us out here? This is related to nsGlobalWindow::ClearScope. Recap of the problem so you don't have to read all the history above:
STR are in comment 0. The basic idea is that we are calling JS scripted functions that belong to a window that has been ClearScope'd. In Fx4.0, this is very bad, because there is a new optimization for global variables that requires that the global object not be invalidated. I guess we might someday not call ClearScope, but for now, it would probably be OK to prevent those calls from going through. (See Details below for full details.)
Does stopping those calls make sense? jst suggested that you would know an efficient way to detect those calls and the right places to detect them.
Details:
Following those STR, we assert in the JS engine in a bad way (may crash) like this:
1. nsGlobalWindow::ClearScope gets called (via either ReallyClearScope or SetDocShell, it looks like).
2. An event handler tries to call a JS function that belongs to that global. Specific steps here: The outer event loop generates an XHR stop-request event, calling XHR::OnStopRequest. This detects that the target isn't really there anymore, so it generates an XHR error. This generates a DOM change event (ready state change, I think), and that tries to call a JS event handler function.
3. DEBUG_CheckForComponentsInScope detects that we are calling a method on a window with cleared scope and shows a dialog (in debug builds, of course).
4. We continue trying to call the event handler, calling JS_CallFunctionValue. This calls a JS scripted function.
5. We assert trying to read the value of a global variable declared with 'var'.
The reason this now asserts is that for a read of a global variable declared with 'var' (or any other permanent property of the global object), we compile that into a direct slot read for any compile-and-go script. So, a script that is compiled with compile-and-go (a) must only ever be called with that same global object, and (b) may not be called after that global is JS_ClearScope'd. Making a call that violate (a) or (b) is now a JSAPI violation and can be expected to crash.
Comment 26•14 years ago
|
||
Event listeners don't get called if the inner window their (current)target is
bound to is not the current inner window of the outer window.
But is this a case where the event listener lives in a different context
than the event target (XHR)?
I'll try to reproduce this.
Comment 27•14 years ago
|
||
Can't reproduce on linux.
Comment 28•14 years ago
|
||
Will try on osx.
But anyway, if the event listener is from different context than than the
target of the event, there isn't much DOM can do.
XPConnect might be able to detect?
Comment 29•14 years ago
|
||
This triggers the same assertion, without any event handling.
Comment 30•14 years ago
|
||
Comment on attachment 505053 [details]
a minimal testcase
And DEBUG_CheckForComponentsInScope doesn't seem to detect this case.
Comment 31•14 years ago
|
||
Just in case this is better for debugging...
Comment 32•14 years ago
|
||
(In reply to comment #25)
> The reason this now asserts is that for a read of a global variable declared
> with 'var' (or any other permanent property of the global object), we compile
> that into a direct slot read for any compile-and-go script. So, a script that
> is compiled with compile-and-go (a) must only ever be called with that same
> global object, and (b) may not be called after that global is JS_ClearScope'd.
> Making a call that violate (a) or (b) is now a JSAPI violation and can be
> expected to crash.
I don't understand how anything in DOM could prevent that happening.
It is rather radical change to the API, IMO.
(Not that I understand JS API too well :) )
In the testcase it is just js in window A using a js function from window B,
and window B just happened to be destroyed already.
Comment 33•14 years ago
|
||
Why do we clear scope that object if the window is destroyed? That seems unnecessary.
Comment 34•14 years ago
|
||
Blake or jst should know. See also Bug 470510.
Comment 35•14 years ago
|
||
Not sure if this is relevant too: http://mxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp?mark=1905-1909#1885
(bug 303981 and bug 304459)
Comment 36•14 years ago
|
||
This doesn't destroy the window the same way as previous test, yet the assertion crash happens, though not 100% reliably.
Comment 37•14 years ago
|
||
FYI, on 3.6 I get
WARNING: Not same origin error!: file /home/smaug/mozilla/hg/192/dom/base/nsJSEnvironment.cpp, line 478
JavaScript error: javascript:var%20foo%20=%20'bar';%20function%20test()%20{%20return%20foo;%20};%20void(0);, line 1: foo is not defined
Comment 38•14 years ago
|
||
(In reply to comment #33)
> Why do we clear scope that object if the window is destroyed? That seems
> unnecessary.
It's basically leftovers from the pre-inner/outer window era. We've wanted to remove that for some time now, but doing so causes leaks in some cases that we have yet to track down.
Assignee | ||
Comment 39•14 years ago
|
||
smaug, jst, IIUC, as long as we call JS_ClearScope when 'nuking' the global window, DOM can't prevent a violation of our new API requirement, because it can be done with pure JS (windows A, B, fun on A has ref to fun on B, B gets closed, call fun on A), where DOM has no opportunity to even see the violation.
So, I think we should try not to call JS_ClearScope. Two ways to do this come to mind:
1. Stop calling JS_ClearScope and debug the memory leaks.
2. Instead of calling JS_ClearScope, set all the property values of the global object to |undefined| (possibly using a new JSAPI function for this purpose).
#2 seems hackier but I'm not sure we have the time for #1. Do you think these memory leaks could reasonably be debugged in time for Fx4?
Comment 40•14 years ago
|
||
#2 is not sound from a security perspective. The property names alone might leak information. Debugging leaks is a real pain though and we might not catch all of them (only those our tests currently finger).
Comment 41•14 years ago
|
||
smaug, do you have cycles to debug the leaks we've seen from removing the JS_ClearScope calls in the past? IIRC we were seeing leaks during mochitest runs, so try could help there. Peter, do you think this is too risky at this point, from a leak point of view or otherwise?
Andreas, I don't see a security problem with #2, leaving the properties on the old inner with undefined values is no worse than not clearing scope at all AFAICT.
Comment 42•14 years ago
|
||
As long we make a new JSAPI and call that only in the case that the window is going dead that might work, but we can't make JS_ClearScope behave like this.
Assignee | ||
Comment 43•14 years ago
|
||
(In reply to comment #42)
> As long we make a new JSAPI and call that only in the case that the window is
> going dead that might work, but we can't make JS_ClearScope behave like this.
Yes, it would be some new API function that no one else is allowed to use. I was kind of hoping that we might be able to get away with just calling JS_DeleteProperty or the like on just a few selected things, but I don't know how hard it would be to find those things.
Comment 44•14 years ago
|
||
(In reply to comment #41)
> smaug, do you have cycles to debug the leaks we've seen from removing the
> JS_ClearScope calls in the past? IIRC we were seeing leaks during mochitest
> runs, so try could help there. Peter, do you think this is too risky at this
> point, from a leak point of view or otherwise?
Changing things which might cause leaks at this point sounds risky.
Also, are the leaks really the only problem; is
http://mxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp?mark=1905-1909#1885 not a problem anymore?
Also, are we talking about all the ClearScope calls in nsGlobalWindow, or
only the one Bug 470510 removes? That bug removes the call from
nsGlobalWindow::FreeInnerObjects and seems to fix my testcases.
(well, the patch doesn't apply to trunk).
Assignee | ||
Comment 45•14 years ago
|
||
(In reply to comment #44)
> (In reply to comment #41)
> > smaug, do you have cycles to debug the leaks we've seen from removing the
> > JS_ClearScope calls in the past? IIRC we were seeing leaks during mochitest
> > runs, so try could help there. Peter, do you think this is too risky at this
> > point, from a leak point of view or otherwise?
>
> Changing things which might cause leaks at this point sounds risky.
I kind of figured that too.
> Also, are the leaks really the only problem; is
> http://mxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp?mark=1905-1909#1885
> not a problem anymore?
>
> Also, are we talking about all the ClearScope calls in nsGlobalWindow, or
> only the one Bug 470510 removes? That bug removes the call from
> nsGlobalWindow::FreeInnerObjects and seems to fix my testcases.
> (well, the patch doesn't apply to trunk).
We would be removing only JS_ClearScope calls after which a method can be called with that global object; I'm not sure what set that is. I'm also still unclear on which is the global object, the inner or outer window. AFAIK only the real global object matters.
A thought occurs to me for a simple hack that might solve the immediate problem here. We could provide a variant of JS_ClearScope that is exactly like the current one, except that (a) it doesn't shrink the slot vector (so that reads to it are memory safe) and (b) sets all the values in the slot vector to |undefined| (so that reads are type safe and don't leak information). But the properties would still be gone, so that unoptimized property reads or queries would be errors, exactly as for JS_ClearScope'd objects.
I think that might not quite work, because I think there is code in ClearScope that does stuff to the object after calling JS_ClearScope, which could shrink its slot vector at that point. But maybe we can solve that.
Taking a broader view of the problem might help, too. If we could figure out how this would ideally work, that might point to an approximation we can hack up quickly. The fundamental problem with these "bad" calls is that the system (a) creates permanent properties on the global object, then (b) deletes those properties, and (c) then runs scripts that were compiled for that global. Deleting permanent properties on a live object seems bad. I don't know exactly what requirements the JS_ClearScopes are providing on the DOM side, but it seems we should be able to find a way to satisfy them a better way.
Comment 46•14 years ago
|
||
I would imagine that a JS_ClearScope variant that simply set all properties to undefined and nothing else would do the trick here. I doubt any of the other stuff the JS_ClearScope() does is important in this case.
Comment 47•14 years ago
|
||
(In reply to comment #44)
> Changing things which might cause leaks at this point sounds risky.
Agreed.
> Also, are the leaks really the only problem; is
> http://mxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp?mark=1905-1909#1885
> not a problem anymore?
mrbkap should confirm this, but I doubt that's relevant any longer since we brain transplant the outer window nowadays, so nothing should leak even w/o clearing the outer's scope. It's the inner that's important here AFAICT.
> Also, are we talking about all the ClearScope calls in nsGlobalWindow, or
> only the one Bug 470510 removes? That bug removes the call from
> nsGlobalWindow::FreeInnerObjects and seems to fix my testcases.
> (well, the patch doesn't apply to trunk).
Long term they should all go, short term maybe replace all of them with a new variant that doesn't actually delete the properties?
Comment 48•14 years ago
|
||
Add an inline method to JSObject that clears alll slots. Call this instead of JS_ClearScope() right before the window dies.
Assignee | ||
Comment 49•14 years ago
|
||
(In reply to comment #48)
> Created attachment 505664 [details] [diff] [review]
> patch
>
> Add an inline method to JSObject that clears alll slots. Call this instead of
> JS_ClearScope() right before the window dies.
Doesn't branding make it unsafe to do that without changing shape? What if there is a PIC or property cache entry that points to a branded slot value, and then we clear it, it gets GC'd, and then we try to call it?
We also call JS_DefineProperty and JS_ClearRegExpStatics on the inner window after calling JS_ClearScope, so we have to make sure whatever we do doesn't trash the object so badly those fail, or else do something different there.
Assignee | ||
Comment 50•14 years ago
|
||
I'm starting to think maybe we shouldn't fix this. It doesn't seem like that serious of a problem--the worst that can happen is an invalid read, and it seems only to happen with extensions. (Andreas, are you still seeing this with Google Voice?)
Anyway, I tried removing the JS_ClearScope call from nsJSContext::ClearScope and that had leaks on mochitests 1, 2, and 4. I don't think we have time to debug that.
Otherwise, we could try clearing out the slot values, while keeping the properties, but:
- The 'window' property must not be cleared out. Currently, the code reads that property and then defines it again after JS_ClearScope as non-writable, non-configurable. So it seems we would need to either (a) skip clearing that slot, or (b) after blanking everything, make that property writable, set it, and then make it non-writable again.
- Branded objects and method-valued properties are also a problem. For them, I guess we could unbrand the object? Not sure if we need to do anything else.
It's hard to see a solution that isn't pretty hacky. So I worry we'd run a high risk of regressions trying to fix this.
I'd rather go with a simpler mitigating measure, but I'm not sure what that could be, either. We could try not shrinking the slots. But I think GC can shrink them anyway at any time.
Or maybe we could overwrite the slots only for the "builtin" properties of the global object (global reserved slots? not sure exactly what they would be)? Not sure if there's a way to do that.
Comment 51•14 years ago
|
||
What, not fixing this?
The assertion may happen with any web page. I could assume that several
even a bit more complex web apps will trigger it - it is just very easy
to write such code even accidentally.
Assignee | ||
Comment 52•14 years ago
|
||
(In reply to comment #51)
> What, not fixing this?
> The assertion may happen with any web page. I could assume that several
> even a bit more complex web apps will trigger it - it is just very easy
> to write such code even accidentally.
I'm just not convinced we have any quick fix that hurts us less than this bug. I'm trying a super-hacky variant of clearing the values, though. Maybe it will work.
Assignee | ||
Comment 53•14 years ago
|
||
This patch fixes the comment 0 test case (we get the called on thing with no components warning, but no assert in JS). Currently running on try.
I started with Andreas's patch. It turns out that:
- we better not clear the regexp statics, because nsJSContext::ClearSlots wants to clear that
- |obj| can be an outer window proxy, in which case we have to call JS_ClearSlots as before
I unbrand the object before clearing, to avoid any future problems there. This can fail (should be very rare), which is a small problem. In that case, I just call JS_ClearSlots as usual. It seems we could also consider not doing anything in that case,
Assignee: general → dmandelin
Attachment #505664 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Assignee | ||
Comment 54•14 years ago
|
||
(continuing sentence cut off in comment 53), in which case we would leak. Again, that should be very rare.
Comment 55•14 years ago
|
||
What does unbrand do? And when does it fail?
Comment 56•14 years ago
|
||
Catching up on bugmail, sorry for not getting here sooner.
Dave, unbrand does not fail, rather it returns false if shape overflow could not be cured by a GC. No error, rather a disable-caches/abort-or-bail-traces thing.
JS_ClearScope is a bad old sharp API from the dark ages. We really need to debug those leaks, but in the mean time, changing it to leave the object's slots vector intact and setting almost all slots to undefined seems best.
/be
Assignee | ||
Comment 57•14 years ago
|
||
Attachment #506851 -
Flags: review?(jwalden+bmo)
Assignee | ||
Updated•14 years ago
|
Attachment #506851 -
Flags: review?(Olli.Pettay)
Assignee | ||
Updated•14 years ago
|
Attachment #506048 -
Attachment is obsolete: true
Comment 58•14 years ago
|
||
Comment on attachment 506851 [details] [diff] [review]
v3
>- JS_ClearScope(mContext, obj);
>+ // Hack fix for bug 611653. This replaces a call to JS_ClearScope, which
>+ // was required to avoid leaks. The intent of this code is to try to avoid
>+ // the leaks without shrinking the slot array by clearing the values of
>+ // the object.
>+ if (obj->isNative())
Could you add a comment why this 'if ()'
Also, style nit,
if () {
} else {
}
even if there is just one statement.
I'll test this a bit still today, once my build is ready.
Updated•14 years ago
|
Attachment #506851 -
Flags: review?(mrbkap)
Comment 59•14 years ago
|
||
Comment on attachment 506851 [details] [diff] [review]
v3
>+JS_FRIEND_API(bool)
>+js_UnbrandAndClearSlots(JSContext *cx, JSObject *obj)
>+{
>+ JS_ASSERT(obj->isNative());
>+ JS_ASSERT(obj->isGlobal());
>+
>+ if (!obj->unbrand(cx))
>+ return false;
>+
>+ for (int key = JSProto_Null; key < JSProto_LIMIT * 3; key++)
Not that I'm reviewing js/ part of the patch at all, but where is the * 3 coming?
And nit, ++key is faster than key++ (unless the latter is optimized).
Seems to work in the cases I tried. Things just became undefined.
But mrbkap or jst should review this too.
Attachment #506851 -
Flags: review?(Olli.Pettay) → review+
Assignee | ||
Comment 60•14 years ago
|
||
This just picks up Smaug's stylistic comments, so I won't spam everyone with canceled and new review requests.
Comment 61•14 years ago
|
||
For an int or other POD type ++ prefix vs. postfix does not matter if the result is not used.
Now the if-else is overbraced, contrary to JS house style! Smaug, we do not follow the Gecko C++ style guide of always brace (not all Gecko hackers do, either, for short control-flow-transfer consequents: break/continue/return).
Bleah.
/be
Comment 62•14 years ago
|
||
Oh, but that if-else was in nsJSEnvironment.cpp -- then (brace yourself dmandelin) it was using the Gecko brace style, but it wrongly indents by 4 not 2.
/be
Comment 63•14 years ago
|
||
Comment on attachment 506897 [details] [diff] [review]
v4 (fixes style, comments)
>+ for (int key = JSProto_Null; key < JSProto_LIMIT * 3; key++)
>+ JS_SetReservedSlot(cx, obj, key, JSVAL_VOID);
Aha, now I can finally make a substantive comment:
JSProto_LIMIT * 3 should be a macro #defined in jsapi.h to the value from JSCLASS_GLOBAL_FLAGS:
#define JSCLASS_GLOBAL_FLAGS \
(JSCLASS_IS_GLOBAL | \
JSCLASS_HAS_RESERVED_SLOTS(JSProto_LIMIT * 3 + JSRESERVED_GLOBAL_SLOTS_COUNT))
/be
Assignee | ||
Comment 64•14 years ago
|
||
Attachment #506897 -
Attachment is obsolete: true
Assignee | ||
Updated•14 years ago
|
Attachment #506851 -
Flags: review?(jwalden+bmo) → review?(cam)
Assignee | ||
Comment 65•14 years ago
|
||
Attachment #506963 -
Attachment is obsolete: true
Assignee | ||
Comment 66•14 years ago
|
||
Comment on attachment 506851 [details] [diff] [review]
v3
Argh, sent review to cam on wrong bug. (Should have been for bug 628612.)
Attachment #506851 -
Flags: review?(cam)
Comment 67•14 years ago
|
||
Comment on attachment 506972 [details] [diff] [review]
v5 (use global reserve slots formula)
>+ // Hack fix for bug 611653. Originally, this always called JS_ClearScope,
>+ // which was required to avoid leaks. But for native objects, the JS
>+ // engine has an optimization that requires that permanent properties of
>+ // the global object are never deleted. So instead, we call a new special
>+ // API that clears the values of the global, thus avoiding leaks without
>+ // We just overwrote all slots to undefined, so the freelist has
>+ // been trashed. We need to clear the head pointer or else we will
>+ // crash later. This leaks slots but the object is all but dead
>+ // anyway.
>+ if (obj->hasPropertyTable())
>+ obj->lastProperty()->table->freelist = SHAPE_INVALID_SLOT;
. . .
>+/*
>+ * Hack fix for bug 611653: Do not use for any other purpose.
>+ *
>+ * Unbrand and set all slot values to undefined (except reserved slots that
>+ * are not used for cached prototypes).
>+ */
>+JS_FRIEND_API(bool) js_UnbrandAndClearSlots(JSContext *cx, JSObject *obj);
These don't follow prevailing comment and declaration style but that may be good, since they're hack fix parts and should not be used or emulated.
shaver used to like this when I did it, inspired by old BSD Unix code I'd seen (I think Bill Joy code):
#ifdef HACK
{ int ugly; /* two space indentation, { on same line as first stmt! */
. . .
}
#endif
Breaks all the rules!
/be
Comment 68•14 years ago
|
||
Comment on attachment 506972 [details] [diff] [review]
v5 (use global reserve slots formula)
So I hope you will accept my drive-by r+.
/be
Attachment #506972 -
Flags: review+
Assignee | ||
Comment 69•14 years ago
|
||
(In reply to comment #68)
> Comment on attachment 506972 [details] [diff] [review]
> v5 (use global reserve slots formula)
>
> So I hope you will accept my drive-by r+.
>
> /be
Of course! :-)
http://hg.mozilla.org/tracemonkey/rev/b5ca98debed0
Whiteboard: hardblocker → hardblocker, fixed-in-tracemonkey
Updated•14 years ago
|
Attachment #506851 -
Flags: review?(mrbkap) → review+
Comment 70•14 years ago
|
||
Since this landed, every single opt Mac mochitest-chrome has failed with the same set of 8 things starting with "test_autocomplete2.xul | [SimpleTest/SimpleTest.js, window.onerror] An error occurred - parseInt is not a function at chrome://global/content/bindings/autocomplete.xml:0"
Assignee | ||
Comment 71•14 years ago
|
||
Backed out due to the Mac (opt-only actually) orange:
http://hg.mozilla.org/tracemonkey/rev/36a4ecbcb4be
It looks like "test_autocomplete.xul", which runs only on Mac, is screwing things up.
Whiteboard: hardblocker, fixed-in-tracemonkey → hardblocker
Assignee | ||
Comment 72•14 years ago
|
||
The mochitest failures seem to be timing-related. They happen only on Mac opt builds. And even a few printfs seems enough to make them not reproduce. It seems that some global window object is probably getting closed while a test is running, and so the test sees undefined values it doesn't expect. I really don't know what global window objects are in play, and it may be hard to figure out without being able to print or debug. Anyway, I'll continue tomorrow.
But it might be better just to disable those tests on Mac: this bug is pretty serious and it could take a long time to figure the mochitest failure out, unless someone out there has good knowledge about how mochitest-chrome works, and what global window objects might be getting cleared in these tests.
Comment 73•14 years ago
|
||
Could we take one more run at figuring out the leaks that result from not doing this bogus clearing/voiding? I know that looks hard but maybe dbaron or peterv could see something the rest of us can't.
/be
Comment 74•14 years ago
|
||
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/b5ca98debed0
http://hg.mozilla.org/mozilla-central/rev/36a4ecbcb4be (backout)
Note: not marking as fixed because last changeset is a backout.
Assignee | ||
Comment 75•14 years ago
|
||
Attachment #506851 -
Attachment is obsolete: true
Attachment #506972 -
Attachment is obsolete: true
Attachment #508561 -
Flags: review?(enndeakin)
Updated•14 years ago
|
Whiteboard: hardblocker → [hardblocker][has patch]
Assignee | ||
Comment 76•14 years ago
|
||
Attachment #508561 -
Attachment is obsolete: true
Attachment #508580 -
Flags: review?(enndeakin)
Attachment #508561 -
Flags: review?(enndeakin)
Comment 77•14 years ago
|
||
Comment on attachment 506851 [details] [diff] [review]
v3
Assuming the errantly-mis-transferred r? should have been re-requested, r+ on it, at least as far as total awful hacks go.
Attachment #506851 -
Flags: review+
Comment 78•14 years ago
|
||
Comment on attachment 508580 [details] [diff] [review]
v7 (better fix autocomplete tests)
>+ if (obj->isNative()) {
>+ js_UnbrandAndClearSlots(mContext, obj);
Note you are ignoring the r.v. (not a bug, see below, though).
>+JS_FRIEND_API(bool)
>+js_UnbrandAndClearSlots(JSContext *cx, JSObject *obj)
>+{
>+ JS_ASSERT(obj->isNative());
>+ JS_ASSERT(obj->isGlobal());
>+
>+ if (!obj->unbrand(cx))
>+ return false;
This is a silent failure but you ignore it. As noted in comment 56, unbrand is not fallible. Its return status tells whether to proceed assuming property-caching/guarding/PIC'ing or not (shape overflow and exhaustion).
So just make this hack API void and don't check unbrand's r.v.
/be
Updated•14 years ago
|
Attachment #508580 -
Flags: review?(enndeakin) → review+
Assignee | ||
Comment 79•14 years ago
|
||
Landed with Brendan's latest changes.
http://hg.mozilla.org/tracemonkey/rev/6882211c4a80
Whiteboard: [hardblocker][has patch] → [hardblocker][has patch][fixed-in-tracemonkey]
Comment 80•14 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Comment 81•14 years ago
|
||
I think this caused the other part of bug 631494.
Comment 82•14 years ago
|
||
comment 81: Yes, I already have a fix.
Updated•13 years ago
|
Target Milestone: --- → mozilla2.0b12
You need to log in
before you can comment on or make changes to this bug.
Description
•