Closed
Bug 479198
Opened 15 years ago
Closed 15 years ago
"Deep" property cache entries not invalidated when shadowed (TIBCO General Interface regression)
Categories
(Core :: JavaScript Engine, defect, P1)
Core
JavaScript Engine
Tracking
()
VERIFIED
FIXED
mozilla1.9.2a1
People
(Reporter: jesse, Assigned: jorendorff)
References
Details
(Keywords: testcase, verified1.9.1)
Attachments
(6 files, 5 obsolete files)
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_6; en-us) AppleWebKit/525.27.1 (KHTML, like Gecko) Version/3.2.1 Safari/525.27.1 Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b2) Gecko/20081201 Firefox/3.1b3pre It appears that under some circumstances JavaScript memory may not be properly synchronized so that seemingly impossible conditions occur. This is causing TIBCO General Interface Builder 3.7 to fail to load. I've spent a lot of time debugging and the issue seems exceedingly complicated. Whether the bug shows up or not seems to be related to whether and how many logging statements are sent to a separate browser window and the number of sync and async XHR requests. The only consistent way to prevent the bug is to turn off jit.client in about:config. I'm attaching both the 3.7 GA release and a version of Model.js that I have used to debug this issue. In the modified file the "impossible" condition is shown on lines 455-458. var try1 = this.getServer(); jsx3.log("Hello world!"); var try2 = this.getServer(); jsx3.log(try1 + " " + try2); "try1" logs null while "try2" logs the server object. However, the implementation of getServer() clearly does nothing to affect the state of memory (line 563): Model_prototype.getServer = function() { var node = this; while (node) { if (node._jsxserver) return node._jsxserver; node = node._jsxparent; } return null; }; The call to jsx3.log() modifies some parts of memory and may call browser DOM methods but it certainly does not modify anything that might affect the getServer() method. Reproducible: Always Steps to Reproduce: 1. Unzip attachment 2 [review]. Launch GI_Builder.html in Firefox 3.1+ with jit.client enabled 3. Click through security warnings (suggest selecting "remember") Actual Results: Start-up fails. Error console shows "objServer is null" or "x.getServer() is null" usually in Model.js. Or, replace JSX/js/jsx3/app/Model.js with the attached file and then launch GI_Builder. In this case a long red stack track shows up in the Application Monitor external window. Just above it you can see the logging messages corresponding to Model.js lines 455-458: 23:17:15.526 global (INFO) - Hello world! 23:17:15.527 global (INFO) - null @jsx3.app.Server GI_Builder:jsx3.IDE Expected Results: Start-up completes with no such errors. This is the case in Firefox 3.0 or 3.1 with jit.client disabled. Testing on a dual-core mac book and able to reproduce in XP running in Fusion with two cores enabled.
Reporter | ||
Comment 1•15 years ago
|
||
Reporter | ||
Comment 2•15 years ago
|
||
Reporter | ||
Comment 3•15 years ago
|
||
And of course I meant jit.content, not jit.client.
Updated•15 years ago
|
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Firefox → Core
QA Contact: general → general
Summary: JavaScript memory synchronization bug with JIT enabled in 3.1+ → TM: JavaScript memory synchronization bug with JIT enabled in 3.1+
Version: unspecified → Trunk
Comment 4•15 years ago
|
||
Jesse, is this something your tools can reduce?
Reporter | ||
Comment 5•15 years ago
|
||
By "reduce" do you mean, can we work around the issue? or, can we produce a simpler test case?
Comment 6•15 years ago
|
||
He means "produce a simpler test case". And I think he missed the fact that you and I are both "Jesse" ;) I don't think the tools and tricks in http://www.squarefree.com/2009/01/11/reducing-real-world-scripts/ will help here, given what you said in comment 0: "Whether the bug shows up or not seems to be related to whether and how many logging statements are sent to a separate browser window and the number of sync and async XHR requests."
Reporter | ||
Comment 7•15 years ago
|
||
The Jesse Project comes to Mozilla... I spent some time trying to create a simple test case for this bug thinking it was correlated with XPathEvaluator.evaluate or XMLHttpRequest.open/send, both of which we use extensively, but unfortunately I was not successful. Our compiled build, which consists of many fewer files (and so fewer XMLHttpRequest's) seems to be more stable than the source build that I attached. Additionally, I was able to reproduce the bug over http:// as well as file:// so it does not seem to be dependent on XMLHttpRequest to the file system.
Reporter | ||
Comment 8•15 years ago
|
||
I just confirmed that this is still an issue in the latest b3 build. At this point there is no way for us to work around this issue short of instructing all users to disable content jit via about:config. We will have to drop support for Fx 3.5 if this issue is not fixed by 3.5 GM.
Updated•15 years ago
|
Assignee: general → gal
Flags: blocking1.9.1?
Priority: -- → P1
Target Milestone: --- → mozilla1.9.1b4
Comment 9•15 years ago
|
||
Any luck with creating a more reduced testcase? That would be much more useful than making hostile-sounding comments to the bug...
Comment 10•15 years ago
|
||
Does this happen with 3.2a1pre build?
(In reply to comment #9) > hostile-sounding comments Sorry, I was with Jesse when he wrote that and it wasn't meant to be hostile. If he/we can't figure this out then his product won't work with Firefox 3.5, plain and simple. Efforts to reduce the testcase have been documented in comment 0 and comment 7, and so far none of these efforts have resulted in anything that makes sense. (In reply to comment #10) > Does this happen with 3.2a1pre build? Yes, it does.
Status: UNCONFIRMED → NEW
Ever confirmed: true
One interesting data point - the app seems to work properly in a recent debug trunk build...
(In reply to comment #12) Oops, please ignore me. My debug profile had the JIT disabled. Problem persists in a debug build with the JIT enabled.
Updated•15 years ago
|
Flags: blocking1.9.1? → blocking1.9.1+
Updated•15 years ago
|
Summary: TM: JavaScript memory synchronization bug with JIT enabled in 3.1+ → TM: TIBCO General Interface regression with JIT
Updated•15 years ago
|
Assignee: gal → jorendorff
Assignee | ||
Comment 14•15 years ago
|
||
I looked into this briefly, but did not really get anywhere. Here's a slightly reduced test case that reproduces the bug (I just deleted a bunch of plugins and images; if you delete much more, it stops getting far enough to trigger the bug). I am out for two weeks. Resisting the urge to assign this to graydon. :)
Comment 15•15 years ago
|
||
Jesse, are there any other tests in your suite that we fail?
Reporter | ||
Comment 16•15 years ago
|
||
I certainly understand the value of a simple reproducible test case, which is why I have spent time trying to create one. Unfortunately, I only see the problem when loading GI_Builder.html, which happens to be a very complex JavaScript application. None of our pre-existing unit tests fail on 3.1+jit so we're obviously not capturing whatever condition is causing this issue. Since I spent some time trying to create a simple case and failed, I was hoping that you engineers would have some tools at your disposal that would help pinpoint the issue. After some further investigation I can say that the bug behavior does not require XSLT transforms or Sync XHR to show up. Turning off all transforms still led to the same error. And I can compile the first test case such that no files are loaded sync XHR. (I'll attach that test case too). I don't have a clear idea of what other parts of the JavaScript or browser APIs might cause these sorts of synchronization errors. In addition to sync XHR and transforms, we also use async XHR, window.setTimeout, DOM manipulation, and XPCOM extensively in the failing test case. Async XHR and setTimeout are impossible to factor out of the test case. I may be able to get rid of the XPCOM code to make sure that that is not the culprit.
Reporter | ||
Comment 17•15 years ago
|
||
Reporter | ||
Comment 18•15 years ago
|
||
Ok, I can reduce the XPCOM code out and the issue still exists so I guess that's not the issue either.
Reporter | ||
Comment 19•15 years ago
|
||
Attachment #367675 -
Attachment is obsolete: true
Bug 485775 has similar symptoms, though the regression range presented there is more recent than this bug...
Assignee | ||
Comment 21•15 years ago
|
||
The previous tarball doesn't reproduce the problem for me, so I produced this starting from the "slightly reduced test case" tarball. Now it produces a stack trace, but I don't think we're on trace at the point of the error. The bug must happen earlier. Will keep hacking away at this tomorrow.
Assignee | ||
Comment 22•15 years ago
|
||
down to 17000 lines in 9 files
Assignee | ||
Comment 23•15 years ago
|
||
Down to 5299 lines in 7 files.
Attachment #367519 -
Attachment is obsolete: true
Attachment #370309 -
Attachment is obsolete: true
Attachment #370695 -
Attachment is obsolete: true
Assignee | ||
Comment 24•15 years ago
|
||
down to 1315 lines in 4 files Still behaving too strangely to figure out.
Assignee | ||
Comment 25•15 years ago
|
||
function Model(x) {} Model.prototype._parent = null; Model.prototype._engine = null; Model.prototype.getEngine = function() { var node = this; while (node) { if (node._engine) return node._engine; node = node._parent; } return null; }; function add(parent, child) { parent.getEngine(); parent._children.push(child); child._parent = parent; } function Block(init) { this._children = []; } Block.prototype = new Model; var root = new Block; root._engine = {}; var layout = new Block; add(layout, new Block); add(layout, new Block); add(layout, new Block); add(root, layout); var engine = layout._children[0].getEngine(); print(engine ? "no bug" : "bug");
Assignee | ||
Comment 26•15 years ago
|
||
OK, this one records only one trace, not two, so it should be easier to decipher... function Model() {} Model.prototype.parent = null; Model.prototype.engine = null; function Block() {} Block.prototype = new Model; function getEngine(node) { while (node) { if (node.engine) return node.engine; node = node.parent; } return null; } var root = new Block; root.engine = {}; var layout = new Block; getEngine(layout); getEngine(layout); layout.parent = root; var engine = getEngine(layout); print(engine ? "no bug" : "bug");
Assignee | ||
Comment 27•15 years ago
|
||
OK, final simplification: function h(node) { var x = 0; while (node) { x++; node = node.parent; } return x; } var tree = {__proto__: {__proto__: {parent: null}}}; h(tree); h(tree); tree.parent = {}; assertEq(h(tree), 2);
Assignee | ||
Comment 28•15 years ago
|
||
The code emitted for `node.parent` guards on the exact pointer value of `node`, not its shape. The assignment `tree.parent = {}` changes the shape only.
Assignee | ||
Comment 29•15 years ago
|
||
`tree.parent = {}` does end up calling js_PurgeScopeChain(cx, tree), but `tree` is not a delegate, so it does nothing. (Otherwise we would end up in PurgeProtoChain, which would change the shape of the object we're actually guarding on, the one that has the .parent property, ensuring correctness.) The generated code for `node.parent`, after all guards, is walking the prototype chain, which I find strange. That is unrelated though.
Comment 30•15 years ago
|
||
Works in Firefox 3.0.x -- that DELEGATE optimization (bug 454035) bites again -- sorry about that. That change assumes shadowing in a non-delegate will change the shape of that directly referenced object, so invalidate any cached/compiled code for this lookup. What's emitting the code to walk up the prototype chain? /be
Assignee | ||
Comment 31•15 years ago
|
||
(In reply to comment #30) > Works in Firefox 3.0.x -- that DELEGATE optimization (bug 454035) bites again > -- sorry about that. That change assumes shadowing in a non-delegate will > change the shape of that directly referenced object, so invalidate any > cached/compiled code for this lookup. It does change the shape, but that's not enough to invalidate the compiled code (which does not shape-guard, comment 28). Still pondering how to fix. I'll catch up to you on IRC, I hope. > What's emitting the code to walk up the prototype chain? TraceRecorder::prop(JSObject* obj, LIns* obj_ins, uint32& slot, LIns*& v_ins) { ... /* * We're getting a proto-property. Walk up the prototype chain emitting * proto slot loads, updating obj as we go, leaving obj set to obj2 with * obj_ins the last proto-load. */ while (obj != obj2) { obj_ins = stobj_get_slot(obj_ins, JSSLOT_PROTO, dslots_ins); obj = STOBJ_GET_PROTO(obj); } ... } This is easier to fix. It is always safe to teleport in this case (just emit obj2 as a constant), right?
Comment 32•15 years ago
|
||
I do not know why Jesse C-G observed the bug not reproducing with jit.content false -- it should happen (and jorendorff's great reduction does happen) with or without the jit. Anyway, the bug seems not to need the "TM: " and "with JIT" bits of the summary. The property cache distinguishes direct, immediate-prototype, and deep hit cases. The DELEGATE optimization assumes the first two, forgetting that in the deep hit case there is no relation between directly-referenced shape and path through the scope and proto chains to the object holding the found property. Therefore a fix is to purge even when the DELEGATE bit is not set if (a) there is shadowing going on; (b) the shadowed property's object is more than unit distance away from the directly referenced object in the PCVCAP_TAG metric. I.e. if this is a "deep shadowing" hazard. /be
Summary: TM: TIBCO General Interface regression with JIT → TIBCO General Interface regression with
Comment 33•15 years ago
|
||
(In reply to comment #31) > (In reply to comment #30) > > Works in Firefox 3.0.x -- that DELEGATE optimization (bug 454035) bites again > > -- sorry about that. That change assumes shadowing in a non-delegate will > > change the shape of that directly referenced object, so invalidate any > > cached/compiled code for this lookup. > > It does change the shape, but that's not enough to invalidate the compiled code > (which does not shape-guard, comment 28). I know, that's what I was trying to say -- the bad assumption is the whole sentence, including "..., so invalidate any cached/compiled code ...." > > What's emitting the code to walk up the prototype chain? > > TraceRecorder::prop(JSObject* obj, LIns* obj_ins, uint32& slot, LIns*& v_ins) > { > ... > /* > * We're getting a proto-property. Walk up the prototype chain emitting > * proto slot loads, updating obj as we go, leaving obj set to obj2 > with > * obj_ins the last proto-load. > */ > while (obj != obj2) { > obj_ins = stobj_get_slot(obj_ins, JSSLOT_PROTO, dslots_ins); > obj = STOBJ_GET_PROTO(obj); > } > ... > } > > This is easier to fix. It is always safe to teleport in this case (just emit > obj2 as a constant), right? That's what test_property_cache does, indeed. Why is this code there? I forget. /be
Updated•15 years ago
|
Summary: TIBCO General Interface regression with → TIBCO General Interface regression
Comment 34•15 years ago
|
||
Argh, classic mistake #27(http://www.codinghorror.com/blog/archives/000889.html) from last August: $ hg log -r17502 jstracer.cpp changeset: 17502:86ef943700ac user: Brendan Eich <brendan@mozilla.org> date: Wed Aug 13 17:10:18 2008 -0700 summary: Fix prototype hit case in prop to advance obj and obj_ins up the proto chain. My apologies again. The loop in prop (which moved) is bogus. Removing it could well net a perf win. Jason proposed flagging an object being stored as a cache key (trace guard) as a delegate, which it obviously is (the delegate flag should be set when there are one or more, potentially many, saved references to an object, and the refs are used to lookup properties). This could de-optimize non-deep hits on the same object from other sites, but removing that loop could easily win more than is lost, on average for our benchmarks. /be
Assignee | ||
Comment 35•15 years ago
|
||
This fixes the JS test case in comment 27. Applying to my tm-browser tree now, to see if it fixes TIBCO...
Attachment #371271 -
Flags: review?(brendan)
Comment 36•15 years ago
|
||
(In reply to comment #34) > Jason proposed flagging an object being stored as a cache key (trace guard) as > a delegate, which it obviously is (the delegate flag should be set when there > are one or more, potentially many, saved references to an object, and the refs > are used to lookup properties). To be a bit more complete and precise (need to write that paper...) the saved refs are a problem not merely because many to one -- but because many different call sites (call, get, set, what's the diff?). /be
Assignee | ||
Comment 37•15 years ago
|
||
As brendan mentioned to me, v1 only fixes this for the particular "saved reference" that's baked into trace. But there is a saved reference in the property cache too, so the right place to set the flag is in js_FillPropertyCache.
Attachment #371271 -
Attachment is obsolete: true
Attachment #371279 -
Flags: review?(brendan)
Attachment #371271 -
Flags: review?(brendan)
Comment 38•15 years ago
|
||
we need a bug for #34
Comment 39•15 years ago
|
||
Comment on attachment 371279 [details] [diff] [review] v2 >diff --git a/js/src/jsinterp.cpp b/js/src/jsinterp.cpp >--- a/js/src/jsinterp.cpp >+++ b/js/src/jsinterp.cpp >@@ -288,16 +288,22 @@ js_FillPropertyCache(JSContext *cx, JSOb > JS_ASSERT_IF(scopeIndex == 0, > protoIndex != 1 || OBJ_GET_PROTO(cx, obj) == pobj); > if (scopeIndex != 0 || protoIndex != 1) { > khash = PROPERTY_CACHE_HASH_ATOM(atom, obj, pobj); > PCMETER(if (PCVCAP_TAG(cache->table[khash].vcap) <= 1) > cache->pcrecycles++); > pc = (jsbytecode *) atom; > kshape = (jsuword) obj; >+ >+ /* >+ * Make sure that a later shadowing assignment will enter >+ * PurgeProtoChain and invalidate this entry, bug 479198. >+ */ >+ OBJ_SET_DELEGATE(cx, obj); Comment also the thread-safety design intentions here (jsobj.h makes it reasonably clear that the caller of OBJ_SET_DELEGATE has to ensure no races to update obj->classword lose bits): only the DELEGATE bit may be set at runtime, never cleared; SYSTEM must be set at birth when obj is guaranteed inaccessible to other threads. r=me with that, thanks. /be
Attachment #371279 -
Flags: review?(brendan) → review+
Assignee | ||
Comment 40•15 years ago
|
||
Confirmed, this fixes the full test case (in the first attachment above). Thanks for the complete test case, Jesse. Truly something is better than nothing. http://hg.mozilla.org/tracemonkey/rev/606a2df1ed0b I forgot to add the comment Brendan asked for. Follow-up push coming.
Summary: TIBCO General Interface regression → "Deep" property cache entries not invalidated when shadowed (TIBCO General Interface regression)
Assignee | ||
Comment 42•15 years ago
|
||
That earlier link should have been: http://hg.mozilla.org/tracemonkey/rev/805dd817a3d4 And the MT comment is: http://hg.mozilla.org/tracemonkey/rev/c4cc47f6a93c
Comment 43•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/805dd817a3d4
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 44•15 years ago
|
||
I just tested with the 3.6a1 nightly and I see no problems with JIT enabled or disabled. Thanks Jason, Brendan, Ben and others for tracking this down and fixing it!
Comment 45•15 years ago
|
||
Don't thank me, I introduced the bug with that freaky DELEGATE optimization of mine (I'm an optimist, it's a problem... I'm working on it...). All praise to jorendorff's heroic reduction/deduction work. /be
Comment 46•15 years ago
|
||
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/12b4fb4695b1
Keywords: fixed1.9.1
Comment 47•15 years ago
|
||
Verified fixed with testcase given in comment 27 on trunk and 1.9.1 with the following debug builds: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090422 Minefield/3.6a1pre ID:20090422224452 Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b4pre) Gecko/20090422 Shiretoko/3.5b4pre ID:20090422122043
Status: RESOLVED → VERIFIED
Hardware: x86 → All
Target Milestone: mozilla1.9.1b4 → mozilla1.9.2a1
Comment 48•15 years ago
|
||
js1_8_1/trace/trace-test.js http://hg.mozilla.org/tracemonkey/rev/61892f57b46a
Flags: in-testsuite+
Comment 49•15 years ago
|
||
cvsroot/mozilla/js/tests/js1_8_1/trace/trace-test.js,v <-- trace-test.js new revision: 1.14; previous revision: 1.13 /cvsroot/mozilla/js/tests/shell.js,v <-- shell.js
You need to log in
before you can comment on or make changes to this bug.
Description
•