Closed Bug 473256 Opened 11 years ago Closed 11 years ago

TM: Wrong function called via `f()` when global f is reassigned on trace

Categories

(Core :: JavaScript Engine, defect, P1)

Other Branch
defect

Tracking

()

RESOLVED INCOMPLETE
mozilla1.9.1

People

(Reporter: jorendorff, Assigned: Waldo)

Details

(Whiteboard: [next beta])

Attachments

(1 file, 1 obsolete file)

function g() { print("g"); }
function h() { print("h"); }
a = [g, g, g, g, h];
for (i=0; i<5; i++) { f = a[i];  f(); }

Without JIT, the output is
g
g
g
g
h

With JIT,
g
g
g
g
g

Not sure what's going on here.  We deliberately do not guard on the identity of the function being called.  It is supposed to be guaranteed by the property cache.  I think this particular bit of code *does* go through CALLNAME, which should hit the propcache.  Perhaps the SETNAME at f=a[i] should cause us to bail off trace.
Flags: blocking1.9.1+
Priority: -- → P1
Assignee: general → brendan
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.9.1
...and a voice from on high spoke to me, and lo! a potential insight into what might or might not be the problem!  And spoken with such utter uncertainty, too!

The set of the property "f" uses JSOP_SETNAME.  In JSOP_SETNAME I'd bet what's happening now is that we hit the "fastest path" and enter the NATIVE_SET to set f to its new value.  Inside NATIVE_SET we have another fast path which uses LOCKED_OBJ_WRITE_BARRIER.  That uses GC_WRITE_BARRIER, and if we go to there we see logic to update the scope's shape when a property is changed to/from a function value.  We're not doing this shape update to the scope on trace, nor are we detecting that we should be and leaving the trace.  I bet that's the root of the problem, but I haven't stepped through with a debugger to be sure.

If I'm right about this, this seems like shades of bug 471214...
Brendan, you mind if I steal?
Assignee: brendan → jwalden+bmo
Attached patch Patch (obsolete) — Splinter Review
Comment in the patch sez all.
Comment on attachment 364193 [details] [diff] [review]
Patch

>+        /*
>+         * Shapes are determined by an object's properties, their types, and
>+         * the order in which those properties were created.  Types as used for
>+         * shapes consist primarily of one per trace-time type,

This isn't accurate: shapes do not depend on types, only on ordered list of property names, attributes, getter/setter, etc. (all the stuff matched in jsscope.cpp as part of the JSPropertyTreeNode hashing and identity ops) and (if the scope whose shape we're considering is branded) on function values of any function-valued properties.

> but every
>+         * function occupies a separate type class to enable function-call
>+         * optimizations.

"type class" suggests the wrong type-theoretic meaning. How about "every method (function-valued property) that is ever called inhabits a separate type"?

>+         * On trace, however, we can take advantage of type requirements to
>+         * optimize this shape-changing.  Normal type matching

s/type matching/shape guarding/

>+.......... handles the
>+         * to-function and from-function transitions, but the
>+         * function-to-function transition we must handle ourselves with the
>+         * following guard.  This is only a problem for properties of the
>+         * global object because we assume the global object's shape is
>+         * invariant during tracing.  See also bug 473256.

I would reorder thins paragraph so you put the "only a problem for ... the global object" first. The way it reads suggests there is a problem for other objects.

>+         */
>+        if (VALUE_IS_FUNCTION(cx, r))
>+            guard(true, lir->ins2(LIR_eq, r_ins, INS_CONST(r)), MISMATCH_EXIT);

Need INS_CONSTPTR(JSVAL_TO_OBJECT(r)) here, for 64-bit portability.

/be
Attached patch Take twoSplinter Review
Let's try these comments instead -- some of what you suggested, some unrequested clarifications.

The guard line goes smack up to just before column 100, so no single-lining.
Attachment #364193 - Attachment is obsolete: true
Attachment #364211 - Flags: review?(brendan)
Comment on attachment 364211 [details] [diff] [review]
Take two

>+        /*
>+         * Shapes are determined by an object's properties, various details of
>+         * each property (slot number, getter/setter, flags, attributes),

This should be tighter: "Scope shape is determined by the ordered list of property names and other attributes (flags, getter/setter, etc.) but not value or type -- except for function-valued properties that have been invoked at least once."

>...         the
>+         * type of each property value,

This is not true.

>...         and the order in which those properties
>+         * were created.

Combined up front in suggested rewording.

>...         For the purpose of generating shapes, every function
>+         * inhabits a separate type,

It isn't clear to someone new and poking around if the function is the object whose scope has a shape here, or the function is the value of one (potentially of many) properties of an object with shape. But the suggested rewording covers this. If this paragraph should stay it would do better to emphasize the branding of scopes whose objects are the receivers of method calls.

>...         and all non-function values inhabit a
>+         * final type.

This implies a separation that is not made in the code between methods and non-methods. But once you add to an object that has called methods, the shape is regenerated on every extension. This is sub-optimal, and it should be fixed. No need to over-specify it in this comment, though.

>...         This function-based distinction enables various
>+         * function-call optimizations.

This motivational part is right on, although brief and short on details. ;-)

>+         * This shape requirement is addressed in the interpreter by branding
>+         * the scope and updating any branded scope's shape every time the
>+         * value changes to or from a function or when one function value is
>+         * modified to a different function value; see GC_WRITE_BARRIER.

See also js_FillPropertyCache, the SCOPE_{IS,SET}_BRANDED usage. Again worth qualifying that branding happens on first method call, not assignment of method.

>+         * On trace the shape requirement is mostly handled by normal shape
>+         * guarding.  However, the global object's shape is assumed invariant
>+         * during tracing,

or "the global object's shape is required to be invariant at trace recording time"

>...         and since a function-to-function transition changes

"can change" (same-value does not, as in the test).

>+         * shape, we must handle this edge case separately with the following
>+         * guard.  See also bug 473256.

/be
Attachment #364211 - Flags: review?(brendan) → review+
http://hg.mozilla.org/tracemonkey/rev/d69277360499
Whiteboard: fixed-in-tracemonkey
Oh, thanks to Brendan/Andreas for discussing this with me and helping us eventually get to the right fix for the problem, much appreciated.
h-118:~/moz/js-tm jwalden$ cat /tmp/guard.js 
var global;
var o = {};
var a = [o, o, o, o, o, o, function(){}];
for (var i = 0, sz = a.length; i < sz; i++)
  void(global = a[i]);
print(shapeOf(this));
h-118:~/moz/js-tm jwalden$ obj-i386-apple-darwin8.11.1/dist/bin/js -f /tmp/guard.js 
134
h-118:~/moz/js-tm jwalden$ obj-i386-apple-darwin8.11.1/dist/bin/js -j -f /tmp/guard.js 
152
recorder: started(1), aborted(0), completed(1), different header(0), trees trashed(0), slot promoted(0), unstable loop variable(0), breaks(0), returns(0), unstableInnerCalls(0)
monitor: triggered(1), exits(1), type mismatch(0), global mismatch(0)


This suggests to me that VALUE_IS_FUNCTION(cx, r) really should be !JSVAL_IS_PRIMITIVE(r).  Thoughts?
http://hg.mozilla.org/mozilla-central/rev/d69277360499
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Indeed the general bug remains, as this test shows:

var x;
function f() { print("f"); }
function g() { print("g"); }
var a = [f, f, f, g, g];
for (var i = 0, n = a.length; i < n; i++) {
    x = a[i];
    x();
}
print(shapeOf(this));

Without -j, output is:

f
f
f
g
g
135

With -j, output is:

f
f
f
f
f
152

Need a new bug to track this, unless you want to back out and reopen.

Guarding on the identity of objects stored in global variables seems a bad idea. We need to avoid this "dog that didn't bark" problem of assuming the global shape not changing during recording means it wouldn't change later if we ran in the interpreter.

The obvious way to do it is to generate code for JSOP_CALLNAME when the base object is the global object. This would avoid penalizing global variable sets, which we must not slow down (cf. t/bitops-bitwise-and.js). When calling a function via a global name, we must guard on the identity of the function.

/be
Backing out entirely seems over-aggressive, but we should certainly reopen.  More to do here still...
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: fixed-in-tracemonkey
Backing out cuz the fix wasn't a fix and won't be part of a further patch seems like the right thing, esp. with a REOPEN. Or patch quickly. Sayrer will keep up, I'm sure (kudos to him for all the merging).

/be
oops, I took this on the branch 

http://hg.mozilla.org/releases/mozilla-1.9.1/rev/51cd40efb382

Should I untake it?
I can't reproduce comment 11, suspect it was a pre-change shell rather than post-change.  Will try to make comment 9 into a test that demonstrates behavioral change beyond shape evolution...
Let's not worry about untaking -- best to fix quickly as noted in comment 13. Bug hygiene wants back-out on REOPEN, but since the patch that landed did fix the testcase in this bug, seems to me the simpler course is to use a followup bug. Or patch really quickly and we can all get on with our lives -- but don't untake.

/be
Could we get this marked FIXED and fixed1.9.1 and a followup P1 blocker filed, please, along with an ETA for the follow-up fix if we're to hold b3 for it?
Whiteboard: [next beta]
I ended up backing out the original patch from TM partly because we decided it was incomplete here and partly because it might have been the cause of a perf regression.  However, I did open bug 481246 as requested to deal with followup work (which, at this point, may just mean rewriting entirely).  At this point perhaps it's best just to close this bug and let that one take over, I guess.  Incomplete, maybe?  Bleh, I hate bug resolutions sometimes.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → INCOMPLETE
did we confirm that the fix was the cause of a large perf regression? If so, it has to come off the branch, too.
See bug 481103 comment 1 for what looks like denial rather than confirmation. But since Waldo backed out of TM, and because we know the patch was at best a partial fix, why not back it out of m-c and 1.9.1 too?

/be
Or see bug 481103 comment 2 -- perhaps this was one of two or more regressions that hit dromaeo, and backing it out helped but didn't unregress everything.

Sorry about the m-c gaffe, I should have written "why not back it out of 1.9.1 too?" in comment 20.

/be
How safe is embedding of object identities into the jited code given that the code is not flushed during the GC? I.e. what prevents an evil script from constructing a very different object after a GC with the same address as the one embedded in the trace?
You need to log in before you can comment on or make changes to this bug.