Closed Bug 456384 Opened 11 years ago Closed 11 years ago

TM: v8-richards.js benchmark opens a print dialog in browser with JIT enabled

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla1.9.1b2

People

(Reporter: smakabwe, Assigned: jorendorff)

References

()

Details

(Keywords: verified1.9.1)

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b1pre) Gecko/20080922032349 Minefield/3.1b1pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b1pre) Gecko/20080922032349 Minefield/3.1b1pre

When you run the v8 benchmark suite from google with tracemonkey enabled it causes a print dialog to appear. This does not happen with JIT disabled. 

I noticed this about a week and a half ago(give or take) and it never happened prior to then when a ran the suite.

Couln't find a dupe. Sorry if it is one, I would have expected a bug to already be filed.

Reproducible: Always

Steps to Reproduce:
1.enable JIT in content
2.go to the link(the v8 benchmark)
3.wait, a print dialog should appear
Actual Results:  
A print dialog appears

Expected Results:  
No print dialog appears and the benchmark is run.
Component: General → JavaScript Engine
Product: Firefox → Core
Version: unspecified → Trunk
Confirming. I see this using  Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b1pre) Gecko/20080922032349 Minefield/3.1b1pre following the reporter's STR.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Tracemonkey ( JIT ) causes the V8 benchamrk suite to open a print dialog. → Tracemonkey ( JIT ) causes the V8 benchmark suite to open a print dialog.
Does anyone know if this affects Linux or Mac. My Linux partition got corrupted and I haven't reinstalled it yet, so I can't check.
Happens for sure on Mac using Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b1pre) Gecko/20080922020341 Minefield/3.1b1pre.
OS: Windows XP → All
Hardware: PC → All
Requesting blocking 1.9.1 as it is a serious regression of the v8 benchmark. People will be very disappointed if they learn that the JIT / TraceMonkey stuff not makes things faster, but crashes the browser...
Flags: blocking1.9.1?
Blocks: 451602
I don't see this crashing; it just doesn't give the correct result since part of the benchmark fails.
The print dialog appearing is a bug in the test, it calls print() if part of the test fails:
http://code.google.com/apis/v8/richards.js (see the end of function runRichards()).

The fact that the test fails is likely a TM bug. I thought there was an existing bug on this, though.
Summary: Tracemonkey ( JIT ) causes the V8 benchmark suite to open a print dialog. → TM: v8-richards.js benchmark opens a print dialog in browser with JIT enabled
This is apparently the same bug that causes error messages when you run this test in the JS shell:

Error during execution: queueCount = 4, holdCount = 1.
Error during execution: queueCount = 0, holdCount = 0.
Error during execution: queueCount = 0, holdCount = 0.
...
I can confirm this on Windows XP SP3, Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b1) Gecko/20081007 Firefox/3.1b1.
Flags: blocking1.9.1? → blocking1.9.1+
Priority: -- → P1
Target Milestone: --- → mozilla1.9.1b2
taking
Assignee: nobody → jorendorff
Jason beat me to it I guess :)
The main loop of the test looks like this:

Scheduler.prototype.schedule = function () {
  this.currentTcb = this.list;
  // (A)
  while (this.currentTcb != null) {
    // (B)
    if (this.currentTcb.isHeldOrSuspended()) {
      this.currentTcb = this.currentTcb.link;
    } else {
      this.currentId = this.currentTcb.id;
      this.currentTcb = this.currentTcb.run();
    }
  }
};

The object this.currentTcb.task is different each time through the loop; there are 4 task constructors: WorkerTask, HandlerTask, IdleTask, DeviceTask.

The problem is that WorkerTasks and HandlerTasks have the same shape.  So on trace, our shape guards pass when they should fail.

When filling the property cache we are supposed to brand objects with unique shapes as necessary.  For some reason this doesn't happen before or during recording here.

If I add a line that says
    this.currentTcb.task.run;  // not actually calling the method
or even just
    WorkerTask.prototype.run;
at point (B), it magically starts working.  Adding the same line at point (A) has no effect.
The brand-on-method-fill choice for Firefox 3 was conservative, to avoid branding on method assignment and risking Twhatever regressions. This was safe when the interpreter is the only user of the property cache. But for TM it's completely broken: it leaves the trace recorder with the wrong shape for too many cases where only the pre-opcode hook is used to generate code.

We need to brand on method assignment, in at least js_SetPropertyHelper. Also in js_DefineNativeProperty, to be safe. Need a common helper, called when the value being defined or set is a function. The rest of the branding code can move from js_FillPropertyCache to the helper.

/be
Smaller test case:

function X() { this.x = 1; }
X.prototype.getName = function () { return "X"; }

function Y() { this.x = 2; }
Y.prototype.getName = function () { return "Y"; }

function test() {
    var a = [new X, new X, new X, new X, new Y];
    var s = '';
    for (var i = 0; i < a.length; i++)
        s += a[i].getName();
    return s;
}

print(test() == 'XXXXY' ? 'PASS' : 'FAIL');
Have to brand on method assignment, not cache fill -- no other way, and it's the right thing.

/be
Forcing early branding doesn't fix this test case.  Here's the LIR emitted for the CALLPROP in question (annotated):

    // Guard that a[i] is native.
    ld7 = ld ld6[0]       // ld6 is a[i]; ld7 is its scope
    ops = ld ld7[4]
    ld8 = ld ops[16]
    guard(native-map) = eq ld8, OP(&js_ObjectOps)
    xf4: xf guard(native-map) -> 0x30315f sp+16 rp+0

    // Guard that a[i] is the expected shape.
    // Both X instances and Y instances have this shape.
    shape = ld ld7[16]    // scope->shape
    guard(kshape) = eq shape, 146
    xf5: xf guard(kshape) -> 0x30315f sp+16 rp+0

    // Guard that X.prototype is native.
    ld9 = ld obj2[0]  // obj2 is X.prototype; ld9 is its scope
    ops = ld ld9[4]
    ld10 = ld ops[0]
    guard(native-map) = eq ld10, OP(&js_ObjectOps)
    xf6: xf guard(native-map) -> 0x30315f sp+16 rp+0

    // Guard that X.prototype is still the expected shape.
    shape = ld ld9[16]
    guard(vshape) = eq shape, 147
    xf7: xf guard(vshape) -> 0x30315f sp+16 rp+0

What isn't here is a check that a[i] actually has X.prototype as its prototype.

I don't think branding on method assignment would fix this.
Should the specific prototype of an object be part of its shape?
(In reply to comment #15)
> Forcing early branding doesn't fix this test case.

You tried?

Branding on assignment means the prototype method assignments give X.prototype and Y.prototype different shapes.

> Here's the LIR emitted for
> the CALLPROP in question (annotated):
[...]
> What isn't here is a check that a[i] actually has X.prototype as its prototype.

We don't need to check that because we know at recording time that a[i] shares X.prototype's shape. Therefore if it later grew its own getName property that shadowed the proto-prop, the trace would side-exit (cache miss in interpreter corresponds to this event).

Since X.prototype and Y.prototype have the same shape, but not the same getName, when the trace is compiled, we cannot cover this case as the interpreter does by brand-on-fill. We must brand on method assignment.

> I don't think branding on method assignment would fix this.

It would, see above.

/be
(In reply to comment #16)
> Should the specific prototype of an object be part of its shape?

No. See comment 17.

/be
(In reply to comment #12)
> The brand-on-method-fill choice for Firefox 3 was conservative, to avoid
> branding on method assignment and risking Twhatever regressions. This was safe
> when the interpreter is the only user of the property cache. But for TM it's
> completely broken: it leaves the trace recorder with the wrong shape for too
> many cases where only the pre-opcode hook is used to generate code.

In this case I think the tracer gets the right shape. record_JSOP_CALLPROP calls test_property_cache, which does this:

http://hg.mozilla.org/tracemonkey/file/a14880ce7249/js/src/jstracer.cpp#l3785

(In reply to comment #17)
> > Forcing early branding doesn't fix this test case.
>
> You tried?

Yep.

> Branding on assignment means the prototype method assignments give X.prototype
> and Y.prototype different shapes.

Yes, but |new X| and |new Y| still have the same shape, after the constructor.

> > Here's the LIR emitted for
> > the CALLPROP in question (annotated):
> [...]
> > What isn't here is a check that a[i] actually has X.prototype as its prototype.
> 
> We don't need to check that because we know at recording time that a[i] shares
> X.prototype's shape.

In this case, it doesn't.  a[i] has a property, a[i].x, that X.prototype lacks.
(In reply to comment #19)
> (In reply to comment #12)
> > The brand-on-method-fill choice for Firefox 3 was conservative, to avoid
> > branding on method assignment and risking Twhatever regressions. This was safe
> > when the interpreter is the only user of the property cache. But for TM it's
> > completely broken: it leaves the trace recorder with the wrong shape for too
> > many cases where only the pre-opcode hook is used to generate code.
> 
> In this case I think the tracer gets the right shape. record_JSOP_CALLPROP
> calls test_property_cache, which does this:

I know, but that's too late. X.prototype and Y.prototype have the same shape, but should not because they have different getName method *values*.

> (In reply to comment #17)
> > > Forcing early branding doesn't fix this test case.
> >
> > You tried?
> 
> Yep.

Patch?

> > Branding on assignment means the prototype method assignments give X.prototype
> > and Y.prototype different shapes.
> 
> Yes, but |new X| and |new Y| still have the same shape, after the constructor.

Then you didn't brand on assignment of X.prototype.getName and Y.prototype.getName.

> > We don't need to check that because we know at recording time that a[i] shares
> > X.prototype's shape.
> 
> In this case, it doesn't.  a[i] has a property, a[i].x, that X.prototype lacks.

What property x? There is no a[i].x, and a[i] for all i in 0 to 5 has no direct properties.

/be
(In reply to comment #20)
> > In this case, it doesn't.  a[i] has a property, a[i].x, that X.prototype lacks.
> 
> What property x? There is no a[i].x, and a[i] for all i in 0 to 5 has no direct
> properties.

The constructors X and Y each assign to this.x.  (If you remove that, the bug goes away.)

Here's the small test case with early branding:


function X() { this.x = 1; }
X.prototype.getName = function () { return "X"; }

function Y() { this.x = 2; }
Y.prototype.getName = function () { return "Y"; }

// force X.prototype and Y.prototype to have different shapes
new X().getName();
new Y().getName();
print(shapeOf(X.prototype));
print(shapeOf(Y.prototype));

function test() {
    var a = [new X, new X, new X, new X, new Y];
    var s = '';
    for (var i = 0; i < a.length; i++)
        s += a[i].getName();
    return s;
}

print(test() == 'XXXXY' ? 'PASS' : 'FAIL');
Ok, sorry I missed those .x assignments in the constructors. The question remains why the interpreter logic leads to a PASS while the JIT generates code a FAIL.

There are three cases:

1. o.p is a direct property of o, or o is unmutated and shares o.__proto__'s scope and shape, and o.__proto__ has direct property p.

2. o.p is in o.__proto__ and o has its own scope/shape. This is a special case because nothing can "get inbetween" o and o.__proto__ without revising o.__proto__'s shape (see js_SetProtoOrParent/GC_SET_SLOT_REQUEST). It allows caching for all o with same shape.

3. o.p is not in o or o.__proto__ and those two objects have distinct scope/shape. This case requires the cache key to be o, the object pointer, not o's shape.

The case at hand is case 2. How did the interpreter get it right, but the JIT get it wrong? That is the question to answer.

/be
(In reply to comment #15)
>     // Guard that X.prototype is still the expected shape.
>     shape = ld ld9[16]
>     guard(vshape) = eq shape, 147

Is 147 the shape of the (force-branded) X.prototype, or Y.prototype?

> What isn't here is a check that a[i] actually has X.prototype as its prototype.

You don't need to check exact prototype identity with shapes, as the case analysis in comment 22 should make clear, combined with the rules for shape regeneration on scope revision and shadowing (which need to be written down in a paper due next month!). It's sufficient to check the shape of the immediate prototype object in case 2 (the cache stores this in the entry's vcap member).

If you have different shapes for X.prototype and Y.prototype, then the guard on X.prototype's shape should side-exit when the new Y at the end of the array is reached. Why isn't that working?

/be
(In reply to comment #22)
> 2. o.p is in o.__proto__ and o has its own scope/shape. This is a special case
> because nothing can "get inbetween" o and o.__proto__ without revising
> o.__proto__'s shape (see js_SetProtoOrParent/GC_SET_SLOT_REQUEST). It allows
> caching for all o with same shape.

Here, o instanceof X.  There are other objects, Y instances, with the same shape as o, but a different prototype.

> The case at hand is case 2. How did the interpreter get it right, but the JIT
> get it wrong? That is the question to answer.

I'm not sure why the interpreter gets it right. I'll look into it.

(In reply to comment #23)
> If you have different shapes for X.prototype and Y.prototype, then the guard on
> X.prototype's shape should side-exit when the new Y at the end of the array is
> reached. Why isn't that working?

The guard is checking the shape of X.prototype (the vshape of the property cache entry), not the shape of a[i].__proto__.
QA Contact: general → general
It works in the interpreter because PROPERTY_CACHE_TEST actually does check a[i].__proto__.  We need to mirror that in the generated LIR.
Attached patch v1Splinter Review
~/dev/moz/tracemonkey/js/src/v8$ ../Darwin_OPT.OBJ/js run-richards.js 
Richards: 181
----
Score: 181
~/dev/moz/tracemonkey/js/src/v8$ ../Darwin_OPT.OBJ/js -j run-richards.js 
Richards: 327
----
Score: 327

I don't know if that number is good or bad, but it's not printing error messages anymore.
Attachment #344992 - Flags: review?(brendan)
Bigger is better with the v8 benchmarks. W00t! Reviewing...

/be
Comment on attachment 344992 [details] [diff] [review]
v1

Great work, sorry I didn't see it earlier (plumbers plus kid-sitting plus lack of coffee, what can I say?).

/be
Attachment #344992 - Flags: review?(brendan) → review+
The brand-on-fill interpreter-biased cache operation is not a problem so long as we always exit a trace on shape guard mismatch (equivalent of cache miss), and fill from the interpreter before re-tracing.

/be
http://hg.mozilla.org/tracemonkey/rev/b7e5c7883e42
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
the trace-test part is in js1_8_1/trace/trace-test.js
Flags: in-testsuite?
Flags: in-litmus-
please set back to RESOLVED FIXED after landing in mozilla-central.  Thanks.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Already landed in mozilla-central.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Flags: in-testsuite? → in-testsuite+
verified 1.9.1, 1.9.2 for testCallProtoMethod
Status: RESOLVED → VERIFIED
(In reply to comment #30)
> http://hg.mozilla.org/tracemonkey/rev/b7e5c7883e42

Without a deep investigation nor a proper background on this, it sounds like the "testCallProtoMethod" contents don't exactly match those of comment 13. In particular, this line doesn't sound to me as valid JavaScript:

+    Y.prototype.getName = function() "Y";

Shouldn't there be a return (or is it implicit), and maybe a pair of curly braces? Is this to be interpreted similarly as the "X" function? I'd see this (probably naively) as a script error or, at most, as:

  function(){};"Y";

If this passes as valid/expected, can' this be a source of scope errors? Please correct me if I'm wrong (and, if possible, point out a few references).
(In reply to comment #35)
> If this passes as valid/expected, can' this be a source of scope errors? Please
> correct me if I'm wrong (and, if possible, point out a few references).

This is an example of an expression closure, which is a SpiderMonkey extension to ECMAScript 3. You can read more about them at <https://developer.mozilla.org/en/New_in_JavaScript_1.8#Expression_closures>.
You need to log in before you can comment on or make changes to this bug.