Closed
Bug 606650
Opened 14 years ago
Closed 14 years ago
Heuristics make us fall into JM on the attached testcases, where it's over 3x slower than tracing
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
blocking2.0 | --- | betaN+ |
People
(Reporter: bzbarsky, Assigned: billm)
References
(Blocks 1 open bug)
Details
(Keywords: regression, Whiteboard: [jsperf] fixed-in-tracemonkey)
Attachments
(3 files)
The testcases linked to from bug 606648 end up in JM with profiling turned on. But they're much faster when traced.
Reporter | ||
Comment 1•14 years ago
|
||
This is from http://www.peacekeeper.therichins.net/test10b.html
Reporter | ||
Comment 2•14 years ago
|
||
This is from http://www.peacekeeper.therichins.net/test10c.html
Reporter | ||
Comment 3•14 years ago
|
||
These are apparently based on some peacekeeper code.
Blocks: peacekeeper
Reporter | ||
Comment 4•14 years ago
|
||
Hmm. So we apparently blacklist the hot loops in processBuffers because we think the inner one has a very high branching factor. Which it technically does, because each get() call has an if/else in it. But in practice, the loop conditions on that loop make sure only one of the branches is taken so this traces well.
Note that tracer is 1.5-2x faster than V8 on these testcases, btw; JM is about 1.5-2x slower than V8...
blocking2.0: --- → ?
Keywords: regression
Updated•14 years ago
|
Whiteboard: [jsperf]
Assignee | ||
Comment 5•14 years ago
|
||
This patch tries to treat the return value of typeof() as an approximate constant, meaning that we expect it to be the same most of the time. Then whenever there's a branch on an approximate constant, it ignores it.
I also cleaned up the short loop tracking to use this new code.
And I fixed a small problem with handling of JSOP_AND/JSOP_OR, where the profiler was finding more branchiness than was actually there.
With the patch, the unmodified Peacekeeper benchmark is still a little slower than with -j alone. I'm not sure if this is profiling overhead or if some other loop ought to be traced. Regardless, we beat v8, and performance is pretty satisfactory.
I'm going to test these changes on SS/V8 to make sure there are no regressions.
Assignee | ||
Comment 6•14 years ago
|
||
No regressions.
Reporter | ||
Comment 7•14 years ago
|
||
Comment on attachment 485863 [details] [diff] [review]
patch
I'm not sure I'm competent to review this, at least with quick turnaround; dmandelin may be a better bet, since he reviewed bug 580468.
That said, passing "false" as the third arg of the StackValue ctor seems wrong.
Also, maybe there should actually be two ctors (one taking just isConst for no value and one taking isConst and the value for cases with a value; setting hasValue can just be done based on which ctor is used).
Attachment #485863 -
Flags: review?(bzbarsky) → review?(dmandelin)
Reporter | ||
Comment 8•14 years ago
|
||
We may also, of course, want a JM bug on it being slower than V8 here. Davids, let me know if you want me to file that.
Reporter | ||
Comment 9•14 years ago
|
||
Hmm. I just tried the attached patch, and I see 300ms and 545ms on my machine on the two testcases in this bug, respectively.
If I disable JM or profiling, I see 230ms and 190ms.
So there's pretty noticeable slowdown from jm+profiling on the first testcase, and a very large slowdown on the second. In fact, the 545ms time is only a little bit faster than the pure-JM time on the second testcase....
It's good to hear that the original benchmark is fast now, but do we need a separate bug on the second testcase here?
Comment 10•14 years ago
|
||
(In reply to comment #9)
> It's good to hear that the original benchmark is fast now, but do we need a
> separate bug on the second testcase here?
Bill, how do you want to do it? Do you want to check that out before I review this patch, or do you want me to go ahead and review this now?
Assignee | ||
Comment 11•14 years ago
|
||
(In reply to comment #10)
> (In reply to comment #9)
> > It's good to hear that the original benchmark is fast now, but do we need a
> > separate bug on the second testcase here?
>
> Bill, how do you want to do it? Do you want to check that out before I review
> this patch, or do you want me to go ahead and review this now?
Sorry, I should have been more clear that the patch is intended to fix only the first testcase. The second one (which is not part of any benchmark suite, IIUC) isn't addressed, and it would be harder to fix.
I think it's ready for review Dave, although I'll probably add a separate constructor as Boris suggested.
Reporter | ||
Comment 12•14 years ago
|
||
OK, I filed bug 607201 on the second testcase (which I think we should try not to regress, if we can: it's closer to how people actually write code than that benchmark is!) and bug 607202 on JM being Really Slow here in general.
Comment 13•14 years ago
|
||
Comment on attachment 485863 [details] [diff] [review]
patch
The patch looks fine to me, modulo the separate constructor. The only thing I'm not sure about is that this kind of duplicates the bytecode analyzer. But we throw away the results from that after compilation, and recomputing it might be too expensive. Let's just keep that in mind going forward each time we consider beefing up this particular analysis.
r+ with constructors fixed.
Attachment #485863 -
Flags: review?(dmandelin) → review+
Assignee | ||
Comment 14•14 years ago
|
||
Whiteboard: [jsperf] → [jsperf] fixed-in-tracemonkey
Comment 15•14 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•14 years ago
|
blocking2.0: ? → betaN+
You need to log in
before you can comment on or make changes to this bug.
Description
•