Open
Bug 606897
Opened 13 years ago
Updated 6 months ago
Profiling makes us much slower on the Celtic Kane Conway benchmark
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
NEW
Tracking | Status | |
---|---|---|
blocking2.0 | --- | - |
People
(Reporter: bzbarsky, Unassigned)
References
Details
(Keywords: perf)
Attachments
(1 file)
2.13 KB,
text/plain
|
Details |
The attached shell testcase is more or less a copy of the Conway benchmark at <http://jsbenchmark.celtickane.com/Run.aspx>. The number it prints is the score; higher is better. I see these numbers over here: -m: 26.6 -j: 49.98 -m -j: 53.19 -m -j -p: 26.14
![]() |
Reporter | |
Comment 1•13 years ago
|
||
For reference, v8 and jsc both score about 45 on this testcase; jsc about the same. So we may be able to get there with JM only... I think the loops on lines 55 and 56 (well, and 49 and 60) are the core of the benchmark; if I make sure we trace those I see scores around 44. The loop on 56 gets blacklisted both because maybeShortLoop is true for it and because selfOpsMult is 16100 (presumably due to those error-checking if statements; in this case, unlike the cases with unreached error-check bodies, I think we do hit all the 16 possible branches... but that's ok!). Also, the array copy loops (talk about slow ways to copy arrays!) don't get traced because the loop bodies are short; I assume JM optimizes dense arrays pretty well, though. If I take out the loop on line 49 and everything inside it, JM ends up scoring 153 while TM scores 176... So the array copies are faster in TM, but not hugely.
![]() |
Reporter | |
Updated•13 years ago
|
I'm not having a lot of luck getting the profiler to trace this one. There are multiple issues. All the loops execute for only a few iterations. There's lots of loop nesting. And the instruction mix doesn't have a lot of math in it; it's mostly control-flow stuff. I tried adding array access and comparisons to the goodOps calculation, but even that wasn't enough (unless I used really big multipliers). Getting this to trace without regressing other stuff seems hard.
![]() |
Reporter | |
Comment 3•13 years ago
|
||
Hmm. So I guess one question is why _is_ this faster with TM than with JM (or with other methodjits, though the difference there is 15%, not 2x)? Can we address this by just fixing something in JM?
Updated•13 years ago
|
blocking2.0: ? → -
Comment 4•12 years ago
|
||
Interp: 1.89 TM: 1.95 JM: 25.19 JM+TI: 38.67 d8: 52.53 Looks like JM+TI got back some of the performance in the attached testcase, but v8 is about 1.4x faster. Obviously the profiling part of this bug is no longer relevant, but it appears that there are still improvements to be made.
Comment 5•10 years ago
|
||
js: 135-140 d8: 160-170 We still have some room for improvements.
Comment 6•10 years ago
|
||
NVM this still didn't have --enable-threadsafe, going to remeasure.
Comment 7•10 years ago
|
||
/s/still/shell/
Updated•9 years ago
|
Keywords: regression → perf
Assignee | ||
Updated•9 years ago
|
Assignee: general → nobody
Updated•6 months ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•