Closed
Bug 507746
Opened 15 years ago
Closed 8 years ago
Compare Microbenchmark Performance
Categories
(Core :: JavaScript Engine: JIT, defect)
Core
JavaScript Engine: JIT
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: wagnerg, Unassigned)
References
Details
Attachments
(2 files)
730 bytes,
text/plain
|
Details | |
5.80 KB,
patch
|
Details | Diff | Splinter Review |
The results of my attached testfile are:
no JIT -j
Function: 39293 39475
Array: 4667 1553
Number: 4450 1986
String: 4447 2028
Boolean: 4347 1867
Date: 6561 4052
RegExp: 9729 7192
Overall: 73494ms 58154ms
And now with a different order (functions at the end):
Array: 4469 1309
Number: 4421 1963
String: 4377 1936
Boolean: 4302 1793
Date: 6529 4010
RegExp: 10147 7487
Function: 62508 62700
Overall: 96754ms 81198ms
There is a huge difference if I allocate functions at the begin or at the end of the test and there is no speedup for function allocation with JIT enabled.
Comment 1•15 years ago
|
||
This is really good. But now I have some more suggestions on how to turn this into a proper ubenchmark suite. :-)
- Separate tests into individual files
- Create a test harness that:
- can run the files in any order
- inserts the timing code itself (e.g., using -e or -f arguments to js) so that less boilerplate is required in the test cases
- generates the loops itself or otherwise allows control over the number of iterations
- can do comparisons with various VMs (incl SFX and V8)
- can compare with empty loop perf, to get potentially more direct measurements of the time taken by the operation being tested.
Reporter | ||
Comment 2•15 years ago
|
||
Yeah my benchmark file could be better... :)
As for the numbers, it seems GC and allocation are not to blame for the time difference.
207:src mozilla$ ./OPT.OBJ/js ./empty.js
Function: 39893
Number: 5900
String: 5739
Boolean: 5618
Date: 7789
RegExp: 11056
Array: 5909
Overall: 81905ms
Ticks in GC: 11981960340, 5.99 sec
Ticks in NewGCThing: 32510511444, 16.2 sec
207:src mozilla$ ./OPT.OBJ/js ./empty.js
Number: 5625
String: 5687
Boolean: 5570
Date: 7821
RegExp: 11362
Array: 5752
Function: 63180
Overall: 104997ms
Ticks in GC: 11914319808, 5.95 sec
Ticks in NewGCThing: 32676061224, 16.34 sec
Comment 3•15 years ago
|
||
(In reply to comment #2)
> Yeah my benchmark file could be better... :)
To be clear, I wasn't criticizing. I do think that we need the things I mentioned and they're on my personal to-do list but they'll probably have to wait until I'm all done with closures, so if you would enjoy doing them, that would be excellent.
Reporter | ||
Comment 4•15 years ago
|
||
It seems js_PCToLineNumber is the "slow" function.
In the "fast" version I spend about 8% of the whole time in this function.
If I allocate functions at the end of the test, I spend about 33% of the total time in this function.
In seconds: 22 vers. 4.1
Comment 5•15 years ago
|
||
Here is output for some recentish builds:
sm tm sfx v8
add.js 80.3 4.3 3.3 138.3
bitand.js 72.3 4.4 2.8 137.1
bitor.js 72.5 4.4 3.3 141.5
bitxor.js 74.6 4.3 4.3 142.4
call.js 164.0 4.0 9.0 147.0
div.js 99.2 14.7 31.8 185.0
double.js 60.8 4.0 3.0 141.5
function.js 305.0 284.0 105.0 322.0
getprop.js 80.8 14.4 5.3 136.0
int.js 58.0 3.4 2.8 134.7
mul.js 75.6 5.0 3.7 140.5
new_date.js 504.0 280.0 212.0 331.0
new_object.js 334.0 117.0 67.0 176.0
new_object_1prop.js 498.0 278.0 57.0 163.0
new_object_2prop.js 566.0 295.0 70.0 170.0
new_object_braces.js 288.0 90.0 50.0 170.0
new_regexp.js 732.0 490.0 914.0 1600.0
new_string.js 346.0 126.0 78.0 181.0
setprop.js 82.4 8.0 4.5 14.6
sub.js 80.4 4.1 3.7 147.6
[*] Numbers are time per iteration in microseconds
Standout items:
call.js Inlining FTW! Still 16x slower than WebKit in interpreter.
major opportunities for improvement:
getprop.js 3x slower than WebKit with tracing
setprop.js 2x slower than WebKit
*object*.js 2x slower than WebKit (5x slower with properties in {} syntax)
Reporter | ||
Comment 6•15 years ago
|
||
There is an easy explanation for the slowdown with the functions allocated at the begin or at the end of the benchmark. For each function js_PCToLineNumber is called and it traverses the whole script until the function is found. If the line number is 36 instead of 5, it takes about 7 times longer until the line number is found.
Is this really necessary in the optimized build? Is it possible/worth to store previous found line numbers?
Comment 7•15 years ago
|
||
Can you show some representative stacks for js_PCToLineNumber? Would be good to know where we're calling it here, since in the non-error case I can't recall why we would use that at all.
Comment 8•15 years ago
|
||
The calls are from the Function constructor. Old bug, I argue low priority -- but a bug to be sure.
/be
Comment 9•15 years ago
|
||
(In reply to comment #5)
Thanks, Dave.
> Standout items:
>
> call.js Inlining FTW! Still 16x slower than WebKit in interpreter.
Bug 471425 at least. Nitro has really optimized call overhead down, we should do likewise.
> major opportunities for improvement:
>
> getprop.js 3x slower than WebKit with tracing
> setprop.js 2x slower than WebKit
> *object*.js 2x slower than WebKit (5x slower with properties in {} syntax)
These are all with tracing, right?
We have bugs on shape guard issues, but beyond those we have too many guards. With brute-force invalidation of cached code on rare, hazardous events, we should aim to guard once per get, set, or init. Need a bug on this.
/be
Comment 10•15 years ago
|
||
(In reply to comment #9)
> (In reply to comment #5)
>
> Thanks, Dave.
>
> > Standout items:
> >
> > call.js Inlining FTW! Still 16x slower than WebKit in interpreter.
>
> Bug 471425 at least. Nitro has really optimized call overhead down, we should
> do likewise.
Yes. It will probably be many small steps. I also hope that the activation record formats will become more similar, which may make upvar/arguments-type tracing code simpler and more efficient.
> > major opportunities for improvement:
> >
> > getprop.js 3x slower than WebKit with tracing
> > setprop.js 2x slower than WebKit
> > *object*.js 2x slower than WebKit (5x slower with properties in {} syntax)
>
> These are all with tracing, right?
Yes.
> We have bugs on shape guard issues, but beyond those we have too many guards.
> With brute-force invalidation of cached code on rare, hazardous events, we
> should aim to guard once per get, set, or init. Need a bug on this.
Yes. Next step for me is to inspect the x86 and builtin code to see where the differences and inefficiencies are.
Comment 11•13 years ago
|
||
Results with current JS shell.
INTERP TM JM JM+TI
Array: 3033 3037 943 833
Number: 2414 2446 845 839
String: 2874 2866 1156 1140
Boolean: 2462 2495 900 929
Date: 22796 22864 20366 20347
RegExp: 4100 4247 2343 2312
Function: 43420 43371 41923 44138
Overall: 81101ms 81329ms 68481ms 70540ms
Looks like JM is a clear win across the board. JM+TI seems to be a mixed bag vs. plain JM. Sorry to say that I don't have a Webkit or d8 shell handy to test.
Blocks: 467263
Summary: TM: Microbenchmark Performance → Compare Microbenchmark Performance
Comment 12•13 years ago
|
||
d8 Results:
Array: 232
Number: 437
String: 207
Boolean: 497
Date: 1685
RegExp: 2335
Function: 9653
Overall: 15050ms
Assignee | ||
Updated•10 years ago
|
Assignee: general → nobody
Comment 13•10 years ago
|
||
I tried this now, with Firefox 31 (release), Nightly 34 and Chrome 36 (release).
I changed the Function bench to be 10x shorter. Nightly has some things turned on that might slow down a bit (GC poisoning or something?), right? But I don't know if that changed something here.
Firefox 31
Array: 32
Number: 1109
String: 1051
Boolean: 953
Date: 6784
RegExp: 2184
Function: 6298
Nightly 34
Array: 31
Number: 1080
String: 1177
Boolean: 910
Date: 2697
RegExp: 3135
Function: 8580
Chrome 36
Array: 197
Number: 398
String: 134
Boolean: 401
Date: 3018
RegExp: 2085
Function: 1010
The improvement on Date should be because I'm on Win7 and Jan de Mooij made a huge improvement on 'new date' 1 or 2 months ago.
The RegExp regression might be a regression from Irregexp vs YARR?
The Function regression I have no idea.
The worst cases for Firefox are 'String' and 'Function', compared to Chrome.
Component: JavaScript Engine → JavaScript Engine: JIT
OS: Mac OS X → All
Hardware: x86 → All
Comment 14•10 years ago
|
||
(In reply to Guilherme Lima from comment #13)
> I changed the Function bench to be 10x shorter. Nightly has some things
> turned on that might slow down a bit (GC poisoning or something?), right?
You can turn this off by running with the environment variable JSGC_DISABLE_POISONING set.
For instance, from a batch file:
set JSGC_DISABLE_POISONING=1
"path/to/firefox.exe" -no-remote -profile "path/to/empty/profile"
Comment 15•8 years ago
|
||
New results:
sm v8
sub.js 1.9 2.3
int.js 1.8 2.3
new_string.js 70.0 16.0
new_regexp.js 341.0 124.0
function.js 10.0 15.0
mul.js 1.7 2.2
new_object_2prop.js 5.0 15.0
bitor.js 2.1 2.3
double.js 1.8 2.2
new_object_braces.js 6.0 15.0
bitand.js 2.3 2.2
new_object.js 150.0 30.0
setprop.js 2.0 2.0
getprop.js 1.7 2.2
call.js 2.0 3.0
new_object_1prop.js 6.0 13.0
new_date.js 106.0 98.0
add.js 2.2 2.2
div.js 2.0 9.2
bitxor.js 2.1 2.2
At this point, it seems that object creation in general is a bit slower, but we could open new bugs to track issues there:
- the ubench code is great, but it creates empty loops in which variables are not getting used, so they're very likely to be dead-code-eliminated (DCE'd). Result values that are different from a few milliseconds just betray the fact that the engines don't DCE this code, so this is something we could investigate (or maybe it's just the time needed for the JITs to kick in).
- although it does have its cons, http://jsperf.com/ does a good job at micro-benchmarking tight loops as well, and has better ergonomics (no need to build a shell, anyone can use their browser instead).
- for micro-benchmarks, arewefastyet contains a suite dedicated to this purpose, so we can keep track of regressions and differences across different browsers/shells: https://arewefastyet.com/#machine=29&view=breakdown&suite=asmjs-ubench
Also, we have other bugs for the fact that objects creation is slow. Closing.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•