Closed
Bug 540236
Opened 15 years ago
Closed 7 months ago
Performance issue with functions created within loops
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: humph, Unassigned)
References
Details
This may be known, but in case not...we noticed a performance issue today while fixing some code in Processing.js. Functions declared within loops perform very differently under V8 and SM. After reducing the code we were working on, we see the following (times are what I saw on my machine): Test 1: http://bocoup.com/core/code/js-tests/f1lt3r-loop-test-1.html for( var i = 0; i < times; i ++ ){ function strokeCol(){ context.strokeColor = '#fff'; } strokeCol(); } V8 - 0.012 SM - 0.91 Test 2: http://bocoup.com/core/code/js-tests/f1lt3r-loop-test-2.html function strokeCol(){ context.strokeColor = '#fff'; } for( var i = 0; i < times; i ++ ){ strokeCol(); } V8 - 0.013 SM - 0.007 Test 3: http://bocoup.com/core/code/js-tests/f1lt3r-loop-test-3.html for( var i = 0; i < times; i ++ ){ var strokeCol = function(){ context.strokeColor = '#fff'; } strokeCol(); } V8 - 0.237 SM - 0.494 Whether the function is named or anonymous, whether it is attached to a var or called directly, etc. have very different effects on the two, and not always in good ways for Firefox.
Comment 1•15 years ago
|
||
For test1: Abort recording of tree /Users/bzbarsky/test.js:5@54 at /Users/bzbarsky/test.js:6@55: deffun. so that loop just doesn't trace. Over here it runs at basically interp-speed. For test3, we stay on trace the whole time according to TMFLAGS (as in, one trace entry and one trace exit)... but the Shark profile shows 38% in js_Interpret, 30% under js_CloneFunctionObject, 15% js_FindPropertyHelper. I have no idea what's going on there.
Comment 2•15 years ago
|
||
(In reply to comment #1) > For test3, we stay on trace the whole time according to TMFLAGS (as in, one > trace entry and one trace exit)... but the Shark profile shows 38% in > js_Interpret, 30% under js_CloneFunctionObject, 15% js_FindPropertyHelper. I > have no idea what's going on there. One question is whether the closure is a regular closure or a null closure. It looks to me like it should be a null closure (i.e., non-escaping, upvar access can be implemented by static link technique (with display optimization in SM's case)) but there may be some extra constraint I don't know about. Both of those call js_CloneFunctionObject. That function seems pretty much to just call js_NewObject. But it suggests that we can somehow make creating function objects faster. js_FindPropertyHelper must be for the .strokeColor access. Maybe with all the cloning we never hit the prop cache?
Comment 3•15 years ago
|
||
Something I forgot to say in comment 2: I guess the delayed cloning with read barrier should prevent all that cloning, so maybe something is tripping up the delayed cloning mechanism.
Comment 4•15 years ago
|
||
Ok, it looks like for test3 the issue is that the startShark calls do something weird with the shapes. If I look at TMFLAGS=abort,stats,tracer for a build with those calls around the loop, I get: No global slotlist for global shape 251, flushing cache. Flushing cache. No global slotlist for global shape 256, flushing cache. Flushing cache. etc and the stats are: recorder: started(1), aborted(0), completed(1), different header(0), trees trashed(0), slot promoted(0), unstable loop variable(0), breaks(0), returns(0), merged loop exits(0), unstableInnerCalls(0), blacklisted(1) monitor: exits(0), timeouts(0), type mismatch(0), triggered(0), global mismatch(20), flushed(20) So that profile is once again pure-interp. If I take out the shark calls, there are only two calls to js_CloneFunctionObject.
Comment 5•15 years ago
|
||
If I move the loop into a function (whether the connect/disconnect to shark happens in that function or around the call to it), then the weird shape stuff goes away. At that point, 53% of the total time is spent under js_NewNullClosure. 13% of total is in the function itself. 20% is in/under js_NewFinalizableGCThing (15% of total in, the rest in NewGCArena, lock, unlock). 18% is in atomic increment. Another 26% of the time is the vm_fault user trap. This is all on Mac, of course... The numbers might look different elsewhere. If I remove supervisor callstacks, 81% of the total time is under js_NewNullClosure. The rest seems to be executing jitted code.
Comment 6•15 years ago
|
||
Note about test 1: In ECMA-262, the function statement gets hoisted to the top of the script. V8 follows this, so for V8 test 1 is the same as test 2. SM has an extension where function statements are not hoisted in some contexts (like in the top-level script?), so in SM test 1 is more like test 3 (although 2x slower, for some reason).
Comment 7•15 years ago
|
||
Probably 2x slower because it doesn't trace while test 3 does, right?
Comment 8•15 years ago
|
||
(In reply to comment #6) > In ECMA-262, the function statement gets hoisted to the top of the script. V8 > follows this, So do IE and Opera. Not sure about WebKit/JSC. > so for V8 test 1 is the same as test 2. SM has an extension where > function statements are not hoisted in some contexts (like in the top-level > script?), Parenthetical is where we do hoist: functions at top level of another function or program, i.e. not function expressions and not function statements that are subordinate to a block or other statement. Function definitions must be hoisted per ES1-3 and ES5, of course. IE hoisting the extended syntax (function sub-statement) is annoying if you are trying to do "conditional compilation": if (DEBUG) { function f() ... } else { function f() ... } and even before IE extended ES3 this way, SpiderMonkey (in Netscape 4) had the function-in-substatement extension for just this use-case. By covenant in Ecma TC39, we will in a future edition standardize function in sub-statement to bind a block-local (let-style) function name, hoisted to top of block. The block must be explicit: no if (DEBUG) function f()... (without braces). So be warned, don't count on either what we do now, or what IE et al. do, for this function-in-substatement extensions. /be
Updated•14 years ago
|
Comment 9•11 years ago
|
||
Test 1 Nightly 27 - 0.247 Chrome 30 - 0.012 IE 10 - 0.018 Test 2 Nightly 27 - 0.003 Chrome 30 - 0.012 IE 10 - 0.015 Test 3 Nightly 27 - 0.029 Chrome 30 - 0.034 IE 10 - 0.117
Assignee | ||
Updated•10 years ago
|
Assignee: general → nobody
Updated•2 years ago
|
Severity: normal → S3
Updated•7 months ago
|
Status: NEW → RESOLVED
Closed: 7 months ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•