540236 - Performance issue with functions created within loops

Reporter

Description

•

15 years ago

This may be known, but in case not...we noticed a performance issue today while fixing some code in Processing.js.  Functions declared within loops perform very differently under V8 and SM.  After reducing the code we were working on, we see the following (times are what I saw on my machine):

Test 1: http://bocoup.com/core/code/js-tests/f1lt3r-loop-test-1.html

    for( var i = 0; i < times; i ++ ){
      function strokeCol(){
        context.strokeColor = '#fff';
      }
      strokeCol();
    }    

V8 - 0.012
SM - 0.91

Test 2: http://bocoup.com/core/code/js-tests/f1lt3r-loop-test-2.html

    function strokeCol(){
      context.strokeColor = '#fff';
    }
    
    for( var i = 0; i < times; i ++ ){
      strokeCol();
    }    

V8 - 0.013
SM - 0.007

Test 3: http://bocoup.com/core/code/js-tests/f1lt3r-loop-test-3.html

    for( var i = 0; i < times; i ++ ){
      var strokeCol = function(){
        context.strokeColor = '#fff';
      }
      strokeCol();
    } 

V8 - 0.237
SM - 0.494

Whether the function is named or anonymous, whether it is attached to a var or called directly, etc. have very different effects on the two, and not always in good ways for Firefox.

Boris Zbarsky [:bzbarsky]

Comment 1

•

15 years ago

For test1:

Abort recording of tree /Users/bzbarsky/test.js:5@54 at /Users/bzbarsky/test.js:6@55: deffun.

so that loop just doesn't trace.  Over here it runs at basically interp-speed.

For test3, we stay on trace the whole time according to TMFLAGS (as in, one trace entry and one trace exit)... but the Shark profile shows 38% in js_Interpret, 30% under js_CloneFunctionObject, 15% js_FindPropertyHelper.  I have no idea what's going on there.

David Mandelin [:dmandelin]

Comment 2

•

15 years ago

(In reply to comment #1)
> For test3, we stay on trace the whole time according to TMFLAGS (as in, one
> trace entry and one trace exit)... but the Shark profile shows 38% in
> js_Interpret, 30% under js_CloneFunctionObject, 15% js_FindPropertyHelper.  I
> have no idea what's going on there.

One question is whether the closure is a regular closure or a null closure. It looks to me like it should be a null closure (i.e., non-escaping, upvar access can be implemented by static link technique (with display optimization in SM's case)) but there may be some extra constraint I don't know about.

Both of those call js_CloneFunctionObject. That function seems pretty much to just call js_NewObject. But it suggests that we can somehow make creating function objects faster.

js_FindPropertyHelper must be for the .strokeColor access. Maybe with all the cloning we never hit the prop cache?

David Mandelin [:dmandelin]

Comment 3

•

15 years ago

Something I forgot to say in comment 2:

I guess the delayed cloning with read barrier should prevent all that cloning, so maybe something is tripping up the delayed cloning mechanism.

Boris Zbarsky [:bzbarsky]

Comment 4

•

15 years ago

Ok, it looks like for test3 the issue is that the startShark calls do something weird with the shapes.  If I look at TMFLAGS=abort,stats,tracer for a build with those calls around the loop, I get:

No global slotlist for global shape 251, flushing cache.
Flushing cache.
No global slotlist for global shape 256, flushing cache.
Flushing cache.
etc

and the stats are:
recorder: started(1), aborted(0), completed(1), different header(0), trees trashed(0), slot promoted(0), unstable loop variable(0), breaks(0), returns(0), merged loop exits(0), unstableInnerCalls(0), blacklisted(1)
monitor: exits(0), timeouts(0), type mismatch(0), triggered(0), global mismatch(20), flushed(20)

So that profile is once again pure-interp.  If I take out the shark calls, there are only two calls to js_CloneFunctionObject.

Boris Zbarsky [:bzbarsky]

Comment 5

•

15 years ago

If I move the loop into a function (whether the connect/disconnect to shark happens in that function or around the call to it), then the weird shape stuff goes away.  At that point, 53% of the total time is spent under js_NewNullClosure.  13% of total is in the function itself.  20% is in/under js_NewFinalizableGCThing (15% of total in, the rest in NewGCArena, lock, unlock).  18% is in atomic increment.

Another 26% of the time is the vm_fault user trap.

This is all on Mac, of course...  The numbers might look different elsewhere.

If I remove supervisor callstacks, 81% of the total time is under js_NewNullClosure.  The rest seems to be executing jitted code.

David Mandelin [:dmandelin]

Updated

•

15 years ago

Depends on: 540992

David Mandelin [:dmandelin]

Comment 6

•

15 years ago

Note about test 1:

In ECMA-262, the function statement gets hoisted to the top of the script. V8 follows this, so for V8 test 1 is the same as test 2. SM has an extension where function statements are not hoisted in some contexts (like in the top-level script?), so in SM test 1 is more like test 3 (although 2x slower, for some reason).

Boris Zbarsky [:bzbarsky]

Comment 7

•

15 years ago

Probably 2x slower because it doesn't trace while test 3 does, right?

Brendan Eich [:brendan]

Comment 8

•

15 years ago

(In reply to comment #6)
> In ECMA-262, the function statement gets hoisted to the top of the script. V8
> follows this,

So do IE and Opera. Not sure about WebKit/JSC.

> so for V8 test 1 is the same as test 2. SM has an extension where
> function statements are not hoisted in some contexts (like in the top-level
> script?),

Parenthetical is where we do hoist: functions at top level of another function or program, i.e. not function expressions and not function statements that are subordinate to a block or other statement. Function definitions must be hoisted per ES1-3 and ES5, of course.

IE hoisting the extended syntax (function sub-statement) is annoying if you are trying to do "conditional compilation":

if (DEBUG) {
  function f() ...
} else {
  function f() ...
}

and even before IE extended ES3 this way, SpiderMonkey (in Netscape 4) had the function-in-substatement extension for just this use-case.

By covenant in Ecma TC39, we will in a future edition standardize function in sub-statement to bind a block-local (let-style) function name, hoisted to top of block. The block must be explicit: no if (DEBUG) function f()... (without braces). So be warned, don't count on either what we do now, or what IE et al. do, for this function-in-substatement extensions.

/be

Boris Zbarsky [:bzbarsky]

Updated

•

14 years ago

Depends on: 543097, 545529

Guilherme Lima

Comment 9

•

11 years ago

Test 1
Nightly 27 - 0.247
Chrome 30 - 0.012
IE 10 - 0.018

Test 2
Nightly 27 - 0.003
Chrome 30 - 0.012
IE 10 - 0.015

Test 3
Nightly 27 - 0.029
Chrome 30 - 0.034
IE 10 - 0.117

Nobody; OK to take it and work on it

Assignee

Updated

•

10 years ago

Assignee: general → nobody

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Matthew Gaudet (he/him) [:mgaudet]

Updated

•

7 months ago

Status: NEW → RESOLVED

Closed: 7 months ago

Resolution: --- → INCOMPLETE

Bugzilla

Quick Search

Performance issue with functions created within loops

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: humph, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Updated

Comment 9

Updated

Updated

Updated