Closed Bug 670493 Opened 13 years ago Closed 13 years ago

JM: for loop with "i--" as loop condition is 2x slower than same loop with "i-- > 0" as loop condition

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla9

People

(Reporter: bzbarsky, Assigned: jandem)

References

(
URL
)

Details

(Keywords: perf, Whiteboard: js-triage-done)

Attachments

(2 files, 1 obsolete file)

Testcase 13 years ago Boris Zbarsky [:bzbarsky] 331 bytes, text/plain		Details
WIP 13 years ago Jan de Mooij [:jandem] 5.12 KB, patch		Details \| Diff \| Splinter Review
Patch 13 years ago Jan de Mooij [:jandem] 5.19 KB, patch	bhackett1024 : review+	Details \| Diff \| Splinter Review

Boris Zbarsky [:bzbarsky]

Reporter

Description

•

13 years ago

Attached file Testcase — Details

See attached testcase.  I see f2() taking 2x longer than f1(), for both JM and JM+TI.

Running under -j doesn't seem to show this 2x slowdown.

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

13 years ago

URL: http://jsperf.com/forloopspeed

David Mandelin [:dmandelin]

Updated

•

13 years ago

Whiteboard: js-triage-needed

Marty Rosenberg [:mjrosenb]

Comment 1

•

13 years ago

f2 generates the opcodes:
00038:   5  zero
00039:   5  gt
00040:   5  ifne 32 (-8)

where f2 generates the opcodes:
00038:  13  ifne 32 (-6)

Then gt; ifne gets fused into a single operation that we generate code for, 
whereas the lone ifne code mostly only works if the argument is a boolean.  In this case, It is an Int32, so we fall back to the slow C++ code.
The relevant code is in booleanJumpScript:
>    if (type.isSet()) {
>        jmpNotBool.setJump(masm.testBoolean(Assembler::NotEqual, type.reg()));
>    } else {

David Mandelin [:dmandelin]

Comment 2

•

13 years ago

Looks like a job for a peephole optimizer

It'd be interesting to see what other engines do with this.

Blocks: WebJSPerf

Marty Rosenberg [:mjrosenb]

Comment 3

•

13 years ago

Sorry if I was not clear.  It looks like the root cause of the problem is that we don't have a fast path for doing integer checks.  I suspect it would be about +10 lines to fix this (and we'd add in about 2 instructions), but I have bigger fish to fry for the time being.  

sstangl had the idea to put asserts in our ool slow paths to make sure that we were not falling back to C++ for integer/double operations which could be trivially jitted.

Jan de Mooij [:jandem]

Assignee

Comment 4

•

13 years ago

booleanJumpScript should handle known int32 values (my first JM patch, hah), so on the TI branch we should be able to handle this inline. Looking at booleanJumpScript there seems to be more low-hanging fruit (regalloc for instance).

Boris Zbarsky [:bzbarsky]

Reporter

Comment 5

•

13 years ago

> so on the TI branch we should be able to handle this inline

As in, you don't yet but you could?

The timings I see on TI branch are:

-m:
  419
  905
-j:
  176
  178
-m -n:
  133
  291

Note that having an integer as test is the common case for loops, so I think we should seriously look into making it fast.

Jan de Mooij [:jandem]

Assignee

Comment 6

•

13 years ago

(In reply to Boris Zbarsky (:bz) from comment #5)
> > so on the TI branch we should be able to handle this inline
> 
> As in, you don't yet but you could?

We do, but somehow it's less efficient than the code generated for i-- > 0.

The fix for this will also help bug 679710. With Ion coming up I don't want to spend too much time fixing JM perf issues, but this one seems to be quite common.

Blocks: 679710

Status: NEW → ASSIGNED

Jan de Mooij [:jandem]

Assignee

Updated

•

13 years ago

Assignee: general → jandemooij

Jan de Mooij [:jandem]

Assignee

Comment 7

•

13 years ago

Attached patch WIP (obsolete) — Details — Splinter Review

This refactors booleanJumpScript and gets rid of some stack stores and jumps in the generated code. And it removes some code: 1 files changed, 43 insertions(+), 55 deletions(-)

For the attached testcase:

old: 132 310
new: 133 148

More than 2x faster, but still 15 ms slower than version 1. The reason is that we are able to hoist the overflow check for i-- > 0 but not for the i--. The generated code for the two versions is very similar. The only difference is test/jne vs. cmp/jg and the overflow check:
--
   0x8a1710:	mov    %edi,%esi
   0x8a1712:	add    $0xffffffff,%esi
>> 0x8a1715:	jo     0x8a1bc7
   0x8a171b:	mov    %esi,0x48(%ebp)
   0x8a171e:	mov    %edi,%ebx
   0x8a1720:	mov    %esi,%edi
   0x8a1722:	test   %ebx,%ebx
   0x8a1724:	jne    0x8a16ec
   0x8a172a:	jmp    0x8a173c
--
I will add the TI optimizations in bug 679710. This patch seems to be a nice win so I would like to get it in first

Jan de Mooij [:jandem]

Assignee

Updated

•

13 years ago

Whiteboard: js-triage-needed → js-triage-done

Emanuel Hoogeveen [:ehoogeveen]

Comment 8

•

13 years ago

(In reply to Jan de Mooij from comment #7)
> ... The reason is
> that we are able to hoist the overflow check for i-- > 0 but not for the
> i--.
As someone following along at home, why is that? i-- can overflow if i == 0 and i is unsigned, but both loops test the value before the decrement, right? Is this a peculiarity of JS, or am I missing something simple?

Boris Zbarsky [:bzbarsky]

Reporter

Comment 9

•

13 years ago

Seems like if we know that i > 0 then i-- can't overflow (we work only with signed ints for purposes of working with ints).

But if all we know is |i != 0|, then i-- could overflow on a signed int i, right?

The above is total guess; jandem is the one who knows what's going on here.  ;)

Emanuel Hoogeveen [:ehoogeveen]

Comment 10

•

13 years ago

Hmm, I suppose it would overflow from the maximum negative value to.. something else in that case (not sure what exactly, bit rusty on integer representations). I guess it makes sense that the overflow check has to be made in that case - though I don't know what happens on overflow :) Does TI have type barriers for unsigned values, or is signed assumed throughout the codebase? (not to start some lengthy discussion on the JS engine, I think I understand the behavior difference now)

Brian Hackett [Laid off!]

Comment 11

•

13 years ago

We can only remove the overflow check if i is always >= 0 within the loop body, else we could get the overflow on INT_MIN and need to start using the equivalent double (all numbers in JS are really doubles, we go through this integer song and dance for perf).  With some analysis we should be able to determine that if i >= 0 when we enter the loop, either when it initially starts or via an interpreter transition, that it will stay >= 0.  Then we just have to do that one test when entering the loop, and don't need to test each iteration.  LoopState does similar analysis to this for other loop forms, but needs to be more robust for this case.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 12

•

13 years ago

> not sure what exactly

In practice on most hardware using 2s-complement, INT_MIN - 1 = INT_MAX.  But JS engine behavior should be that INT_MIN-1 = double(INT_MIN) - 1.

Emanuel Hoogeveen [:ehoogeveen]

Comment 13

•

13 years ago

Thanks, that just about answers all my questions!

Jan de Mooij [:jandem]

Assignee

Comment 14

•

13 years ago

Attached patch Patch — Details — Splinter Review

Can you especially check whether this comment is still correct: "Note: this cannot overwrite slots holding loop invariants."?

Attachment #557919 - Attachment is obsolete: true

Attachment #558066 - Flags: review?(bhackett1024)

Brian Hackett [Laid off!]

Comment 15

•

13 years ago

Comment on attachment 558066 [details] [diff] [review]
Patch

Nice.

Attachment #558066 - Flags: review?(bhackett1024) → review+

Jan de Mooij [:jandem]

Assignee

Comment 16

•

13 years ago

Adding checkin-needed. The tracemonkey repo required level 2 commit access but for inbound I need level 3.

This was green on try (except some seemingly unrelated Android orange).

Keywords: checkin-needed, perf

Tom S [:evilpie]

Comment 17

•

13 years ago

http://hg.mozilla.org/integration/mozilla-inbound/rev/6e67cafd75c8

Keywords: checkin-needed

Whiteboard: js-triage-done → js-triage-done [inbound]

Ed Morley [:emorley]

Comment 18

•

13 years ago

http://hg.mozilla.org/mozilla-central/rev/6e67cafd75c8

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Whiteboard: js-triage-done [inbound] → js-triage-done

Target Milestone: --- → mozilla9

Alice0775 White

Updated

•

13 years ago

Depends on: 684789

Jan de Mooij [:jandem]

Assignee

Updated

•

13 years ago

Depends on: 684797

Jan de Mooij [:jandem]

Assignee

Updated

•

13 years ago

Depends on: 686919

Alice0775 White

Updated

•

13 years ago

Depends on: 716900

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Updated

•

12 years ago

Depends on: 707141

You need to log in before you can comment on or make changes to this bug.