Closed Bug 803884 Opened 13 years ago Closed 13 years ago

IonMonkey: if (!x) slower than if (x == 0)

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla19

People

(Reporter: azakai, Assigned: sstangl)

References

(Blocks 1 open bug)

Details

(Whiteboard: [ion:t])

Attachments

(3 files)

one.js 13 years ago Alon Zakai (:azakai) 34.64 KB, application/javascript		Details
two.js 13 years ago Alon Zakai (:azakai) 34.63 KB, application/javascript		Details
Fold MControlInstructions. 13 years ago Sean Stangl [:sstangl] 2.25 KB, patch	mjrosenb : review+	Details \| Diff \| Splinter Review

Alon Zakai (:azakai)

Reporter

Description

•

13 years ago

Attached file one.js — Details

Attached are two versions of a primes benchmark compiled by emscripten. diff one.js two.js is 478c478 < if(c % g == 0) { --- > if(!(c % g)) { So the second version has !x instead of x == 0. The second version is preferable because in general it leads to shorter code (not here actually because we can't remove the parens, but generally speaking it is "==0" vs "!"). But the second version is 12% slower.

Alon Zakai (:azakai)

Reporter

Comment 1

•

13 years ago

Attached file two.js — Details

Ryan VanderMeulen [:RyanVM]

Comment 2

•

13 years ago

(In reply to Alon Zakai (:azakai) from comment #0) > So the second version has !x instead of x == 0. > But the second version is 12% slower. The title says that x == 0 is slower?

Alon Zakai (:azakai)

Reporter

Comment 3

•

13 years ago

Good catch, my mistake :) fixed the title.

Summary: if (x == 0) slower than if (!x) → if (!x) slower than if (x == 0)

Jan de Mooij [:jandem]

Comment 4

•

13 years ago

So at least with IonMonkey the problem is that for x == 0 we use CompareAndBranch and emit cmpl $0x0, %eax jne <target> However, for !x we use NotI and TestIAndBranch and emit cmpl $0x0, %eax sete %al movzbl %eax, %eax testl %eax, %eax je <target> There's some code in MTest::foldsTo to fold MTest(MNot(x), b1, b2) to MTest(x, b2, b1), but I don't think foldsTo is ever called on control instructions. We could probably also do this during Lowering.

Blocks: IonSpeed

Marty Rosenberg [:mjrosenb]

Comment 5

•

13 years ago

The Compare + Branch -> CompareAndBranch fusing is done during lowering. Brian and I were talking about this during dinner, and we thing that we need a better interface for fusing instructions together. A pre-existing algorithm like maximal munch would likely be sufficient for all cases that have come up since we started IonMonkey.

Sean Stangl [:sstangl]

Assignee

Comment 6

•

13 years ago

Attached patch Fold MControlInstructions. — Details — Splinter Review

one.js and two.js now run at exactly the same speed. !! also works as expected.

Attachment #675697 - Flags: review?(mrosenberg)

Sean Stangl [:sstangl]

Assignee

Updated

•

13 years ago

Summary: if (!x) slower than if (x == 0) → IonMonkey: if (!x) slower than if (x == 0)

Sean Stangl [:sstangl]

Assignee

Updated

•

13 years ago

Assignee: general → sstangl

OS: Linux → All

Hardware: x86 → All

Whiteboard: [ion:t]

Marty Rosenberg [:mjrosenb]

Comment 7

•

13 years ago

Comment on attachment 675697 [details] [diff] [review] Fold MControlInstructions. Review of attachment 675697 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/ion/ValueNumbering.cpp @@ +87,5 @@ > + repl->setValueNumberData(new ValueNumberData); > + > + MBasicBlock *block = def->block(); > + > + JS_ASSERT(def->useCount() == 0); A comment saying why we are verifying that this instruction *isn't* used would be useful.

Attachment #675697 - Flags: review?(mrosenberg) → review+

Sean Stangl [:sstangl]

Assignee

Comment 8

•

13 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/829fcbcf7084

Ryan VanderMeulen [:RyanVM]

Comment 9

•

13 years ago

https://hg.mozilla.org/mozilla-central/rev/829fcbcf7084

Status: NEW → RESOLVED

Closed: 13 years ago

Flags: in-testsuite-

Resolution: --- → FIXED

Target Milestone: --- → mozilla19

Alon Zakai (:azakai)

Reporter

Comment 10

•

13 years ago

two.js is still 5% slower than one.js for me, on latest m-c. Better than 12% from before but still significant.

Sean Stangl [:sstangl]

Assignee

Comment 11

•

13 years ago

(In reply to Alon Zakai (:azakai) from comment #10) > two.js is still 5% slower than one.js for me, on latest m-c. Better than 12% > from before but still significant. Testing locally: on x86, both one.js and two.js resolve in exactly 502ms with little variation. On x86_64, both resolve in exactly 670ms with little variation. After the patch in this bug landed, the code emitted for both cases should be identical. Confirmation in either direction or STR would be helpful.

Alon Zakai (:azakai)

Reporter

Comment 12

•

13 years ago

All I did was build latest m-c from a few hours ago, then run `time js one.js` and `time js two.js` where js is the new build.

Alon Zakai (:azakai)

Reporter

Comment 13

•

13 years ago

I am on latest 32-bit ubuntu on a fairly new core i7 (x230 thinkpad).

Sean Stangl [:sstangl]

Assignee

Comment 14

•

13 years ago

So, I never actually verified this before, but it turns out that IonMonkey never compiles any function in either of the attachments -- every function is found to be excessively large. Could you run a debug build with IONFLAGS=scripts,aborts and see whether there are any non-abort messages? It's suspicious that landing this patch would improve things.

Alon Zakai (:azakai)

Reporter

Comment 15

•

13 years ago

I can do a debug build if it's helpful, but note that I didn't check with and without this patch, I checked just now and a month ago. So it's possible something else landed that narrowed the 12% to 5% over the last month (some JM+TI improvement?), and this patch has no effect since as you say nothing is ioned. Do we have a bug on getting ion to compile bigger scripts?

Sean Stangl [:sstangl]

Assignee

Comment 16

•

13 years ago

(In reply to Alon Zakai (:azakai) from comment #15) > Do we have a bug on getting ion to compile bigger scripts? Not currently. I am investigating the simple solution of bumping the script size limit from 1500 bytes to (ideally) 5000. Chunked compilation was considered to handle arbitrarily-sized scripts, but discarded. The best approach we're pursuing is with parallel compilation, which lets us (almost) not care about the amount of time required to compile.

Alon Zakai (:azakai)

Reporter

Comment 17

•

13 years ago

Ok, cool. If you file a bug on that script size limit bump please cc me.

You need to log in before you can comment on or make changes to this bug.