Closed Bug 502788 Opened 15 years ago Closed 15 years ago

nanojit: avoid many redundant NULL comparisons in the LirReader pipeline

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file)

Attached patch patchSplinter Review
The termination condition in the LirReader pipeline is when the returned LInsp is NULL.  This means that the first thing every step in the pipeline does is a NULL check.  Lots of redundancy.

The attached patch changes the termination condition to opcode==LIR_start.  This means that the stages in the pipeline no longer need to test against NULL.  It also means the pipeline is one instruction shorter (because you stop on LIR_start, not after it).

It's a clear win in the number of instructions executed, but doesn't make any measurable time difference for SunSpider, probably because the NULL tests are highly predictable.  (It doesn't help that NJ compilation time is maybe 10--15% of overall SunSpider time.) 

I'm wondering what is the fate of such patches -- are they worth committing in the hope that they might make a difference on some other machine, or in the future after some other changes have been made?  Or do they just wither and die?
The patch reduces code size (compiled and source), right? That alone (no other wins, and assuming no negative effects) makes it a keeper in my book.

/be
agreed
Comment on attachment 387136 [details] [diff] [review]
patch

(In reply to comment #1)
> The patch reduces code size (compiled and source), right? That alone (no other
> wins, and assuming no negative effects) makes it a keeper in my book.

It reduces source size.  Binary size turns out the same -- the LIR_start comparisons are a little larger than the NULL comparisons, and the new version has some worse register allocation and so there are some extra movs put in there.  If the pipeline gets longer it should be a clearer win.

Ed, are you willing to officially bless with r+?
Attachment #387136 - Flags: review?(edwsmith)
How about a pre-computed table for dce yes/no? Or maybe a precomputed bit-field. Its currently a fairly long chain of short branches (isCse() is internally a bunch of branches too). I made a similar patch a while back after looking at the code in shark.
(separate bug probably, fly-by r=me for this change, shorter code is always better)
(In reply to comment #4)
> How about a pre-computed table for dce yes/no? Or maybe a precomputed
> bit-field. Its currently a fairly long chain of short branches (isCse() is
> internally a bunch of branches too). I made a similar patch a while back after
> looking at the code in shark.

I tried that, it didn't seem to speed up SS at all.  I might go back and look at it more carefully soon.
Attachment #387136 - Flags: review?(edwsmith) → review+
Comment on attachment 387136 [details] [diff] [review]
patch

We probably need a single table of instruction attributes instead of a bunch of separate tables, (analogous to CallInfo) but this change seems fine.
http://hg.mozilla.org/tracemonkey/rev/91cb8d578da7
Assignee: general → nnethercote
Status: NEW → ASSIGNED
Whiteboard: fixed-in-tracemonkey
http://hg.mozilla.org/mozilla-central/rev/91cb8d578da7
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: