TI: gzip benchmark much faster with -j on the TI branch

RESOLVED FIXED

Status

()

RESOLVED FIXED
7 years ago
5 years ago

People

(Reporter: jandem, Unassigned)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

7 years ago
Created attachment 538226 [details]
Shell testcase

For the attached shell testcase (TM branch / JM branch):

js -j       :  362 ms /  248 ms
js -m       :  302 ms /  319 ms
js -m -j    :  232 ms /  243 ms
js -m -j -p :  304 ms /  358 ms
js          : 3713 ms / 3912 ms

1) Why is -j (much) faster on the JM branch? Maybe it's related to invoking TM from the interpreter since -m -j is much faster?

2) Why is interpreter // -m // -m -j -p slower on the JM branch?

gzip.js uses parseInt but porting the patch in bug 662766 to the TM branch makes no difference.
(Reporter)

Comment 1

7 years ago
Filed bug 663087 for the TI part, this bug is for the perf difference without -n.
On the testcase in bug 663138 comment 2, js -j is much faster on the JM branch than the TM branch too (323ms vs. 396ms).  Pretty sure this is due to the object layout changes in place on the JM branch --- slots of call objects and other objects can be accessed at fixed offsets rather than requiring an extra load of the slots pointer.

On that other testcase I checked the disassembly and the only difference was the extra load of the call object's slots on trunk.  Normally I wouldn't think this would make that much of a difference as the loop body still has 12 memory ops, but processors are weird and maybe something was really angry at the removed load.

It is quite possible this is also responsible for the -j difference on this testcase.  I don't know where we spend the time in gzip, but from the nature of the thing it's probably all in some hot loop and, as noted in bug 663087, the inflater in this testcase is a closure filled with NAME accesses, which will be active after the original activation is gone and thus can be optimized by TM.

For 2), the interpreter is probably slower because it is checking over and over and over whether inference is enabled, -m also to a lesser degree.  The -m -j -p difference is weird, the trace decision heuristics may be behaving differently (they shouldn't).
Blocks: 467263
With/witout TI: 88ms/357ms
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.