Closed Bug 458016 Opened 16 years ago Closed 14 years ago

TM: JIT compilation of javascript does not speed up the attached example

Categories

(Core :: JavaScript Engine, defect, P3)

x86
Windows XP
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: duncan.loveday, Assigned: gal)

References

Details

(Keywords: perf, regression, testcase)

Attachments

(2 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b1pre) Gecko/20080930093007 Minefield/3.1b1pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b1pre) Gecko/20080930093007 Minefield/3.1b1p

The attached example runs in 3-4 seconds whether the javascript.options.jit.* prefs are enabled or not. As it's only javascript code - manipulating arrays with some string and date handling (doesn't touch the DOM), why wouldn't there be a big speedup ?

Reproducible: Always

Steps to Reproduce:
1. Load the attached
2. Go to about:config and enable the prefs javascript.options.jit.*
3. Press the Go button and note the timing
4. Go to about:config and disable the prefs javascript.options.jit.*
5. Repeat step 3.
Actual Results:  
The timings are similar, notwithstanding some variation from run to run in both cases. Typically on my windows XP laptop the test runs in 3-4 seconds.

Expected Results:  
With JIT enabled, it should be much faster.

I do see the big improvement in performance in the demo at http://people.mozilla.com/~schrep/image12.html so it is something about the nature of the attached code.
Attached file test case
Keywords: testcase
Interestingly, Fx3 either with or without JIT enabled is better than Google Chrome on this example.

Perhaps the performance has more to do with the javascript intrinsic objects like Date and String and how good the (possibly native) implementation is for those ?
Attached file shell test
We're aborting on Math.min and new Date().
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: wanted1.9.1+
Priority: -- → P3
Target Milestone: --- → mozilla1.9.1b2
Summary: JIT compilation of javascript does not speed up the attached example → TM: JIT compilation of javascript does not speed up the attached example
Bug 456165 is new Date().

/be
Depends on: 456165
> We're aborting on Math.min and new Date().

My full code uses lots of methods in Math.*, String.* etc. Will these be a barrier for TM or will they all be supported going forward ?
(In reply to comment #5)
> > We're aborting on Math.min and new Date().
> 
> My full code uses lots of methods in Math.*, String.* etc. Will these be a
> barrier for TM or will they all be supported going forward ?

It's a bug, will fix.
On the latest trunk, the attached runs noticeably SLOWER with JIT, although there's a good deal of variation from run to run in both cases.
Here are some results I just got from 10 runs with and without JIT content enabled.

JIT ON                    JIT OFF
4153                      4611
3913                      2828
10476                     3408
9893                      3550
4179                      3595
4381                      4553
11038                     3499
3250                      3364
7899                      5502
4645                      4648
=====                     =====
63827                     39558

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre) Gecko/20090211 Shiretoko/3.1b3pre
I might add: Earlier today I ran an automated test suite for a javascript-heavy (and DOM-heavy) web app, too complex to be a test case, results were

Fx 3.0.6               470 secs
Fx3.1b2 JIT            770 secs
Fx3.1b2 NO JIT         614 secs
Trunk JIT              786 secs
Trunk NO JIT           515 secs

Definitely slower with JIT enabled. There seem to be frequent pauses, as if there might be more GC activity but that's a guess.
Keywords: perf, regression
The code still hits a bunch of functions we don't trace (getTime() i.e. or new Date(...)). Thats one bug. The other bug is that we lose so much time from that. Thats not ok. I suggest investigating the latter first.
whale:src gal$ TRACEMONKEY=verbose ./Darwin_DBG.OBJ/js -j x.js | grep "Abort"
Abort recording (line 57, pc 6): call.
Abort recording (line 57, pc 6): call.
Abort recording (line 110, pc 60): Loop edge does not return to header.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 129, pc 79): new.
Abort recording (line 32, pc 105): new.
Abort recording (line 57, pc 6): call.
Abort recording (line 110, pc 60): No compatible inner tree.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 112, pc 82): Inner tree is trying to grow, abort outer recording.
Abort recording (line 57, pc 6): call.
Abort recording (line 57, pc 6): call.
Abort recording (line 57, pc 6): call.
Abort recording (line 57, pc 6): call.
Abort recording (line 57, pc 6): call.
Abort recording (line 129, pc 79): new.
Abort recording (line 32, pc 105): new.
Abort recording (line 57, pc 6): call.
Abort recording (line 129, pc 79): new.
Abort recording (line 32, pc 105): new.
Abort recording (line 57, pc 6): call.
Abort recording (line 129, pc 79): new.
Abort recording (line 32, pc 105): new.
Abort recording (line 57, pc 6): call.
Abort recording (line 129, pc 79): new.
Abort recording (line 32, pc 105): new.
Abort recording (line 57, pc 6): call.
Abort recording (line 129, pc 79): new.
Abort recording (line 32, pc 105): new.
Abort recording (line 57, pc 6): call.
Abort recording (line 129, pc 79): new.
Abort recording (line 32, pc 105): new.

The inner tree aborts might be due to the paths we can't compile.
Adding graydon. Blacklisting doesn't seem to work. We keep aborting the same traces over and over. Maybe GCs reset our state.
Assignee: general → gal
Depends on: 478525, 469347, 478512
Depends on: 478526
Attachment #362176 - Attachment is obsolete: true
With the dependent bugs/patches applied we are doing much better already. Some date constructors are still missing. I will look at those next. This is a good example how somewhat odd code behavior that we don't trace punishes us in the performance department. The code accesses undefined properties in a hot loop, which killed tracing of that loop which caused a lot of tracing without much benefit. More optimizations to follow.

whale:src gal$ ./Darwin_OPT.OBJ/js -j slow-date.js 
Time=2995
whale:src gal$ ./Darwin_OPT.OBJ/js slow-date.js 
Time=2955
I added new Date(double) and Date.setTime. We now gain some speedup. We still get around 4000 side exits due to new Date(6-args) not being traced. The latter we can't support atm due to some nanojit limitations. I will file a bug with wanted 1.9.2. Also, note that this test case now spends most of its time in jsdate.cpp, banging on various dates in various ways. Further speedup will be limited.

whale:src gal$ ./Darwin_OPT.OBJ/js -j slow-date.js 
Time=2282
whale:src gal$ ./Darwin_OPT.OBJ/js slow-date.js 
Time=2993
No longer depends on: 469347
I'm guessing -j is with JIT ? In which case, you seem to have cracked it already !

Help me out - where do I access undefined properties ?
Bug 459440 is about other date constructors - make this depend on that one ?
Depends on: 459440
Yes, -j is with jit enabled and 459440 is in deed blocking. Sorry didn't see that.
(In reply to comment #7)

This is VERY much improved in the current trunk, slowdown all but gone and times much more repeatable.

JIT ON                    JIT OFF
4206                      4123
4480                      3059
3265                      3478
3480                      3486
3505                      3218
4317                      3262
3788                      3428
3780                      3913
4362                      2553
3463                      3595
=====                     =====
38646                     34115

(cf 63827, 39558)

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b4pre) Gecko/20090307 Shiretoko/3.1b4pre
Please try the latest tracemonkey nightly if you can.

/be
Sorry to report the timings are all 10 seconds and higher both with and without JIT.

Using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090307 Minefield/3.2a1pre
So tracemonkey nightly is slower by ~6 seconds than the 1.9.1 branch builds? We need to find when the regression appeared (when the two diverged, since tm has changes not yet in 1.9.1, but not vice versa), then bisect in hg to find the regressing changeset.

/be
See bug 482076
The bugs this depends on are still open, especially: 478525. I will try to do that one today.
On my machine I currently see JIT perf about equal to non-jit perf on the shell test case here. There are 3 main things slowing it down, in order of presumed severity:

- 2000 mismatch exits at line 113. The instruction is the JSOP_ITER_NEXT for the outer loop, which is implemented with an imacro. We get one mismatch exit per run of the loop that starts on line 107 ('for (var ts in arrayToAdd)'). ts is always a string, and a mismatch exit guards for this. But I think js_CallIteratorNext sends back JSVAL_HOLE on the last iteration. 

- Abort on Date constructors.

- Abort on calls to getTime().
(In reply to comment #24)
> But I think
> js_CallIteratorNext sends back JSVAL_HOLE on the last iteration. 

That's correct.
The unboxing mismatch-exits bug is bug 479110.

/be
Depends on: 479110
JIT perf is still only about equal to shell, even with mismatch exits fixed. Right now it looks like Date constructors may be the only problem left. There are 2 calls to the 1-arg form, and 1 call to the 6-arg form. Because of these we fall off trace almost immediately, so the shell program runs mostly in the interpreter.
Constructors are not traced yet. I think I can fix that but there is a P1 ahead of it in my queue.
(In reply to comment #27)
> JIT perf is still only about equal to shell, even with mismatch exits fixed.
> Right now it looks like Date constructors may be the only problem left. There
> are 2 calls to the 1-arg form, and 1 call to the 6-arg form. Because of these
> we fall off trace almost immediately, so the shell program runs mostly in the
> interpreter.

In case it may be of any interest, bug 340992 dealt with some earlier (non-JIT) optimizations re: Date constructors.
(In reply to comment #28)
> Constructors are not traced yet. I think I can fix that but there is a P1 ahead
> of it in my queue.

Bug 478526 was meant to address new Date(1 arg) - dup'd to one that's resolved FIXED
Flags: blocking1.9.2?
Target Milestone: mozilla1.9.1b2 → ---
all the blocking bugs are fixed. what's the status here?
Flags: blocking1.9.2? → wanted1.9.2+
Current opt js shell results for shell test case:

  no jit:  Time=3436
  jit:     Time=2823

The speedup is not great, but we do spend almost all our time on trace:

Activity duration summary:

interpret       82.629710
monitor          5.884010
record           7.546585
compile          4.401155
execute         10.465550
native        2583.598305
-------------------------
Subtotal      2694.525315
I'm going to call this fixed, then.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Firefox: 800ms - Chrome: 900ms !

So, marking as verified!
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: