Closed Bug 900291 Opened 11 years ago Closed 9 years ago

IonMonkey: More aggressive compilation policies for off thread compilation

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: wuwei, Assigned: wuwei)

Details

Attachments

(2 files, 3 obsolete files)

Off thread compilation has been enabled by default. I profiled the length of the IonWorklist which was the task queue (or stack, more precisely) for all JIT helper threads. It turns out that the task pressure of helper threads is relatively low, so I think it might be good to do off thread compilation more aggressively.

Here are statistics of profiling results for the three main JavaScript benchmarks.
Profiling results may vary on different environments, but the length distribution might not change much.

btw, I saw jsworkers cancelled too much compilation tasks when running kraken benchmark, which might be a hint for further optimizations.

Kraken:
#Operations Operation Type
  14325 APPEND
   1393 CANCEL
  12932 POP

#counts length of IonWorklist
  12864 0
  14123 1
   1461 2
    202 3

Octane:
#Operations Operation Type
   2664 APPEND
     15 CANCEL
   2649 POP

#counts length of IonWorklist
   1859 0
   2128 1
    421 2
    262 3
    195 4
    150 5
    110 6
     85 7
     54 8
     22 9
     12 10
      7 11
      7 12
      7 13
      6 14
      3 15

Sunspider:
#Operations Operation Type
  11203 APPEND
    423 CANCEL
  10780 POP

#counts length of IonWorklist
  10094 0
  10701 1
    909 2
    502 3
    200 4
Attached file kraken profiles (obsolete) —
Attached file Octane profiles (obsolete) —
Attached file SunSpider profiles (obsolete) —
(In reply to Wei Wu [:wuwei UTC+8] from comment #0)
> […] so I think it might be good to do off thread compilation
> more aggressively.

I agree, but we need to be careful because IonBuilder is not executed in parallel and so this will appear as being a source of slow down on benchmarks.

On the other hand, it will be interesting to test Hannes idea to compile non-inlined versions of script at the beginning, and queue large inlining compilation.

Another idea inspired by Hannes idea, would be to test assumptions on the non-inlined versions of the script, and see if we can make assumption and how much it affects optimizations of the scripts, such as adding an anti-alias guard which bails if two objects are identical.  If such anti-alias guard is not hit during the small scripts, we can remember it for the larger compilations.
Although jsworker supports multiple helper threads, currently only one helper thread is allowed to be activated. We may benefit more if all helper threads are used.
Can you also print which scripts are scheduled to compile? I think that might give additional interesting information. I'm quite interested to the octane ones.

Maybe it would also make more sense to count the time a script is waiting, instead of how many scripts are waiting. Because it might well be possible that the queue is length 4, but that it is emptied quite fast...

To have real sense of how much this would make, we actually need to know if it would have made a difference if we had scheduled it earlier. So the time from when it could have been compiled till the moment we can actually enter it through OSR/function enter. (This seems hard to get though). This should give more info about if prioritizing would be better or compiling all at the same time ...

Though we need to be careful about increasing the number of helper threads. They potential could decrease our execution speed. Since we are executing more at the same time. Also possible more lost execution power, since we often have to throw all background compilations away, since a type changed. I remember it used to be not possible to compile more background compilations at a time. I think it should be know, since asm.js is doing this.
(In reply to Hannes Verschore [:h4writer] from comment #6)
> Can you also print which scripts are scheduled to compile? I think that
> might give additional interesting information. I'm quite interested to the
> octane ones.
> 
> Maybe it would also make more sense to count the time a script is waiting,
> instead of how many scripts are waiting. Because it might well be possible
> that the queue is length 4, but that it is emptied quite fast...
> 
> To have real sense of how much this would make, we actually need to know if
> it would have made a difference if we had scheduled it earlier. So the time
> from when it could have been compiled till the moment we can actually enter
> it through OSR/function enter. (This seems hard to get though). This should
> give more info about if prioritizing would be better or compiling all at the
> same time ...
> 

No problem, I will upload the data you mentioned once I finished.
How the waitTime and compileTime were calculated:

int64_t begin = PRMJ_Now();
...code...
int64_t end = PRMJ_Now();
printf("%u\n", end - begin);

Here is a snippet (first 50 lines):

[ionWorklist] 1375542279640314 1 APPEND earley-boyer.js : 536
[ionWorklist] 1375542279640338 0 POP earley-boyer.js : 536
[WorkerThread] 1375542279640341 0 BEGIN earley-boyer.js : 536
[WorkerThread] 1375542279640492 0 FINISH earley-boyer.js : 536 WaitTime: 24 CompileTime: 154
[ionWorklist] 1375542279856473 1 APPEND richards.js : 309
[ionWorklist] 1375542279856508 0 POP richards.js : 309
[WorkerThread] 1375542279856511 0 BEGIN richards.js : 309
[WorkerThread] 1375542279856647 0 FINISH richards.js : 309 WaitTime: 35 CompileTime: 139
[ionWorklist] 1375542279856820 1 APPEND richards.js : 188
[ionWorklist] 1375542279856853 0 POP richards.js : 188
[WorkerThread] 1375542279856856 0 BEGIN richards.js : 188
[ionWorklist] 1375542279857191 1 APPEND richards.js : 324
[ionWorklist] 1375542279857714 2 APPEND richards.js : 527
[ionWorklist] 1375542279857887 3 APPEND richards.js : 401
[WorkerThread] 1375542279858168 3 FINISH richards.js : 188 WaitTime: 32 CompileTime: 1316
[ionWorklist] 1375542279858170 2 POP richards.js : 401
[WorkerThread] 1375542279858171 2 BEGIN richards.js : 401
[ionWorklist] 1375542279858216 3 APPEND richards.js : 465
[ionWorklist] 1375542279858271 4 APPEND richards.js : 230
[ionWorklist] 1375542279858279 5 APPEND richards.js : 313
[ionWorklist] 1375542279858322 6 APPEND richards.js : 241
[ionWorklist] 1375542279858346 7 APPEND richards.js : 345
[WorkerThread] 1375542279858407 7 FINISH richards.js : 401 WaitTime: 283 CompileTime: 237
[ionWorklist] 1375542279858409 6 POP richards.js : 345
[WorkerThread] 1375542279858409 6 BEGIN richards.js : 345
[WorkerThread] 1375542279858510 6 FINISH richards.js : 345 WaitTime: 63 CompileTime: 101
[ionWorklist] 1375542279858512 5 POP richards.js : 241
[WorkerThread] 1375542279858512 5 BEGIN richards.js : 241
[ionWorklist] 1375542279858540 6 APPEND richards.js : 430
[WorkerThread] 1375542279858637 6 FINISH richards.js : 241 WaitTime: 190 CompileTime: 125
[ionWorklist] 1375542279858639 5 POP richards.js : 430
[WorkerThread] 1375542279858640 5 BEGIN richards.js : 430
[ionWorklist] 1375542279858748 6 APPEND richards.js : 204
[ionWorklist] 1375542279858755 7 APPEND richards.js : 301
[ionWorklist] 1375542279858770 7 CANCEL richards.js : 324
[WorkerThread] 1375542279858883 6 FINISH richards.js : 430 WaitTime: 99 CompileTime: 244
[ionWorklist] 1375542279858885 5 POP richards.js : 204
[WorkerThread] 1375542279858886 5 BEGIN richards.js : 204
[WorkerThread] 1375542279858951 5 FINISH richards.js : 204 WaitTime: 137 CompileTime: 66
[ionWorklist] 1375542279858953 4 POP richards.js : 313
[WorkerThread] 1375542279858953 4 BEGIN richards.js : 313
[WorkerThread] 1375542279858982 4 FINISH richards.js : 313 WaitTime: 674 CompileTime: 29
[ionWorklist] 1375542279859012 3 POP richards.js : 230
[WorkerThread] 1375542279859013 3 BEGIN richards.js : 230
[WorkerThread] 1375542279859052 3 FINISH richards.js : 230 WaitTime: 742 CompileTime: 40
[ionWorklist] 1375542279859053 2 POP richards.js : 465
[WorkerThread] 1375542279859054 2 BEGIN richards.js : 465
[ionWorklist] 1375542279859103 3 APPEND richards.js : 305
[ionWorklist] 1375542279859378 4 APPEND richards.js : 188
[WorkerThread] 1375542279859465 4 FINISH richards.js : 465 WaitTime: 837 CompileTime: 412
Attachment #784114 - Attachment is obsolete: true
Attachment #784115 - Attachment is obsolete: true
Attachment #784116 - Attachment is obsolete: true
Profiles with corresponding source code, which might simplify profile analysis.
(In reply to Wei Wu [:wuwei UTC+8] from comment #8)
> Created attachment 785392 [details]
> Octane profiles with waitTime and compileTime
> 
> How the waitTime and compileTime were calculated:
> 
> int64_t begin = PRMJ_Now();
> ...code...
> int64_t end = PRMJ_Now();
> printf("%u\n", end - begin);
> 
> Here is a snippet (first 50 lines):
> 

The profile was obtained on a relatively new machine with eight cores (Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz), so the waitTime and compileTime might be smaller than average cases.
Assignee: general → nobody
Assignee: nobody → lazyparser
Bug 1013172 and some other bugs have been landed, which made this bug obsolete.
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: