Open Bug 508849 Opened 15 years ago Updated 5 months ago

js raytracer somewhat slower than webkit

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect

Tracking

()

People

(Reporter: bzbarsky, Unassigned)

References

(Blocks 1 open bug, )

Details

Attachments

(5 files)

The url in the url field raytraces in 20s or so in my m-c or t-m build.  It's 16s in webkit nightly.
Attached file First js file
Attached file Second js file
Attached file Third js file
Attached file HTML testcase
Blocks: 467263
jitstats:

recorder: started(526), aborted(239), completed(1608), different header(7), trees trashed(0), slot promoted(0), unstable loop variable(582), breaks(0), returns(34), unstableInnerCalls(41), blacklisted(47)
monitor: triggered(274505), exits(274473), type mismatch(0), global mismatch(4)

Abort reasons (the part after the last ':' in the abort reports), with their counts:

 159 Inner tree is trying to grow, abort outer recording.
  41 No compatible inner tree.
  17 Inner tree is trying to stabilize, abort outer recording.
   8 Inner tree took different side exit, abort current recording and grow
     nesting tree.
   7 Loop edge does not return to header.
   4 Inner tree not suitable for calling.
Attached file js shell testcase
Each Triangle can have a .render property, which is a function. We don't handle
this well. Each time we call triangle.render(), the Triangle's scope gets
branded.  That gives every Triangle that has a render method a different shape.

A small test case follows; we run it correctly but fall off trace every time we
try to execute the recorded loop.

function method() { return 22; }
function F() { this.m = method; }
var arr = [new F, new F, new F, new F, new F];
for (var i = 0; i < arr.length; i++)
    arr[i].m();

This might or might not be the problem. Most triangles in the demo do not have shaders.
I did just try not giving any triangles shaders, and that didn't help much.
And to be precise, we take about the same 20s, but are at that point spending 80+% of our time on trace as far as I can tell.

jseward, want to try your new stuff on this?
Bug 497789 will help the problem in comment 7, but that may not be biting hard here (yet).

/be
So I tried Julian's patch for bug 503424 to get some information out of this thing.

It looks like we have lots of fragments, spend no more than a few % of time in any of them, and take lots of branch exits.  The code _is_ in fact very branchy here; for example Triangle.prototype.intersect has 4 separate branchpoints in 21 lines of code.  And given the output, we take all of these reasonably often.

So I think speedups are most likely to come from one of several things here:

1)  Reducing the amount of unboxing we have to do when getting this.foo double
    props and doing arithmetic with them.
2)  Reducing the cost of branch exits, especially ones that are patched to jump
    into a different fragment.
3)  _Maybe_ reordering fragments in some way such that the most-common ones run
    first.  It might or might not help; for example the last branchpoint in
    intersect() goes one way about half the time and the other way the other
    half of the time...

Do we already have bugs on #1 and #2?
I filed bug 509069 on item 2.
Depends on: 509069
Some numbers (seconds):

- Opera 10.63: 6.9
- Chrome 8: 7.1
- JM/TM/profiling or JM-only: 9.2-9.7

For the shell version I get 8.5 seconds for V8 and 10.1 for JM. I started profiling this a bit; JM becomes (much) slower than V8 when it has to run these kind of functions:
---
function scale(v, scale) {
    return [v[0] * scale, v[1] * scale, v[2] * scale];
}
---
This pattern is common in raytracers; I hope bug 606477 will help here. 

I didn't have time to look further, there may be other issues.
Depends on: 606477
Yeah, I think the right way forward here is to just make JM fast on this code and not try to trace it...
Current JS shell numbers using attached shell testcase:
Interp: 82.27s
-j:     82.721s
-m:     10.68s
-m -n:  6.057s

Looks like JM+TI made this fast. I don't have a v8 shell to compare against, though.
Over here I get:
 -m -n: 5.164s
    d8: 3.494s

So we've got some ways to go.
Assignee: general → nobody
Using the URL testcase:
Nightly 37 - 2.5s
Chrome 39 - 1.8s
Severity: normal → S3
See Also: → 1864381
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: