TI: Incorrect results with compiled raytrace benchmark

RESOLVED FIXED

Status

()

Core
JavaScript Engine
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: azakai, Unassigned)

Tracking

(Blocks: 1 bug)

Other Branch
x86
Linux
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments)

(Reporter)

Description

7 years ago
Created attachment 520802 [details]
raytrace benchmark

The attachment is a simple raytrace benchmark compiled to JavaScript. Running it on jaegermonkey gives incorrect results when using the method JIT:

1. Running it with -m and parameters |3 32| causes it to write (after the 3 lines of header) just 0's. The correct result has lots of non-0 values.

2. Running it with -m and parameters |5 64| causes it to enter an apparently infinite loop.

Running without -m works ok.
(Reporter)

Comment 1

7 years ago
Created attachment 520818 [details]
raytrace benchmark - alternative version

This is an alternative build of the same original C++ code, this time without using llvm-opt. This version is significantly smaller.

Similar but not identical problems happen here (with 5 64, I see invalid 0's and not a hang).
Bug 643653 has a reduced test case, may be the same problem. Mentioning it here in case you go through these bugs in order.
This seems to work now, same issues I think as bug 643653.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
(Reporter)

Comment 4

7 years ago
Created attachment 523114 [details]
raytrace, gcc w. emscripten opts

Attached is a version compiled with different options. With arguments |3 16|, it works with |-m| but gives wrong output with |-m -n| (0's, and also much slower).
(Reporter)

Comment 5

7 years ago
Reopening, see last comment/attachment.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Will need a reduced testcase to look at the correctness problem, but the slowdown looks to be from stubbed SETELEMs (bug 621937).  Will investigate that once the test is working.  I think we should be able to do fairly good with these translated programs; they treat the entire heap as one big array, but it's an array of numbers and monomorphic marshaling functions are used to read and write to it (e.g. String_copy).
(Reporter)

Comment 7

7 years ago
(In reply to comment #6)
> I think we should be able to do fairly good with
> these translated programs; they treat the entire heap as one big array, but
> it's an array of numbers and monomorphic marshaling functions are used to read
> and write to it (e.g. String_copy).

Would be great if TI speeds up this kind of code. It's compiled from something statically typed, so in theory it should be possible to infer the types of everything ahead of time, I hope?

(Well, except that the heap array mixes ints and floats. Does that matter? Emscripten does have an option to compile with separate heaps for ints and floats, but it means much more memory is needed.)
Ah, mixing ints and floats in the same array will currently confuse TI, we will coerce every value read from the array into a float.  That could be part/most of the slowdown, actually, right now we make slow stub calls when floats are used to index into arrays (even if those floats are easily convertible to ints).

The plan for the analysis (not sure the timeframe) is to be able to detect this sort of polymorphism and adapt to it, use instrumentation to figure out which access sites have always been accessing integers and stick type tests there to allow the program to use polymorphic structures and code while retaining accurate types and speed for the monomorphic parts of the program.
Created attachment 523281 [details]
Reduced

$ ./js -m -n -a test.js
test.js:13: Error: Assertion failed: got (void 0), expected 5

Test passes if I remove the "use strict".
The problem is that in Compiler::ensureInteger there was a place where we alloc'ed an FP register inside conditional code, so that if the alloc emitted sync code it could be jumped over by the fast path.

I accidentally fixed this as part of bug 647048, which replaced the allocFPReg with a use of the conversion temporary.  I now get the same output with -m, -m -n, and -m -n -a.  Will check in the test case.
Status: REOPENED → RESOLVED
Last Resolved: 7 years ago7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.