Closed Bug 1303122 Opened 8 years ago Closed 8 years ago

x64: codegen of LCompareI64AndBranch lacks the stack case

Categories

(Core :: JavaScript Engine, defect)

51 Branch
x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox51 + fixed
firefox52 --- fixed

People

(Reporter: konsoletyper, Assigned: bbouvier)

Details

Attachments

(8 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.113 Safari/537.36

Steps to reproduce:

1. opened http://teavm.org/live-examples/wasm/ in Firefox nightly
2. downloaded http://teavm.org/live-examples/wasm/classes.wasm and tried to run it via SpiderMonkey shell:
var buffer = "";
function putchar(c) {
    if (c == 10) {
        print(buffer);
        buffer = "";
    } else {
        buffer += String.fromCharCode(c);
    }
}
function currentTimeMillis() {
    return new Date().getTime();
}

var data = read("classes.wasm", "binary");
var module = Wasm.instantiateModule(data, {
    runtime: {
        putchar: putchar,
        currentTimeMillis: currentTimeMillis
    },
    spectest: {
        print: putchar
    }
});
module.exports.main();
3. Built SpiderMonkey from sources and run it under gdb


Actual results:

1. Firefox reported: "RangeError: invalid or out-of-range index"
2. JS shell reported
Running BigInteger benchmark
run.js:24:1 RangeError: invalid or out-of-range index
Stack:
  @run.js:24:1
3. segfault:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffef63a700 (LWP 13991)]
0x000000000098de18 in js::jit::ToRegister (a=...)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/jit/shared/CodeGenerator-shared-inl.h:81
81          MOZ_ASSERT(a.isGeneralReg());
Missing separate debuginfos, use: dnf debuginfo-install libgcc-5.3.1-6.fc23.x86_64 libstdc++-5.3.1-6.fc23.x86_64 zlib-1.2.8-9.fc23.x86_64
(gdb) bt
#0  0x000000000098de18 in js::jit::ToRegister (a=...)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/jit/shared/CodeGenerator-shared-inl.h:81
#1  0x000000000098dee2 in js::jit::ToRegister64 (a=...)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/jit/shared/CodeGenerator-shared-inl.h:115
#2  0x000000000098f0d8 in js::jit::CodeGeneratorX64::visitCompareI64AndBranch (this=0x7fffef6388d0, 
    lir=0x7fffebc314f0)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/jit/x64/CodeGenerator-x64.cpp:257
#3  0x000000000081f179 in js::jit::LCompareI64AndBranch::accept (this=0x7fffebc314f0, 
    visitor=0x7fffef6388d0)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/jit/shared/LIR-shared.h:2583
#4  0x000000000067b396 in js::jit::CodeGenerator::generateBody (this=0x7fffef6388d0)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/jit/CodeGenerator.cpp:5148
#5  0x000000000068c1f4 in js::jit::CodeGenerator::generateWasm (this=0x7fffef6388d0, sigId=..., 
    offsets=0x7fffeba2f650)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/jit/CodeGenerator.cpp:9144
#6  0x00000000008e04c3 in js::wasm::IonCompileFunction (task=0x7fffeba2e9a0)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/asmjs/WasmIonCompile.cpp:3700
#7  0x00000000008e05eb in js::wasm::CompileFunction (task=0x7fffeba2e9a0)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/asmjs/WasmIonCompile.cpp:3715
#8  0x0000000000c75e1d in js::HelperThread::handleWasmWorkload (this=0x7ffff694e518, locked=...)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/vm/HelperThreads.cpp:1323
#9  0x0000000000c77878 in js::HelperThread::threadLoop (this=0x7ffff694e518)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/vm/HelperThreads.cpp:1740
#10 0x0000000000c75ceb in js::HelperThread::ThreadMain (arg=0x7ffff694e518)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/vm/HelperThreads.cpp:1308
#11 0x0000000000c84ee2 in js::detail::ThreadTrampoline<void (&)(void*), js::HelperThread*>::callMain<0ul> (
    this=0x7ffff6917150)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/threading/Thread.h:242
#12 0x0000000000c81eaf in js::detail::ThreadTrampoline<void (&)(void*), js::HelperThread*>::Start (
    aPack=0x7ffff6917150)
    at /home/konsoletyper/devel/mozilla-central-401ea746b1a9/js/src/threading/Thread.h:235
#13 0x00007ffff7bc561a in start_thread () from /lib64/libpthread.so.0
#14 0x00007ffff6c4a5fd in clone () from /lib64/libc.so.6



Expected results:

Execution completes normally. Or at least SpiderMonkey allows to debug what's happenning. This wasm file completes without errors in V8.
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
Thank you very much for the bug report! It is actually an interesting case that is not exerced enough in our codebase: regalloc is lacking registers here, so one instruction inherits a stack slot. Codegen didn't handle this case, so it is a simple fix, I think. Rephrasing a bit the title to make it sound less dramatic :)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: WebAssembly does not work → x64: codegen of LCompareI64AndBranch lacks the stack case
Attached patch 01.codegen.patchSplinter Review
This handles the stack case for LCompareAndBranchI. Other bugs of this kind may be lurking like this, but intense fuzzing is coming.
Assignee: nobody → bbouvier
Status: NEW → ASSIGNED
Attachment #8791743 - Flags: review?(luke)
Attached patch 02.bce.patchSplinter Review
This test case is great, it's found another issue in BCE where we would add a bounds check twice in the hash map if it's not redundant.

Note that both patches don't have tests yet; I will write these tomorrow.
Attachment #8791744 - Flags: review?(luke)
Thanks again for the report, it's a great one (found two unrelated issues on different architectures).

I am a bit curious: visiting the domain name presented in comment 0, I assume this is automatically compiled from an experimental TeaVM backend? If so, that's great!

My local testing showed the RangeError in both cases, but comment 0 seems to say that Canary doesn't throw a runtime error. Is that right? If so, there might be either a bug in our impl or Canary's, but it's hard to tell without a source code (and our disassembler isn't fully implemented yet). Out of luck, would you have the equivalent text format (wast) source of this test case, and be kind enough to upload it here as well, please?
Flags: needinfo?(konsoletyper)
This file is generated automatically from the Java bytecode.

Yes, I have wast files as well, though I'm a bit unsure if I can setup identical environment to the one I used to generate former wasm file. I'll attach both new wasm and wast. Also I generate compilable .c file from the same wasm model, without it I have no chance to debug my wasm generator. Not sure if the latter may be useful.
Attached file classes.wasm
Attached file classes.wast
Applied your patches and now I get another error:

Starting program: /home/konsoletyper/devel/mozilla-central-29af101880db/js/src/build_DBG.OBJ/dist/bin/js run.js
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffefe3b700 (LWP 26400)]
[Thread 0x7fffefe3b700 (LWP 26400) exited]
[New Thread 0x7fffefe3b700 (LWP 26401)]
[New Thread 0x7fffef63a700 (LWP 26402)]
[New Thread 0x7fffeee39700 (LWP 26403)]
[New Thread 0x7fffee638700 (LWP 26404)]
[New Thread 0x7fffede37700 (LWP 26405)]
[New Thread 0x7fffed636700 (LWP 26406)]
[New Thread 0x7fffece35700 (LWP 26407)]
[New Thread 0x7fffec634700 (LWP 26408)]

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7facf6e in ?? ()
(gdb) bt
#0  0x00007ffff7facf6e in ?? ()
#1  0x00007fffffffa590 in ?? ()
#2  0x00007ffff7fad025 in ?? ()
#3  0x0000000000000000 in ?? ()


Sorry, I'm not aware of channels and naming, etc. I just downloaded latest Firefox nightly and latest source code snapshot from Mercurial.
Attachment #8791743 - Flags: review?(luke) → review+
Comment on attachment 8791744 [details] [diff] [review]
02.bce.patch

Review of attachment 8791744 [details] [diff] [review]:
-----------------------------------------------------------------

Whew, good find!
Attachment #8791744 - Flags: review?(luke) → review+
(In reply to konsoletyper from comment #8)
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007ffff7facf6e in ?? ()
> (gdb) bt
> #0  0x00007ffff7facf6e in ?? ()
> #1  0x00007fffffffa590 in ?? ()
> #2  0x00007ffff7fad025 in ?? ()
> #3  0x0000000000000000 in ?? ()
> 
> 
> Sorry, I'm not aware of channels and naming, etc. I just downloaded latest
> Firefox nightly and latest source code snapshot from Mercurial.

We use a mmap trick in Firefox to catch out-of-bounds memory accesses under x64 (mmap a large region, mprotect everything > the length of the linear memory, handle SEGFAULT appropriately). So this might be the segfault triggered by this mechanism; if you continue, from a debugger, does it crash, or does it step over the out-of-bounds and yield the RangeError instead?
Pushed by bbouvier@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5841b0c33b7e
Handle stack in codegen of LCompareAndBranchI on x64; r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/945544b11e38
Don't re-add an already seen bounds check if it's not redundant; r=luke
Yes, SpiderMonkey yields RangeError, and you may be right. However, how can I find out where my code tries to access address beyound heap limit? There are no debuggers in JS engines. I use the following approach to debug my code before I try to run it in SpiderMonkey and v8. My generator has wasm object model that I can serialize to anything I want. Binary and wast format serializers are supported. Additionally, my generator can produce compilable C file, which I run under gdb. It does not yield any signals during execution. Note that to allocate heap, I use simple malloc. When generated C code tries to access memory beyound heap limits, it usually gets into unmapped pages and produces sigsegv (and I use this fact to catch bugs), the only exception is when address does not exceed heap limit very much, that may be the reason. Also, I run this file under v8 and it completes successfully.

Can you give me any clues of how I can debug my wasm file in SpiderMonkey? The binary offset of an instruction which causes error may be sufficient, but full stack trace is preferred.
Hi, sorry for your trouble!  Bug 1277973 is next on my todo list and it shouldn't take more than a week and will provide full stack traces for wasm frames up to the faulting frame.  If you build your wasm with "-g", I think you'll get a Name section (https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md#name-section) which will give you a symbolicated backtrace.  (We are also working on full devtool integration which would allow you to break on the fault in the debugger, but that will take longer.)

Do you also see the RangeError in Chrome?
I don't have any errors in v8, my example completes without errors and prints all expected text to console.

My compiler does not provide "-g" command line option (nor it prodides CLI, it rather lauched as a Maven plugin), but it produces "name" section anyway.

I see, but can I do something prior corresponding feature is release? For example, when I developed wasm binary serializer, I found place in SpiderMonkey code that parsed instructions and inserted printf there which reported instruction opcode and offset, recompled it and so did with my binary serializer, and then matched both outputs. It there anything like this in codegen, so before emitting some native code I could insert something like

printf("Instruction at %d becomes native code at %d\n", wasmOffset, nativeOffset);

?
Oops, sorry I missed that you already said this above.  Yes, we'll look into this and find our bug here.

Unfortunately there's no such easy pinch point I can think of: for traps, we simply branch on the condition and then jump to a stub shared by the rest of the module so we've lost the context of the error.  In fact, the fix for bug 1277973 is precisely to add an intermediate thunk that is out of the hot path but still knows the bytecode offset (and frame depth) of the faulting insn which it can pass along to the shared throwing stub.
I've tried a few things, and disabling GVN/RA/LICM doesn't change the result.

After some low level debugging, it seems that the failing function is method$20org_gteavm_gruntime_gGC_Z27_getAvailableChunkIfPossibleI.

The load which is precisely failing is in there:

            ;; org/teavm/runtime/GC.java:98
            (i32.store align=4
              ;; org/teavm/runtime/GC.java:98
              (i32.const 5780)
              (i32.add
                (i32.load align=4
                  ;; org/teavm/runtime/GC.java:98
                  (i32.const 5784))
                (i32.load align=4                         ;; this one
                  ;; org/teavm/runtime/GC.java:98
                  (i32.add
                    (i32.load align=4
                      ;; org/teavm/runtime/GC.java:98
                      (i32.const 5784))
                    ;; org/teavm/runtime/GC.java:98
                    (i32.const 4)))))

found from the assembly thanks to the 0x1694 constant.

   0x7ffff7fb4346:	mov    (%r15,%rdx,1),%edx ;; right there
   0x7ffff7fb434a:	add    %edx,%ecx
   0x7ffff7fb434c:	mov    $0x1694,%edx
   0x7ffff7fb4351:	mov    %ecx,(%r15,%rdx,1)

Not sure if that helps until we can get a fix for bug 1277973.
One thing Emscripten has done for us in the past that really expedites finding bugs is generating code that prints on every load/store (the address and value).  Assuming the workload is deterministic, this helps pinpoint the divergence.
Attached file classes.wast.bz2
Attached file classes.c.bz2
Attached file wasm.c
Attached file classes.wasm
I added an option to my wasm -> C translator which enables to insert assert before each memory access. Generated C file does not fail with sigabrt. I looked through method$20org_gteavm_gruntime_gGC_Z27_getAvailableChunkIfPossibleI and all assertions seem to be there. I tried to alter these assertions by hands, and they really work when I decrease upper limit.

Since my compiler is under active development, I can no more generate the same files, therefore I attached recently generated wasm, wast and C files.
https://hg.mozilla.org/mozilla-central/rev/5841b0c33b7e
https://hg.mozilla.org/mozilla-central/rev/945544b11e38
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
(In reply to konsoletyper from comment #22)
> I added an option to my wasm -> C translator which enables to insert assert
> before each memory access. Generated C file does not fail with sigabrt. I
> looked through
> method$20org_gteavm_gruntime_gGC_Z27_getAvailableChunkIfPossibleI and all
> assertions seem to be there. I tried to alter these assertions by hands, and
> they really work when I decrease upper limit.
> 
> Since my compiler is under active development, I can no more generate the
> same files, therefore I attached recently generated wasm, wast and C files.

I'm having some issues with the files you've provided. Some of these could probably be worked on our sides:
- in the wast format:
  - we don't allow the binary data to be on strings that are on several lines (I guess v8 does allow this).
  - we don't allow % or @ in variable names.
  - even with these two are changed, I get a signature mismatch error in one indirect call?
- I can't compile the C file with the current emscripten branch, because emscripten refuses to compile; I assume most work on emscripten's side is done on the 0xc branch, so that should arrive soon.

In the meanwhile, I know I am asking a lot, but if you were willing to, that would help us a lot:
- replace the assertions by calls to printf, printing the byte index in the linear memory, the value just loaded (or about to be stored), for each memory access
- provide us a wasm file that does this? (polyfilling printf with console.log is fine)
- provide us with the output of the printed text when running under chromium/v8?
[Tracking Requested - why for this release]: small bugs in wasm (to be enabled in 51).
> in the wast format:
>  - we don't allow the binary data to be on strings that are on several lines (I guess v8 does allow this).
Does the browser support wast format directly? I could not make neither FireFox, nor Chrome to load wast.

> - I can't compile the C file with the current emscripten branch, because emscripten refuses to compile; I assume most work on emscripten's side is done on the 0xc branch, so that should arrive soon.

This file is not for compilation by emscripten, it rather should be compiled by gcc.

> - replace the assertions by calls to printf, printing the byte index in the linear memory, the value just loaded (or about to be stored), for each memory access
> - provide us a wasm file that does this? (polyfilling printf with console.log is fine)
> - provide us with the output of the printed text when running under chromium/v8?

Already done it on my side and it looks like SpiderMonkey does not run start function. As a workaround I put call to my start function to the beginning of exported main function. Can you check under debugger what's happening to start function, or should I create a new bug report with all necessary files attached?
(In reply to konsoletyper from comment #26)
> > in the wast format:
> >  - we don't allow the binary data to be on strings that are on several lines (I guess v8 does allow this).
> Does the browser support wast format directly? I could not make neither
> FireFox, nor Chrome to load wast.

No, the browser doesn't allow that (as of today). In the Spidermonkey JS shell, you can use the wasmTextToBinary function to test against it; otherwise, there is a project on the WebAssembly repo that does this conversion: https://github.com/WebAssembly/sexpr-wasm-prototype . Not sure how well it does run under Firefox, though.

> > - replace the assertions by calls to printf, printing the byte index in the linear memory, the value just loaded (or about to be stored), for each memory access
> > - provide us a wasm file that does this? (polyfilling printf with console.log is fine)
> > - provide us with the output of the printed text when running under chromium/v8?
> 
> Already done it on my side and it looks like SpiderMonkey does not run start
> function. As a workaround I put call to my start function to the beginning
> of exported main function. Can you check under debugger what's happening to
> start function, or should I create a new bug report with all necessary files
> attached?

Sounds like an issue, yes; it should already run (we have tests ensuring it does, and in these tests the start function runs at *instanciation*). Can you make sure the start section is placed just after the export section? Otherwise, let's open a new bug with a test case and discuss it there.

If you're open to it, we could have quicker discussions and feedback on irc: we're using the irc network irc.mozilla.org; I'm there in the #jsapi or #ionmonkey channels under the name bbouvier.
Track for 51+ as this is wasm related bug.
Hi :bbouvier,
Since this bug also affects 51, do you consider to uplift this for 51 if this patch is not too risky?
Flags: needinfo?(bbouvier)
Yes, will ask for uplifting: both patches are tiny.
Flags: needinfo?(bbouvier)
Comment on attachment 8791743 [details] [diff] [review]
01.codegen.patch

Approval Request Comment
[Feature/regressing bug #]: probably bug 1279248
[User impact if declined]: undefined behavior in wasm programs (could end up being very bad)
[Describe test coverage new/current, TreeHerder]: test added, green for a week, all good
[Risks and why]: pretty much no risk (we'll uplift other fixes if need be) 
[String/UUID change made/needed]: n/a
Attachment #8791743 - Flags: approval-mozilla-aurora?
Comment on attachment 8791744 [details] [diff] [review]
02.bce.patch

Approval Request Comment
[Feature/regressing bug #]: bug 1282618
[User impact if declined]: corrupted hash maps
[Describe test coverage new/current, TreeHerder]: test added, all green on treeherder for a week
[Risks and why]: pretty low, if not no risk at all
[String/UUID change made/needed]: n/a
Attachment #8791744 - Flags: approval-mozilla-aurora?
For what it's worth, we've discussed a bit with konsoletyper on irc:
- the start function seems to now run under firefox.
- konsoletyper explained me how to compile Java files to wasm with the experimental backend, so I can go ahead and investigate what the issue is.
Flags: needinfo?(konsoletyper)
Comment on attachment 8791743 [details] [diff] [review]
01.codegen.patch

Fix a wasm related issue. Take it in 51 aurora.
Attachment #8791743 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8791744 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: