OdinMonkey: Invalid write of size 4 [@ js::Invoke]

RESOLVED INVALID

Status

()

Core
JavaScript Engine
--
critical
RESOLVED INVALID
4 years ago
3 years ago

People

(Reporter: gkw, Unassigned)

Tracking

(Blocks: 2 bugs, {regression, testcase, valgrind})

Trunk
x86_64
Linux
regression, testcase, valgrind
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments)

(Reporter)

Description

4 years ago
Created attachment 801179 [details]
Valgrind stack

x = (function () {});
x.valueOf = (function (stdlib, n, heap) {
    "use asm";
    var Float32ArrayView = new stdlib.Float32Array(heap);
    function f(i0) {
        i0 = i0 | 0;
        var d5 = .0;
        Float32ArrayView[((0 > 0) - 0xe) >> 2] = 5.;
    }
    return f
})
(this, {}, ArrayBuffer(4096));
x + '';

shows an invalid write of size 4 on m-c changeset 3697f962bb7b without any CLI arguments.

s-s because this is an invalid write.

My configure flags are:

--enable-optimize --disable-debug --enable-profiling --enable-gczeal --enable-debug-symbols --enable-methodjit --enable-type-inference --disable-tests --enable-more-deterministic --enable-valgrind --with-ccache --enable-threadsafe <other NSPR options>
Flags: needinfo?(luke)
(Reporter)

Comment 1

4 years ago
autoBisect shows this is probably related to the following changeset:

The first bad revision is:
changeset:   http://hg.mozilla.org/mozilla-central/rev/b3d85b68449d
user:        Luke Wagner
date:        Fri Mar 15 02:29:02 2013 -0700
summary:     Bug 840282 - OdinMonkey (sr=dmandelin)
Blocks: 840284, 840282
(Reporter)

Comment 2

4 years ago
Tested with `valgrind --smc-check=all-non-file ./js testcase.js`.
Guess it occurs only on x64? If it's the case, it's not a bug, it's a feature: on x64, Odin mprotects the memory and attaches a handler to the exception mechanism if the read is out of bounds, which is the case here (pointer is 0xFFFFFFF0). So there is actually no write. Maybe Valgrind doesn't get this mechanism?

Comment 4

4 years ago
Thanks for looking into this Benjamin, I think you're right.  It'd be good to know whether this is supposed to work in Valgrind or whether Valgrind just doesn't support catching SIGSEGV.
Flags: needinfo?(luke)

Comment 5

4 years ago
I guess I should ask: Julian, does Valgrind support using a SIGSEGV handlers that catch and correct faulting memory accesses?
Group: core-security
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Flags: needinfo?(jseward)
Resolution: --- → INVALID
Not invalid. The same test, and also the OOB jit-tests are failing on ASan. This was working before, and must have been recently regressed. We need to fix this :) I'll try to bisect and see if I can find the regressor.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---

Comment 7

4 years ago
Just to be sure, I stepped through this in a debug build: we do indeed fault, once, and the fault occurs in the protected region by an known heap access and so the fault is handled and execution continues.  It would be useful to put a printf() in AsmJSFaultHandler to see if it is getting called.

Oh, I had one idea, though: I recently changed the Unix path to stop installing the handler for SIGBUS (http://hg.mozilla.org/mozilla-central/rev/e89815fe247d) since I thought SIGBUS was only used for Mac (which now takes a totally different path) and x86 segment-based bounds checking (which was also removed).  However, it's possible ASan/Valgrind report the fault as SIGBUS.  If so, we can reinstall SIGBUS.
(In reply to Luke Wagner [:luke] from comment #5)
> I guess I should ask: Julian, does Valgrind support using a SIGSEGV handlers
> that catch and correct faulting memory accesses?

Valgrind emulates the kernel's signal-delivery semantics with varying
levels of accuracy on different architectures.  On x86/x86_64 we do
the best job.  Other platforms could be improved if necessary.

That said .. I think for all platforms, it supports the sequence "take
a fault; change permissions on the mapped page; restart the faulting
insn" OK.  But you might need
--vex-iropt-register-updates=allregs-at-mem-access in order to have
correct register values for the instruction restart.  By default, V's
JIT optimises away (simulated) register writes when it can, improving
performance but losing so-called "precise exceptions" semantics.  This
flag restores that, at some performance cost (probably not much).
Flags: needinfo?(jseward)

Comment 9

4 years ago
(In reply to Julian Seward from comment #8)
Thanks Julian!  Our handlers need to do one additional thing in this case: update the pc (skipping over the faulting load/store which, if reexecuted, would re-fault).  Should that work?
I think that will work for x86-linux (32 bit) and amd64-linux (64 bit)
since V correctly restores the simulated registers from the sigcontext
when the signal handler returns.  Won't work on arm-linux without a bit
of fixing (an hour's worth, kind of thing).  Also won't work for 
x86-darwin and amd64-darwin (same comment).
I'm confused, now I think more about it.  Odin mprotects some huge
area no-access, and then attempts to write there.  So isn't V correct
in complaining about an invalid write?

Comment 12

4 years ago
(In reply to Julian Seward from comment #11)
The write faults, the handler advances the PC past the write, and execution resumes (the JS semantic effect of an out-of-bounds write is a nop).
(Reporter)

Updated

4 years ago
Flags: needinfo?(jseward)
I'm still confused.  AFAICS V is correct to complain about the write.
Then the only remaining question is, does execution continue correctly
when the signal handler returns?
Flags: needinfo?(jseward)
(Reporter)

Comment 14

4 years ago
(Adding another needinfo for Luke about comment 13)

Julian, I'm not sure if I need to add --vex-iropt-register-updates=allregs-at-mem-access, do I?
Flags: needinfo?(luke)

Comment 15

4 years ago
I think V should only complain about the write if the signal has the default disposition (which means it will crash the process).  Execution does continue correctly when the signal handler returns since the PC has been advanced to the instruction immediately after the write.
Flags: needinfo?(luke)
(Reporter)

Comment 16

4 years ago
This is very reproducible (with varying signatures) such that it blocks running jsfunfuzz with Valgrind, unfortunately..
Flags: needinfo?(jseward)

Comment 17

4 years ago
In the meantime, you can run with --no-asmjs.
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #14)
> Julian, I'm not sure if I need to add
> --vex-iropt-register-updates=allregs-at-mem-access, do I?

You do, yes.
Flags: needinfo?(jseward)
(Reporter)

Comment 19

4 years ago
(In reply to Julian Seward from comment #13)
> I'm still confused.  AFAICS V is correct to complain about the write.
> Then the only remaining question is, does execution continue correctly
> when the signal handler returns?

I'm not sure, I think it's an error because it trips Valgrind's error code (we pass in error number to be used is 77, and watch for it). I'll re-run with the new flag and see how it goes with --vex-iropt-register-updates=allregs-at-mem-access.
(Reporter)

Comment 20

4 years ago
Created attachment 807650 [details]
Valgrind stack with --vex-iropt-register-updates=allregs-at-mem-access

valgrind --smc-check=all-non-file --vex-iropt-register-updates=allregs-at-mem-access ./js testcase.js

function x() {}
x.valueOf = (function(stdlib, n, heap) {
    "use asm"
    var Float64ArrayView = new stdlib.Float64Array(heap);
    function f(d0) {
        d0 = +d0
        return (d0 > +Float64ArrayView[0xcea7 >> 3]) | 0
    }
    return f
})(this, {}, ArrayBuffer(4096))
this + x


Tested on rev 53a2011a01b2 with:

--enable-optimize --disable-debug --enable-profiling --enable-gczeal --enable-debug-symbols --enable-methodjit --enable-type-inference --disable-tests --enable-more-deterministic --enable-valgrind --with-ccache --enable-threadsafe <more NSPR options>

I *still* get this issue with --vex-iropt-register-updates=allregs-at-mem-access, though the frequency seems somewhat less. This should still be addressed because this is adding noise to the results.
Flags: needinfo?(jseward)
(Reporter)

Comment 21

4 years ago
fwiw, I'm on Valgrind SVN tip r13572.
(Reporter)

Comment 22

3 years ago
I ran:

valgrind --smc-check=all-non-file --vex-iropt-register-updates=allregs-at-mem-access ./js-opt-64-dm-vg-ts-linux-c93cfe704487 913876-c20.js

and got an invalid read of size 8. Currently, to make Valgrind run properly, we have to disable asm.js through --no-asmjs. Tested using Valgrind SVN tip r13745.
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #22)
> I ran:
> 
> valgrind --smc-check=all-non-file
> --vex-iropt-register-updates=allregs-at-mem-access

Might be worth trying --vex-iopt-register-updates=allregs-at-each-insn.  This
disables caching of simulated registers in real registers even more strongly.
Flags: needinfo?(jseward)
(Reporter)

Comment 24

3 years ago
> Might be worth trying --vex-iopt-register-updates=allregs-at-each-insn.  This
> disables caching of simulated registers in real registers even more strongly.

Will test this tomorrow.
Flags: needinfo?(gary)
(Reporter)

Comment 25

3 years ago
Doesn't work, the "invalid read of size 8" still reproduces with:

valgrind --smc-check=all-non-file --vex-iropt-register-updates=allregs-at-each-insn ./js-opt-64-dm-vg-linux-1426ffa9caf2 913876-c20.js
Flags: needinfo?(gary) → needinfo?(jseward)
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #25)
> Doesn't work, the "invalid read of size 8" still reproduces with:

Can you tar up the entire source tree and binary and get it to me
somehow?  I am now intrigued as to why this is acting up.
Flags: needinfo?(jseward)
(Reporter)

Comment 27

3 years ago
I merely compiled an opt shell using the parameters in comment 20 using unmodified m-c rev tip (1426ffa9caf2 as of a few hours' ago) on Ubuntu Linux 13.10.

Are you sure you'd still need the tar'ed source?
(Reporter)

Comment 28

3 years ago
(moreover a zipped binary takes up 25MB of space)
I looked at this using the binary Gary sent in private email.  It seems
that what I said in comment 11 is true, except now I have more details.

js does a mmap-no-access of a 4MB area:

  SYSCALL[9068,1](  9) sys_mmap ( 0x0, 4294971392, 0, 34, -1, 0 )
   --> [pre-success] Success(0x0:0x39dde000) 
  Warning: set address range perms: large range [0x39dde000, 0x139ddf000) (noaccess)

  then it mprotect(3 == PROT_READ|PROT_WRITE)'s the first two pages:

  SYSCALL[9068,1]( 10) sys_mprotect ( 0x39dde000, 8192, 3 )[sync] --> Success(0x0:0x0) 

  then accesses 39debea0, which is beyond the mprotected area

  so there's a segfault.

In summary:

  mmaped --- [39dde000, 139ddf000)
  mprot  rw- [39dde000,  39de0000)
  access      39debea0  --> SEGFAULT

V reports an invalid access to 39debea0 in JIT generated code, at
code address 4813013.  It enters the segfault handler, which changes the
RIP in the sigcontext and thereby returns to 481301c, 9 bytes past the
faulting instruction.  Presumably it steps over the faulting insn?

The faulting fragment is this:

        0x4813013:  movsd 52896(%r15),%xmm1
        0x481301C:  ucomisd %xmm1,%xmm0
        0x4813020:  seta %al
        0x4813023:  movzbl %al,%eax
        0x4813026:  ret

Peering at js/src/jit/AsmJSSignalHandlers.cpp, function 
bool HandleSignal(int signum, siginfo_t *info, void *ctx)
it looks like the line 
*ppc += heapAccess->opLength();
is what moves the program counter over the movsd instruction.

Who knows about the details of this segfault-and-recovery mechanism?  It would
be useful if a JS person could say whether the above analysis is correct.

So it seems to me that V is correct to complain about the access.  The question
is how to stop this adding noise to the fuzzing results, as per the last sentence
in comment 20.
(Reporter)

Comment 30

3 years ago
> Peering at js/src/jit/AsmJSSignalHandlers.cpp, function 
> bool HandleSignal(int signum, siginfo_t *info, void *ctx)
> it looks like the line 
> *ppc += heapAccess->opLength();
> is what moves the program counter over the movsd instruction.
> 
> Who knows about the details of this segfault-and-recovery mechanism?  It
> would
> be useful if a JS person could say whether the above analysis is correct.

I'm guessing Luke might know.
Flags: needinfo?(luke)

Comment 31

3 years ago
(In reply to Julian Seward from comment #29)
Yes, that is the intended behavior.  (That's what I was trying to explain in comment 15.)  My recommendation, also in comment 15, was to only report an error when a SIGSEGV is raised only if it has the default disposition (and thus is about to core dump).  The asm.js handler isn't the only one to do tricks like this, so this sounds like a valid general improvement.
Flags: needinfo?(luke)
(Reporter)

Updated

3 years ago
Assignee: general → nobody
QA Contact: general

Comment 32

3 years ago
I don't think there is anything to fix here...
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago3 years ago
Resolution: --- → INVALID

Comment 33

3 years ago
Filed bug 970643 to look for a solution for fuzzing.
You need to log in before you can comment on or make changes to this bug.