738117 - Valgrind and Nulgrind show SIGSEGV with testcase on Snow Leopard even though testcase does not crash

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Description

•

13 years ago

function h(sandboxType) {
  switch (sandboxType) {
  default:
    k = newGlobal("new-compartment");
  }
  return function(f, code) {
    try {
      evalcx(code, k);
    } catch (e) {}
  };
}
function p(code) {
  f = new Function(code);
  g(f, code);
}
function g(f, code, wtt) {
  var rv = f();
}
p("g=h()");
p("with(x=<y>></y>){(function(){return function(){1}})}");
p("for(let z=0;z<1;){c=z;t};");
p("c%-x");

(without Valgrind testcase does not abort in any way)

I ran with valgrind --dsymutil=yes ./js

(this happens on Snow Leopard)

Using 64-bit js opt shell on Mac 10.6 with Valgrind r12455 on m-i changeset ca353538c7f9, the following error is shown:

==17021== Invalid read of size 8
==17021==    at 0x10003FE8F: JSCompartment::wrap(JSContext*, JS::Value*) (jscompartment.cpp:214)
==17021==    by 0x10000488E: EvalInContext(JSContext*, unsigned int, JS::Value*) (js.cpp:2687)
==17021==    by 0x10009BC7C: js::InvokeKernel(JSContext*, js::CallArgs, js::MaybeConstruct) (jscntxtinlines.h:314)
==17021==    by 0x10008C7EF: js::Interpret(JSContext*, js::StackFrame*, js::InterpMode) (jsinterp.cpp:2685)
==17021==    by 0x10009AA54: js::RunScript(JSContext*, JSScript*, js::StackFrame*) (jsinterp.cpp:469)
==17021==    by 0x10009AE6D: js::ExecuteKernel(JSContext*, JSScript*, JSObject&, JS::Value const&, js::ExecuteType, js::StackFrame*, JS::Value*) (jsinterp.cpp:667)
==17021==    by 0x10009B9C7: js::Execute(JSContext*, JSScript*, JSObject&, JS::Value*) (jsinterp.cpp:709)
==17021==    by 0x10001E607: JS_ExecuteScript (jsapi.cpp:5229)
==17021==    by 0x100005770: Process(JSContext*, JSObject*, char const*, bool) (js.cpp:478)
==17021==    by 0x100005DEB: Shell(JSContext*, js::cli::OptionParser*, char**) (js.cpp:4730)
==17021==    by 0x100007045: main (js.cpp:5022)
==17021==  Address 0x7ffffffff000 is not stack'd, malloc'd or (recently) free'd
==17021== 
==17021== 
==17021== Process terminating with default action of signal 11 (SIGSEGV)
==17021==  Access not within mapped region at address 0x7FFFFFFFF000
==17021==    at 0x10003FE8F: JSCompartment::wrap(JSContext*, JS::Value*) (jscompartment.cpp:214)
==17021==    by 0x10000488E: EvalInContext(JSContext*, unsigned int, JS::Value*) (js.cpp:2687)
==17021==    by 0x10009BC7C: js::InvokeKernel(JSContext*, js::CallArgs, js::MaybeConstruct) (jscntxtinlines.h:314)
==17021==    by 0x10008C7EF: js::Interpret(JSContext*, js::StackFrame*, js::InterpMode) (jsinterp.cpp:2685)
==17021==    by 0x10009AA54: js::RunScript(JSContext*, JSScript*, js::StackFrame*) (jsinterp.cpp:469)
==17021==    by 0x10009AE6D: js::ExecuteKernel(JSContext*, JSScript*, JSObject&, JS::Value const&, js::ExecuteType, js::StackFrame*, JS::Value*) (jsinterp.cpp:667)
==17021==    by 0x10009B9C7: js::Execute(JSContext*, JSScript*, JSObject&, JS::Value*) (jsinterp.cpp:709)
==17021==    by 0x10001E607: JS_ExecuteScript (jsapi.cpp:5229)
==17021==    by 0x100005770: Process(JSContext*, JSObject*, char const*, bool) (js.cpp:478)
==17021==    by 0x100005DEB: Shell(JSContext*, js::cli::OptionParser*, char**) (js.cpp:4730)
==17021==    by 0x100007045: main (js.cpp:5022)

Nulgrind shows the latter error.

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Updated

•

13 years ago

Summary: Valgrind and Nulgrind show SIGSEGV even though testcase does not crash → Valgrind and Nulgrind show SIGSEGV with testcase on Snow Leopard even though testcase does not crash

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Updated

•

13 years ago

Hardware: x86 → x86_64

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 1

•

13 years ago

Last I checked, this happens on m-c changeset 4bdae514b9be as well.

Julian Seward [:jseward]

Comment 2

•

13 years ago

I can reproduce this, but I can't figure out why it happens.  It
strikes me as likely that it is a bug in Valgrind, but I am not even
sure of that -- V can run Firefox on MacOSX no problem, for example.

It might also be a bug in JS, in that it makes some assumption about
memory layout or some such, which happens to be true natively but is
not true when running on V.  Unlikely, but such things have happened
in the past.

The test case fails with symptoms of JS heap corruption, despite
running fine natively.  A debug build also fails, somewhat earlier in
an assertion, also with symptoms of heap corruption.

I also tried to reproduce on on x86_64-linux, but failed.

Currently I have no leads.  One line of approach would be to take the
failing case and cut it down to something smaller than the entire JS
engine.  That does not sound easy, though.

Is there any part of the JS engine that makes assumptions about memory
layout?  In particular (eg) whether objects can/can't exist in
particular 64-bit address ranges?  Isn't there some such bit twiddling
magic to do with the "new" 64-bit JSVal representation?

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 3

•

13 years ago

Math.atan2((-01.t),0)


(without Valgrind testcase does not abort in any way)

I ran with valgrind --dsymutil=yes ./js

(this happens on Snow Leopard)

Using 64-bit js opt shell on Mac 10.6 with Valgrind on m-c changeset 95df15895e02, the following error is shown:

==73463== Invalid read of size 8
==73463==    at 0x1000C1B90: js_GetMethod(JSContext*, JSObject*, long, unsigned int, JS::Value*) (jsscope.h:603)
==73463==    by 0x100107A71: js_ValueToSource(JSContext*, JS::Value const&) (jsstr.cpp:3315)
==73463==    by 0x100018514: JS_ValueToSource (jsapi.cpp:534)
==73463==    by 0x100005C36: Process(JSContext*, JSObject*, char const*, bool) (js.cpp:583)
==73463==    by 0x1000064CB: Shell(JSContext*, js::cli::OptionParser*, char**) (js.cpp:4731)
==73463==    by 0x100007165: main (js.cpp:5017)
==73463==  Address 0x7fffffffffff is not stack'd, malloc'd or (recently) free'd
==73463== 
==73463== 
==73463== Process terminating with default action of signal 11 (SIGSEGV)
==73463==  Access not within mapped region at address 0x7FFFFFFFFFFF
==73463==    at 0x1000C1B90: js_GetMethod(JSContext*, JSObject*, long, unsigned int, JS::Value*) (jsscope.h:603)
==73463==    by 0x100107A71: js_ValueToSource(JSContext*, JS::Value const&) (jsstr.cpp:3315)
==73463==    by 0x100018514: JS_ValueToSource (jsapi.cpp:534)
==73463==    by 0x100005C36: Process(JSContext*, JSObject*, char const*, bool) (js.cpp:583)
==73463==    by 0x1000064CB: Shell(JSContext*, js::cli::OptionParser*, char**) (js.cpp:4731)
==73463==    by 0x100007165: main (js.cpp:5017)
==73463==  If you believe this happened as a result of a stack
==73463==  overflow in your program's main thread (unlikely but
==73463==  possible), you can try to increase the size of the
==73463==  main thread stack using the --main-stacksize= flag.
==73463==  The main thread stack size used in this run was 8388608.
==73463== 
==73463== HEAP SUMMARY:
==73463==     in use at exit: 1,159,878 bytes in 1,408 blocks
==73463==   total heap usage: 2,112 allocs, 704 frees, 1,273,167 bytes allocated
==73463== 
==73463== LEAK SUMMARY:
==73463==    definitely lost: 16 bytes in 1 blocks
==73463==    indirectly lost: 0 bytes in 0 blocks
==73463==      possibly lost: 832 bytes in 16 blocks
==73463==    still reachable: 1,158,942 bytes in 1,390 blocks
==73463==         suppressed: 88 bytes in 1 blocks
==73463== Rerun with --leak-check=full to see details of leaked memory
==73463== 
==73463== For counts of detected and suppressed errors, rerun with: -v
==73463== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault

Luke Wagner [:luke]

Comment 4

•

13 years ago

Hi Julian.  I looked at this with Gary.  I put a printf on the first line of JS_ValueToString that looks like:

  printf("The bits: %p\n", (void *)JSVAL_TO_IMPL(v).asBits);

and ran:

  print(Math.atan2((-01.t),0))

in a 64-bit opt shell on OSX10.6.  For native, the result is a "canonical" NaN: 0xfff8000000000000.
Run from inside valgrind, the result is a decidedly non-canonical NaN: 0xffffffffffffffff.

So valgrind is technically returning a NaN, but it is breaking the implicit contract that we (as well as jsc and v8) depend on: that math functions only return canonical NaNs.  In theory, we could put a JS_CANONICALIZE_NAN after all math functions, but this is slow.  (I suppose we could put one in for --enable-valgrind builds, but I think ideally valgrind would return the same NaN value as the system).

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Updated

•

13 years ago

Whiteboard: js-triage-needed → valgrind bug

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Updated

•

13 years ago

Whiteboard: valgrind bug → [valgrind bug]

Julian Seward [:jseward]

Comment 5

•

13 years ago

(In reply to Luke Wagner [:luke] from comment #4)
> Hi Luke, Gary,

Thanks for the lead.  I had been contemplating "address space
weirdness" (unlikely) or "integer arithmetic weirdness" (very
unlikely), so I was completely on the wrong track.

I can repro this with a 10-line C program that prints atan2(NaN,0),
which makes it a lot easier to track down.  I think it is due to the
kludging that V does for 80-vs-64 legacy (x87) floating point.  On the
case.

Julian Seward [:jseward]

Comment 6

•

13 years ago

Attached patch fix conversions of quiet NaNs between 64- and 80-bit formats — Details — Splinter Review

I think this will fix the reported problem.

Julian Seward [:jseward]

Comment 7

•

13 years ago

Committed, r2276/r12497.

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 8

•

13 years ago

> Committed, r2276/r12497.

Confirming fixed by the Valgrind patch. Thus, this bug can be marked WFM since it is not a patch for Mozilla code.

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → WORKSFORME

Bugzilla

Quick Search

Valgrind and Nulgrind show SIGSEGV with testcase on Snow Leopard even though testcase does not crash

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: gkw, Unassigned)

References

Details

(Keywords: testcase, valgrind, Whiteboard: [valgrind bug])

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Updated

Comment 5

Comment 6

Comment 7

Comment 8

Attachment

General

Description

File Name

Content Type