Last Comment Bug 733979 - Opt-only Crash [@ js::gc::MarkInternal]
: Opt-only Crash [@ js::gc::MarkInternal]
Status: VERIFIED FIXED
[sg:critical]
: crash, testcase
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: x86 Linux
: -- critical (vote)
: mozilla12
Assigned To: Brian Hackett (:bhackett)
:
Mentors:
Depends on:
Blocks: langfuzz
  Show dependency treegraph
 
Reported: 2012-03-07 17:38 PST by Christian Holler (:decoder)
Modified: 2013-03-11 07:29 PDT (History)
9 users (show)
choller: in‑testsuite-
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed
fixed
12+
verified


Attachments
patch (7.36 KB, patch)
2012-03-16 09:57 PDT, Brian Hackett (:bhackett)
wmccloskey: review+
akeybl: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta+
lukasblakk+bugs: approval‑mozilla‑esr10+
Details | Diff | Review

Description Christian Holler (:decoder) 2012-03-07 17:38:56 PST
The following test crashes on mozilla-central revision 7d0d1108a14e (options -m -n -a, 32 bit optimized only):


function inSection(x) {}
function printStatus (msg) {}
function reportCompare (expected, actual, description) {
  var expected_t = typeof expected;
  var actual_t = typeof actual;
    printStatus ("Expected type '" + expected_t + "' matched actual " + "type '" + actual_t + "'");
}
new Function("\
var actual = '';\
var expect = '';\
var date1;\
var date2;\
var validDateStrings = [ '11/69/2004', ];\
var invalidDateStrings = [ '70/70/70', '70/70/1970', '70/70/2004'];\
for (i = 0; i < validDateStrings.length; i++) {\
  date1 = new Date(validDateStrings[i]);\
  date2 = new Date(date1.toDateString());\
  actual = date2 - date1;\
}\
var offset = validDateStrings.length;\
for (i = 0; i < invalidDateStrings.length; i++) {\
  date1 = new Date(invalidDateStrings[i]);\
  i  = isNaN(date1);\
  reportCompare(expect, actual, inSection(i + offset) + ' ' + invalidDateStrings[i] + (3));\
}\
")();


Crash trace:

==53139== Invalid read of size 4
==53139==    at 0x80ABD1A: void js::gc::MarkInternal<JSString>(JSTracer*, JSString*) (jsgcmark.cpp:105)
==53139==    by 0x80AAEDF: js::gc::MarkValueRootRange(JSTracer*, unsigned int, JS::Value*, char const*) (jsgcmark.cpp:321)
==53139==    by 0x8186008: js::StackSpace::markFrameSlots(JSTracer*, js::StackFrame*, JS::Value*, unsigned char*) (jsgcmark.h:114)
==53139==    by 0x8186162: js::StackSpace::mark(JSTracer*) (Stack.cpp:521)
==53139==    by 0x80A20A4: _ZN2jsL11MarkRuntimeEP8JSTracerb.clone.224 (jsgc.cpp:2397)
==53139==    by 0x80A26DE: BeginMarkPhase(JSRuntime*) (jsgc.cpp:2990)
==53139==    by 0x80A3D18: GCCycle(JSContext*, JSCompartment*, long long, js::JSGCInvocationKind) (jsgc.cpp:3289)
==53139==    by 0x80A4F8A: Collect(JSContext*, JSCompartment*, long long, js::JSGCInvocationKind, js::gcreason::Reason) (jsgc.cpp:3707)
==53139==    by 0x807F58B: js_HandleExecutionInterrupt(JSContext*) (jscntxt.cpp:856)
==53139==    by 0x8301FF3: ??? (in /srv/repos/mozilla-central/js/src/opt32/shell/js)
==53139==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
Comment 1 Bill McCloskey (:billm) 2012-03-11 18:28:46 PDT
Do you have time to look at this, Brian? It's some sort of TI bug. So far what I've learned is that the type of actual is being inferred to be string, even though the |actual = date2-date1;| assignment makes it an integer. So when we sync actual, we write out its integer value but with a string type tag. Therefore, when the GC runs, we crash with a null pointer when we try to trace through the string.

I've tried to figure out why the type is being set this way. I can see that mjit::Compiler::updateJoinVarTypes is making it be a string when it's run on the loop entry point for the second loop. But I don't understand the analysis that's being used to generate the types.
Comment 2 Daniel Veditz [:dveditz] 2012-03-14 16:44:06 PDT
Is it always going to be a null value? If so we can unhide this as a DoS bug.
Comment 3 David Mandelin [:dmandelin] 2012-03-15 17:04:16 PDT
(In reply to Bill McCloskey (:billm) from comment #1)
> Do you have time to look at this, Brian? It's some sort of TI bug. So far
> what I've learned is that the type of actual is being inferred to be string,
> even though the |actual = date2-date1;| assignment makes it an integer. So
> when we sync actual, we write out its integer value but with a string type
> tag. Therefore, when the GC runs, we crash with a null pointer when we try
> to trace through the string.

Based on the part about the type tags, it sounds like we could end up with any pointer value there.
Comment 4 Brian Hackett (:bhackett) 2012-03-16 09:57:47 PDT
Created attachment 606615 [details] [diff] [review]
patch

Very old TI bug involving GC, thanks for looking into this Bill.  When analyzing arithmetic operations, TI treats objects like they are integers, in part so that interpreted ops producing integers never need to be reported to TI.  (This is pretty lame and could be fixed so that TI doesn't assume anything about objects in arithmetic ops, but that should be done in a different bug).  If an operation involving objects produces an integer, then later a GC occurs which collects the types of those objects, then all information about the type of the operation is gone and after reanalysis the inferred types will not reflect the integer result of that operation.  If a frame for the script is still on the stack, then the inferred types for that frame are incomplete.

This patch does the minimal fix of ensuring that arithmetic operations whose result type was derived from (collectable) objects have those types frozen in the script's type information (as if they resulted from an integer overflow, say).
Comment 5 Brian Hackett (:bhackett) 2012-03-16 10:00:26 PDT
Comment on attachment 606615 [details] [diff] [review]
patch

[Approval Request Comment]
User impact if declined: Potential crash or security exploit.
Risk to taking this patch (and alternatives if risky): Very low, fixes an obscure and difficult to trigger logical gap.
Comment 6 Bill McCloskey (:billm) 2012-03-16 13:56:12 PDT
Comment on attachment 606615 [details] [diff] [review]
patch

Thanks. It looks like the semantics of what happens when you do [object] + [double|int32] might be changing here. Previously it would get type [int32|double] and now it just gets double. I guess that probably doesn't change what code gets generated, so it doesn't matter?

I also don't totally understand how this works for other ops (non-arithmetic). It seems like a GC can just remove singleton objects from typesets everywhere. What ensures that we end up with the right type later on? You mentioned in your comment that arithmetic operations are special, but I don't understand what other stuff does. However, I guess that's not really relevant to this bug.
Comment 7 Alex Keybl [:akeybl] 2012-03-20 15:12:28 PDT
Comment on attachment 606615 [details] [diff] [review]
patch

[Triage Comment]
Low risk sg:crit fix - approved for Aurora 13 and Beta 12.
Comment 8 Brian Hackett (:bhackett) 2012-03-21 05:17:37 PDT
(In reply to Bill McCloskey (:billm) from comment #6)
> Thanks. It looks like the semantics of what happens when you do [object] +
> [double|int32] might be changing here. Previously it would get type
> [int32|double] and now it just gets double. I guess that probably doesn't
> change what code gets generated, so it doesn't matter?

Any type set containing double will also have int32 added to it, so there's no change in semantics (this edit was just removing cruft).

> I also don't totally understand how this works for other ops
> (non-arithmetic). It seems like a GC can just remove singleton objects from
> typesets everywhere. What ensures that we end up with the right type later
> on? You mentioned in your comment that arithmetic operations are special,
> but I don't understand what other stuff does. However, I guess that's not
> really relevant to this bug.

The GC will remove objects from persistent type sets, but only when they have been collected and thus can no longer appear anywhere.  The hazard here is when a type constraint does something based on an object type other than simply pass it through.  Property reads/writes and calls also do this, but only update persistent type sets like those associated with property reads, with function arguments and type object properties.  The type sets updated by arithmetic constraints are transient and are destroyed on GC.

It would be nice if inference could pull in types from frames on the stack when doing its initial analysis, which would make this code more straightforwardly correct and would have a few other nice properties, e.g. persistent type sets could be destroyed with a script's frames still on the stack.
Comment 9 Brian Hackett (:bhackett) 2012-03-21 05:32:48 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/b25a3f1a4bd7
Comment 10 Daniel Holbert [:dholbert] 2012-03-21 12:06:48 PDT
https://hg.mozilla.org/mozilla-central/rev/b25a3f1a4bd7
Comment 12 Lukas Blakk [:lsblakk] use ?needinfo 2012-04-06 15:01:00 PDT
[Triage Comment]
Please go ahead and nominate for ESR10 approval as per https://wiki.mozilla.org/Release_Management/ESR_Landing_Process
Comment 13 Lukas Blakk [:lsblakk] use ?needinfo 2012-04-16 13:28:52 PDT
Comment on attachment 606615 [details] [diff] [review]
patch

Approved, please land today following https://wiki.mozilla.org/Release_Management/ESR_Landing_Process
Comment 14 Daniel Veditz [:dveditz] 2012-04-19 10:10:01 PDT
ESR-10 landing
http://hg.mozilla.org/releases/mozilla-esr10/rev/26d65ee6bcde
Comment 15 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-06-21 12:41:57 PDT
Verified fixed in Firefox 10.0.6esrpre 2012-06-21 32-bit JS-shell.
Comment 16 Christian Holler (:decoder) 2013-03-11 07:29:59 PDT
Test too unreliable, in-testsuite-.

Note You need to log in before you can comment on or make changes to this bug.