Closed Bug 621202 Opened 11 years ago Closed 11 years ago

Assert: abs(dst - src) >= ptrdiff_t(nelem) or crash in JSString::flatten

Categories

(Core :: JavaScript Engine, defect)

x86
Linux
defect
Not set
critical

Tracking

()

VERIFIED FIXED
Tracking Status
blocking2.0 --- betaN+
status1.9.2 --- unaffected
status1.9.1 --- unaffected

People

(Reporter: decoder, Assigned: luke)

Details

(Keywords: regression, Whiteboard: fixed-in-tracemonkey, [sg:critical?])

Attachments

(4 files)

The attached testcase (tested on TM trunk, soon to land in mc) demonstrates a serious memory corruption. In debug shell, the testcase either triggers the assertion

Assertion failure: abs(dst - src) >= ptrdiff_t(nelem), at jsutil.h:362

or crashes. An optimized build crashes straight away, the backtrace shows that this is due to a memcpy:


Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6f4001b in memcpy () from /lib/libc.so.6
(gdb) bt
#0  0x00007ffff6f4001b in memcpy () from /lib/libc.so.6
#1  0x00000000004fa4ae in JSString::flatten(JSContext*) ()
#2  0x0000000000419c28 in js::RegExp::executeInternal(JSContext*, js::RegExpStatics*, JSString*, unsigned long*, bool, js::Value*) ()
#3  0x00000000004fbcc3 in DoMatch(JSContext*, js::RegExpStatics*, js::Value*, JSString*, RegExpPair const&, bool (*)(JSContext*, js::RegExpStatics*, unsigned long, void*), void*, MatchControlFlags) ()
#4  0x0000000000505e4a in js::str_replace(JSContext*, unsigned int, js::Value*) ()
#5  0x000000000060ec83 in js::Interpret(JSContext*, JSStackFrame*, unsigned int, JSInterpMode) ()
#6  0x0000000000478d54 in js::Execute(JSContext*, JSObject*, JSScript*, JSStackFrame*, unsigned int, js::Value*) ()
#7  0x0000000000412106 in JS_ExecuteScript ()
#8  0x000000000040788b in Process(JSContext*, JSObject*, char*, int) ()
#9  0x0000000000408643 in Shell(JSContext*, int, char**, char**) ()
#10 0x0000000000408948 in main ()


The crash originates from this change in trunk:

The first bad revision is:
changeset:   58610:1d1fe1d1e626
user:        Luke Wagner <lw@mozilla.com>
date:        Mon Dec 06 10:26:58 2010 -0800
summary:     Bug 609440, part 4 - make JSString::chars() fallible (r=waldo,dvander,igor,dwitte,njn)

Flagged as security bug because this bug is highly likely to be exploitable.
Attached file Test case for shell
With this patch, the original testcase assertion is unchanged.  But comment out the |var a = ...| and |a[length - 2] = ...| lines and you get this:

InternalError: too much recursion
Assertion failure: isRope(), at ../jsstr.h:354

Program received signal SIGABRT, Aborted.
0x0000003c20a0f29b in raise () from /lib64/libpthread.so.0
Missing separate debuginfos, use: debuginfo-install glibc-2.12.90-21.x86_64
(gdb) up 2
#2  0x000000000042f188 in JSString::ropeLeft (this=0x7ffff1a28cc0) at ../jsstr.h:354
354	        JS_ASSERT(isRope());
(gdb) up
#3  0x000000000055e74c in JSString::flatten (this=0x7ffff1a28ce0, maybecx=0xa94520) at ../jsstr.cpp:182
182	        JSString *left = str->ropeLeft();           /* Read before clobbered. */
(gdb) lis
177	    wholeChars = AllocChars(maybecx, wholeCapacity);
178	    if (!wholeCapacity)
179	        return NULL;
180	    pos = wholeChars;
181	    first_visit_node: {
182	        JSString *left = str->ropeLeft();           /* Read before clobbered. */
183	        str->u.chars = pos;
184	        if (left->isRope()) {
185	            left->s.parent = str;               /* Return to this when 'left' done, */
186	            left->lengthAndFlags = 0x200;       /* but goto visit_right_child. */
(gdb) p str->isRope()
$1 = false
(gdb) p str->u.chars
$2 = (const jschar *) 0x7ffff1a28ca0
(gdb) p str->lengthAndFlags
$3 = 512
(gdb) p/x str->lengthAndFlags
$4 = 0x200

So the traversal algorithm is buggy.  How...well, this is the first time I've really read the rope code, so I'm not sure yet.  But I wholeheartedly support bug 614459 right now!
Minimized while still hitting the isRope() assert:

var str = "", old;
for (var i = 0; i < 2; i++)
{
  str = old + "abcdefghijklmno";
  old = str;
}
str.replace(/\n/g, "");
Er, this is slightly better:

var str = "abcdefghi", old = str;
for (var i = 0; i < 2; i++)
{
  str = old + "abcdefghijklmno";
  old = str;
}
str.replace(/\n/g, "");
first_visit_node: {
    JSString *left = str->ropeLeft();       /* Read before clobbered. */
    str->u.chars = pos;
    if (left->isRope()) {
        left->s.parent = str;           /* Return to this when 'left' done, */
        left->lengthAndFlags = 0x200;   /* but goto visit_right_child. */
        str = left;
        goto first_visit_node;

The = 0x200 blows away the ROPE bit, and the string length.  Hilarity ensues.
Almost have a patch, racing against company showing up for dinner...
Assignee: general → jwalden+bmo
Status: NEW → ASSIGNED
blocking2.0: --- → ?
Regarding the comment on IRC that tests should have found this earlier: The assertion showed up quite a few times in my fuzzer but it was never reproducible (fragile?). This reproducible crash happened only once in a lot of runs which suggests to me that it's not as easy to hit.
Waldo: there does seem to be a bug somewhere here, but the clobbering of lengthAndFlags during the traversal is intentional; that is why the raw fields are accessed directly instead of through the accessors members.
First of all, a reduced test case:

function test(arg, countdown) {
  if (countdown == 0)
    return;
  arg.indexOf('');
  test(arg + 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', countdown - 1);
  (arg + 'x').indexOf('');
}
test('', 2000);

The bug is with the 'extensible' optimization: when we decide to 'extend' flat string when flattening a rope, that flat rope gets turned into a dependent string.  That means that any dependent string whose base was that extensible flat string now depends on a dependent string.  This breaks the invariant that dependent strings only depend on flat strings which, in this case, fools the GC and causes some strings to be prematurely collected.
This fixes my minimization, and it's a clear and simple change which eliminates the need for unchecked field accesses.  It doesn't fix the original testcase; I now see why the code was doing what it was doing, although as this patch demonstrates it doesn't seem necessary here.  (It also fixes a botched null-check of wholeChars in passing.)

The PodCopy triggering the assert is into |pos| from |left->nonRopeChars()|, where the two are identical, and where |pos| is identical to |wholeChars|.  Of course that's utter nonsense -- |wholeChars| is the fresh result of a malloc!  So we have a string with a dangling chars pointer -- and indeed, if I insert volatile overwrites and corrections to left->nonRopeChars()[0], valgrind fingers GC as having freed it:

==13358==  Address 0xbda3a80 is 0 bytes inside a block of size 131,074 free'd
==13358==    at 0x4A05187: free (vg_replace_malloc.c:325)
==13358==    by 0x4114C9: js_free (jsutil.h:221)
==13358==    by 0x4300ED: JSRuntime::free(void*) (jscntxt.h:1513)
==13358==    by 0x430658: JSContext::free(void*) (jscntxt.h:2144)
==13358==    by 0x4AB1CB: JSString::finalize(JSContext*) (jsstrinlines.h:138)
==13358==    by 0x4A90E0: void FinalizeArenaList<JSString>(JSCompartment*, JSContext*, unsigned int) (jsgc.cpp:1849)
==13358==    by 0x4A4EAA: MarkAndSweep(JSContext*, JSGCInvocationKind) (jsgc.cpp:2232)
==13358==    by 0x4A50BE: GCUntilDone(JSContext*, JSGCInvocationKind) (jsgc.cpp:2487)
==13358==    by 0x4A51FD: js_GC(JSContext*, JSGCInvocationKind) (jsgc.cpp:2552)
==13358==    by 0x45C23C: js_InvokeOperationCallback(JSContext*) (jscntxt.cpp:1837)
==13358==    by 0x45C2E2: js_HandleExecutionInterrupt(JSContext*) (jscntxt.cpp:1883)
==13358==    by 0x6AF4C2: js::Interpret(JSContext*, JSStackFrame*, unsigned int, JSInterpMode) (jsinterp.cpp:2824)

So the plot thickens.  More in the morning, hopefully, after sleep and without having to schedule around guests.  (I think.)
Hmm, there isn't a clearly-best fix.  Option 1 is to remove the EXTENSIBLE optimization (described here http://hg.mozilla.org/tracemonkey/file/6255a0255dc2/js/src/jsstr.cpp#l143).  Option 2 is to break the "dependent string bases are always flat strings" invariant and fix string gc marking (and find any other code that depended on this invariant).
To be clear: commenting out the "if extensible" case in JSString::flatten() fixes the test case.

Waldo: Since JSString::flatten is a member of JSString and the complexity and logic are local, I preferred to be very explicit about what fields were being accessed since, as I assume by now you see, there is subtle aliasing afoot.  Hiding field access behind member functions, IMHO, only makes this less clear since the reader now has to go off and read the accessor definition since, ultimately, you need to know what fields are being touched.  Moreover, the added asserts of the accessors provides little value since, if you read JSString::flatten() the asserted condition is dynamically guarded immediately before the field access.
(In reply to comment #10)
> (It also fixes a botched null-check of wholeChars in passing.)

Gag; vim i_CTRL-P for the lose!  Can you land that independently (rs=me)?
Sure, can land.

Another idea: flip another type bit when a flat string is the base of a dependent string.  1<<27 length ought to be enough for anyone, right?
I'm assuming you're saying: if this bit were set, that we wouldn't consider it for the extensible optimization.  The problem I see is that, in most of the cases where the extensible-optimization could apply, this bit would be set (all the internal nodes of a flattened rope dag become dependent strings on the root).

Also, I considered stealing a bit earlier and so I checked: all the other engines support at least 1<<28.
Severity: normal → critical
Attached patch fixSplinter Review
Thinking about this again, it has always been the case that the extensible-optimization can create chains of dependent strings (even before ropes); I just incorrectly extrapolated from the JS_ASSERT(base->isFlat()) in initDependent() and the logic in js_NewDependentString that there is some "base is always flat" invariant (viz., in NonRopeTypedMarker).  So, simple fix.

I made the test case iterative so that it doesn't depend on the VM max allowed call depth.
Assignee: jwalden+bmo → lw
Attachment #499672 - Flags: review?
Attachment #499672 - Flags: review? → review?(nnethercote)
(In reply to comment #16)
> 
> Thinking about this again, it has always been the case that the
> extensible-optimization can create chains of dependent strings (even before
> ropes);

Yeah, I remember seeing somewhere that the code tries to avoid them, but doesn't guarantee it.

I won't be able to get to this review until Dec 29, but I'll definitely do it then!  Good catch, decoder.
blocking2.0: ? → betaN+
(In reply to comment #12)
> Waldo: Since JSString::flatten is a member of JSString and the complexity and
> logic are local, I preferred to be very explicit about what fields were being
> accessed since, as I assume by now you see, there is subtle aliasing afoot. 
> Hiding field access behind member functions, IMHO, only makes this less clear
> since the reader now has to go off and read the accessor definition since,
> ultimately, you need to know what fields are being touched.  Moreover, the
> added asserts of the accessors provides little value since, if you read
> JSString::flatten() the asserted condition is dynamically guarded immediately
> before the field access.

I get the point, it makes some sense, although I'm not entirely sure I agree.
(In reply to comment #13)
> Gag; vim i_CTRL-P for the lose!  Can you land that independently (rs=me)?

http://hg.mozilla.org/tracemonkey/rev/1c8d8360b90d
Attachment #499672 - Flags: review?(nnethercote) → review+
Re: comment 18: Abstraction from concrete memory addressing/aliasing obscures the necessary order of reads and writes, seems to be what comment 12 is arguing. The assert after guard fallout of using inline accessors is not itself a bad pattern, though.

Luke: the original dependent string code would make chains, yeah. Is this back to the future, or do we flatten more than we used to?

/be
(In reply to comment #20)
> Luke: the original dependent string code would make chains, yeah. Is this back
> to the future, or do we flatten more than we used to?

We flatten less (since we no longer need to enforce the tree-ness of ropes), but not w.r.t the chains of dependent strings created via the extensibility optimization, which I think is what you are asking about, so, the former.
http://hg.mozilla.org/tracemonkey/rev/9fa77ffd1145
Whiteboard: fixed-in-tracemonkey
blocking2.0: betaN+ → beta9+
Whiteboard: fixed-in-tracemonkey → fixed-in-tracemonkey, [sg:critical?]
Whiteboard: fixed-in-tracemonkey, [sg:critical?] → fixed-in-tracemonkey, [sg:critical]
As per today's meeting, beta 9 will be a time-based release. Marking these all betaN+. Please move it back to beta9+ if  you believe it MUST be in the next beta (ie: trunk is in an unshippable state without this)
blocking2.0: beta9+ → betaN+
Fixing fields my automated script accidentally blanked. Apologies for the bugspam
http://hg.mozilla.org/mozilla-central/rev/9fa77ffd1145
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Keywords: regression
Whiteboard: fixed-in-tracemonkey, [sg:critical] → fixed-in-tracemonkey, [sg:critical?]
Any reason for removing sg:critical again without comment why this should be the case?
sg:critical wasn't removed...
(In reply to comment #28)
> sg:critical wasn't removed...

Nevermind my comment, I wasn't aware of the semantics of the ? here :D
This did not affect 1.9.2 and has been fixed long ago. Any reason to keep this locked?
Group: core-security
Status: RESOLVED → VERIFIED
rforbes-bugspam-for-setting-that-bounty-flag-20130719
Flags: sec-bounty+
You need to log in before you can comment on or make changes to this bug.