Closed Bug 617935 Opened 9 years ago Closed 9 years ago

64-bit Memory corruption/Integer overflow in RegExp replace

Categories

(Core :: JavaScript Engine, defect, critical)

x86_64
Linux
defect
Not set
critical

Tracking

()

VERIFIED FIXED
Tracking Status
blocking2.0 --- betaN+
blocking1.9.2 --- needed
status1.9.2 --- wanted

People

(Reporter: decoder, Assigned: cdleary)

References

(Blocks 1 open bug)

Details

(Keywords: crash, Whiteboard: [sg:critical] [fixed-in-tracemonkey][hardblocker])

Attachments

(2 files, 1 obsolete file)

The 1.9.2 and 2.0 branches are vulnerable to an integer overflow when performing replace() with a regular expression on a 64-bit system with more than 4 GB of memory.

The attached script (js shell) demonstrates a segmentation fault caused by a memcpy gone wild. (WARNING: Do not run this if you have less than or exactly 4 GB of memory, this will consume ~ 4 GB because a string of length 2^31 is required which in UTF-16 is twice the length...).

I consider the vulnerability to be exploitable if the memory/architecture conditions are met because the snprintf should allow you arbitrary writes before the actual buffer with the proper amount of memory. Nevertheless, due to the amount of RAM an attack is not realistic. However, if there is some way to achieve some kind of optimized storage for repeating strings with the effect that the string has the given length but does not consume the full memory, then exploitation might get much easier. I don't know if there are such possibilities in the current implementation.

Here is a gdb backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6f4001b in memcpy () from /lib/libc.so.6
(gdb) bt
#0  0x00007ffff6f4001b in memcpy () from /lib/libc.so.6
#1  0x00000000005632a5 in DoReplace (cx=0xaa51c0, res=0xaaa130, rdata=..., chars=0x7ffaf6930010) at jsstr.cpp:2241
#2  0x0000000000563421 in ReplaceCallback (cx=0xaa51c0, res=0xaaa130, count=32768, p=0x7fffffffbad0) at jsstr.cpp:2268
#3  0x0000000000561ede in DoMatch (cx=0xaa51c0, res=0xaaa130, vp=0x7ffff6abf0a0, str=0x7ffff690f1a0, rep=..., callback=0x5632af <ReplaceCallback>, data=0x7fffffffbad0, 
    flags=REPLACE_ARGS) at jsstr.cpp:1847
#4  0x0000000000563e0c in str_replace_regexp (cx=0xaa51c0, argc=2, vp=0x7ffff6abf0a0, rdata=...) at jsstr.cpp:2436
#5  0x0000000000564683 in js::str_replace (cx=0xaa51c0, argc=2, vp=0x7ffff6abf0a0) at jsstr.cpp:2589
#6  0x00000000004c5368 in js::CallJSNative (cx=0xaa51c0, native=0x56421b <js::str_replace(JSContext*, unsigned int, js::Value*)>, argc=2, vp=0x7ffff6abf0a0)
    at jscntxtinlines.h:684
#7  0x00000000006c28da in js::Interpret (cx=0xaa51c0, entryFrame=0x7ffff6abf048, inlineCallCount=0, interpMode=JSINTERP_NORMAL) at jsinterp.cpp:4748
#8  0x00000000004c14f7 in js::RunScript (cx=0xaa51c0, script=0xac6530, fp=0x7ffff6abf048) at jsinterp.cpp:657
#9  0x00000000004c2873 in js::Execute (cx=0xaa51c0, chain=0x7ffff6903048, script=0xac6530, prev=0x0, flags=0, result=0x0) at jsinterp.cpp:1005
#10 0x000000000042cdd1 in JS_ExecuteScript (cx=0xaa51c0, obj=0x7ffff6903048, script=0xac6530, rval=0x0) at jsapi.cpp:4828
#11 0x000000000040524f in Process (cx=0xaa51c0, obj=0x7ffff6903048, filename=0x7fffffffdcda "min.js", forceTTY=0) at js.cpp:453
#12 0x0000000000406048 in ProcessArgs (cx=0xaa51c0, obj=0x7ffff6903048, argv=0x7fffffffd9a0, argc=2) at js.cpp:871
#13 0x000000000040f3a6 in Shell (cx=0xaa51c0, argc=2, argv=0x7fffffffd9a0, envp=0x7fffffffd9b8) at js.cpp:5370
#14 0x000000000040f56c in main (argc=2, argv=0x7fffffffd9a0, envp=0x7fffffffd9b8) at js.cpp:5478

If you need me to run further debug commands on this, let me know (I have the necessary hardware available).
Attached file Test case for shell
Test case to run on shell. DO NOT RUN if you're system has less than or exactly 4 GB of RAM (see description).
Keywords: crash
Whiteboard: [sg:critical]
Assignee: general → cdleary
blocking2.0: --- → betaN+
From a cursory look I think we can just ensure the rdata.cb growth puts us in valid MAX_STRING_LENGTH territory. It'll be good to take a look at other uses of the CharBuffers for similar issues -- can probably abstract this potential badness away a bit.
Status: NEW → ASSIGNED
I left all the identifiers as |cb| to avoid unnecessary code thrash.

I talked with luke and he convinced me it better to avoid a one-use-case |CheckLength| policy in favor of encapsulation.

There's still a Vector<jschar, 32> used in the tokenizer to avoid any unnecessary checking overhead -- the atomization of those vectors still checks string length for overallocation.

Christian, do I have your permission to release the test into the public domain?
Attachment #502885 - Flags: review?(lw)
Sure :) You may include my tag in the test.

"Christian Holler <decoder@own-hero.net>"
Add Christian's test.
Attachment #502885 - Attachment is obsolete: true
Attachment #502897 - Flags: review?(lw)
Attachment #502885 - Flags: review?(lw)
Comment on attachment 502897 [details] [diff] [review]
Replace JSCharBuffer with an invariant-checking StringBuffer type.

I see perf change as being in the noise on both sunspider and V8.
Comment on attachment 502897 [details] [diff] [review]
Replace JSCharBuffer with an invariant-checking StringBuffer type.

First of all, nice patch; it seems like a clean way to handle the problem.

From IRL discussion:
 - StringBuffer cb(cx) is kindof an eyesore; 'sb' would be nice
 - Could you un-member-ify append(js::Value) (back into js_ValueToStringBuffer) since its not really part of a StringBuffer, but a spec impl of toString

Other requests:

The two static atomizing StringBuffer members seem out of place; perhaps these could go back to free functions

If all the uses of js_AppendLiteral have been replaced by StringBuffer::append, could you delete the original from jsvector.h?

StringBuffer::build() is a bit ambiguous; perhaps StringBuffer::createString()?

So the general eager-length-checking strategy makes sense because it catches overflows potentially much earlier (avoiding the need to hit OOM which, on 64-bit, might take a while).  However, perhaps you can take the eager check out of replaceBuffer so that it can retain the same interface (and rename it back to replaceRawBuffer)?  My complaint is that the replaceBuffer interface leaves the question of, on failure, who takes responsibility for freeing the given memory.  E.g., the patch leaks on failure in array_toSource.  (That means you also need to add a final length check to StringBuffer::createString() (or whatever it gets named).

r+ with those addressed.
Attachment #502897 - Flags: review?(lw) → review+
(In reply to comment #7)
> StringBuffer::build() is a bit ambiguous; perhaps StringBuffer::createString()?

IRL we decided |StringBuffer::finishString()| reflects the fact the vector is cleared of all existing characters while still being more specific than |build()|.

Great point/catch with the replaceBuffer ownership.
http://hg.mozilla.org/tracemonkey/rev/7e5853562deb
Whiteboard: [sg:critical] → [sg:critical] [fixed-in-tracemonkey]
Whiteboard: [sg:critical] [fixed-in-tracemonkey] → [sg:critical] [fixed-in-tracemonkey][hardblocker]
blocking1.9.2: --- → ?
blocking1.9.2: ? → needed
Chris: is this worth back-porting to 1.9.2? Seems a big-ish patch and we don't really support 64-bit builds of 1.9.2 anyway (though Linux vendors do).
(In reply to comment #12)

It is quite large. Instead of the patch that eliminates the whole category of errors, we could make a different, tiny patch for this one vector on 1.9.2 -- is that something we tend to do for older branches?
Blocks: 676763
We don't ship our own 64-bit 3.6 builds, but some linux distros do (and support them a pretty long time). Do you know already where a tiny targeted patch would need to go or would that take more investigation. Old-Branch-band-aides are certainly something we commonly do.
Marking VERIFIED as this has a testcase in testsuite.
Status: RESOLVED → VERIFIED
Group: core-security
Flags: sec-bounty+
You need to log in before you can comment on or make changes to this bug.