Closed Bug 1000182 Opened 6 years ago Closed 5 years ago

ASan heap-use-after-free crash [@ js::ThreadSafeContext::onOutOfMemory] with OOM

Categories

(Core :: JavaScript Engine, defect, P1, critical)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla33
Tracking Status
firefox31 --- disabled
firefox32 --- disabled
firefox33 + disabled
firefox34 --- disabled
firefox35 --- disabled
firefox36 --- verified
firefox-esr24 --- disabled
firefox-esr31 --- disabled

People

(Reporter: decoder, Assigned: jorendorff)

References

(Blocks 2 open bugs)

Details

(4 keywords, Whiteboard: [jsbugmon:])

Crash Data

Attachments

(3 files)

The following testcase crashes on mozilla-central revision 1ab07aa4d004 (run with --fuzzing-safe):


function range(n, m) {
  var result = [];
    result.push(i);
  return result;
    for (var i = 0, l = a.length; i < l; i++) {}
}
function b(opFunction, cmpFunction) {
    var result = opFunction({mode:"compile"});
      var result = opFunction({mode:"par"});
}
function a(arr, op, func, cmpFunc) {
  b(function (m) { return arr[op + "Par"].apply(arr, [func, m]); }, function(m) {});
}
a(range(0, 1024), "map", function (i) {});
oomAfterAllocations(10);
Crash trace:


==53329==ERROR: AddressSanitizer: heap-use-after-free on address 0x613000037e80 at pc 0xcb6fb3 bp 0x7fffebd2f400 sp 0x7fffebd2f3f8
READ of size 8 at 0x613000037e80 thread T0
    #0 0xcb6fb2 in js::ThreadSafeContext::onOutOfMemory(void*, unsigned long) js/src/jscntxt.h:266
    #1 0xcb6fb2 in js::TempAllocPolicy::onOutOfMemory(void*, unsigned long) js/src/jsalloc.cpp:16
    #2 0xab74e0 in js::TempAllocPolicy::calloc_(unsigned long) js/src/jsalloc.h:68
    #3 0xab74e0 in js::detail::HashTable<js::EncapsulatedPtr<JSScript, unsigned long> const, js::HashSet<js::EncapsulatedPtr<JSScript, unsigned long>, js::DefaultHasher<js::EncapsulatedPtr<JSScript, unsigned long> >, js::TempAllocPolicy>::SetOps, js::TempAllocPolicy>::createTable(js::TempAllocPolicy&, unsigned int) js/src/opt64asan-oom/js/src/../../dist/include/js/HashTable.h:1057
    #4 0xab74e0 in js::detail::HashTable<js::EncapsulatedPtr<JSScript, unsigned long> const, js::HashSet<js::EncapsulatedPtr<JSScript, unsigned long>, js::DefaultHasher<js::EncapsulatedPtr<JSScript, unsigned long> >, js::TempAllocPolicy>::SetOps, js::TempAllocPolicy>::changeTableSize(int) js/src/opt64asan-oom/js/src/../../dist/include/js/HashTable.h:1299
    #5 0x95e25d in js::jit::JitCompartment::sweep(js::FreeOp*) js/src/jit/Ion.cpp:639
    #6 0xd12741 in JSCompartment::sweep(js::FreeOp*, bool) js/src/jscompartment.cpp:578
    [...]

0x613000037e80 is located 0 bytes inside of 384-byte region [0x613000037e80,0x613000038000)
freed by thread T0 here:
    #0 0x460bff in __interceptor_free /srv/repos/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:64
    #1 0x47f29a in DestroyContext(JSContext*, bool) js/src/shell/js.cpp:5586
    #2 0x47f29a in main js/src/shell/js.cpp:6201
    #3 0x7fcf051ec76c in __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226
Blocks: 912928
Keywords: csectype-uaf
Whiteboard: [jsbugmon:update,bisect]
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:bisect]
JSBugMon: Cannot process bug: Unknown exception (check manually)
Whiteboard: [jsbugmon:bisect] → [jsbugmon:]
decoder: Is this UAF crash reproducible? Do you know why JSBugMon can't bisect it?
Flags: needinfo?(choller)
Priority: -- → P1
Group: javascript-core-security
JSBugMon currently doesn't process ASan failures, but this bug should be reproducible on the given revision. Even manual bisection is unlikely to be effective, since this is an OOM issue.
Flags: needinfo?(choller)
Assignee: general → jorendorff
I still see this regularly in the OOM fuzzer.
Jason, is this something you'll be able to get to soonish?
Flags: needinfo?(jorendorff)
Nudge nudge. (Jason) :)
Marking 32 affected speculatively.
Investigating...
Oh, this data structure resides in a Compartment and points to a JSContext. When the JSContext is destroyed, the data structure remains, with a dangling pointer. However, the dangling pointer is used only in OOM code. The test manages to force an OOM during shutdown, after the JSContext is freed.

The fix is pretty straightforward ("don't do that"). I am not sure I understand what the contract is for functions that return TrafficLight though.
Attachment #8439626 - Flags: review?(shu)
Flags: needinfo?(jorendorff)
Comment on attachment 8439626 [details] [diff] [review]
bug-1000182-oom-v1.patch

Review of attachment 8439626 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for fixing. I don't feel too strongly about taking or not taking a cx, so just marking r+.

::: js/src/jit/Ion.cpp
@@ +512,5 @@
>      return true;
>  }
>  
>  bool
> +JitCompartment::notifyOfActiveParallelEntryScript(HandleScript script)

Why the change it to not take a cx? I'd think it'd be a bit cleaner for it to report its own OOM error.

@@ +523,5 @@
>          return true;
>      }
>  
>      if (!activeParallelEntryScripts_) {
> +        ScriptSet *scripts = js_new<ScriptSet>();

This change is because of malloc byte tracking, right?

@@ +525,5 @@
>  
>      if (!activeParallelEntryScripts_) {
> +        ScriptSet *scripts = js_new<ScriptSet>();
> +        if (!scripts || !scripts->init()) {
> +            js_delete(scripts);

Ouch, thanks. :(
Attachment #8439626 - Flags: review?(shu) → review+
(In reply to Shu-yu Guo [:shu] from comment #11)
> Comment on attachment 8439626 [details] [diff] [review]
> bug-1000182-oom-v1.patch
> 
> Review of attachment 8439626 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Thanks for fixing. I don't feel too strongly about taking or not taking a
> cx, so just marking r+.
> 
> ::: js/src/jit/Ion.cpp
> @@ +512,5 @@
> >      return true;
> >  }
> >  
> >  bool
> > +JitCompartment::notifyOfActiveParallelEntryScript(HandleScript script)
> 
> Why the change it to not take a cx? I'd think it'd be a bit cleaner for it
> to report its own OOM error.
> 
> @@ +523,5 @@
> >          return true;
> >      }
> >  
> >      if (!activeParallelEntryScripts_) {
> > +        ScriptSet *scripts = js_new<ScriptSet>();
> 
> This change is because of malloc byte tracking, right?

Oh, no, this is the bug itself. Because it didn't have a SystemAllocPolicy, ScriptSet kept a dangling pointer to cx. I just read your comment. Maybe this is why you changed the function to not take a cx either, to avoid the temptation? In any case, the r+ stands.
As a sort of prequel to the other patch, we need support for unhandlable oom in the jit-test harness.
Attachment #8441828 - Flags: review?(terrence)
Comment on attachment 8441828 [details] [diff] [review]
bug-1000182-prequel-jittest-unhandlable-oom-v1.patch

Review of attachment 8441828 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, Jason! I should have added this with CrashAtUnhandlableOOM.
Attachment #8441828 - Flags: review?(terrence) → review+
This landed without sec-approval because the code containing the bug is only enabled in Nightly builds.
had to backout this changes again since jit tests seems to fail after this checkin like https://tbpl.mozilla.org/php/getParsedLog.php?id=42252088&tree=Mozilla-Inbound
Hmm. No clue how this could be causing that. I re-read the patch; nothing.

I don't remember seeing anything like that on try server either, and this was right underneath Symbols so I think I must have run it through the try server 10 times.

Trying again, Windows debug only, and I'll retrigger it 8 times, see if I can get it to happen again.

https://tbpl.mozilla.org/?tree=Try&rev=18e94bfa603a
Well that was certainly annoying. Trying again.

https://tbpl.mozilla.org/?tree=Try&rev=8d5bad21866c
Interesting. Reproduces beautifully, symptoms always the same.

There is no output in the log for jit-test\tests\v8-v5\check-splay.js with "--no-baseline --no-ion".

Try-server-ing with only the first patch, to see if that's the issue. (If not, I'll land it.)

https://tbpl.mozilla.org/?tree=Try&rev=b2348fdc3c16
The first patch was not the issue. Landed:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8d67333d2868
Keywords: leave-open
https://hg.mozilla.org/mozilla-central/rev/8d67333d2868
Flags: in-testsuite?
Target Milestone: --- → mozilla33
Jason, you changed from "fixed" to "affected" for 33 on 7/7. Did the fix not work?
Flags: needinfo?(jorendorff)
Current status of this bug:  bug-1000182-oom-v1.patch can't land because it apparently causes ancient-Windows-only failures. Completely impossible.

I'm going to try server this again, hoping that in the meantime something unrelated shook loose and it'll work this time. If that doesn't work, we have no good options. I could '#ifdef windows' the fix, but that doesn't really do anyone any good. Maybe I can change the fix to be a very stupid fix that just crashes if we run out of memory at that particular spot---that is, maybe a different fix will tickle Windows in a different way.
Flags: needinfo?(jorendorff)
What is the plan here to address bustage so we can get this sec-critical rated issue fixed? :-)
Well... the good news is, this mysterious Mac bustage appears to be completely unrelated to the mysterious Windows bustage that originally prevented this patch from landing.

I'll try something different tomorrow.
Hi Jason, did you find anything out, based on your last comment?
Flags: needinfo?(jorendorff)
[Meta: I just email-pinged for the NI]
Naveed, can we do anything to move this along?
Flags: needinfo?(nihsanullah)
Jason is on PTO through the end of the week. Eric, Jason suggested you may be able to pick this one up. Can you please take a look?
Flags: needinfo?(nihsanullah) → needinfo?(efaustbmo)
Eric, I already asked jandem to take a look, so please coordinate with him.
Flags: needinfo?(jorendorff)
Just to explain briefly why there hasn't been more urgency on this issue -- it's mainly because this is in Parallel JS, a Nightly-only feature.
(In reply to Jason Orendorff [:jorendorff] from comment #36)
> Just to explain briefly why there hasn't been more urgency on this issue --
> it's mainly because this is in Parallel JS, a Nightly-only feature.

Ah, thanks. I didn't realize this was Parallel JS only.
I missed that. Should have read the code more closely. A bit more color on this. PJS will never ride the trains out. We will be removing the PJS code early 2015. Unfortunately we are unable to disable the code in Nightly until the end of the year.
Flags: needinfo?(efaustbmo)
Flags: needinfo?(jdemooij)
Rebased the patch and kicked off a Linux64 + Win32 Try run to see how things are now:

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=d6f748a80637
Attachment #8523026 - Flags: review+
(In reply to Jan de Mooij [:jandem] from comment #39)
> Rebased the patch and kicked off a Linux64 + Win32 Try run to see how things
> are now:
> 
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=d6f748a80637

The test is failing in opt builds (no oomAfterAllocations) and on Linux we hit a CrashAtUnhandlableOOM:

Assertion failure: [unhandlable oom] Could not add to pending GC helpers list

The test is probably too brittle. Considering the future of PJS, I think I'll just land this without the testcase.
Hm, the patch that landed is not the same as the attachment; it was updated after review comments. Let's try this again with the newer version:

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=3bae756e1c1d
Flags: needinfo?(jdemooij)
Flags: needinfo?(jdemooij)
https://hg.mozilla.org/mozilla-central/rev/03cd5b31ae49
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
I assume that this went in without sec-approval+ because Parallel JS is trunk only and doesn't affect branches? If so, shouldn't the status flags for 34 and 35 be set to "disabled" to remove ambiguity? 

I'd change this myself but I'm not sure I'm correct here.
Flags: needinfo?(cbook)
(In reply to Al Billings [:abillings] from comment #44)
> I assume that this went in without sec-approval+ because Parallel JS is
> trunk only and doesn't affect branches? If so, shouldn't the status flags
> for 34 and 35 be set to "disabled" to remove ambiguity? 
> 
> I'd change this myself but I'm not sure I'm correct here.

good question Al, yeah it landed without sec-approval. Jan do you know the answer to the flag question, thx!
Flags: needinfo?(cbook) → needinfo?(jdemooij)
(In reply to Al Billings [:abillings] from comment #44)
> I assume that this went in without sec-approval+ because Parallel JS is
> trunk only and doesn't affect branches? If so, shouldn't the status flags
> for 34 and 35 be set to "disabled" to remove ambiguity? 
> 
> I'd change this myself but I'm not sure I'm correct here.

That's right, I changed it. Thanks!
Flags: needinfo?(jdemooij)
So I reproduced the original issue using mozilla-central revision 1ab07aa4d004 using the following configuration command:

CC="/home/kjozwiak/llvm/build/bin/clang -Qunused-arguments -fsanitize=address -Dxmalloc=myxmalloc" CXX="/home/kjozwiak/llvm/build/bin/clang++ -Qunused-arguments -fsanitize=address -Dxmalloc=myxmalloc" AR=ar sh /home/kjozwiak/code/mozilla-central/js/src/configure --enable-address-sanitizer --enable-debug --enable-optimize --enable-nspr-build --enable-more-deterministic --with-ccache --enable-gczeal --enable-debug-symbols --disable-tests

After running "make" & "./js --fuzzing-safe poc.js", I got the following AddressSanitizer crash:

==19250==ERROR: AddressSanitizer: heap-use-after-free on address 0x61500003f000 at pc 0x11b7363 bp 0x7ffff7d5d780 sp 0x7ffff7d5d778
READ of size 8 at 0x61500003f000 thread T0

When I went through the same steps with the latest mozilla-central revision 7d17b594834f, I keep getting the following crash:

kjozwiak@ubuntu:~/code/mozilla-central/js/src/_out/dist/bin$ ./js --fuzzing-safe poc.js
Assertion failure: [unhandlable oom] Could not add to pending GC helpers list, at /home/kjozwiak/code/mozilla-central/js/src/jscntxt.cpp:1276
Hit MOZ_CRASH() at /home/kjozwiak/code/mozilla-central/js/src/jscntxt.cpp:1277
ASAN:SIGSEGV
=================================================================
==103466==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000001559fc2 sp 0x7fff98cafb40 bp 0x7fff98cb0030 T0)
    #0 0x1559fc1 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x1559fc1)
    #1 0x164bc96 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x164bc96)
    #2 0x164dde5 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x164dde5)
    #3 0x1671bd3 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x1671bd3)
    #4 0x1677349 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x1677349)
    #5 0x1679384 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x1679384)
    #6 0x167ac77 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x167ac77)
    #7 0x1648db1 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x1648db1)
    #8 0x14e4484 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x14e4484)
    #9 0x4c8a8a (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x4c8a8a)
    #10 0x7ffce60c2ec4 (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
    #11 0x4ba53a (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x4ba53a)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 ??
==103466==ABORTING

Christian, does this mean it's not fixed or perhaps I'm doing something incorrectly? The second crash is definitely different than the first one but I'm not sure if they are related to each other.
Flags: needinfo?(choller)
All crashes with "Assertion failure: [unhandlable oom]" are expected. This does not necessarily mean that the issue is fixed, because verifying OOM crashes is nearly impossible (they depend on memory usage which makes them unstable across revisions. There is no reliable way to verify them.
Flags: needinfo?(choller)
Thanks Christian! I'm marking this as qe‑verify- as per comment #48.
Flags: qe-verify-
JSBugMon: This bug has been automatically verified fixed on Fx36
Group: javascript-core-security
Group: core-security → core-security-release
Group: core-security-release
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.