Last Comment Bug 1000182 - ASan heap-use-after-free crash [@ js::ThreadSafeContext::onOutOfMemory] with OOM
: ASan heap-use-after-free crash [@ js::ThreadSafeContext::onOutOfMemory] with OOM
Status: RESOLVED FIXED
[jsbugmon:]
: crash, csectype-uaf, leave-open, sec-critical, testcase
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: x86_64 Linux
P1 critical (vote)
: mozilla33
Assigned To: Jason Orendorff [:jorendorff]
: general
: Jason Orendorff [:jorendorff]
Mentors:
Depends on:
Blocks: langfuzz 912928
  Show dependency treegraph
 
Reported: 2014-04-23 07:22 PDT by Christian Holler (:decoder)
Modified: 2016-01-11 14:31 PST (History)
17 users (show)
cbook: in‑testsuite?
kjozwiak: qe‑verify-
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
disabled
disabled
+
disabled
disabled
disabled
verified
disabled
disabled


Attachments
bug-1000182-oom-v1.patch (5.83 KB, patch)
2014-06-12 18:14 PDT, Jason Orendorff [:jorendorff]
shu: review+
Details | Diff | Splinter Review
bug-1000182-prequel-jittest-unhandlable-oom-v1.patch (3.41 KB, patch)
2014-06-17 20:21 PDT, Jason Orendorff [:jorendorff]
terrence.d.cole: review+
Details | Diff | Splinter Review
bug-1000182-oom-v1.patch (rebased) (5.83 KB, patch)
2014-11-14 08:49 PST, Jan de Mooij [:jandem]
jdemooij: review+
Details | Diff | Splinter Review

Description User image Christian Holler (:decoder) 2014-04-23 07:22:38 PDT
The following testcase crashes on mozilla-central revision 1ab07aa4d004 (run with --fuzzing-safe):


function range(n, m) {
  var result = [];
    result.push(i);
  return result;
    for (var i = 0, l = a.length; i < l; i++) {}
}
function b(opFunction, cmpFunction) {
    var result = opFunction({mode:"compile"});
      var result = opFunction({mode:"par"});
}
function a(arr, op, func, cmpFunc) {
  b(function (m) { return arr[op + "Par"].apply(arr, [func, m]); }, function(m) {});
}
a(range(0, 1024), "map", function (i) {});
oomAfterAllocations(10);
Comment 1 User image Christian Holler (:decoder) 2014-04-23 07:25:04 PDT
Crash trace:


==53329==ERROR: AddressSanitizer: heap-use-after-free on address 0x613000037e80 at pc 0xcb6fb3 bp 0x7fffebd2f400 sp 0x7fffebd2f3f8
READ of size 8 at 0x613000037e80 thread T0
    #0 0xcb6fb2 in js::ThreadSafeContext::onOutOfMemory(void*, unsigned long) js/src/jscntxt.h:266
    #1 0xcb6fb2 in js::TempAllocPolicy::onOutOfMemory(void*, unsigned long) js/src/jsalloc.cpp:16
    #2 0xab74e0 in js::TempAllocPolicy::calloc_(unsigned long) js/src/jsalloc.h:68
    #3 0xab74e0 in js::detail::HashTable<js::EncapsulatedPtr<JSScript, unsigned long> const, js::HashSet<js::EncapsulatedPtr<JSScript, unsigned long>, js::DefaultHasher<js::EncapsulatedPtr<JSScript, unsigned long> >, js::TempAllocPolicy>::SetOps, js::TempAllocPolicy>::createTable(js::TempAllocPolicy&, unsigned int) js/src/opt64asan-oom/js/src/../../dist/include/js/HashTable.h:1057
    #4 0xab74e0 in js::detail::HashTable<js::EncapsulatedPtr<JSScript, unsigned long> const, js::HashSet<js::EncapsulatedPtr<JSScript, unsigned long>, js::DefaultHasher<js::EncapsulatedPtr<JSScript, unsigned long> >, js::TempAllocPolicy>::SetOps, js::TempAllocPolicy>::changeTableSize(int) js/src/opt64asan-oom/js/src/../../dist/include/js/HashTable.h:1299
    #5 0x95e25d in js::jit::JitCompartment::sweep(js::FreeOp*) js/src/jit/Ion.cpp:639
    #6 0xd12741 in JSCompartment::sweep(js::FreeOp*, bool) js/src/jscompartment.cpp:578
    [...]

0x613000037e80 is located 0 bytes inside of 384-byte region [0x613000037e80,0x613000038000)
freed by thread T0 here:
    #0 0x460bff in __interceptor_free /srv/repos/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:64
    #1 0x47f29a in DestroyContext(JSContext*, bool) js/src/shell/js.cpp:5586
    #2 0x47f29a in main js/src/shell/js.cpp:6201
    #3 0x7fcf051ec76c in __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226
Comment 2 User image Christian Holler (:decoder) 2014-04-25 05:53:57 PDT
JSBugMon: Cannot process bug: Unknown exception (check manually)
Comment 3 User image Chris Peterson [:cpeterson] 2014-04-28 00:45:48 PDT
decoder: Is this UAF crash reproducible? Do you know why JSBugMon can't bisect it?
Comment 4 User image Christian Holler (:decoder) 2014-05-06 07:26:02 PDT
JSBugMon currently doesn't process ASan failures, but this bug should be reproducible on the given revision. Even manual bisection is unlikely to be effective, since this is an OOM issue.
Comment 5 User image Christian Holler (:decoder) 2014-05-14 06:44:58 PDT
I still see this regularly in the OOM fuzzer.
Comment 6 User image Andrew McCreight [:mccr8] 2014-05-15 13:10:20 PDT
Jason, is this something you'll be able to get to soonish?
Comment 7 User image David Bolter [:davidb] 2014-05-29 13:12:54 PDT
Nudge nudge. (Jason) :)
Comment 8 User image David Bolter [:davidb] 2014-06-05 13:07:27 PDT
Marking 32 affected speculatively.
Comment 9 User image Jason Orendorff [:jorendorff] 2014-06-11 10:12:20 PDT
Investigating...
Comment 10 User image Jason Orendorff [:jorendorff] 2014-06-12 18:14:57 PDT
Created attachment 8439626 [details] [diff] [review]
bug-1000182-oom-v1.patch

Oh, this data structure resides in a Compartment and points to a JSContext. When the JSContext is destroyed, the data structure remains, with a dangling pointer. However, the dangling pointer is used only in OOM code. The test manages to force an OOM during shutdown, after the JSContext is freed.

The fix is pretty straightforward ("don't do that"). I am not sure I understand what the contract is for functions that return TrafficLight though.
Comment 11 User image Shu-yu Guo [:shu] 2014-06-13 13:53:55 PDT
Comment on attachment 8439626 [details] [diff] [review]
bug-1000182-oom-v1.patch

Review of attachment 8439626 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for fixing. I don't feel too strongly about taking or not taking a cx, so just marking r+.

::: js/src/jit/Ion.cpp
@@ +512,5 @@
>      return true;
>  }
>  
>  bool
> +JitCompartment::notifyOfActiveParallelEntryScript(HandleScript script)

Why the change it to not take a cx? I'd think it'd be a bit cleaner for it to report its own OOM error.

@@ +523,5 @@
>          return true;
>      }
>  
>      if (!activeParallelEntryScripts_) {
> +        ScriptSet *scripts = js_new<ScriptSet>();

This change is because of malloc byte tracking, right?

@@ +525,5 @@
>  
>      if (!activeParallelEntryScripts_) {
> +        ScriptSet *scripts = js_new<ScriptSet>();
> +        if (!scripts || !scripts->init()) {
> +            js_delete(scripts);

Ouch, thanks. :(
Comment 12 User image Shu-yu Guo [:shu] 2014-06-13 13:59:57 PDT
(In reply to Shu-yu Guo [:shu] from comment #11)
> Comment on attachment 8439626 [details] [diff] [review]
> bug-1000182-oom-v1.patch
> 
> Review of attachment 8439626 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Thanks for fixing. I don't feel too strongly about taking or not taking a
> cx, so just marking r+.
> 
> ::: js/src/jit/Ion.cpp
> @@ +512,5 @@
> >      return true;
> >  }
> >  
> >  bool
> > +JitCompartment::notifyOfActiveParallelEntryScript(HandleScript script)
> 
> Why the change it to not take a cx? I'd think it'd be a bit cleaner for it
> to report its own OOM error.
> 
> @@ +523,5 @@
> >          return true;
> >      }
> >  
> >      if (!activeParallelEntryScripts_) {
> > +        ScriptSet *scripts = js_new<ScriptSet>();
> 
> This change is because of malloc byte tracking, right?

Oh, no, this is the bug itself. Because it didn't have a SystemAllocPolicy, ScriptSet kept a dangling pointer to cx. I just read your comment. Maybe this is why you changed the function to not take a cx either, to avoid the temptation? In any case, the r+ stands.
Comment 13 User image Jason Orendorff [:jorendorff] 2014-06-17 20:21:04 PDT
Created attachment 8441828 [details] [diff] [review]
bug-1000182-prequel-jittest-unhandlable-oom-v1.patch

As a sort of prequel to the other patch, we need support for unhandlable oom in the jit-test harness.
Comment 14 User image Terrence Cole [:terrence] 2014-06-18 09:51:04 PDT
Comment on attachment 8441828 [details] [diff] [review]
bug-1000182-prequel-jittest-unhandlable-oom-v1.patch

Review of attachment 8441828 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, Jason! I should have added this with CrashAtUnhandlableOOM.
Comment 16 User image Jason Orendorff [:jorendorff] 2014-06-20 10:40:36 PDT
This landed without sec-approval because the code containing the bug is only enabled in Nightly builds.
Comment 19 User image Carsten Book [:Tomcat] 2014-06-23 01:15:44 PDT
had to backout this changes again since jit tests seems to fail after this checkin like https://tbpl.mozilla.org/php/getParsedLog.php?id=42252088&tree=Mozilla-Inbound
Comment 20 User image Jason Orendorff [:jorendorff] 2014-06-27 14:33:53 PDT
Hmm. No clue how this could be causing that. I re-read the patch; nothing.

I don't remember seeing anything like that on try server either, and this was right underneath Symbols so I think I must have run it through the try server 10 times.

Trying again, Windows debug only, and I'll retrigger it 8 times, see if I can get it to happen again.

https://tbpl.mozilla.org/?tree=Try&rev=18e94bfa603a
Comment 21 User image Jason Orendorff [:jorendorff] 2014-06-27 15:28:03 PDT
Well that was certainly annoying. Trying again.

https://tbpl.mozilla.org/?tree=Try&rev=8d5bad21866c
Comment 22 User image Jason Orendorff [:jorendorff] 2014-07-01 13:28:34 PDT
Interesting. Reproduces beautifully, symptoms always the same.

There is no output in the log for jit-test\tests\v8-v5\check-splay.js with "--no-baseline --no-ion".

Try-server-ing with only the first patch, to see if that's the issue. (If not, I'll land it.)

https://tbpl.mozilla.org/?tree=Try&rev=b2348fdc3c16
Comment 23 User image Jason Orendorff [:jorendorff] 2014-07-06 01:55:57 PDT
The first patch was not the issue. Landed:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8d67333d2868
Comment 24 User image Carsten Book [:Tomcat] 2014-07-07 05:41:45 PDT
https://hg.mozilla.org/mozilla-central/rev/8d67333d2868
Comment 25 User image Al Billings [:abillings] 2014-07-23 10:09:05 PDT
Jason, you changed from "fixed" to "affected" for 33 on 7/7. Did the fix not work?
Comment 26 User image Jason Orendorff [:jorendorff] 2014-08-25 08:23:37 PDT
Current status of this bug:  bug-1000182-oom-v1.patch can't land because it apparently causes ancient-Windows-only failures. Completely impossible.

I'm going to try server this again, hoping that in the meantime something unrelated shook loose and it'll work this time. If that doesn't work, we have no good options. I could '#ifdef windows' the fix, but that doesn't really do anyone any good. Maybe I can change the fix to be a very stupid fix that just crashes if we run out of memory at that particular spot---that is, maybe a different fix will tickle Windows in a different way.
Comment 27 User image Jason Orendorff [:jorendorff] 2014-09-04 14:38:09 PDT
https://tbpl.mozilla.org/?tree=Try&rev=6f731e400d4b
Comment 28 User image Al Billings [:abillings] 2014-09-10 10:20:51 PDT
What is the plan here to address bustage so we can get this sec-critical rated issue fixed? :-)
Comment 29 User image Jason Orendorff [:jorendorff] 2014-09-11 03:00:42 PDT
https://tbpl.mozilla.org/?tree=Try&rev=16f8427c7ada
Comment 30 User image Jason Orendorff [:jorendorff] 2014-09-11 16:02:14 PDT
Well... the good news is, this mysterious Mac bustage appears to be completely unrelated to the mysterious Windows bustage that originally prevented this patch from landing.

I'll try something different tomorrow.
Comment 31 User image Matt Wobensmith [:mwobensmith][:matt:] 2014-10-01 09:30:58 PDT
Hi Jason, did you find anything out, based on your last comment?
Comment 32 User image David Bolter [:davidb] 2014-10-22 10:11:41 PDT
[Meta: I just email-pinged for the NI]
Comment 33 User image Al Billings [:abillings] 2014-11-12 10:09:02 PST
Naveed, can we do anything to move this along?
Comment 34 User image Naveed Ihsanullah [:naveed] 2014-11-13 08:41:32 PST
Jason is on PTO through the end of the week. Eric, Jason suggested you may be able to pick this one up. Can you please take a look?
Comment 35 User image Jason Orendorff [:jorendorff] 2014-11-13 14:35:19 PST
Eric, I already asked jandem to take a look, so please coordinate with him.
Comment 36 User image Jason Orendorff [:jorendorff] 2014-11-13 14:38:45 PST
Just to explain briefly why there hasn't been more urgency on this issue -- it's mainly because this is in Parallel JS, a Nightly-only feature.
Comment 37 User image Al Billings [:abillings] 2014-11-13 15:51:43 PST
(In reply to Jason Orendorff [:jorendorff] from comment #36)
> Just to explain briefly why there hasn't been more urgency on this issue --
> it's mainly because this is in Parallel JS, a Nightly-only feature.

Ah, thanks. I didn't realize this was Parallel JS only.
Comment 38 User image Naveed Ihsanullah [:naveed] 2014-11-13 16:19:37 PST
I missed that. Should have read the code more closely. A bit more color on this. PJS will never ride the trains out. We will be removing the PJS code early 2015. Unfortunately we are unable to disable the code in Nightly until the end of the year.
Comment 39 User image Jan de Mooij [:jandem] 2014-11-14 08:49:43 PST
Created attachment 8523026 [details] [diff] [review]
bug-1000182-oom-v1.patch (rebased)

Rebased the patch and kicked off a Linux64 + Win32 Try run to see how things are now:

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=d6f748a80637
Comment 40 User image Jan de Mooij [:jandem] 2014-11-17 05:06:39 PST
(In reply to Jan de Mooij [:jandem] from comment #39)
> Rebased the patch and kicked off a Linux64 + Win32 Try run to see how things
> are now:
> 
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=d6f748a80637

The test is failing in opt builds (no oomAfterAllocations) and on Linux we hit a CrashAtUnhandlableOOM:

Assertion failure: [unhandlable oom] Could not add to pending GC helpers list

The test is probably too brittle. Considering the future of PJS, I think I'll just land this without the testcase.
Comment 41 User image Jan de Mooij [:jandem] 2014-11-17 05:21:37 PST
Hm, the patch that landed is not the same as the attachment; it was updated after review comments. Let's try this again with the newer version:

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=3bae756e1c1d
Comment 43 User image Carsten Book [:Tomcat] 2014-11-18 08:44:18 PST
https://hg.mozilla.org/mozilla-central/rev/03cd5b31ae49
Comment 44 User image Al Billings [:abillings] 2014-11-18 13:23:41 PST
I assume that this went in without sec-approval+ because Parallel JS is trunk only and doesn't affect branches? If so, shouldn't the status flags for 34 and 35 be set to "disabled" to remove ambiguity? 

I'd change this myself but I'm not sure I'm correct here.
Comment 45 User image Carsten Book [:Tomcat] 2014-11-18 23:44:51 PST
(In reply to Al Billings [:abillings] from comment #44)
> I assume that this went in without sec-approval+ because Parallel JS is
> trunk only and doesn't affect branches? If so, shouldn't the status flags
> for 34 and 35 be set to "disabled" to remove ambiguity? 
> 
> I'd change this myself but I'm not sure I'm correct here.

good question Al, yeah it landed without sec-approval. Jan do you know the answer to the flag question, thx!
Comment 46 User image Jan de Mooij [:jandem] 2014-11-19 01:24:01 PST
(In reply to Al Billings [:abillings] from comment #44)
> I assume that this went in without sec-approval+ because Parallel JS is
> trunk only and doesn't affect branches? If so, shouldn't the status flags
> for 34 and 35 be set to "disabled" to remove ambiguity? 
> 
> I'd change this myself but I'm not sure I'm correct here.

That's right, I changed it. Thanks!
Comment 47 User image Kamil Jozwiak [:kjozwiak] 2014-11-20 15:37:45 PST
So I reproduced the original issue using mozilla-central revision 1ab07aa4d004 using the following configuration command:

CC="/home/kjozwiak/llvm/build/bin/clang -Qunused-arguments -fsanitize=address -Dxmalloc=myxmalloc" CXX="/home/kjozwiak/llvm/build/bin/clang++ -Qunused-arguments -fsanitize=address -Dxmalloc=myxmalloc" AR=ar sh /home/kjozwiak/code/mozilla-central/js/src/configure --enable-address-sanitizer --enable-debug --enable-optimize --enable-nspr-build --enable-more-deterministic --with-ccache --enable-gczeal --enable-debug-symbols --disable-tests

After running "make" & "./js --fuzzing-safe poc.js", I got the following AddressSanitizer crash:

==19250==ERROR: AddressSanitizer: heap-use-after-free on address 0x61500003f000 at pc 0x11b7363 bp 0x7ffff7d5d780 sp 0x7ffff7d5d778
READ of size 8 at 0x61500003f000 thread T0

When I went through the same steps with the latest mozilla-central revision 7d17b594834f, I keep getting the following crash:

kjozwiak@ubuntu:~/code/mozilla-central/js/src/_out/dist/bin$ ./js --fuzzing-safe poc.js
Assertion failure: [unhandlable oom] Could not add to pending GC helpers list, at /home/kjozwiak/code/mozilla-central/js/src/jscntxt.cpp:1276
Hit MOZ_CRASH() at /home/kjozwiak/code/mozilla-central/js/src/jscntxt.cpp:1277
ASAN:SIGSEGV
=================================================================
==103466==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000001559fc2 sp 0x7fff98cafb40 bp 0x7fff98cb0030 T0)
    #0 0x1559fc1 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x1559fc1)
    #1 0x164bc96 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x164bc96)
    #2 0x164dde5 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x164dde5)
    #3 0x1671bd3 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x1671bd3)
    #4 0x1677349 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x1677349)
    #5 0x1679384 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x1679384)
    #6 0x167ac77 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x167ac77)
    #7 0x1648db1 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x1648db1)
    #8 0x14e4484 (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x14e4484)
    #9 0x4c8a8a (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x4c8a8a)
    #10 0x7ffce60c2ec4 (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
    #11 0x4ba53a (/home/kjozwiak/code/mozilla-central/js/src/_out/js/src/shell/js+0x4ba53a)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 ??
==103466==ABORTING

Christian, does this mean it's not fixed or perhaps I'm doing something incorrectly? The second crash is definitely different than the first one but I'm not sure if they are related to each other.
Comment 48 User image Christian Holler (:decoder) 2014-11-20 15:41:32 PST
All crashes with "Assertion failure: [unhandlable oom]" are expected. This does not necessarily mean that the issue is fixed, because verifying OOM crashes is nearly impossible (they depend on memory usage which makes them unstable across revisions. There is no reliable way to verify them.
Comment 49 User image Kamil Jozwiak [:kjozwiak] 2014-11-20 15:44:13 PST
Thanks Christian! I'm marking this as qe‑verify- as per comment #48.
Comment 50 User image Christian Holler (:decoder) 2014-11-28 20:51:57 PST
JSBugMon: This bug has been automatically verified fixed on Fx36

Note You need to log in before you can comment on or make changes to this bug.