Last Comment Bug 658016 - TI: Crash [@ JSString::isStaticAtom()]
: TI: Crash [@ JSString::isStaticAtom()]
Status: RESOLVED FIXED
fixed-in-tracemonkey
: crash, testcase
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: x86_64 Linux
: -- critical (vote)
: ---
Assigned To: Igor Bukanov
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on:
Blocks: langfuzz 601234
  Show dependency treegraph
 
Reported: 2011-05-18 12:25 PDT by Christian Holler (:decoder)
Modified: 2011-08-05 00:54 PDT (History)
6 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
shell testcase, unpack, chdir and run main.js with options "-j -m -a" (3.58 KB, application/x-compressed-tar)
2011-05-18 12:25 PDT, Christian Holler (:decoder)
no flags Details
v1 (9.97 KB, patch)
2011-05-19 17:21 PDT, Igor Bukanov
no flags Details | Diff | Splinter Review
v2 (14.21 KB, patch)
2011-05-20 09:36 PDT, Igor Bukanov
wmccloskey: review+
Details | Diff | Splinter Review

Description Christian Holler (:decoder) 2011-05-18 12:25:19 PDT
Created attachment 533372 [details]
shell testcase, unpack, chdir and run main.js with options "-j -m -a"

The attached testcase crashes on TI revision 97f9e3274bd5 (run main.js with -j -m -a), tested on 64 bit.

Backtrace:

==12185== Invalid read of size 8
==12185==    at 0x46ED10: JSString::isStaticAtom() const (jsstr.h:367)
==12185==    by 0x4CCF38: js::gc::MarkId(JSTracer*, jsid) (jsgcmark.cpp:252)
==12185==    by 0x4CCFDA: js::gc::MarkId(JSTracer*, jsid, char const*) (jsgcmark.cpp:263)
==12185==    by 0x4CE33F: js::gc::MarkChildren(JSTracer*, js::Shape const*) (jsgcmark.cpp:653)
==12185==    by 0x4CE62C: JS_TraceChildren (jsgcmark.cpp:719)
==12185==    by 0x42D8E5: JS_DumpHeap (jsapi.cpp:2554)
==12185==    by 0x40C21D: DumpHeap(JSContext*, unsigned int, jsval_layout*) (js.cpp:2730)
==12185==    by 0x4F342D: js::CallJSNative(JSContext*, int (*)(JSContext*, unsigned int, js::Value*), unsigned int, js::Value*) (jscntxtinlines.h:293)
==12185==    by 0x716966: CallCompiler::generateNativeStub() (MonoIC.cpp:839)
==12185==    by 0x7114FE: js::mjit::ic::NativeCall(js::VMFrame&, js::mjit::ic::CallICInfo*) (MonoIC.cpp:1108)
==12185==    by 0x41B3F82: ???
==12185==    by 0x691BC0: js::mjit::EnterMethodJIT(JSContext*, js::StackFrame*, void*, js::Value*) (MethodJIT.cpp:882)
==12185==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==12185== 
==12185== 
==12185== Process terminating with default action of signal 11 (SIGSEGV)
Comment 1 Brian Hackett (:bhackett) 2011-05-19 11:27:09 PDT
TM bug, JS_DumpHeap calls TraceRuntime which invokes the conservative GC without first purging the compartment's freelists.  This causes the conservative GC to think an object is live when it has not been allocated yet, and we crash.

This is I think a regression from bug 601234.  Igor, do you mind looking at this?
Comment 2 Andrew McCreight [:mccr8] 2011-05-19 11:39:31 PDT
That sounds vaguely like Bug 652985, though that was before bug 601234, so it couldn't have been caused by that.
Comment 3 Igor Bukanov 2011-05-19 17:21:11 PDT
Created attachment 533853 [details] [diff] [review]
v1

The patch adds a helper class that temporary copies the free list to the arena and then clears the arenas again in the destructor. Its instance in TraceRuntime ensures that the conservative scanner can safely access the free list when dumping the arenas.

Another fix is the call to gcHelperThread.waitBackgroundSweepEnd before tracing the heap. Without that the conservative scanner may access just finalized things.
Comment 4 Igor Bukanov 2011-05-20 09:36:46 PDT
Created attachment 534004 [details] [diff] [review]
v2

Here is an update in view of bug 658137 landing. The new patch uses AutoCopyFreeListToArenas in IterateCells to remove the free list checks in IterateArenaCells. 

IterateCells also has a bug with missing RecordNativeStackTopForGC. This is important if AutoSession would wait for another GC to finish. To make sure that this would not happen in future I have added the call to LetOtherGCFinish where it should have been put long time ago.
Comment 5 [PTO to Dec5] Bill McCloskey (:billm) 2011-05-20 09:59:22 PDT
Comment on attachment 534004 [details] [diff] [review]
v2

Review of attachment 534004 [details] [diff] [review]:
-----------------------------------------------------------------

Now I understand the reason for RecordNativeStackTopForGC. Thanks.

::: js/src/jsgc.cpp
@@ -2717,5 @@
> -    /* -16k because it is possible to perform a GC during an overrecursion report. */
> -    JS_ASSERT_IF(cx->stackLimit, JS_CHECK_STACK_SIZE(cx->stackLimit - (1 << 14), &stackDummy));
> -# endif
> -#endif
> -

I'm glad to see this go.

::: js/src/jsgc.h
@@ +750,5 @@
> +        }
> +    }
> +
> +    /*
> +     * Copy temporary the free list heads to the arenas so the code can see

"Temporarily copy"

@@ +764,5 @@
> +        }
> +    }
> +
> +    /*
> +     * Clear the free lists in arenas that were temporary set there using

"temporarily"
Comment 6 [PTO to Dec5] Bill McCloskey (:billm) 2011-05-20 09:59:31 PDT
Comment on attachment 534004 [details] [diff] [review]
v2

Review of attachment 534004 [details] [diff] [review]:
-----------------------------------------------------------------

Now I understand the reason for RecordNativeStackTopForGC. Thanks.

::: js/src/jsgc.cpp
@@ -2717,5 @@
> -    /* -16k because it is possible to perform a GC during an overrecursion report. */
> -    JS_ASSERT_IF(cx->stackLimit, JS_CHECK_STACK_SIZE(cx->stackLimit - (1 << 14), &stackDummy));
> -# endif
> -#endif
> -

I'm glad to see this go.

::: js/src/jsgc.h
@@ +750,5 @@
> +        }
> +    }
> +
> +    /*
> +     * Copy temporary the free list heads to the arenas so the code can see

"Temporarily copy"

@@ +764,5 @@
> +        }
> +    }
> +
> +    /*
> +     * Clear the free lists in arenas that were temporary set there using

"temporarily"
Comment 8 Igor Bukanov 2011-05-21 13:33:46 PDT
Christian: could you test that the landed patch fixes the test case? I cannot reproduce it on my box even if the problem from the comment 1 is real.
Comment 9 Brian Hackett (:bhackett) 2011-05-22 00:05:50 PDT
It looks like non-THREADSAFE builds are burning, TraceRuntime uses 'rt' without defining it first.
Comment 10 Igor Bukanov 2011-05-22 00:30:32 PDT
http://hg.mozilla.org/tracemonkey/rev/57412df720cf - followup to fix !JS_THREADSAFE builds.
Comment 11 Christian Holler (:decoder) 2011-05-22 03:29:24 PDT
Igor: The original test  in comment 1 is for the JM branch. So to verify this fix, I need the changes pushed on the JM branch as well. It's likely that the test will not repro on TM at all, even if the underlying problem is in TM.
Comment 12 Chris Leary [:cdleary] (not checking bugmail) 2011-05-23 14:16:51 PDT
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/0e2d15573f6c
http://hg.mozilla.org/mozilla-central/rev/57412df720cf
Comment 13 Christian Holler (:decoder) 2011-05-24 08:00:27 PDT
I retested this on TI tip and it does no longer reproduce now :)
Comment 14 Igor Bukanov 2011-05-24 09:09:31 PDT
(In reply to comment #13)
> I retested this on TI tip and it does no longer reproduce now :)

Thanks!

Note You need to log in before you can comment on or make changes to this bug.