TM: keep a list of JSGCThings embedded in traces (LIR, TreeInfo, etc.) for marking during GC

RESOLVED FIXED

Status

()

Core
JavaScript Engine
RESOLVED FIXED
9 years ago
9 years ago

People

(Reporter: gal, Assigned: gal)

Tracking

Trunk
x86
Mac OS X
Points:
---
Dependency tree / graph
Bug Flags:
wanted1.9.2 +

Firefox Tracking Flags

(status1.9.2 beta1-fixed)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 attachment, 7 obsolete attachments)

(Assignee)

Description

9 years ago
Prerequisite for GC-ing while on trace.

Comment 1

9 years ago
Clarifying summary a bit...
Summary: TM: scan generated code (or LIR or TreeInfo) for embedded JSGCThings → TM: keep a list of JSGCThings embedded in traces (LIR, TreeInfo, etc.) for marking during GC
(Assignee)

Comment 2

9 years ago
Created attachment 393567 [details] [diff] [review]
patch
Assignee: general → gal
(Assignee)

Updated

9 years ago
Attachment #393567 - Flags: superreview?(brendan)
Attachment #393567 - Flags: review?(graydon)
(Assignee)

Comment 3

9 years ago
The attached patch no longer flushes the JIT cache every time we GC. As the title indicates we mark pointers embedded in the native code during GC.
(Assignee)

Comment 4

9 years ago
Seems to work in the browser.
(Assignee)

Comment 5

9 years ago
Passes try server. This patch helps the box2d.sf.net demo a lot.
(Assignee)

Updated

9 years ago
Flags: wanted1.9.2?
Attachment #393567 - Flags: superreview?(brendan) → review?(brendan)
Comment on attachment 393567 [details] [diff] [review]
patch

>@@ -109,21 +109,20 @@ PurgeThreadData(JSContext *cx, JSThreadD
>     /* FIXME: bug 506341. */
>     js_PurgePropertyCache(cx, &data->propertyCache);
> 
> # ifdef JS_TRACER
>     JSTraceMonitor *tm = &data->traceMonitor;
>     tm->reservedDoublePoolPtr = tm->reservedDoublePool;
> 
>     /*
>-     * FIXME: bug 506117. We should flush only if (cx->runtime->gcRegenShapes),
>-     * but we can't yet, because traces may embed sprop and object references,
>-     * and we don't yet mark such embedded refs.
>+     * If we regenerated shapes, we must also flush the JIT cache.

Nit: "If the GC is regenerating shapes, ..." -- this uses past tense but the order of calls from js_GC runs to here before any shape regeneration code.

>+    js_TraceCodeCacheGCThings(trc, acx);

Better pass &JS_TRACE_MONITOR(acx) here, which suggests making this function a JSTraceMonitor::gcTrace method instead (taking trc as its lone argument). More below on why.

More important: you don't want to call this from js_TraceContext. There are many cx per thread (MT embedding) or runtime (ST embedding), you'll waste time re-marking the same JSTraceMonitor over and over. Move this call to js_GC's mark phase, looping over threads if MT, or marking just the runtime's tm if ST.

Alas there is no js_TraceThreads call into jscntxt.cpp code, and js_PurgeThreads happens a little early. You might add js_TraceThreads to parallel js_PurgeThreads and hide the MT/ST diff in jscntxt.cpp.

>+/*
>+ * INS_CONSTPTR can be used to embed arbitrary pointers into the native code. It should not
>+ * be used directly to embed GC thing pointers. Instead, use the INS_CONSTOBJ/FUN/STR/SPROP
>+ * variants which ensure that the embedded pointer will be kept alive across GCs.
>+ */
>+
>+#define INS_CONST(c)          addName(lir->insImm(c), #c)
>+#define INS_CONSTPTR(p)       addName(lir->insImmPtr(p), #p)
>+#define INS_CONSTWORD(v)      addName(lir->insImmPtr((void *) v), #v)
>+#define INS_CONSTOBJ(obj)     addName(insImmObj(obj), #obj)
>+#define INS_CONSTFUN(fun)     addName(insImmFun(fun), #fun)
>+#define INS_CONSTSTR(str)     addName(insImmStr(str), #str)
>+#define INS_CONSTSPROP(sprop) addName(insImmSProp(sprop), #sprop)

Nit: elsewhere but going away in favor of methods, we have js_{G,S}etSprop -- note lower-case p both times. So insImmSprop here.

>+#define INS_NULL()            INS_CONSTPTR(NULL)
>+#define INS_VOID()            INS_CONST(JSVAL_TO_PSEUDO_BOOLEAN(JSVAL_VOID))

Want INS_ATOM to avoid INS_CONSTSTR(ATOM_TO_STRING(...)) proliferation below. Could even have INS_TYPE_ATOM for the atoms in members named to match cx->runtime->atomState.typeAtoms[i], with INS_COMMON_ATOM taking just the JSType index i.

Nit: INS_MYSUMMERVACATIONWASVERYLONG naming convention here is not aging gracefully. Hence INS_COMMON_ATOM, but perhaps INS_CONST_{PTR,WORD,OBJ,FUN,STR,SPROP} might be easier on the eyes too.

>+void
>+js_TraceCodeCacheGCThings(JSTracer* trc, JSContext* acx)

JSTraceMonitor::gcTrace(JSTracer* trc) -- avoids repetitive and over-narrow words like GCThings (no SProps? ;-P) in the name to boot.

>+{
>+    if (!TRACING_ENABLED(acx))
>+        return;

This is wrong, many cx to one tm and some cx could have the JIT option off but that doesn't mean we shouldn't mark. But see above about js_TraceThreads.

>+
>+    JSTraceMonitor* tm = &JS_TRACE_MONITOR(acx);
>+    if (!tm)
>+        return;

This is useless, JS_TRACE_MONITOR is a struct whose address you just took -- so tm can't be null.

>+    for (size_t i = 0; i < FRAGMENT_TABLE_SIZE; ++i) {
>+        VMFragment* f = tm->vmfragments[i];
>+        while (f) {
>+            TreeInfo* ti;
>+            if ((ti = (TreeInfo*)f->vmprivate) != NULL) {

Spanky new C++ way:

             if (TreeInfo* ti = (TreeInfo*)f->vmprivate) {

>+        type = INS_CONSTSTR(ATOM_TO_STRING(cx->runtime->atomState.typeAtoms[JSTYPE_STRING]));

Here's one of several INS_ATOM or INS_TYPE_ATOM use-case.

/be
(In reply to comment #6)
> More important: you don't want to call this from js_TraceContext. There are
> many cx per thread (MT embedding) or runtime (ST embedding),

ST is not an issue of course -- one cx and rt and thread. MT is the problem. Easy solution: js_TraceThreads.

/be
(Assignee)

Comment 8

9 years ago
> Spanky new C++ way:
> 
>              if (TreeInfo* ti = (TreeInfo*)f->vmprivate) {

I just died a little bit. I had to scan the code 3 times to get it.
(Assignee)

Comment 9

9 years ago
Created attachment 394109 [details] [diff] [review]
patch
Attachment #393567 - Attachment is obsolete: true
Attachment #393567 - Flags: review?(graydon)
Attachment #393567 - Flags: review?(brendan)
(Assignee)

Updated

9 years ago
Attachment #394109 - Attachment is patch: true
Attachment #394109 - Attachment mime type: application/octet-stream → text/plain
(Assignee)

Updated

9 years ago
Attachment #394109 - Flags: review?(brendan)
(In reply to comment #8)
> > Spanky new C++ way:
> > 
> >              if (TreeInfo* ti = (TreeInfo*)f->vmprivate) {
> 
> I just died a little bit. I had to scan the code 3 times to get it.

Ok, the least new/confusing/ugly way to do this is to initialize ti's declaration and avoid nesting an assignment or a declaration in an if condition!

/be
Comment on attachment 394109 [details] [diff] [review]
patch

>+     * If we regenerated shapes, we must also flush the JIT cache.

Verb tense fix per IRC discussion.

>+JSTraceMonitor::mark(JSTracer* trc)

Nice, mark >> gcTrace.

>+{
>+    for (size_t i = 0; i < FRAGMENT_TABLE_SIZE; ++i) {
>+        VMFragment* f = vmfragments[i];
>+        while (f) {
>+            TreeInfo* ti;
>+            if ((ti = (TreeInfo*)f->vmprivate) != NULL) {

Initialize ti's declaration, just as you did for f.

r=me with these picked, thanks.

/be
Attachment #394109 - Flags: review?(brendan) → review+
(Assignee)

Comment 12

9 years ago
The new version is breaking in MT shell. Investigating.
(Assignee)

Comment 13

9 years ago
Created attachment 394134 [details] [diff] [review]
patch

I did some manual code review after the failures in the MT shell and fixed a bunch of GC safety hazards. As a result, the MT shell now crashes even harder.
Attachment #394109 - Attachment is obsolete: true
(Assignee)

Comment 14

9 years ago
I get a sporadic failure in the MT shell:

TEST-PASS | trace-test.js | Math.ceil(Number.NaN)
Assertion failure: fp->slots + fp->script->nfixed + js_ReconstructStackDepth(cx, fp->script, fp->regs->pc) == fp->regs->sp, at ../jstracer.cpp:5727
Trace/BPT trap

ST shell works fine.
(Assignee)

Comment 15

9 years ago
Sometimes the failures look like this:

TEST-PASS | trace-test.js | Math.acos(void 0)
TEST-UNEXPECTED-FAIL | trace-test.js | Math.acos(null) : expected number ( 1.5707963267948966 )  != actual number ( 0 ) 
TEST-PASS | trace-test.js | Math.acos(Number.NaN)
TEST-PASS | trace-test.js | Math.acos("a string")
TEST-PASS | trace-test.js | Math.acos('0')
TEST-UNEXPECTED-FAIL | trace-test.js | Math.acos('1') : expected number ( 0 )  != actual number ( 1 ) 
TEST-PASS | trace-test.js | Math.acos('-1')
TEST-UNEXPECTED-FAIL | trace-test.js | Math.acos(1.00000001) : expected number ( NaN )  != actual number ( 1.00000001 ) 
TEST-PASS | trace-test.js | Math.acos(-1.00000001)
TEST-UNEXPECTED-FAIL | trace-test.js | Math.acos(1) : expected number ( 0 )  != actual number ( 1 ) 
TEST-PASS | trace-test.js | Math.acos(-1)
TEST-PASS | trace-test.js | Math.acos(0)
TEST-PASS | trace-test.js | Math.acos(-0)
TEST-PASS | trace-test.js | Math.acos(Math.SQRT1_2)
TEST-PASS | trace-test.js | Math.acos(-Math.SQRT1_2)
TEST-PASS | trace-test.js | Math.acos(0.9999619230642)
TEST-PASS | trace-test.js | Math.acos(-3.0)
TEST-PASS | trace-test.js | Math.asin(void 0)
TEST-UNEXPECTED-FAIL | trace-test.js | Math.asin(null) : expected number ( 0 )  != actual number ( 1.5707963267948966 ) 
TEST-PASS | trace-test.js | Math.asin(Number.NaN)
TEST-PASS | trace-test.js | Math.asin("string")
TEST-UNEXPECTED-FAIL | trace-test.js | Math.asin("0") : expected number ( 0 )  != actual number ( 1.5707963267948966 ) 
TEST-UNEXPECTED-FAIL | trace-test.js | Math.asin("1") : expected number ( 1.5707963267948966 )  != actual number ( 0 ) 
TEST-UNEXPECTED-FAIL | trace-test.js | Math.asin("-1") : expected number ( -1.5707963267948966 )  != actual number ( 3.141592653589793 ) 
TEST-PASS | trace-test.js | Math.asin(Math.SQRT1_2+'')
TEST-UNEXPECTED-FAIL | trace-test.js | Math.asin(-Math.SQRT1_2+'') : expected number ( -0.7853981633974483 )  != actual number ( 2.356194490192345 )
(Assignee)

Updated

9 years ago
Depends on: 510136
(Assignee)

Comment 16

9 years ago
A bunch of recent changes to nanojit by graydon tickled an old bug in the cache flush logic (510136). #14 and #15 are resolved now.
(Assignee)

Comment 17

9 years ago
Passes now after 510136 started working after I backed out 508051, which was breaking that one. jorendorff!!!!!!!!!! ;)

Sent to the try server too.
(Assignee)

Comment 18

9 years ago
THe patch causes a shutdown leak. The JIT cache isn't cleared so the final GC still keeps a bunch of objects alive.
Don't call traceMonitor.mark if trc->context->runtime->state == JSRTS_LANDING, this should let the GC collect the things that you're currently marking from the GC_LAST_CONTEXT invocation of js_GC from js_DestroyContext.

But why isn't the JIT code cache, including the gcthings and sprops queues, cleared forcibly in this state? I'm asking cuz it seems you'll leave dangling pointers if you don't do more work than just the JSRTS_LANDING mark suppression.

/be
(Assignee)

Comment 20

9 years ago
Yeah, I am skipping marking when LANDING, and flushing the cache instead, but i have to make sure we don't accidentally record again after that. Its not clear to me whether LANDING means no JS can run any more.
(Assignee)

Comment 21

9 years ago
I am still get a sporadic failure in MT shell (MT shell only).

TEST-PASS | trace-test.js | Math.log(null)
TEST-PASS | trace-test.js | Math.log(true)
TEST-PASS | trace-test.js | Math.log(false)
TEST-PASS | trace-test.js | Math.log('0')
TEST-PASS | trace-test.js | Math.log('1')
TEST-PASS | trace-test.js | Math.log("Infinity")
TEST-PASS | trace-test.js | Math.log(Number.NaN)
TEST-PASS | trace-test.js | Math.log(-0.000001)
TEST-PASS | trace-test.js | Math.log(-1)
TEST-PASS | trace-test.js | Math.log(0)
TEST-PASS | trace-test.js | Math.log(-0)
TEST-UNEXPECTED-FAIL | trace-test.js | Math.log(1) : expected number ( 0 )  != actual number ( 1 ) 
TEST-PASS | trace-test.js | Math.log(Number.POSITIVE_INFINITY)
TEST-PASS | trace-test.js | Math.log(Number.NEGATIVE_INFINITY)
TEST-PASS | trace-test.js | Math.max(void 0, 1)
TEST-PASS | trace-test.js | Math.max(void 0, void 0)
TEST-PASS | trace-test.js | Math.max(null, 1)
TEST-PASS | trace-test.js | Math.max(-1, null)
TEST-PASS | trace-test.js | Math.max(true,false)
TEST-PASS | trace-test.js | Math.max("-99","99")
TEST-PASS | trace-test.js | Math.max(Number.NaN,Number.POSITIVE_INFINITY)
TEST-PASS | trace-test.js | Math.max(Number.NaN, 0)
TEST-PASS | trace-test.js | Math.max("a string", 0)
TEST-PASS | trace-test.js | Math.max(Number.NaN,1)
TEST-PASS | trace-test.js | Math.max("a string", Number.POSITIVE_INFINITY)
TEST-PASS | trace-test.js | Math.max(Number.POSITIVE_INFINITY, Number.NaN)
Assertion failure: pc == target, at ../jsopcode.cpp:5483
==14152== 
==14152== Process terminating with default action of signal 5 (SIGTRAP)
==14152==    at 0x12479F: JS_Assert (jsutil.cpp:69)
==14152==    by 0xCF906: ReconstructPCStack(JSContext*, JSScript*, unsigned char*, unsigned char**) (jsopcode.cpp:5483)
==14152==    by 0xCF942: js_ReconstructStackDepth (jsopcode.cpp:5268)
==14152==    by 0x14BC7B: LeaveTree(InterpState&, VMSideExit*) (jstracer.cpp:5790)
==14152==    by 0x15209E: ExecuteTree(JSContext*, nanojit::Fragment*, unsigned int&, VMSideExit**) (jstracer.cpp:5586)
==14152==    by 0x16C3CD: js_MonitorLoopEdge(JSContext*, unsigned int&) (jstracer.cpp:6019)
==14152==    by 0x820C1: js_Interpret (jsops.cpp:905)
==14152==    by 0xA4A70: js_Execute (jsinterp.cpp:1602)
==14152==    by 0x225DE: JS_ExecuteScript (jsapi.cpp:4974)
==14152==    by 0x39F6: Load(JSContext*, JSObject*, unsigned int, long*, long*) (shell/js.cpp:934)
==14152==    by 0xA602A: js_Invoke (jsinterp.cpp:1363)
==14152==    by 0x90B92: js_Interpret (jsops.cpp:2199)
==14152== 
==14152== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==14152== malloc/free: in use at exit: 2,788,298 bytes in 8,314 blocks.
==14152== malloc/free: 292,839 allocs, 284,525 frees, 2,311,080,049 bytes allocated.
==14152== For counts of detected errors, rerun with: -v
==14152== searching for pointers to 8,314 not-freed blocks.
==14152== checked 7,405,516 bytes.
==14152== 
==14152== LEAK SUMMARY:
==14152==    definitely lost: 0 bytes in 0 blocks.
==14152==    indirectly lost: 0 bytes in 0 blocks.
==14152==      possibly lost: 1,575,282 bytes in 2,708 blocks.
==14152==    still reachable: 1,209,676 bytes in 5,523 blocks.
==14152==         suppressed: 3,340 bytes in 83 blocks.
==14152== Rerun with --leak-check=full to see details of leaked memory.
Killed
(Assignee)

Comment 22

9 years ago
This reproduces reliably with the old trace-test harness and valgrind using its default settings. sewardj suggested using --freelist-vol=1000000000. With that it doesn't happen. Asking him for advise and an interpretation.
(Assignee)

Comment 23

9 years ago
Created attachment 394522 [details] [diff] [review]
patch that fixes shutdown leak
Attachment #394134 - Attachment is obsolete: true
(In reply to comment #21)
> I am still get a sporadic failure in MT shell (MT shell only).

Possibly a threading bug.  Sporadic failures on identical runs 
with identical input data says to me a race, or a misuse of
the pthreads api.
(Assignee)

Comment 25

9 years ago
I am pretty confident #24 is incorrect. I think its the background deallocator thread that frees stuff in the background and gets scheduled differently. So its a deallocation bug IMO.
Comment in the code below seems to imply this is harmless.
We're 110% sure about that?

==5374== Possible data race during read of size 4 at 0x665a094 by thread #1
==5374==    at 0x8109FA7: JSBackgroundThread::busy() (jstask.cpp:105)
==5374==    by 0x8099639: RefillDoubleFreeList(JSContext*) (jsgc.cpp:2090)
==5374==    by 0x8099741: js_NewDoubleInRootedValue (jsgc.cpp:2193)
==5374==    by 0x816B149: js_Interpret (jsops.cpp:1016)
==5374==    by 0x809D11A: js_Execute (jsinterp.cpp:1602)
==5374==    by 0x8052D79: JS_ExecuteScript (jsapi.cpp:4974)
==5374==    by 0x804E479: Process(JSContext*, JSObject*, char*, int) (js.cpp:435)
==5374==    by 0x804ED5C: main (js.cpp:843)
==5374==  This conflicts with a previous write of size 4 by thread #2
==5374==    at 0x810A08B: JSBackgroundThread::work() (jstask.cpp:94)
==5374==    by 0x4C49AD0: ??? (in /usr/lib/libnspr4.so)
==5374==    by 0x4BC784B: mythread_wrapper (hg_intercepts.c:201)
==5374==    by 0x4C03FEF: start_thread (in /lib/libpthread-2.8.so)
==5374==    by 0x637CF6D: clone (in /lib/libc-2.8.so)

void
JSBackgroundThread::work()
{
    PR_Lock(lock);
    while (!shutdown) {
        PR_WaitCondVar(wakeup, PR_INTERVAL_NO_TIMEOUT);
        JSBackgroundTask* t;
        while ((t = stack) != NULL) {
            stack = t->next;                 <------------ this write
            PR_Unlock(lock);
            t->run();
            delete t;
            PR_Lock(lock);
        }
    }
    PR_Unlock(lock);
}

bool
JSBackgroundThread::busy()
{
    return !!stack; // we tolerate some racing here <----- races against
                                                           this read
}
(In reply to comment #25)
> I think its the background deallocator thread that frees stuff in the 
> background and gets scheduled differently. So its a deallocation bug IMO.

Not sure I agree.  If P fails or succeeds depending only on
the scheduling, then it has threading problems.

But anyway.  Here's some more info.  I am seeing a bunch of
reads of freed memory.  This is with valgrind --tool=exp-ptrcheck,
which is experimental and thus subject to some level of doubt.

Here's the first two.  There are a whole bunch more.  Curiously
they are all reads, no writes.

TEST-PASS | trace-test.js | Math.abs(null)
TEST-PASS | trace-test.js | Math.abs(true)
TEST-PASS | trace-test.js | Math.abs(false)
TEST-PASS | trace-test.js | Math.abs("a string primitive")
TEST-PASS | trace-test.js | Math.abs(new String( 'a String object' ))
TEST-PASS | trace-test.js | Math.abs(Number.NaN)
==7736== Invalid read of size 1
==7736==    at 0x816FFE7: js_Interpret (jsops.cpp:905)
==7736==    by 0x809D11A: js_Execute (jsinterp.cpp:1602)
==7736==    by 0x8052D79: JS_ExecuteScript (jsapi.cpp:4974)
==7736==    by 0x804D7E4: Load(JSContext*, JSObject*, unsigned int, int*, int*)
                          (js.cpp:934)
==7736==    by 0x809DAEA: js_Invoke (jsinterp.cpp:1363)
==7736==    by 0x8169520: js_Interpret (jsops.cpp:2199)
==7736==    by 0x809D11A: js_Execute (jsinterp.cpp:1602)
==7736==    by 0x8052D79: JS_ExecuteScript (jsapi.cpp:4974)
==7736==    by 0x804E479: Process(JSContext*, JSObject*, char*, int)
                          (js.cpp:435)
==7736==    by 0x804ED5C: main (js.cpp:843)
==7736==  Address 0xab62052 is 106 bytes inside the accessing pointer's
==7736==  once-legitimate range, a block of size 141 free'd
==7736==    at 0x49720C6: free (vg_replace_malloc.c:325)
==7736==    by 0x809AB39: JSFreePointerListTask::run() (jsutil.h:202)
==7736==    by 0x810A09F: JSBackgroundThread::work() (jstask.cpp:96)
==7736==    by 0x4AA4AD0: ??? (in /usr/lib/libnspr4.so)
==7736==    by 0x49FEFEF: start_thread (in /lib/libpthread-2.8.so)
==7736==    by 0x7E7FF6D: clone (in /lib/libc-2.8.so)
==7736==
==7736== Invalid read of size 1
==7736==    at 0x8169346: js_Interpret (jsops.cpp:905)
==7736==    by 0x809D11A: js_Execute (jsinterp.cpp:1602)
==7736==    by 0x8052D79: JS_ExecuteScript (jsapi.cpp:4974)
==7736==    by 0x804D7E4: Load(JSContext*, JSObject*, unsigned int, int*, int*)
           (js.cpp:934)
==7736==    by 0x809DAEA: js_Invoke (jsinterp.cpp:1363)
==7736==    by 0x8169520: js_Interpret (jsops.cpp:2199)
==7736==    by 0x809D11A: js_Execute (jsinterp.cpp:1602)
==7736==    by 0x8052D79: JS_ExecuteScript (jsapi.cpp:4974)
==7736==    by 0x804E479: Process(JSContext*, JSObject*, char*, int)
                          (js.cpp:435)
==7736==    by 0x804ED5C: main (js.cpp:843)
==7736==  Address 0xab62053 is 107 bytes inside the accessing pointer's
==7736==  once-legitimate range, a block of size 141 free'd
==7736==    at 0x49720C6: free (vg_replace_malloc.c:325)
==7736==    by 0x809AB39: JSFreePointerListTask::run() (jsutil.h:202)
==7736==    by 0x810A09F: JSBackgroundThread::work() (jstask.cpp:96)
==7736==    by 0x4AA4AD0: ??? (in /usr/lib/libnspr4.so)
==7736==    by 0x49FEFEF: start_thread (in /lib/libpthread-2.8.so)
==7736==    by 0x7E7FF6D: clone (in /lib/libc-2.8.so)
==7736==
(Assignee)

Comment 28

9 years ago
Created attachment 394680 [details] [diff] [review]
rebased
Attachment #394522 - Attachment is obsolete: true
(Assignee)

Comment 29

9 years ago
If I disable the background thread and do foreground frees only, and enable free poisoning in valgrind, I still get this crash in the MT shell:

TEST-PASS | trace-test.js | Math.floor(-0)== -Math.ceil(0)
TEST-PASS | trace-test.js | Math.floor(Number.POSITIVE_INFINITY)
TEST-PASS | trace-test.js | Math.floor(Number.POSITIVE_INFINITY) == -Math.ceil(Number.NEGATIVE_INFINITY)
TEST-PASS | trace-test.js | Math.floor(Number.NEGATIVE_INFINITY)
TEST-PASS | trace-test.js | Math.floor(Number.NEGATIVE_INFINITY) == -Math.ceil(Number.POSITIVE_INFINITY)
Assertion failure: fp->slots + fp->script->nfixed + js_ReconstructStackDepth(cx, fp->script, fp->regs->pc) == fp->regs->sp, at ../jstracer.cpp:5792
==15014== 
==15014== Process terminating with default action of signal 5 (SIGTRAP)
==15014==    at 0x1247B3: JS_Assert (jsutil.cpp:69)
==15014==    by 0x14BCD9: LeaveTree(InterpState&, VMSideExit*) (jstracer.cpp:5790)
==15014==    by 0x1520B2: ExecuteTree(JSContext*, nanojit::Fragment*, unsigned int&, VMSideExit**) (jstracer.cpp:5586)
==15014==    by 0x16C3E1: js_MonitorLoopEdge(JSContext*, unsigned int&) (jstracer.cpp:6019)
==15014==    by 0x820D3: js_Interpret (jsops.cpp:905)
==15014==    by 0xA4A82: js_Execute (jsinterp.cpp:1602)
==15014==    by 0x2272E: JS_ExecuteScript (jsapi.cpp:4974)
==15014==    by 0x3B46: Load(JSContext*, JSObject*, unsigned int, long*, long*) (shell/js.cpp:934)
==15014==    by 0xA603C: js_Invoke (jsinterp.cpp:1363)
==15014==    by 0x90BA4: js_Interpret (jsops.cpp:2199)
==15014==    by 0xA4A82: js_Execute (jsinterp.cpp:1602)
==15014==    by 0x2272E: JS_ExecuteScript (jsapi.cpp:4974)

I don't get this crash reliably without valgrind.
(Assignee)

Comment 30

9 years ago
Without valgrind I get sporadic failures such as this one:

TEST-PASS | trace-test.js | Math.abs('-Infinity')
TEST-PASS | trace-test.js | Math.acos(void 0)
TEST-UNEXPECTED-FAIL | trace-test.js | Math.acos(null) : expected number ( 1.5707963267948966 )  != actual number ( 0 ) 
TEST-PASS | trace-test.js | Math.acos(Number.NaN)
TEST-PASS | trace-test.js | Math.acos("a string")
I can't seem to reproduce this crash in Linux.
(In reply to comment #31)
I did see it (running natively) on my suse 11.0 rig last night, but like
Andreas says, it was sporadic.
(In reply to comment #29)
> If I disable the background thread and do foreground frees only, and enable
> free poisoning in valgrind, I still get this crash in the MT shell:

How disabled is "disabled" ?  If you do a MT build, link against libpthread,
but don't actually create the second thread, does it still crash?
(Assignee)

Comment 34

9 years ago
All frees happen in the foreground. The background thread is started, but waits in a condvar until shutdown. With that setup I don't see the sporadic failures, but with valgrind it still dies.
Read of size one from the interpreter => loading a bytecode. Could it be that the bug is failing to mark function objects that own scripts whose loop headers have traced to code that is still cached? IOW we need to preserve not just immediates in the LIR from GC, but also the loop-containing function object and its script.

/be
(Assignee)

Comment 36

9 years ago
The failing parts of trace-tests are always the same class of math tests.

TEST-PASS | trace-test.js | Math.acos(void 0)
TEST-UNEXPECTED-FAIL | trace-test.js | Math.acos(null) : expected number ( 1.5707963267948966 )  != actual number ( 0 ) 
TEST-PASS | trace-test.js | Math.acos(Number.NaN)

Those are generated using eval, which uses the funky DestroyScriptsToGC pile.

Brendan is probably on to something in #35.
(Assignee)

Comment 37

9 years ago
If I add a return to DestroyScript and never actually free those, the bug goes away. My working theory is that graydon's purge-fragments-for-script in destroy-script is broken. Since I no longer flush the cache upon GC, this bug gets exposed. The bug seems latent to me.
(Assignee)

Comment 38

9 years ago
Created attachment 394881 [details]
this test case reliably crashes with the patch applied
(Assignee)

Comment 39

9 years ago
We destroy a whole bunch of scripts in the testcase but never end up calling PurgeScriptFragments.
(Assignee)

Comment 40

9 years ago
Created attachment 394889 [details] [diff] [review]
always purge fragments for scripts when we dispose them
Attachment #394680 - Attachment is obsolete: true
(Assignee)

Updated

9 years ago
Attachment #394889 - Flags: review?(brendan)
Comment on attachment 394889 [details] [diff] [review]
always purge fragments for scripts when we dispose them

This could get costly -- O(n*m) for n scripts (functions, eval scripts -- top-level scripts are manually destroyed) cross m fragments to find.

/be
Attachment #394889 - Flags: review?(brendan) → review+
(Assignee)

Comment 42

9 years ago
http://hg.mozilla.org/tracemonkey/rev/aa6bb02120e3
Whiteboard: fixed-in-tracemonkey
(Assignee)

Comment 43

9 years ago
Created attachment 395170 [details] [diff] [review]
explicitly disable code cache marking, landing stage is too late in the shutdown cycle
Attachment #394881 - Attachment is obsolete: true
Attachment #394889 - Attachment is obsolete: true
(Assignee)

Updated

9 years ago
Attachment #394889 - Attachment is obsolete: false
(Assignee)

Updated

9 years ago
Attachment #395170 - Attachment is obsolete: true
(Assignee)

Updated

9 years ago
Depends on: 511252
(Assignee)

Updated

9 years ago
Duplicate of this bug: 508128

Comment 45

9 years ago
http://hg.mozilla.org/mozilla-central/rev/aa6bb02120e3
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
Duplicate of this bug: 500346
What bug should this block, the bug that changes things so we don't dump JITted code on every GC?

/be
Blocks: 504640

Comment 48

9 years ago
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/39d714972b93
status1.9.2: --- → beta1-fixed
Flags: wanted1.9.2? → wanted1.9.2+

Updated

9 years ago
Blocks: 524051
You need to log in before you can comment on or make changes to this bug.