Closed Bug 560798 Opened 14 years ago Closed 14 years ago

TM: recorder doesn't handle errors that also cause an abort

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: gal, Assigned: luke)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(2 files, 3 obsolete files)

Luke's type system seems to not handle the case where an abort happens and within the same deep bail an error occurs. If we report the abort back, the error is swallowed. If we report the error, various recorder pieces are touched and cause badness.

Note: someone please steal this bug. Luke?
Proposed patch:

--- a/js/src/jstracer.cpp
+++ b/js/src/jstracer.cpp
@@ -7120,23 +7120,25 @@ TraceRecorder::monitorRecording(JSOp op)
     }
     if (status == ARECORD_ABORTED) {
         JS_ASSERT(!localtm.recorder);
         return ARECORD_ABORTED;
     }
 
   stop_recording:
     /* Handle lazy abort / OOM. */
-    if (outOfMemory() || OverfullJITCache(&localtm)) {
+    if ((localtm.recorder && outOfMemory()) ||
+        OverfullJITCache(&localtm)) {
         ResetJIT(cx, FR_OOM);
         return ARECORD_ABORTED;
     }
     if (StatusAbortsRecording(status)) {
-        AbortRecording(cx, js_CodeName[op]);
-        return ARECORD_ABORTED;
+        if (localtm.recorder)
+            AbortRecording(cx, js_CodeName[op]);
+        return (status == ARECORD_ERROR) ? ARECORD_ERROR : ARECORD_ABORTED;
     }
 
     return status;
 }
 
 JS_REQUIRES_STACK void
 AbortRecording(JSContext* cx, const char* reason)
 {
Here is a slightly improved version:

diff --git a/js/src/jstracer.cpp b/js/src/jstracer.cpp
--- a/js/src/jstracer.cpp
+++ b/js/src/jstracer.cpp
@@ -7120,23 +7120,25 @@ TraceRecorder::monitorRecording(JSOp op)
     }
     if (status == ARECORD_ABORTED) {
         JS_ASSERT(!localtm.recorder);
         return ARECORD_ABORTED;
     }
 
   stop_recording:
     /* Handle lazy abort / OOM. */
-    if (outOfMemory() || OverfullJITCache(&localtm)) {
+    if ((localtm.recorder && localtm.recorder->outOfMemory()) ||
+        OverfullJITCache(&localtm)) {
         ResetJIT(cx, FR_OOM);
         return ARECORD_ABORTED;
     }
     if (StatusAbortsRecording(status)) {
-        AbortRecording(cx, js_CodeName[op]);
-        return ARECORD_ABORTED;
+        if (localtm.recorder)
+            AbortRecording(localcx, js_CodeName[op]);
+        return (status == ARECORD_ERROR) ? ARECORD_ERROR : ARECORD_ABORTED;
     }
 
     return status;
 }
 
 JS_REQUIRES_STACK void
 AbortRecording(JSContext* cx, const char* reason)
 {
I manually checked all code paths and I think all current paths are safe, but this is pretty severe and should be fixed right away (it crashes my iterator patch that does add a path that does abort + report and error at the same time).
Eech, good find.

IIUC, in addition to changes you suggest above, we'd also need to soup up the ExecuteTree call in attemptTreeCall to possibly return ARECORD_ERROR instead of always returning ARECORD_ABORTED.  Since [A]RECORD_ERROR means "there was an error, but the recorder was not aborted", I'm inclined to add ARECORD_ABORTED_WITH_ERROR.  Does that sound right?
Assignee: general → lw
So in investigating our natives-returning-error-handling, I found another, highly unlikely bug: although we check the return value of natives to see whether to bail or not, once we return to the interpreter, we only check cx->throwing.  Thus, if a native returns false without setting cx->throwing, the false is lost.  But to return 0 without setting cx->throwing is to quit, and so this would have to be a native that quits, but only some of the time.  Do such natives exist?  It seems like no.
The only case I can think of is the old branch callback, which would trigger an uncatchable failure.  Otherwise, false-without-error-report is a bug in the native, but we should terminate the script rather than continuing, I think!
Attached patch patch (obsolete) — Splinter Review
If I've understood all the cases correctly, this patch catches the abort/error case and also cleans up the monitorRecording logic so its easier to understand.  
I didn't end up needing any extra ARECORD_ flag since ARECORD_ERROR, returned from monitorRecording, already means "the recorder was aborted (somehow), goto error".  The patch passes trace and ref tests.  Andreas: does it fix the crash you were hitting?
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000
0x0000000100146d57 in JS_Assert (s=0x1001ec188 "status == ARECORD_COMPLETED || status == ARECORD_ABORTED", file=0x1001e59d5 "../jstracer.cpp", ln=7138) at ../jsutil.cpp:76
76	    *((int *) NULL) = 0;  /* To continue from here in GDB: "return" then "continue". */
(gdb) bt
#0  0x0000000100146d57 in JS_Assert (s=0x1001ec188 "status == ARECORD_COMPLETED || status == ARECORD_ABORTED", file=0x1001e59d5 "../jstracer.cpp", ln=7138) at ../jsutil.cpp:76
#1  0x0000000100199110 in js::TraceRecorder::monitorRecording (this=0x100415070, op=JSOP_NEXTITER) at ../jstracer.cpp:7138
#2  0x000000010007eaa5 in js_Interpret (cx=0x10088b400) at jsops.cpp:78
#3  0x00000001000a7f8e in js_Execute () at jsinterp.cpp:1103
#4  0x0000000100011e01 in JS_ExecuteScript (cx=0x10088b400, obj=0x101002000, script=0x100414ce0, rval=0x0) at ../jsapi.cpp:4804
#5  0x0000000100009f9c in Process (cx=0x10088b400, obj=0x101002000, filename=0x7fff5fbffa68 "x.js", forceTTY=0) at ../../shell/js.cpp:449
#6  0x000000010000abdc in ProcessArgs (cx=0x10088b400, obj=0x101002000, argv=0x7fff5fbff910, argc=2) at ../../shell/js.cpp:863
#7  0x000000010000ad57 in main (argc=2, argv=0x7fff5fbff910, envp=0x7fff5fbff928) at ../../shell/js.cpp:5045
(gdb)
Attached patch bigger (obsolete) — Splinter Review
Brendan explained that, on OOM, functions fail without setting cx->throwing, so there is a common case where testing cx->throwing isn't sufficient.  Second, as Andreas's new patch demonstrates, many of these RETURN_ERROR situations can reenter and thus delete the current TraceRecorder.  This means:
 1) catch errors using cx->builtinStatus in ExecuteTree, and propagate this up to the interpreter
 2) turn all RECORD_ERRORs into ARECORD_ERRORs (and all the functions that return those from RecordingStatus to AbortableRecordingStatus)

(2) is a lot of boring syntax.  (1) requires more careful change since errors can propagate up from ExecuteTree through several paths.  The patch passes trace/ref tests, not quite ready though.
Attachment #440620 - Attachment is obsolete: true
Attached patch better (obsolete) — Splinter Review
The last patch went a bit overboard converting RecordingStatus to AbortableRecording status; this one undoes the unnecessary parts.  I also came across a few cases where we weren't checking for deleted TraceRecorder and few others where we were returning ABORTED instead of ERROR.

Going to build a browser and run some tests first.  Andreas: does this patch work with yours?
Attachment #440672 - Attachment is obsolete: true
This patch utterly conflicts with mine. One of us is going to wade through knee deep rebasing hell.
xpcshell tests found a silly assertion I added.  Surprisingly we don't have any trace-tests that deep abort while recording, so I added one.

(In reply to comment #11)
> This patch utterly conflicts with mine. One of us is going to wade through knee
> deep rebasing hell.

Psh, just a flesh wound.  I'm happy to rebase my patch on top of yours.
Attachment #440826 - Attachment is obsolete: true
Attachment #440912 - Flags: review?(gal)
I don't have review yet, and I need your patch to not crash since my patch tickles this bug, so its going to be my pound o flesh.
Comment on attachment 440912 [details] [diff] [review]
minus silly assertion

Looks good. I suggest some fuzzing.
Attachment #440912 - Flags: review?(gal) → review+
(In reply to comment #14)
> (From update of attachment 440912 [details] [diff] [review])
> Looks good. I suggest some fuzzing.

Please cc me if fuzzing is needed. :)

Could this patch please be rebased to TM tip?
Attached patch refreshedSplinter Review
http://hg.mozilla.org/tracemonkey/rev/04b98e71bc55
Whiteboard: fixed-in-tracemonkey
http://hg.mozilla.org/mozilla-central/rev/04b98e71bc55
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: