Closed Bug 1165966 Opened 5 years ago Closed 5 years ago

Assertion failure: safepoint->hasBoxedValue(alloc), at js/src/jit/RegisterAllocator.cpp:338 or Assertion failure: CurrentThreadIsGCSweeping(), at js/src/gc/Barrier.h:442 with OOM

Categories

(Core :: JavaScript Engine, defect, critical)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
firefox41 --- affected

People

(Reporter: decoder, Assigned: jonco)

References

Details

(Keywords: assertion, regression, testcase, Whiteboard: [jsbugmon:ignore])

Attachments

(3 files, 1 obsolete file)

The following testcase crashes on mozilla-central revision 35918b0441b4 (build with --enable-optimize --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --enable-debug, run with --fuzzing-safe --thread-count=2 --ion-eager):

var lfcode = new Array();
lfcode.push(`
var g = newGlobal();
var N = 4;
for (var i = 0; i < N; i++) {
    var dbg = Debugger(g);
    oomAfterAllocations(10);
}
`);
options = function() {}
var lfRunTypeId = -1;
var file = lfcode.shift();
loadFile(file)
function loadFile(lfVarx) {
    if (lfVarx.substr(-7) != ".js" && lfVarx.length != 2) unescape("x");
    try {
        if (lfVarx.substr(-3) != ".js" && lfVarx.length != 1) {
          evaluate(lfVarx);
        }
    } catch (lfVare) {}
}



Backtrace:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff6bb8700 (LWP 16586)]
0x00000000009b4816 in js::jit::AllocationIntegrityState::checkSafepointAllocation (this=this@entry=0x7ffff6bb6920, ins=ins@entry=0x7ffff500dc70, vreg=vreg@entry=2, alloc=..., 
    populateSafepoints=populateSafepoints@entry=false) at js/src/jit/RegisterAllocator.cpp:338
338     js/src/jit/RegisterAllocator.cpp: No such file or directory.
(gdb) bt 16
#0  0x00000000009b4816 in js::jit::AllocationIntegrityState::checkSafepointAllocation (this=this@entry=0x7ffff6bb6920, ins=ins@entry=0x7ffff500dc70, vreg=vreg@entry=2, alloc=..., 
    populateSafepoints=populateSafepoints@entry=false) at js/src/jit/RegisterAllocator.cpp:338
#1  0x00000000009beaed in js::jit::AllocationIntegrityState::checkIntegrity (this=this@entry=0x7ffff6bb6920, block=0x7ffff500d750, ins=<optimized out>, vreg=<optimized out>, alloc=..., 
    populateSafepoints=false) at js/src/jit/RegisterAllocator.cpp:228
#2  0x00000000009bf60f in js::jit::AllocationIntegrityState::check (this=this@entry=0x7ffff6bb6920, populateSafepoints=populateSafepoints@entry=false)
    at js/src/jit/RegisterAllocator.cpp:172
#3  0x00000000008fc1bb in js::jit::GenerateLIR (mir=mir@entry=0x7ffff513f1a8) at js/src/jit/Ion.cpp:1554
#4  0x0000000000920f2f in js::jit::CompileBackEnd (mir=0x7ffff513f1a8) at js/src/jit/Ion.cpp:1619
#5  0x000000000063a2b2 in js::HelperThread::handleIonWorkload (this=this@entry=0x7ffff694c318) at js/src/vm/HelperThreads.cpp:1126
#6  0x000000000063bae7 in js::HelperThread::threadLoop (this=0x7ffff694c318) at js/src/vm/HelperThreads.cpp:1422
#7  0x00000000006ac6d1 in nspr::Thread::ThreadRoutine (arg=0x7ffff69301e0) at js/src/vm/PosixNSPR.cpp:45
#8  0x00007ffff7bc4182 in start_thread (arg=0x7ffff6bb8700) at pthread_create.c:312
#9  0x00007ffff6cb3fbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
=> 0x9b4816 <js::jit::AllocationIntegrityState::checkSafepointAllocation(js::jit::LInstruction*, unsigned int, js::jit::LAllocation, bool)+598>:    movl   $0x152,0x0
rax            0x0 0
rbx            0x7ffff500dcc8  140737303862472
rcx            0x7ffff6cc1ca9  140737333959849
rdx            0x0 0
rsi            0x7ffff6f7a9d0  140737336814032
rdi            0x7ffff6f791c0  140737336807872
rbp            0x7ffff6bb6700  0x7ffff6bb6700
rsp            0x7ffff6bb6600  0x7ffff6bb6600
r8             0x7ffff6bb8700  140737332872960
r9             0x6372732f736a2f6c     7165916604736876396
r10            0x7ffff6bb63c0  140737332863936
r11            0x7ffff6c27960  140737333328224
r12            0x6 6
r13            0x0 0
r14            0x7ffff500dc70  140737303862384
r15            0x0 0
rip            0x9b4816 0x9b4816 <js::jit::AllocationIntegrityState::checkSafepointAllocation(js::jit::LInstruction*, unsigned int, js::jit::LAllocation, bool)+598>



These assertions don't reproduce reliably. I get the safepoint assertion approximately 10% of the time, the other one 10-20% of the time, the test is [unhandlable oom] asserts. This is one of the problems with oomAfterAllocations: It affects all threads and here it seems to me that it depends on which thread goes OOM first.
Needinfo from terrence and jonco based on IRC conversation with jandem.
Flags: needinfo?(terrence)
Flags: needinfo?(jcoppeard)
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:ignore]
Attached patch bug1165966-debugger-oom (obsolete) — Splinter Review
The problem is that for objects whose lifetime is managed by the GC, there are actually two possible places they can be destroyed:
 1) In GC sweeping
 2) During initialization if an error occurs (e.g. OOM)

Out assertions don't currently take account of the second case.

The fixes here concern the creation of Debugger objects, and the HeapPtr and WeakMap objects they contain.  These is currently created using a UniquePtr to make sure we delete the object if the initialization fails.  This is, or should be, a common pattern for fallible creation of complex objects.

What I've done here is to add a flag to the runtime to indicate that we've hit an error during initialization, and added a new delete policy that sets this flag while deleting.  This is then used in a new version of make_unique(), make_unqiue_for_init().  Oh yes and I made the delete policy a MOZ_STACK_CLASS so we can't embed one of these UniquePtrs in a heap object to subvert the lifetime assertions.
Assignee: nobody → jcoppeard
Flags: needinfo?(jcoppeard)
Attachment #8607599 - Flags: review?(terrence)
Duplicate of this bug: 1164532
Having said that, I need to check it fixes the safepoint assertion because I was mainly seeing a HeapPtr assertion for this testcase.
(In reply to Jon Coppeard (:jonco) from comment #4)
No, it doesn't so there will be another patch to follow.
Keywords: leave-open
Comment on attachment 8607599 [details] [diff] [review]
bug1165966-debugger-oom

Review of attachment 8607599 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/MallocProvider.h
@@ +187,5 @@
> +     * managed by the GC, so that we can set a flag to indicate that destruction
> +     * is happening due to initialization failure.  This allows us to assert that
> +     * these objects are only destroyed at the correct times.
> +     */
> +    JS_DECLARE_MAKE_METHODS(make_unique_for_init, new_, JS::DeletePolicyForInit, MOZ_ALWAYS_INLINE)

I don't see any real downside to making make_unique set the flag and avoiding a separate method for this particular case. I guess it would make debug builds yet another epsilon slower, but that seems infinitely preferable to making consumers pick between these two similar methods at the risk of introducing an intermittent debug-only failure.

::: js/src/vm/Runtime.h
@@ +1078,5 @@
>      bool                hadOutOfMemory;
>  
> +#ifdef DEBUG
> +    /* We are curently deleting an object due to an initialization failure. */
> +    bool                handlingInitFailure;

It should be fine to make this a mozilla::DebugOnly as all the users are #ifdef DEBUG.
Attachment #8607599 - Flags: review?(terrence) → review+
(In reply to Terrence Cole [:terrence] from comment #6)
> I don't see any real downside to making make_unique set the flag and
> avoiding a separate method for this particular case. I guess it would make
> debug builds yet another epsilon slower, but that seems infinitely
> preferable to making consumers pick between these two similar methods at the
> risk of introducing an intermittent debug-only failure.

I agree that this is not a great solution.  But the problem is that this is an escape hatch for the HeapPtr lifetime checking and if we make this happen by default we lose those checks everywhere an object is owned though a UniquePtr allocated on the heap.

Maybe it would be better to have an explicit guard object that allows destruction of GC managed resources while it is live and avoid baking this into a different type of UnqiuePtr.  The trouble is making it obvious that you need to use such a guard, although the same problem exists for make_unique_for_init().
Here's a patch implementing the approach described above.
Attachment #8607599 - Attachment is obsolete: true
Attachment #8608092 - Flags: review?(terrence)
Add some error checking in the backtracking register allocator.
Attachment #8608093 - Flags: review?(bhackett1024)
Attachment #8608093 - Flags: review?(bhackett1024) → review+
Comment on attachment 8608093 [details] [diff] [review]
bug1165966-debugger-oom-2

Review of attachment 8608093 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/BacktrackingAllocator.cpp
@@ +2026,5 @@
>                      continue;
>  
>                  switch (reg.type()) {
>                    case LDefinition::OBJECT:
>                      safepoint->addGcPointer(a);

This method looks like it's fallible too.
Comment on attachment 8608092 [details] [diff] [review]
bug1165966-debugger-oom

Review of attachment 8608092 [details] [diff] [review]:
-----------------------------------------------------------------

I like this approach much better, if you're also fine with it.
Attachment #8608092 - Flags: review?(terrence) → review+
Flags: needinfo?(terrence)
Attached patch bug1165966-testsSplinter Review
So it seems that oomAfterAllocations() is toxic on Windows which prevents us using adding the testcase here to our regression tests. 

Fixing all the problems that show up in bug 1155618 will probably solve this, but in the mean time here's a different approach to testing this that relies on the oom-test-support patch in that bug.
Attachment #8608859 - Flags: review?(terrence)
Attachment #8608859 - Flags: review?(terrence) → review+
Status: NEW → RESOLVED
Closed: 5 years ago
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.