Closed Bug 975626 Opened 7 years ago Closed 6 years ago

Assertion failure: ForkJoinContext::current() == cx, at jit/ParallelFunctions.cpp or Assertion failure: ok == !cx.bailoutRecord->topScript, at vm/ForkJoin.cpp

Categories

(Core :: JavaScript Engine: JIT, defect)

x86_64
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: gkw, Unassigned)

References

Details

(Keywords: assertion, regression, testcase)

Attachments

(2 files)

Attached file stack
Array.buildPar(9434, function() {
    return function() {}
})

asserts js debug shell on m-c changeset b89a9d7b4ca0 with --ion-parallel-compile=off at Assertion failure: ForkJoinContext::current() == cx, at jit/ParallelFunctions.cpp

My configure flags are:

MAKE=mozmake AR=ar sh ./configure --host=x86_64-pc-mingw32 --target=x86_64-pc-mingw32 --enable-optimize --enable-debug --enable-profiling --enable-gczeal --enable-debug-symbols --enable-methodjit --enable-type-inference --disable-tests --enable-more-deterministic --enable-exact-rooting --enable-threadsafe <other NSPR options>
Assertion failure: ok == !cx.bailoutRecord->topScript, at vm/ForkJoin.cpp

During reduction, I also saw this assertion, which might be related.
Summary: Assertion failure: ForkJoinContext::current() == cx, at jit/ParallelFunctions.cpp → Assertion failure: ForkJoinContext::current() == cx, at jit/ParallelFunctions.cpp or Assertion failure: ok == !cx.bailoutRecord->topScript, at vm/ForkJoin.cpp
autoBisect shows this is probably related to the following changeset:

The first bad revision is:
changeset:   http://hg.mozilla.org/mozilla-central/rev/5735c0b01f19
user:        Bert Belder
date:        Wed Jan 08 12:54:25 2014 -0600
summary:     Bug 956899 - Use mozilla::ThreadLocal instead of NSPR for ForkJoinSlice's thread-local variable, and use it in all cases, not just threadsafe, for simplicity.  Also do some slight style-fix renaming.  r=jwalden

Bert, is bug 956899 a likely regressor?
Blocks: 956899
Flags: needinfo?(bertbelder)
@Gary

Well, the stack trace suggests that it might.

But after looking over that patch 20 times I can't see what could possibly be wrong :/

Semantically nothing changes at all. Maybe TLS got a little faster so timings are now different and this is exposing another issue?

I don't have time to figure it out right now, not within a week or so anyway. So I guess you should revert it, maybe ask rwalden what he thinks.
Flags: needinfo?(bertbelder)
Flags: needinfo?(jwalden+bmo)
If anywhere, there could be a lack of synchronization around the TLS slot initalizer. This code:

```
ThreadLocal<ForkJoinSlice*> ForkJoinSlice::tlsForkJoinSlice;
ForkJoinSlice::initialize()
{
if (!tlsForkJoinSlice.initialized()) {
  if (!tlsForkJoinSlice.init())
    return false;
  }
  return true;
}
```

lacks synchronization around the initialize() thread. ThreadLocal<T>::init() itself is also not thread safe btw. 

I am not sure if there is some external synchronization mechanism that ensures that TlsForJoinSlice::initialize() only runs in one thread at a time. If that mechanism isn't there and multiple VM's are created in different threads around the same time, a race condition may occur here.

But that patch didn't change that, at worst it changed some timings.
Shu-yu, you used to work on ForkJoin stuff, do you know if you might have any ideas?
Flags: needinfo?(shu)
ForkJoinContext::initialize is called from JS_Init. How can it race?
Flags: needinfo?(shu)
Flags: needinfo?(bertbelder)
The other possible cause I can think of is if we run out of TLS indices. That isn't entirely hypothetical because windows only allows the allocation of 1088 TLS slots per process. I am not sure what the fuzzer does, could it be that somehow many slots are allocated?
Flags: needinfo?(bertbelder)
(In reply to Bert from comment #7)
> The other possible cause I can think of is if we run out of TLS indices.
> That isn't entirely hypothetical because windows only allows the allocation
> of 1088 TLS slots per process. I am not sure what the fuzzer does, could it
> be that somehow many slots are allocated?

Nothing from the fuzzer is needed to reproduce, just the testcase in comment 0, so I'm not sure what you mean by whatever the fuzzer is doing.
Flags: needinfo?(bertbelder)
Gary, can I catch you on IRC or something?
Flags: needinfo?(bertbelder)
I can't reproduce this.  JS_Init is called non-racily against anything.  ThreadLocal::init and such need not be thread-safe.  So I don't know what's up here.  Perhaps it's worth investigating this in the Mountain View office on a computer that reproduces it; otherwise I've got nothing.
Flags: needinfo?(jwalden+bmo)
Waldo, I think I only hit this on 64-bit Windows, but right now that's busted due to bug 981492.
WFM with m-c rev b85c260821ab.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.