Closed Bug 976889 Opened 6 years ago Closed 6 years ago

GenerationalGC: Assertion failure: !entered, at dist/include/mozilla/ReentrancyGuard.h

Categories

(Core :: JavaScript: GC, defect, critical)

x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla30

People

(Reporter: gkw, Assigned: terrence)

References

(Blocks 1 open bug)

Details

(Keywords: assertion, testcase, Whiteboard: [fuzzblocker] [jsbugmon:])

Attachments

(2 files)

Attached file stack
Array.buildPar(5, function() {
    return [].t = encodeURI
})

crashes js debug shell intermittently (tested with a threadsafe deterministic 32-bit debug build) on m-c changeset 22650589a724 with  at Assertion failure: !entered, at dist/include/mozilla/ReentrancyGuard.h

My configure options are:

CC="gcc -m32" AR=ar PKG_CONFIG_LIBDIR=/usr/lib/pkgconfig CXX="g++ -m32" sh ./configure --target=i686-pc-linux --enable-optimize --enable-debug --enable-profiling --enable-gczeal --enable-debug-symbols --enable-methodjit --enable-type-inference --disable-tests --enable-more-deterministic --enable-gcgenerational --with-ccache --enable-threadsafe <other NSPR options>
Flags: needinfo?(terrence)
Whiteboard: [fuzzblocker][jsbugmon:update,bisect] → [fuzzblocker] [jsbugmon:bisect]
JSBugMon: Cannot process bug: Unable to automatically reproduce, please track manually.
Whiteboard: [fuzzblocker] [jsbugmon:bisect] → [fuzzblocker] [jsbugmon:]
GDB is 100% sure this memory and all the padding memory around it is 0. If I replace |DebugOnly<bool> entered;| with just |bool entered;|, the problem still occurs. If I replace ReentrancyGuards's |bool &entered;| with |bool *entered;|, the problem still occurs. If I do /both/ of the above, the problem goes away. I'd think gcc was just getting confused here, except the assembly looks fine in all cases. It doesn't seem to be amemory alignment issue either: the instructions manually clear the top bits, adding manual alignment to the struct does not help, and gdb disagrees with the processor. I'll give this some more thought tomorrow.
Flags: needinfo?(terrence)
Additional notes: If I add a printf, it prints 0 and the crash disappears. If I re-arrange the code sufficiently -- e.g. move the guard object notifier init below the existing code, do not use C++'s :member(init) syntax, and switch to bool* -- the crash becomes random. Moreover, if I add a second ReentrancyGuard immediately after the first to trigger a crash, the crash is still randomly not triggered.

Note: This is all in a --disable-optimize --enable-debug build, so no optimizations are involved here.
Shu took a look today and spotted the problem immediately. Our check in CurrentThreadCanAccessRuntime is just wrong. This patch fixes the crash here and seems to not wreak havoc on the parallel jit tests either.
Assignee: nobody → terrence
Status: NEW → ASSIGNED
Attachment #8384979 - Flags: review?(shu)
Comment on attachment 8384979 [details] [diff] [review]
fix_CurrentThreadCanAccessRuntime-v0.diff

Review of attachment 8384979 [details] [diff] [review]:
-----------------------------------------------------------------

Noting the reasoning here:

CurrentThreadCanAccessRuntime seems to be mainly used for exclusive context stuff (the OMT workloads like parsing). In PJS sections, all the VM paths should be vetted PJS-safe paths and none of them should be modifying runtime state willy nilly, even when we acquire the JSContext lock from the ForkJoinContext. Moreover, we reuse the main thread as a worker in PJS, so |ownerThread_ == PR_GetCurrentThread()| will be true for one of the workers, and that worker definitely should not be able to touch the runtime state.
Attachment #8384979 - Flags: review?(shu) → review+
Oh, and the original reason for the reentrancy guard assert was just that multiple PJS threads got inside a post barrier at the same time, not one post barrier triggering another post barrier.
https://hg.mozilla.org/mozilla-central/rev/94874f1a2451
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Depends on: 981362
You need to log in before you can comment on or make changes to this bug.