Closed Bug 1443607 Opened 7 years ago Closed 6 years ago

Assertion failure: IsCellPointerValid(*edge), at js/src/gc/Marking.cpp:2885 with nursery strings

Categories

(Core :: JavaScript Engine, defect, P1)

x86_64
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 1442481
Tracking Status
firefox-esr52 --- unaffected
firefox58 --- unaffected
firefox59 --- unaffected
firefox60 --- disabled
firefox62 --- disabled

People

(Reporter: decoder, Unassigned)

References

Details

(5 keywords, Whiteboard: [jsbugmon:])

Attachments

(1 file)

The following testcase crashes on mozilla-central revision b2a9a4bb5c94+ (build with --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --disable-profiling --enable-debug --enable-optimize, run with --fuzzing-safe --cpu-count=2 --disable-oom-functions --ion-offthread-compile=off --ion-eager --nursery-strings=on): See attachment. Backtrace: received signal SIGSEGV, Segmentation fault. 0x0000000000e7de98 in js::gc::StoreBuffer::CellPtrEdge::trace (this=0x7ffff5f1d238, mover=...) at js/src/gc/Marking.cpp:2885 #0 0x0000000000e7de98 in js::gc::StoreBuffer::CellPtrEdge::trace (this=0x7ffff5f1d238, mover=...) at js/src/gc/Marking.cpp:2885 #1 0x0000000000ebaa6a in js::gc::StoreBuffer::MonoTypeBuffer<js::gc::StoreBuffer::CellPtrEdge>::trace (this=this@entry=0x7ffff5f1d1e8, owner=owner@entry=0x7ffff5f1d190, mover=...) at js/src/gc/Marking.cpp:2750 #2 0x0000000000ecf701 in js::gc::StoreBuffer::traceCells (mover=..., this=<optimized out>) at js/src/gc/StoreBuffer.h:440 #3 js::Nursery::doCollection (this=this@entry=0x7ffff5f1cdb8, reason=reason@entry=JS::gcreason::DESTROY_RUNTIME, tenureCounts=...) at js/src/gc/Nursery.cpp:856 #4 0x0000000000ecffaf in js::Nursery::collect (this=0x7ffff5f1cdb8, reason=reason@entry=JS::gcreason::DESTROY_RUNTIME) at js/src/gc/Nursery.cpp:722 #5 0x0000000000e6785a in js::gc::GCRuntime::minorGC (this=0x7ffff5f1a780, reason=JS::gcreason::DESTROY_RUNTIME, phase=<optimized out>) at js/src/gc/GC.cpp:7743 #6 0x0000000000e8af02 in js::gc::GCRuntime::gcCycle (this=this@entry=0x7ffff5f1a780, nonincrementalByAPI=nonincrementalByAPI@entry=true, budget=..., reason=reason@entry=JS::gcreason::DESTROY_RUNTIME) at js/src/gc/GC.cpp:7347 #7 0x0000000000e8b73d in js::gc::GCRuntime::collect (this=this@entry=0x7ffff5f1a780, nonincrementalByAPI=nonincrementalByAPI@entry=true, budget=..., reason=reason@entry=JS::gcreason::DESTROY_RUNTIME) at js/src/gc/GC.cpp:7538 #8 0x0000000000e8ba99 in js::gc::GCRuntime::gc (this=this@entry=0x7ffff5f1a780, gckind=gckind@entry=GC_NORMAL, reason=reason@entry=JS::gcreason::DESTROY_RUNTIME) at js/src/gc/GC.cpp:7608 #9 0x0000000000bd76ab in JSRuntime::destroyRuntime (this=0x7ffff5f1a000) at js/src/vm/Runtime.cpp:316 #10 0x0000000000b575f9 in js::DestroyContext (cx=0x7ffff5f16000) at js/src/vm/JSContext.cpp:252 #11 0x00000000009a497a in JS_DestroyContext (cx=<optimized out>) at js/src/jsapi.cpp:505 #12 0x0000000000444a62 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at js/src/shell/js.cpp:9294 rax 0x0 0 rbx 0x7ffff5316000 140737307041792 rcx 0x7ffff6c282ad 140737333330605 rdx 0x0 0 rsi 0x7ffff6ef7770 140737336276848 rdi 0x7ffff6ef6540 140737336272192 rbp 0x7fffffffcfe0 140737488343008 rsp 0x7fffffffcfc0 140737488342976 r8 0x7ffff6ef7770 140737336276848 r9 0x7ffff7fe4780 140737354024832 r10 0x0 0 r11 0x0 0 r12 0x7ffff5f1d238 140737319653944 r13 0x7fffffffd110 140737488343312 r14 0x7ffff5f1d1e8 140737319653864 r15 0x7ffff5f1d190 140737319653776 rip 0xe7de98 <js::gc::StoreBuffer::CellPtrEdge::trace(js::TenuringTracer&) const+184> => 0xe7de98 <js::gc::StoreBuffer::CellPtrEdge::trace(js::TenuringTracer&) const+184>: movl $0x0,0x0 0xe7dea3 <js::gc::StoreBuffer::CellPtrEdge::trace(js::TenuringTracer&) const+195>: ud2 Probably not s-s because nursery strings are not enabled right now.
Attached file Testcase
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:bisect]
JSBugMon: Cannot process bug: Error: Failed to isolate test from comment
Whiteboard: [jsbugmon:bisect] → [jsbugmon:]
JSBugMon: Bisection requested, failed due to error: Error: Failed to isolate test from comment
Flags: needinfo?(sphink)
Priority: -- → P1
(P1 only because blocking a feature we're trying to ship; flags are accurate)
decoder gave me a reproducible test case for this. \o/
This is again a wonderful example for fragile test cases. The test reproduces deterministically on my EC2 machine. When I pull it down on my laptop, including the binary (!), it doesn't reproduce at all. I also tried using --no-threads --no-incremental-gc and with that, it still reproduces on EC2, but does not locally. We should try jandem's trick to replace GenerateRandomSeed and PRMJ_Now to see if that does the trick here.
What I've figured out so far about what's going on here: The problem is that we are "transmuting" strings in-place -- in particular, changing a JSRope into a JSExtensibleString. Strings have 3 words. The first word is flags/length (except temporarily while flattening, which doesn't matter here.) The second two words depend on the type. For a JSRope, they're left and right child pointers, both of them pointing to GC Cells. For a JSExtensibleString, they are a pointer to the string characters and the capacity, neither of which are GC Cells. So if we started with a tenured JSRope whose children are in the nursery, then the post barrier will have recorded the address of those two words in the store buffer. When the JSRope turns into JSExtensibleString, I neglect to remove them from the store buffer, resulting in a store buffer entry pointing to non-GC memory. I'm still working out why this situation is so uncommon; offhand, it seems like this should have crashed all over the place. Perhaps it's similar to the other bug I looked at -- the JSRope will be created by appending a JSExtensibleString to another string, which means the extensible string is older than the rope. This is the reverse of the problematic situation above, and will not involve any store buffer entries. I guess I'll have to figure out how I ended up with the situation I did.
Flags: needinfo?(sphink)
(In reply to Steve Fink [:sfink] [:s:] from comment #7) > I neglect > to remove them from the store buffer, resulting in a store buffer entry > pointing to non-GC memory. We don't usually remove store buffer entries if possible (it's a pain). Can we put the whole cell in the store buffer in these cases where it's possible for the individual fields to change type?
Perhaps incorrectly, I have the fix for this in bug 1442481.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: