Closed Bug 1507322 Opened 1 year ago Closed 1 year ago

Crash [@ js::CurrentThreadCanAccessRuntime] during GC with WeakMap

Categories

(Core :: JavaScript: GC, defect, major)

x86_64
Linux
defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla66
Tracking Status
firefox-esr60 --- wontfix
firefox64 --- wontfix
firefox65 --- fixed
firefox66 --- fixed

People

(Reporter: decoder, Assigned: sfink)

References

(Blocks 2 open bugs)

Details

(5 keywords, Whiteboard: [jsbugmon:update][sg:dos])

Crash Data

Attachments

(3 files, 1 obsolete file)

The following testcase crashes on mozilla-central revision 073045259e75 (build with --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --disable-profiling --enable-debug --enable-optimize, run with --fuzzing-safe --cpu-count=2 --ion-offthread-compile=off):

function TestGC2(m) {
  var head = new Object;
  for (key = head, i = 0; i < 48614; i++, key = m.get(key)) {
    m.set(key, new Object);
  }
  gc();
  for (key = head; key != undefined; key = m.get(key)) {}
}
TestGC2(new WeakMap);


Backtrace:

received signal SIGSEGV, Segmentation fault.
0x0000555555b2eb74 in js::CurrentThreadCanAccessRuntime (rt=0x7ffff5f1c000) at js/src/vm/Runtime.cpp:833
#0  0x0000555555b2eb74 in js::CurrentThreadCanAccessRuntime (rt=0x7ffff5f1c000) at js/src/vm/Runtime.cpp:833
#1  0x0000555555d81393 in js::CheckThreadLocal::check (this=this@entry=0x7ffff5f182f0) at js/src/threading/ProtectedData.cpp:45
#2  0x0000555555d82ad5 in js::ProtectedData<js::CheckThreadLocal, bool>::ref (this=0x7ffff5f182e8) at js/src/threading/ProtectedData.h:116
#3  js::ProtectedData<js::CheckThreadLocal, bool>::operator bool const& (this=0x7ffff5f182e8) at js/src/threading/ProtectedData.h:84
#4  js::OnHelperThread<(js::AllowedHelperThread)1> () at js/src/threading/ProtectedData.cpp:32
#5  js::CheckZone<(js::AllowedHelperThread)1>::check (this=this@entry=0x7ffff5f58080) at js/src/threading/ProtectedData.cpp:70
#6  0x0000555555f00b22 in js::ProtectedData<js::CheckZone<(js::AllowedHelperThread)1>, JS::GCHashMap<js::gc::Cell*, unsigned long, mozilla::PointerHasher<js::gc::Cell*>, js::SystemAllocPolicy, js::gc::UniqueIdGCPolicy> >::ref (this=0x7ffff5f58058) at js/src/threading/ProtectedData.h:107
#7  JS::Zone::uniqueIds (this=0x7ffff5f58000) at js/src/gc/Zone.h:308
#8  JS::Zone::hasUniqueId (this=0x7ffff5f58000, cell=<optimized out>) at js/src/gc/Zone-inl.h:132
#9  0x0000555555f00bcb in js::MovableCellHasher<JSObject*>::hasHash (l=@0x7fffff7ff0d8: 0x7ffff4a31e40) at js/src/gc/Barrier.cpp:146
#10 0x0000555555a40f5d in js::MovableCellHasher<js::HeapPtr<JSObject*> >::hasHash (l=<optimized out>) at js/src/gc/Barrier.h:844
#11 mozilla::FallibleHashMethods<js::MovableCellHasher<js::HeapPtr<JSObject*> > >::hasHash<JSObject* const&> (l=<optimized out>) at dist/include/js/RootingAPI.h:788
#12 mozilla::HasHash<mozilla::HashMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value>, js::MovableCellHasher<js::HeapPtr<JSObject*> >, js::ZoneAllocPolicy>::MapHashPolicy, JSObject* const&> (aLookup=@0x7fffff7ff0d8: 0x7ffff4a31e40) at dist/include/mozilla/HashTable.h:960
#13 mozilla::detail::HashTable<mozilla::HashMapEntry<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value> >, mozilla::HashMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value>, js::MovableCellHasher<js::HeapPtr<JSObject*> >, js::ZoneAllocPolicy>::MapHashPolicy, js::ZoneAllocPolicy>::readonlyThreadsafeLookup (aLookup=@0x7fffff7ff0d8: 0x7ffff4a31e40, this=0x7ffff55053f8) at dist/include/mozilla/HashTable.h:2115
#14 mozilla::detail::HashTable<mozilla::HashMapEntry<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value> >, mozilla::HashMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value>, js::MovableCellHasher<js::HeapPtr<JSObject*> >, js::ZoneAllocPolicy>::MapHashPolicy, js::ZoneAllocPolicy>::lookup (aLookup=@0x7fffff7ff0d8: 0x7ffff4a31e40, this=0x7ffff55053f8) at dist/include/mozilla/HashTable.h:2125
#15 mozilla::HashMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value>, js::MovableCellHasher<js::HeapPtr<JSObject*> >, js::ZoneAllocPolicy>::lookup (aLookup=@0x7fffff7ff0d8: 0x7ffff4a31e40, this=0x7ffff55053f8) at dist/include/mozilla/HashTable.h:254
#16 js::WeakMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value> >::markEntry (this=0x7ffff55053c0, marker=0x7ffff5f1d690, markedCell=0x7ffff4a31e40, origKey=...) at js/src/gc/WeakMap-inl.h:51
#17 0x0000555555ebe8a1 in js::GCMarker::markEphemeronValues (this=0x7ffff5f1d690, markedCell=0x7ffff4a31e40, values=...) at js/src/gc/Marking.cpp:629
#18 0x0000555555f0f2d2 in js::GCMarker::markImplicitEdgesHelper<JSObject*> (this=this@entry=0x7ffff5f1d690, markedThing=markedThing@entry=0x7ffff4a31e40) at js/src/gc/Marking.cpp:656
#19 0x0000555555f10386 in js::GCMarker::markImplicitEdges<JSObject> (thing=0x7ffff4a31e40, this=0x7ffff5f1d690) at js/src/gc/Marking.cpp:670
#20 js::GCMarker::markAndPush<JSObject> (this=this@entry=0x7ffff5f1d690, thing=thing@entry=0x7ffff4a31e40) at js/src/gc/Marking.cpp:856
#21 0x0000555555f10424 in js::GCMarker::traverse<JSObject*> (thing=0x7ffff4a31e40, this=0x7ffff5f1d690) at js/src/gc/Marking.cpp:859
#22 DoMarking<JSObject> (gcmarker=0x7ffff5f1d690, thing=0x7ffff4a31e40) at js/src/gc/Marking.cpp:735
#23 0x0000555555f1e02b in DoMarking<JS::Value> (thing=..., gcmarker=<optimized out>) at js/src/gc/Marking.cpp:750
#24 js::gc::TraceEdgeInternal<JS::Value> (trc=trc@entry=0x7ffff5f1d690, thingp=0x7ffff4b0b5c0, name=name@entry=0x5555568f1bf5 "ephemeron value") at js/src/gc/Marking.cpp:562
#25 0x0000555555a411d3 in js::TraceEdge<JS::Value> (name=0x5555568f1bf5 "ephemeron value", thingp=<optimized out>, trc=0x7ffff5f1d690) at js/src/gc/Tracer.h:109
#26 js::WeakMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value> >::markEntry (this=0x7ffff55053c0, marker=0x7ffff5f1d690, markedCell=0x7ffff4a26f00, origKey=...) at js/src/gc/WeakMap-inl.h:57
#27 0x0000555555ebe8a1 in js::GCMarker::markEphemeronValues (this=0x7ffff5f1d690, markedCell=0x7ffff4a26f00, values=...) at js/src/gc/Marking.cpp:629
#28 0x0000555555f0f2d2 in js::GCMarker::markImplicitEdgesHelper<JSObject*> (this=this@entry=0x7ffff5f1d690, markedThing=markedThing@entry=0x7ffff4a26f00) at js/src/gc/Marking.cpp:656
#29 0x0000555555f10386 in js::GCMarker::markImplicitEdges<JSObject> (thing=0x7ffff4a26f00, this=0x7ffff5f1d690) at js/src/gc/Marking.cpp:670
#30 js::GCMarker::markAndPush<JSObject> (this=this@entry=0x7ffff5f1d690, thing=thing@entry=0x7ffff4a26f00) at js/src/gc/Marking.cpp:856
#31 0x0000555555f10424 in js::GCMarker::traverse<JSObject*> (thing=0x7ffff4a26f00, this=0x7ffff5f1d690) at js/src/gc/Marking.cpp:859
#32 DoMarking<JSObject> (gcmarker=0x7ffff5f1d690, thing=0x7ffff4a26f00) at js/src/gc/Marking.cpp:735
#33 0x0000555555f1e02b in DoMarking<JS::Value> (thing=..., gcmarker=<optimized out>) at js/src/gc/Marking.cpp:750
#34 js::gc::TraceEdgeInternal<JS::Value> (trc=trc@entry=0x7ffff5f1d690, thingp=0x7ffff4b2a508, name=name@entry=0x5555568f1bf5 "ephemeron value") at js/src/gc/Marking.cpp:562
#35 0x0000555555a411d3 in js::TraceEdge<JS::Value> (name=0x5555568f1bf5 "ephemeron value", thingp=<optimized out>, trc=0x7ffff5f1d690) at js/src/gc/Tracer.h:109
#36 js::WeakMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value> >::markEntry (this=0x7ffff55053c0, marker=0x7ffff5f1d690, markedCell=0x7ffff4a512c0, origKey=...) at js/src/gc/WeakMap-inl.h:57
#37 0x0000555555ebe8a1 in js::GCMarker::markEphemeronValues (this=0x7ffff5f1d690, markedCell=0x7ffff4a512c0, values=...) at js/src/gc/Marking.cpp:629
[...]
#127 0x0000555555ebe8a1 in js::GCMarker::markEphemeronValues (this=0x7ffff5f1d690, markedCell=0x7ffff4a1c380, values=...) at js/src/gc/Marking.cpp:629
rax	0x7ffff5f18000	140737319632896
rbx	0x7ffff5f182f0	140737319633648
rcx	0x1	1
rdx	0x0	0
rsi	0x7ffff4a31e40	140737297718848
rdi	0x7ffff5f1c000	140737319649280
rbp	0x7fffff7ff000	140737479962624
rsp	0x7fffff7ff000	140737479962624
r8	0x7fffff7ff150	140737479962960
r9	0x0	0
r10	0xa0eaf452	2699752530
r11	0x0	0
r12	0x7ffff5f58080	140737319895168
r13	0x7fffff7ff0e0	140737479962848
r14	0x7fffff7ff0f0	140737479962864
r15	0x7ffff5f1d690	140737319655056
rip	0x555555b2eb74 <js::CurrentThreadCanAccessRuntime(JSRuntime const*)+4>
=> 0x555555b2eb74 <js::CurrentThreadCanAccessRuntime(JSRuntime const*)+4>:	push   %rbx
   0x555555b2eb75 <js::CurrentThreadCanAccessRuntime(JSRuntime const*)+5>:	mov    %rdi,%rbx


This looks like a stack space exhaustion (over recursion), but I'm marking it s-s as a start because it involves GC with WeapMaps and I want to be sure there is no memory corruption or anything present causing this behavior.
Summary: Crash [@ js::CurrentThreadCanAccessRuntime] during GC with WeapMap → Crash [@ js::CurrentThreadCanAccessRuntime] during GC with WeakMap
Flags: needinfo?(jcoppeard)
Christian: any idea of the regression range here?
(In reply to Daniel Veditz [:dveditz] from comment #1)
> Christian: any idea of the regression range here?

No. JSBugMon will soon run on the bug, but if that doesn't find the regression range, it is up to QA/developers to determine that.
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:update]
I take that back. Comment 4 is also unlikely to be the one.
It seems like GCMarker::markEphemeronValues ends up being called recursively in this case.  Steve, could we push things onto the mark stack rather than marking them directly there?
Flags: needinfo?(jcoppeard) → needinfo?(sphink)
Group: javascript-core-security
Keywords: csectype-dos
Whiteboard: [jsbugmon:update] → [jsbugmon:update][sg:dos]
Testing this out with https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=217512636&revision=1ea013f7413050c0b9617a0af4fe0090989a49e9

Seems good so far. And I *think* it may have removed a redundant call to markImplicitEdges that I'd been meaning to look at. (processMarkStackTop's scan_obj does markImplicitEdges, and so does JSObject's markAndPush, which seems wrong.)
Flags: needinfo?(sphink)
Hopefully I got this right. It fixes the test case, at least.
Attachment #9032504 - Flags: review?(jcoppeard)
Assignee: nobody → sphink
Status: NEW → ASSIGNED
Comment on attachment 9032504 [details] [diff] [review]
mark implicit edges via mark stack instead of eagerly

Review of attachment 9032504 [details] [diff] [review]:
-----------------------------------------------------------------

Well, I want to r+ this but I'm not actually I understand how it works.

This makes JSScript marking not use the mark stack, and moves implicit edge marking to the traceChildren() method... for some set of GC things?  Is that correct?

The stack trace seems to be a problem with JSObject marking taking stack space proportional to the length of the path through all the weak map entries.  Oh, so is it that implicit edge marking no longer happens in markAndPush()?  Ah, that's what your comment says.  r+ if so.

Note that if you're not pushing JSScripts on the mark stack any more then there's some stuff in MarkStack related to them that can be removed, i.e. anything related to ScriptTag.
Attachment #9032504 - Flags: review?(jcoppeard)
(In reply to Jon Coppeard (:jonco) from comment #9)
> This makes JSScript marking not use the mark stack, and moves implicit edge
> marking to the traceChildren() method... for some set of GC things?  Is that
> correct?

Yes. And perhaps it's a mistake to change JSScript marking in this patch; I only did it because the comment implied that the only reason why it needed to use the mark stack was because of the weakmap marking, so if we stop using the mark stack for that, then that reason goes away. Basically, I sort of assumed that it was *only* using the mark stack because of weakmaps, and had been changed to accommodate them. But I didn't do the research, so it's possible that there are other reasons for it. In which case, it may be too risky for this bug, and I should do it as a separate bug (just to reduce unnecessary depth on the stack).

As for the set of GC things, there's a bunch of stuff that unfortunately depends on the set of GC things that can be used as weakmap keys: JSObject, JSScript, and LazyScript. All of them need to do the weakmap marking when they get marked, and I like to think of that as "implicit" edges from a key to its value. Thought of that way, then they should naturally be marking their implicit children along with their other children in places like ::traceChildren(). (JSObject is a little weird; we inline its marking-specific traceChildren into processMarkStackTop.)

...and for no apparent reason, I seem to have done it for Shape as well?? Uh, that makes no sense, I'll remove it. I must've been confused about which function I was in?

> The stack trace seems to be a problem with JSObject marking taking stack
> space proportional to the length of the path through all the weak map
> entries.  Oh, so is it that implicit edge marking no longer happens in
> markAndPush()?  Ah, that's what your comment says.  r+ if so.

Right, that's the real point here. When encountering a key of type JSScript or LazyScript, we no longer immediately trace the implicit edges because that takes C stack and can recurse. Instead, we delay that to when we're marking their children anyway. And none of that comes into play for this test case, because it's only using JSObject keys.

For JSObject keys, we were already tracing implicit child edges along with the explicit children (eg group_), there was just an unnecessary eager additional call to markImplicitEdges when first pushing an object onto the mark stack.

Uh... but as a result of writing this up, I'm also not seeing where LazyScript implicit children were getting handled. They go through the markAndScan, which doesn't call markImplicitEdges. I think this may be fixing a second bug here. I wonder if we have good enough test facilities to write a test case for those.

> Note that if you're not pushing JSScripts on the mark stack any more then
> there's some stuff in MarkStack related to them that can be removed, i.e.
> anything related to ScriptTag.

Good point. I think I'd better split that change out. Especially since there's a chance it could break things for unrelated reasons.

Steve, can you prioritize this bug using the (Importance) field as per your judgement?

Component: JavaScript Engine → JavaScript: GC

Set to major, which might be a bit too high -- crashes are bad, but this seems like it shouldn't happen that much in practice.

Fortunately, I just need to get back to this and split the patch up a bit; the existing patch seems to work and fix the problem.

Severity: critical → major
Finally got back around to splitting out the eager JSScript marking. I'll have to wait for this to land before I can do that.
Attachment #9036075 - Flags: review?(jcoppeard)
Attachment #9032504 - Attachment is obsolete: true
Attachment #9036075 - Flags: review?(jcoppeard) → review+
Pushed by sfink@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5ecfcb615bd6
mark implicit edges via mark stack instead of eagerly, r=jonco
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66

Can we land the testcase for this? Also, please nominate this for Beta approval when you get a chance.

Flags: needinfo?(sphink)

Comment on attachment 9036075 [details] [diff] [review]
mark implicit edges via mark stack instead of eagerly

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1164294

User impact if declined: Denial-of-service type of crash, which is possible to stumble across with real code. Especially if that code is buggy.

Is this code covered by automated tests?: No

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: I will be landing a test here soon, and would probably want to uplift it along with this.

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): It doesn't change what happens, just when. It's the sort of thing that is pretty likely to show up in automated tests if it was busted.

String changes made/needed: none

Flags: needinfo?(sphink)
Attachment #9036075 - Flags: approval-mozilla-beta?
Attachment #9036736 - Flags: review?(jcoppeard)

(In reply to Steve Fink [:sfink] [:s:] from comment #17)

Is this code covered by automated tests?: No

Or rather, "sorta". This code is executed by many different automated tests. The bug in question is not exercised by any of them, but will be once the test I just attached lands.

Attachment #9036736 - Flags: review?(jcoppeard) → review+
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1e7a80d93de5
Test deeply recursive weakmaps. r=jonco

I should run the test once before landing. Was missing the jstests-required reportCompare call.

Flags: needinfo?(sphink)

Comment on attachment 9036075 [details] [diff] [review]
mark implicit edges via mark stack instead of eagerly

[Triage Comment]
Fixes an OOM crash seen in the wild. Thanks for landing a test too. Approved for 65.0b12.

Attachment #9036075 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Depends on: 1520778
Blocks: 1529775
You need to log in before you can comment on or make changes to this bug.