This adds a bunch of MOZ_NO_INLINE in the hope that this will produce more useful stack traces. It also adds null checks every time a pointer is added to or removed from the nsDeque and a few other places, in an attempt to figure out where these nulls come from.
Attachment #598287 - Flags: review?(bugs)
Comment on attachment 598287 [details] [diff] [review] adds some null checks and MOZ_NO_INLINEs I wonder if non-inlining will actually slow down CC
I tried it a little bit locally, and it didn't seem to matter that much. I was still getting 8ms CCs with a couple of tabs open.
Here goes nothing. https://hg.mozilla.org/integration/mozilla-inbound/rev/096697e2beab
Target Milestone: --- → mozilla13
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
MarkRoots and GraphWalker crashes seem to have gone away in crash reports, suggesting that this did something, but I don't see where they are showing up. Maybe the way I'm crashing things is making them end up in one of the corrupt dump buckets.
Ahh, here's a signature. Two crashes, both on this line: nsresult rv = aPtrInfo->mParticipant->Traverse(aPtrInfo->mPointer, *this); One is a null deref, one is a read of some random invalid value. Could be the aPtrInfo or the mParticipant. This traverse call is virtual, so I think it can't be inlined. The aPtrInfo should be null checked, so maybe it is the participant?
Crash Signature: [@ GCGraphBuilder::Traverse(PtrInfo*) ]
I see some crashes with the signature GraphWalker<ScanBlackVisitor>::DoWalk after my patch landed. Here's one: https://crash-stats.mozilla.com/report/index/00fb16b7-c70e-4956-8090-675552120226 The crashes are all on this line (maybe 6 or so of them): CC_AbortIfNull(*child); ...inside the loop that iterates over the edges. So I guess child is bogus. There's also one crash with the regular walker at the same point ( GraphWalker<scanVisitor>::DoWalk(nsDeque&) ).
I see 4 null deref crashes in aPtrInfo->mParticipant->Traverse(aPtrInfo->mPointer, *this); I wonder how mParticipant can end up being null. It is only set in the constructor. It might be worth null checking there to see what is happening. It looks like there are a few places where nulls could be getting in somehow. There's also a default constructor, which I guess could be getting invoked somehow.
You need to log in before you can comment on or make changes to this bug.