Closed
Bug 1448635
Opened 6 years ago
Closed 6 years ago
Crash in arena_t::DallocSmall | static void arena_dalloc
Categories
(Core :: Networking, defect)
Tracking
()
RESOLVED
DUPLICATE
of bug 1448612
People
(Reporter: jseward, Unassigned)
Details
(Keywords: crash)
Crash Data
This is the #4 topcrash of the Windows nightly 20180322100349, with 38 crashes from 7 different installations. From the backtraces it looks to me like a heap corruption bug, possibly to do with XPCOM shutdown. This bug was filed from the Socorro interface and is report bp-048175a8-3ea3-448b-99c7-1afef0180323. ============================================================= Top 10 frames of crashing thread: 0 mozglue.dll arena_t::DallocSmall memory/build/mozjemalloc.cpp:3418 1 mozglue.dll static void arena_dalloc memory/build/mozjemalloc.cpp:3504 2 nss3.dll static void DefaultFreeEntry nsprpub/lib/ds/plhash.c:55 3 nss3.dll PL_HashTableDestroy nsprpub/lib/ds/plhash.c:115 4 nss3.dll SECOID_Shutdown security/nss/lib/util/secoid.c:2258 5 nss3.dll nss_Shutdown security/nss/lib/nss/nssinit.c:1159 6 nss3.dll NSS_Shutdown security/nss/lib/nss/nssinit.c:1221 7 xul.dll mozilla::ShutdownXPCOM xpcom/build/XPCOMInit.cpp:1011 8 xul.dll ScopedXPCOMStartup::~ScopedXPCOMStartup toolkit/xre/nsAppRunner.cpp:1508 9 xul.dll XREMain::XRE_main toolkit/xre/nsAppRunner.cpp:4939 =============================================================
Reporter | ||
Comment 1•6 years ago
|
||
Looking at all 7 unique crash stacks I can't see any obvious pattern other than "crash at shutdown". Nathan, any ideas?
Flags: needinfo?(nfroyd)
Comment 2•6 years ago
|
||
Looking at more of the crash reports, e.g.: https://crash-stats.mozilla.com/report/index/d58de278-0277-4358-9b5c-393b50180326 https://crash-stats.mozilla.com/report/index/7e2708c7-7dc8-450b-b504-bb49b0180326 https://crash-stats.mozilla.com/report/index/9ae120ad-ff06-46fd-b941-bb2310180325 https://crash-stats.mozilla.com/report/index/ab3e135c-c403-460d-aa41-5cb4b0180325 https://crash-stats.mozilla.com/report/index/f74edb69-bf1c-43cd-bee0-f79f00180325 https://crash-stats.mozilla.com/report/index/3d201026-ce98-4bbd-b6a6-e4b0c0180325 https://crash-stats.mozilla.com/report/index/2bf056fa-d635-492f-8556-fd5b60180325 https://crash-stats.mozilla.com/report/index/98f97781-2cbc-42a1-8574-30d2b0180324 https://crash-stats.mozilla.com/report/index/7fd5558c-96d7-4a25-ace4-cfc4a0180324 https://crash-stats.mozilla.com/report/index/de165b7a-8311-41e9-8cfe-961030180324 https://crash-stats.mozilla.com/report/index/f54230f2-20ba-48dd-bf82-f8e7d0180324 https://crash-stats.mozilla.com/report/index/ede50dc9-e987-4a9e-8eae-17f880180324 https://crash-stats.mozilla.com/report/index/10dd5d14-5289-428e-a26f-f65b80180324 which are basically all of the ones that happened for Nightlies on or after 20180324, we seem to be hitting some issues with streams. Then again, the NSS shutdown crashes seem to be a thing, in addition to the crash in comment 0: https://crash-stats.mozilla.com/report/index/fa9c0918-1756-43e6-be17-146cd0180319 https://crash-stats.mozilla.com/report/index/1b5713be-71a1-4276-996d-a22160180324 https://crash-stats.mozilla.com/report/index/fa9c0918-1756-43e6-be17-146cd0180319 I don't think they are related, but two separate issues? Actually, the NSS shutdown crashes appear to be bug 721710. Looking at the streams crashes first, we're crashing in nsStreamLoader::OnStopRequest: // provide nsIStreamLoader::request during call to OnStreamComplete mRequest = request; size_t length = mData.length(); uint8_t* elems = mData.extractOrCopyRawBuffer(); nsresult rv = mObserver->OnStreamComplete(this, mContext, aStatus, length, elems); if (rv != NS_SUCCESS_ADOPTED_DATA) { // The observer didn't take ownership of the extracted data buffer, so // put it back into mData. mData.replaceRawBuffer(elems, length); // XXX crash inside here } replaceRawBuffer is pretty straightforward: we want to take ownership of the passed in buffer, if possible: Vector<T, N, AP>::replaceRawBuffer(T* aP, size_t aLength, size_t aCapacity) { MOZ_REENTRANCY_GUARD_ET_AL; /* Destroy what we have. */ Impl::destroy(beginNoCheck(), endNoCheck()); if (!usingInlineStorage()) { this->free_(beginNoCheck()); } /* Take in the new buffer. */ if (aCapacity <= kInlineCapacity) { /* * We convert to inline storage if possible, even though aP might * otherwise be acceptable. Maybe this behaviour should be * specifiable with an argument to this function. */ mBegin = inlineStorage(); mLength = aLength; mTail.mCapacity = kInlineCapacity; Impl::moveConstruct(mBegin, aP, aP + aLength); Impl::destroy(aP, aP + aLength); this->free_(aP); } else { mBegin = aP; mLength = aLength; mTail.mCapacity = aCapacity; } #ifdef DEBUG mTail.mReserved = aCapacity; #endif } nsStreamLoader::mData doesn't *have* inline storage, so we could hit the first free_ call, or we could hit the second one if aCapacity is 0. We can't hit frees when destroying elements because the elements are uint8_t, i.e. trivial. But the state of the vector at this replaceRawBuffer call should be an empty vector; we should have extracted whatever data was resident in the extractOrCopyRawBuffer call in OnStopRequest prior to this. (The copy case shouldn't be hit; we'd only copy the data out if we were using an inline buffer, which we aren't. Therefore, we should always be passing out a pointer for the Vector's allocated data.) So the first free() call should always receive nullptr...I think. Unless we're hitting OnStopRequest on multiple threads? But hitting the second free is equally uninteresting: the aLength passed in would have to be 0, and the buffer hasn't been obviously freed by anybody else. The passed-in buffer ought to be solely owned by us in the first place! So I don't know what's going on there either. It would be super-helpful to know what mObserver is, so we could see if that object's OnStreamComplete was doing peculiar things. But I have taken a cursory look at all the OnStreamComplete implementations in the tree and none of them seem to be doing anything especially unusual. The peculiar thing is that this is all running from a ThrottledEventQueue, so we're on the main thread, running things through a TabGroup, either for a worker or a timer (cf. http://dxr.mozilla.org/mozilla-central/source/dom/base/TabGroup.cpp#68 ). That feels significant, but I'm out of ideas for now. Nothing on the stack for these crashes or related-ish code looks like it's changed recently. ni? to baku and valentin (streams and networking, respectively) to see if they have any ideas about what might be going wrong here, or for poing holes in the above analysis. Tentatively moving this to Networking...
Component: XPCOM → Networking
Flags: needinfo?(valentin.gosu)
Flags: needinfo?(amarchesini)
Comment 5•6 years ago
|
||
(In reply to Andrea Marchesini [:baku] from comment #4) > This seems a dup of bug 1448612. Yes, I think this is correct. Thank you!
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(valentin.gosu)
Resolution: --- → DUPLICATE
Comment 6•6 years ago
|
||
I still need to understand why this happens. In theory, when that method is called, we have a ThreadSafeWorkerRef that keeps the worker alive. And the dispatching of WorkerControlRunnable should not fail. Or it was not failing before the introduction of ThreadSafeWorkerRef.
Updated•3 years ago
|
Group: core-security
You need to log in
before you can comment on or make changes to this bug.
Description
•