Closed
Bug 973533
Opened 11 years ago
Closed 10 years ago
Crash [@ moz_abort | isalloc_validate | mozilla::scache::StartupCache::SizeOfEntryExcludingThis(nsACString_internal const&, nsAutoPtr<mozilla::scache::CacheEntry> const&, unsigned int (*)(void const*), void*) ]
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: andrei, Unassigned)
References
()
Details
(Keywords: crash, intermittent-failure, sec-moderate, Whiteboard: [mozmill])
Crash Data
Attachments
(1 file)
|
20.45 KB,
text/plain
|
Details |
This is the crash report:
https://crash-stats.mozilla.com/report/index/b3406b26-1056-47ee-82e9-722ff2140217
Crashed while running mozmill tests on a Win81 32b node.
http://mozmill-daily.blargon7.com/#/endurance/report/e35f22a68fec3f6d144713f9abdbca7a
The full console log is attached.
Prior to the crash we have a few messages like this:
> [Child 1676] WARNING: pipe error: 109: file c:/builds/moz2_slave/m-cen-w32-ntly-000000000000000/build/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 310
I found a similar message reported in bug 966094
This crash sounds kinda bad. We are accessing some memory which should not be accessed (I think)
Crash Reason EXCEPTION_BREAKPOINT
Crash Address 0x73612555
Here the first 10 frames from the stack:
0 mozglue.dll moz_abort memory/mozjemalloc/jemalloc.c
1 mozglue.dll isalloc_validate memory/mozjemalloc/jemalloc.c
2 xul.dll mozilla::scache::StartupCache::SizeOfEntryExcludingThis(nsACString_internal const &,nsAutoPtr<mozilla::scache::CacheEntry> const &,unsigned int (*)(void const *),void *) startupcache/StartupCache.cpp
3 xul.dll nsBaseHashtable<nsCStringHashKey,nsAutoPtr<mozilla::scache::CacheEntry>,mozilla::scache::CacheEntry *>::s_SizeOfStub(PLDHashEntryHdr *,unsigned int (*)(void const *),void *) obj-firefox/dist/include/nsBaseHashtable.h
4 xul.dll SizeOfEntryExcludingThisEnumerator xpcom/glue/pldhash.cpp
5 xul.dll PL_DHashTableEnumerate(PLDHashTable *,PLDHashOperator (*)(PLDHashTable *,PLDHashEntryHdr *,unsigned int,void *),void *) xpcom/glue/pldhash.cpp
6 xul.dll PL_DHashTableSizeOfExcludingThis(PLDHashTable const *,unsigned int (*)(PLDHashEntryHdr *,unsigned int (*)(void const *),void *),unsigned int (*)(void const *),void *) xpcom/glue/pldhash.cpp
7 xul.dll nsBaseHashtable<nsCStringHashKey,nsAutoPtr<mozilla::scache::CacheEntry>,mozilla::scache::CacheEntry *>::SizeOfExcludingThis(unsigned int (*)(nsACString_internal const &,nsAutoPtr<mozilla::scache::CacheEntry> const &,unsigned int (*)(void const *),void *),unsigned int (*)(void const *),void *) obj-firefox/dist/include/nsBaseHashtable.h
8 xul.dll mozilla::scache::StartupCache::HeapSizeOfIncludingThis(unsigned int (*)(void const *)) startupcache/StartupCache.cpp
9 xul.dll mozilla::scache::StartupCache::CollectReports(nsIMemoryReporterCallback *,nsISupports *) startupcache/StartupCache.cpp
10 xul.dll nsMemoryReporterManager::GetReportsForThisProcess(nsIMemoryReporterCallback *,nsISupports *) xpcom/base/nsMemoryReporterManager.cpp
Andrei, is that crash reproducible? If yes, we should find the regression range. Are other platforms also affected or only Windows 8.1?
Severity: normal → critical
status-firefox30:
--- → affected
tracking-firefox30:
--- → ?
Flags: needinfo?(andrei.eftimie)
Keywords: crash
Whiteboard: [mozmill]
Marking as security bug for now until we know if it is one or not.
Group: core-security
| Reporter | ||
Comment 3•11 years ago
|
||
I haven't seen any other instance besides this one.
Also note that this crash was on the 13th Feb, and we ran tests every day with no other occurrence.
I'll give it a shot to try reproducing it in the same environment, but I think its not likely that it will reproduce.
Flags: needinfo?(andrei.eftimie)
Comment 4•11 years ago
|
||
(Not a networking cache but startup cache)
Component: Networking: Cache → XPCOM
Comment 5•11 years ago
|
||
Nick: how bad is it when this memory reporter (added in bug 697335) fires?
Updated•11 years ago
|
Flags: needinfo?(n.nethercote)
Comment 6•11 years ago
|
||
> Nick: how bad is it when this memory reporter (added in bug 697335) fires?
I don't understand the question... how bad is what? If you're asking when it happens, it's when a user visits about:memory and hits the "measure" button.
As you found, this is fairly old code that hasn't changed recently, AFAIK.
Judging from the stack, I'd guess we're calling malloc_size_of() on something that's not a heap block.
Here's the full code for CacheEntry:
struct CacheEntry
{
nsAutoArrayPtr<char> data;
uint32_t size;
CacheEntry() : data(nullptr), size(0) { }
// Takes possession of buf
CacheEntry(char* buf, uint32_t len) : data(buf), size(len) { }
~CacheEntry()
{
}
size_t SizeOfExcludingThis(mozilla::MallocSizeOf mallocSizeOf) {
return mallocSizeOf(data);
}
};
Normally, the structure of a SizeOfExcludingThis() function is quite similar to the structure of a destructor, except that it measures heap blocks instead of freeing them. But this class isn't like that... ~CacheEntry() doesn't free |data|. In fact, I can't tell where |data| is freed.
Taras, Nathan, or Aaron might understand this code better.
Flags: needinfo?(n.nethercote)
Updated•11 years ago
|
Keywords: intermittent-failure,
sec-moderate
Comment 7•11 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #6)
> Normally, the structure of a SizeOfExcludingThis() function is quite similar
> to the structure of a destructor, except that it measures heap blocks
> instead of freeing them. But this class isn't like that... ~CacheEntry()
> doesn't free |data|. In fact, I can't tell where |data| is freed.
~nsAutoArrayPtr will take care of deleting the data.
We only ever build a CacheEntry with heap-allocated data:
http://mxr.mozilla.org/mozilla-central/source/startupcache/StartupCache.cpp#372
Hm, are we trying to get a memory report from a child that's already freed its startup cache? Perhaps this assert:
http://mxr.mozilla.org/mozilla-central/source/memory/mozjemalloc/jemalloc.c#4384
is firing because we've already freed the pointer?
Comment 8•11 years ago
|
||
We're pretty clearly operating on a bad pointer, but it really doesn't look like we've already freed the cache.
We could be hitting threading issues with startup cache clients: we have debug assertions about PutBuffer being on the correct thread but not release-mode checks.
It could also be that we're doing unsafe modification of the hashtable itself via callbacks in the CacheCloseHelper enumerator. I must admit I've lost track of what the CacheCloseHelper is actually supposed to be doing, and its calls seem unable to re-enter at first glance, but I'm not 100% sure of that.
I guess this was a release build that doesn't have assertions about threading or nested hashtable modification?
Comment 9•11 years ago
|
||
It seems like this is not a commonly occurring issues and that we don't have any reproducibility so since it's sec-moderate I'm not going to track this unless we see more activity on this signature.
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Updated•10 years ago
|
Group: core-security → core-security-release
Updated•8 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•