1432615 - Crash in shutdownhang | mozilla::detail::HashKnownLength<T>

Reporter

Description

•

7 years ago

This bug was filed from the Socorro interface and is report bp-bbe2bfc9-6744-4686-aeac-b252d1180123. ============================================================= Top 5 frames of crashing thread: 0 xul.dll mozilla::detail::HashKnownLength<unsigned char> obj-firefox/dist/include/mozilla/HashFunctions.h:226 1 xul.dll js::HashableValue::setValue js/src/builtin/MapObject.cpp:43 2 xul.dll js::MapObject::has_impl js/src/builtin/MapObject.cpp:693 3 xul.dll js::MapObject::has js/src/builtin/MapObject.cpp:704 4 @0x2cac4834eea =============================================================

Dão Gottwald [:dao]

Updated

•

7 years ago

Component: Tabbed Browser → JavaScript Engine

Product: Firefox → Core

Worcester12345

Reporter

Updated

•

7 years ago

Keywords: crash, crashreportid

Jason Orendorff [:jorendorff]

Comment 1

•

7 years ago

A non-opt stack would include more frames. I think like this: HashableValue::setValue -> AtomizeString -> js::AtomHasher::Lookup::Lookup(const JSAtom*) -> mozilla::HashString() -> mozilla::HashString(const unsigned char* aStr, size_t aLength) -> mozilla::detail::HashKnownLength<unsigned char>(unsigned char const*, unsigned __int64) This is super rare and probably the Value is already corrupt by the time we get here, so we have nothing to go on. Not actionable for us in Core::JavaScript land.

Jason Orendorff [:jorendorff]

Comment 2

•

7 years ago

There is a more common stack that crashes with the same signature: either mozilla::net::CacheStorageService::DropPrivateBrowsingEntries() or mozilla::net::CacheStorageService::Clear() -> mozilla::net::CacheStorageService::DoomStorageEntries(nsACString_internal const&, nsILoadContextInfo*, bool, bool, nsICacheEntryDoomCallback*) -> mozilla::net::RemoveExactEntry -> nsRefPtrHashtable<nsCStringHashKey, mozilla::net::CacheEntry>::Get(nsTSubstring<char> const&, mozilla::net::CacheEntry**) -> PLDHashTable::Search(void const*) -> mozilla::detail::HashKnownLength<unsigned char>(unsigned char const*, unsigned __int64) For example, bp-4ea210fa-c6a0-4799-8745-3da2e0180213.

Component: JavaScript Engine → Networking: Cache

Honza Bambas (:mayhemer)

Comment 3

•

7 years ago

Has this appeared recently? Since this code is really old.

Assignee: nobody → honzab.moz

Priority: -- → P3

Whiteboard: [necko-triaged]

Worcester12345

Reporter

Comment 4

•

7 years ago

(In reply to Honza Bambas (:mayhemer) from comment #3) > Has this appeared recently? Since this code is really old. 23 days ago.

Honza Bambas (:mayhemer)

Comment 5

•

7 years ago

Nicholas, have we changed something in our hashtable hashing algos (string keys) or anything that has recently changed and could lead to major slowdown of Get() ?

Flags: needinfo?(n.nethercote)

Nicholas Nethercote [inactive]

Comment 6

•

7 years ago

(In reply to Honza Bambas (:mayhemer) from comment #5) > Nicholas, have we changed something in our hashtable hashing algos (string > keys) or anything that has recently changed and could lead to major slowdown > of Get() ? I'm not an MFBT reviewer, but `hg blame` says that mfbt/HashFunctions.h hasn't changed since July 2017, and mfbt/HashFunctions.cpp hasn't changed since July 2014. It's a strange place to have a hang. HashKnownLength rips through memory pretty quickly. If one of its parameters was bogus -- e.g. a super large length -- you'd think that it would crash by hitting inaccessible memory.

Flags: needinfo?(n.nethercote)

Honza Bambas (:mayhemer)

Comment 7

•

7 years ago

Yep, I think as well this is probably a swapping problem... Question is, would doing just a simple Clear() on the hashtable help? It would also mean to rething and rewrite some stuff inside the top layers of cache2... Do you think using hashes (fast) as keys would help?

Nicholas Nethercote [inactive]

Comment 8

•

7 years ago

(In reply to Honza Bambas (:mayhemer) from comment #7) > Yep, I think as well this is probably a swapping problem... You say "as well" but I didn't say that I thought it was a swapping problem. I don't think there's enough information in the crash reports to draw any conclusiosn. > Question is, would doing just a simple Clear() on the hashtable help? It > would also mean to rething and rewrite some stuff inside the top layers of > cache2... > > Do you think using hashes (fast) as keys would help? I don't know what theoretical problem either of these suggestions would address.

Honza Bambas (:mayhemer)

Comment 9

•

7 years ago

Ahh!(In reply to Nicholas Nethercote [:njn] from comment #8) > (In reply to Honza Bambas (:mayhemer) from comment #7) > > Yep, I think as well this is probably a swapping problem... > > You say "as well" but I didn't say that I thought it was a swapping problem. > I don't think there's enough information in the crash reports to draw any > conclusiosn. Ahh! Yes, didn't read carefully, sorry. We have the strings under our control, but they contain full URLs. Extra longs URLs are possible, but not likely near 2^32. I believe we would OOM somewhere else before. This is a shutdown hang. So, something takes too long time. The code does 2 nested iterations, where the top level is expected to be no more than three items. The seconds level can be large. For each item of the second level we do a lookup+delete in another hashtable with lower magnitude The number of items in those hashtables can be one factor, other could be that the memory is swapping on a heavily loaded system. From the report it's hard to make a conclusion and in the past with similar reports we were never able to conclude what's the cause. If there were an actionable problem, it usually was a bug like indefinite or buggy loop. It's not likely in this case, since the code has not changed for a very long time. > > > Question is, would doing just a simple Clear() on the hashtable help? It > > would also mean to rething and rewrite some stuff inside the top layers of > > cache2... > > > > Do you think using hashes (fast) as keys would help? > > I don't know what theoretical problem either of these suggestions would > address. I was thinking about lowering the memory footprint here and thus lower the probability of this piece of code taking too long to exec during shutdown, but that's a wild guess only.

Sylvestre Ledru [:Sylvestre]

Updated

•

7 years ago

Status: UNCONFIRMED → ASSIGNED

Ever confirmed: true

Honza Bambas (:mayhemer)

Updated

•

7 years ago

Status: ASSIGNED → NEW

Honza Bambas (:mayhemer)

Comment 11

•

7 years ago

This has a low crash rate (~1.5 crash a day) and is mostly seen only on ESR. I can see 30% crashes are from mozilla::net::CacheStorageService::DropPrivateBrowsingEntries() and the rest from cache2.clear() at the shutdown time. Both could be somehow optimized to just drop-all and not walk the memory entries tables entry-by-entry. Anyway, at the moment I'm returning this to the unassigned pool, holding this bug for too long and no time soon to work on it.

Assignee: honzab.moz → nobody

Wayne Mery (:wsmwk)

Comment 12

•

5 years ago

Seem this is gone - no crashes in past month other than 52.xx ESR

Status: NEW → RESOLVED

Closed: 5 years ago

Resolution: --- → WORKSFORME

Bugzilla

Crash in shutdownhang | mozilla::detail::HashKnownLength<T>

Categories

(Core :: Networking: Cache, defect, P3)

Tracking

()

People

(Reporter: worcester12345, Unassigned)

References

Details

(Keywords: crash, crashreportid, Whiteboard: [necko-triaged])

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Updated

Comment 11

Comment 12