Crash in [@ shutdownhang | NtQueryAttributesFile]. CacheFileIOManager because SyncRemoveAllCacheFiles takes too long to Clear Cache on Shutdown for users with clear cache history on close
Categories
(Core :: Networking: Cache, defect, P2)
Tracking
()
People
(Reporter: ehsan.akhgari, Assigned: valentin)
References
(Blocks 1 open bug)
Details
(Keywords: crash, nightly-community, topcrash, Whiteboard: [tbird crash][necko-triaged])
Crash Data
Updated•8 years ago
|
Comment 1•8 years ago
|
||
Comment 2•8 years ago
|
||
Comment 3•8 years ago
|
||
Comment 4•8 years ago
|
||
Comment 5•8 years ago
|
||
Updated•8 years ago
|
Comment 6•8 years ago
|
||
Comment 7•8 years ago
|
||
Comment 8•8 years ago
|
||
Comment 9•7 years ago
|
||
Comment 10•7 years ago
|
||
Comment 11•7 years ago
|
||
Comment 12•7 years ago
|
||
Comment 13•7 years ago
|
||
Comment 14•7 years ago
|
||
Updated•7 years ago
|
Comment 16•7 years ago
|
||
Updated•7 years ago
|
Comment 17•7 years ago
|
||
Comment 18•7 years ago
|
||
Updated•7 years ago
|
Updated•7 years ago
|
Comment 19•7 years ago
|
||
Comment 20•7 years ago
|
||
Comment 21•7 years ago
|
||
Comment 22•7 years ago
|
||
Updated•7 years ago
|
Comment 23•7 years ago
|
||
Updated•7 years ago
|
Comment 25•7 years ago
|
||
Comment 26•7 years ago
|
||
Comment 27•7 years ago
|
||
Comment 28•7 years ago
|
||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 31•5 years ago
|
||
Removed signature for crashes occurring only on very old versions.
Comment 32•4 years ago
|
||
According to the crash reports, I have attempted reproducing this crash on the versions with the highest percentage (Release v85.0, v85.0.1) on Windows 10 by doubling the limit of the cache (with pref browser.cache.disk.capacity) and then navigating on ~10 different websites for at least a minute to attempt to collect enough cache and then simply close the browser to expect a crash at shutdown. I wasn't able to reproduce it in a few tries.
Do we have some kind of test case for this issue? At least a more reliable set of steps to reproduce would be nice in order to investigate it.
Can you give me some tips on how it'd be best to attempt its reproduction? Thank you.
Comment 33•4 years ago
•
|
||
(In reply to Bodea Daniel [:danibodea] from comment #32)
Do we have some kind of test case for this issue? At least a more reliable set of steps to reproduce would be nice in order to investigate it.
Can you give me some tips on how it'd be best to attempt its reproduction? Thank you.
We do not have test cases for shutdown hangs in general, unfortunately. We might want to understand from the crash reports if there are places we want to add additional diagnostics in order to make this more actionable.
Comment 34•3 years ago
|
||
Putting back in triage for priority review.
Also NI'ing myself to revisit later this week.
Assignee | ||
Comment 36•2 years ago
|
||
There is a good chance bug 1705676 will fix this. Let's see how it works on Nightly.
Comment 37•2 years ago
|
||
The severity field for this bug is set to S3. However, the bug has the topcrash
keyword.
:valentin, could you consider increasing the severity of this top-crash bug? If the crash isn't "top" anymore, could you drop the topcrash
keyword?
For more information, please visit auto_nag documentation.
Assignee | ||
Updated•2 years ago
|
Comment 38•2 years ago
|
||
As Valentin indicated by linking to bug 1786256, this should be solved by bug 1786256
Comment 39•2 years ago
|
||
IIUC the instances I see now on crash-stats are a different thing - something is happening with patched_LdrLoadDll
that makes us block.
I have no idea what this is and if we can do anything about this, but it looks different from this bug.
Comment 40•2 years ago
|
||
Hmm. It looks like our patched function might be a red herring there; we're already well into the "real" LdrLoadDll by the time the crash is triggered. I don't have access to minidumps anymore, but both the hang in NtQueryAttributesFile and also the LoadLibrary call happening from inside of PeekMessage during shutdown seem a little bit odd. I'm wondering if there's some really terrible I/O situation happening, or if something is trying to have hooks in user32, or both.
Assignee | ||
Comment 41•2 years ago
|
||
We should see the crash numbers go down in 109 once bug 1786256 rides the trains.
Comment 42•2 years ago
|
||
The signature [@ shutdownhang | NtQueryAttributesFile ]
shows a fair amount of hangs where we seem to do stub_LdrLoadDll
intercepting a PeekMessageW
. I cannot really see what kind of DLL we try to load here, but I'd assume, it is not really cache related.
Comment 43•2 years ago
|
||
(In reply to Valentin Gosu [:valentin] (he/him) [PTO until Jan 2nd] from comment #41)
We should see the crash numbers go down in 109 once bug 1786256 rides the trains.
There must be some statistics by now. I don't know how to view/get these. Can you post something on this?
Assignee | ||
Comment 44•2 years ago
•
|
||
This was fixed by bug 1786256.
I think back-porting all of the background tasks work to ESR is too difficult.
(In reply to Jens Stutte [:jstutte] from comment #42)
The signature
[@ shutdownhang | NtQueryAttributesFile ]
shows a fair amount of hangs where we seem to dostub_LdrLoadDll
intercepting aPeekMessageW
. I cannot really see what kind of DLL we try to load here, but I'd assume, it is not really cache related.
I have filed bug 1809655 for the remaining crashes with this signature.
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Updated•2 years ago
|
Description
•