Startup crash in [@ std::_Func_impl_no_alloc<T>::_Do_call | nsFolderCompactState::OnStopRunningUrl]
Categories
(MailNews Core :: Networking: IMAP, defect)
Tracking
(thunderbird_esr102+ verified, thunderbird107 wontfix, thunderbird108 verified)
People
(Reporter: wsmwk, Assigned: benc)
References
()
Details
(4 keywords, Whiteboard: [TM:102.6.1])
Crash Data
Attachments
(2 files)
50.48 KB,
image/png
|
Details | |
Bug 1797616 - Pass expungedBytes as parameter to nsFolderCompactState completion function. r=mkmelin
48 bytes,
text/x-phabricator-request
|
wsmwk
:
approval-comm-esr102+
|
Details | Review |
+++ This bug was initially created as a clone of Bug #1766101 +++
Sharp uptick of crashes in 107.0b1. Regression of bug 1793599 ?
Crash report: https://crash-stats.mozilla.org/report/index/84695245-098b-415a-a750-9a6190221026
Reason: EXCEPTION_ACCESS_VIOLATION_READ
Top 10 frames of crashing thread:
0 xul.dll std::_Func_impl_no_alloc<`lambda at /builds/worker/checkouts/gecko/comm/mailnews/base/src/nsMsgFolderCompactor.cpp:1331:25', void, nsresult>::_Do_call mailnews/base/src/nsMsgFolderCompactor.cpp:103
1 xul.dll nsFolderCompactState::FinishCompact mailnews/base/src/nsMsgFolderCompactor.cpp:644
2 xul.dll nsFolderCompactState::OnStopRequest mailnews/base/src/nsMsgFolderCompactor.cpp:684
3 xul.dll nsMsgProtocol::OnStopRequest mailnews/base/src/nsMsgProtocol.cpp:390
4 xul.dll nsMailboxProtocol::OnStopRequest mailnews/local/src/nsMailboxProtocol.cpp:389
5 xul.dll nsInputStreamPump::OnStateStop netwerk/base/nsInputStreamPump.cpp:670
6 xul.dll nsInputStreamPump::OnInputStreamReady netwerk/base/nsInputStreamPump.cpp:395
7 xul.dll CallbackHolder::CallbackHolder::<lambda_1>::operator const xpcom/io/nsPipe3.cpp:73
7 xul.dll NS_NewCancelableRunnableFunction<`lambda at /builds/worker/checkouts/gecko/xpcom/io/nsPipe3.cpp:71:35'>::FuncCancelableRunnable::Run xpcom/threads/nsThreadUtils.h:650
8 xul.dll mozilla::RunnableTask::Run xpcom/threads/TaskController.cpp:538
Mac Crash report: https://crash-stats.mozilla.org/report/index/b18025bb-cb36-43fa-9f6d-9d5a50221026
Reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Top 10 frames of crashing thread:
0 XUL nsFolderCompactState::ExpungedBytes const mailnews/base/src/nsMsgFolderCompactor.cpp:103
0 XUL nsMsgFolderCompactor::NextFolder const mailnews/base/src/nsMsgFolderCompactor.cpp:1340
0 XUL std::__1::__invoke<nsMsgFolderCompactor::NextFolder /builds/worker/fetches/MacOSX11.3.sdk/usr/include/c++/v1/type_traits:3747
0 XUL std::__1::__invoke_void_return_wrapper<void>::__call<nsMsgFolderCompactor::NextFolder /builds/worker/fetches/MacOSX11.3.sdk/usr/include/c++/v1/__functional_base:348
0 XUL std::__1::__function::__alloc_func<nsMsgFolderCompactor::NextFolder /builds/worker/fetches/MacOSX11.3.sdk/usr/include/c++/v1/functional:1553
0 XUL std::__1::__function::__func<nsMsgFolderCompactor::NextFolder /builds/worker/fetches/MacOSX11.3.sdk/usr/include/c++/v1/functional:1727
1 XUL std::__1::__function::__value_func<void const /builds/worker/fetches/MacOSX11.3.sdk/usr/include/c++/v1/functional:1880
1 XUL std::__1::function<void const /builds/worker/fetches/MacOSX11.3.sdk/usr/include/c++/v1/functional:2555
1 XUL nsFolderCompactState::FinishCompact mailnews/base/src/nsMsgFolderCompactor.cpp:644
2 XUL nsFolderCompactState::OnStopRequest mailnews/base/src/nsMsgFolderCompactor.cpp:684
Reporter | ||
Comment 1•2 years ago
|
||
Note, this is a topcrash, whereas Bug 1766101 is not.
Reporter | ||
Updated•2 years ago
|
Comment 2•2 years ago
|
||
Seems it for some reason doesn't like the call to uint64_t ExpungedBytes() const { return m_totalExpungedBytes; }
?
Comment 3•2 years ago
|
||
The crash address is always 0x48 on 32-bit builds and 0x70 on 64-bit ones, it's likely that the nsFolderCompactState
instance ExpungedBytes()
is being called upon is NULL.
Assignee | ||
Comment 4•2 years ago
|
||
Updated•2 years ago
|
Assignee | ||
Comment 5•2 years ago
•
|
||
My best guess is that a Compact()
call is failing, but still (partially) kicking off iteration over the messages (which is handed by the message copy system). So when the Compact()
returns the failure, it releases it's reference to the nsFolderCompactState
:
... but the nsFolderCompactState is still there (probably held in existence by the message copy system, for which it is a listener). When the copy completes (or fails), the completion function is invoked, which tries to get the expunged bytes from the nsFolderCompactState, using the (now null!) pointer in nsMsgFolderCompactor
. The nsFolderCompactState
still exists (it's the thing that is calling the completion fn), but we're trying to access it via a null pointer.
This patch bypasses all that by passing in the expunged bytes as a parameter to completionFn
, so there is no need to access the nsFolderCompactState
in any case. Even if this patch doesn't fix things, I think it's more logical.
SO...
I say we land this patch and see if it helps.
The proper solution is to rewrite the folder compaction code. It's waaaay too convoluted and complex, and there are so many places where the error handling will just fall flat on it's face. I do plan to do this once I've sorted out the rest of the mbox code properly (Bug 1719121).
Assignee | ||
Updated•2 years ago
|
Updated•2 years ago
|
Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/5837935c3864
Pass expungedBytes as parameter to nsFolderCompactState completion function. r=mkmelin
Comment 7•2 years ago
|
||
This is crashing comm/mailnews/imap/test/unit/test_offlineStoreLocking.js
Updated•2 years ago
|
Assignee | ||
Comment 9•2 years ago
|
||
Patch tweaked and uploaded. Passes test_offlineStoreLocking.js fine locally now, but I just kicked off a try run to go with it:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=4fe001322da9e1090032e0242ebf2b735cbf3a3b
Assignee | ||
Updated•2 years ago
|
Comment 10•2 years ago
|
||
Pushed by alessandro@thunderbird.net:
https://hg.mozilla.org/comm-central/rev/4a5e88f6613a
Pass expungedBytes as parameter to nsFolderCompactState completion function. r=mkmelin
Comment 11•2 years ago
|
||
Backed out again for build bustages across the board.
Backout: https://hg.mozilla.org/comm-central/rev/9daf437f80b8cc28195d8127ceb69a3489156f53
Assignee | ||
Comment 12•2 years ago
|
||
This fix is cursed :-)
This bustage totally confused me (it worked fine in a try build a day or two ago)...
Turns out to be a simple case of 24 hour self-sabotage bitrot: I did a fix in Bug 1798181 around the same time which clashed with this patch.
Updated in phab, pending this new try build:
Comment 13•2 years ago
|
||
Third time a charm? :) Try looks ok so will land it now.
Comment 14•2 years ago
|
||
Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/9ef9ebcf17b5
Pass expungedBytes as parameter to nsFolderCompactState completion function. r=mkmelin
Reporter | ||
Comment 15•2 years ago
•
|
||
(In reply to Wayne Mery (:wsmwk) from comment #1)
Note, this is a topcrash, whereas Bug 1766101 is not.
I think that should have been written as "this is a topcrash for BETA (starting at 107.0b1), whereas Bug 1766101 is not."
However,
- we haven't identified the cause of the big increase at 107.0b1, the regressing bug/patch for this bug, unless Bug 1793599 - IMAP to fastmail fails after erasing ImapMail folder - is the primary suspect and which BTW was uplifted to 102.4.2
- Bug 1766101 is a topcrash for 102 (but it is not marked as such - somwhere along the way I goofed) - ranking #7, #34, etc - which means we really should do something for 102 if possible
- #2 above is an entaglement which makes it more difficult to use crash-stats to determine both the cause of #1 and the effectiveness of the patch in this bug
Ben, Gene, thoughts?
(In reply to Ben Campbell from comment #5)
My best guess is that a
Compact()
call is failing, but still (partially) kicking off iteration over the messages (which is handed by the message copy system). So when theCompact()
returns the failure, it releases it's reference to thensFolderCompactState
:
...
The proper solution is to rewrite the folder compaction code. It's waaaay too convoluted and complex, and there are so many places where the error handling will just fall flat on it's face. I do plan to do this once I've sorted out the rest of the mbox code properly (Bug 17191210).
You mean Bug 1719121 ?
Reporter | ||
Comment 16•2 years ago
|
||
But if beta crashes significantly drop starting with 108.0b1 (nightly crashes are virtually zero, so nightly stats are not helpful), then perhaps we have a winner that can be uplifted to 102.
Updated•2 years ago
|
Reporter | ||
Comment 17•2 years ago
|
||
We have a winner. Crashes end starting with 108 beta for at least three signatures:
Average user crashes 5-10 times.
Unfortunately the reporter of https://support.mozilla.org/en-US/questions/1381616 never replied.
A new report https://support.mozilla.org/en-US/questions/1399437
Reporter | ||
Comment 18•2 years ago
|
||
Let's get this for 102.6.1
Reporter | ||
Comment 19•2 years ago
|
||
Comment on attachment 9300670 [details]
Bug 1797616 - Pass expungedBytes as parameter to nsFolderCompactState completion function. r=mkmelin
[Approval Request Comment]
Regression caused by (bug #): was never clearly labeled as a regression
User impact if declined: user crashes
Testing completed (on c-c, etc.): on beta for over 30 days
Risk to taking this patch (and alternatives if risky): ?
Reporter | ||
Comment 20•2 years ago
|
||
Comment on attachment 9300670 [details]
Bug 1797616 - Pass expungedBytes as parameter to nsFolderCompactState completion function. r=mkmelin
[Triage Comment]
Approved for esr102
Risk to taking this patch (and alternatives if risky): ?
FWIW, no major new crashes or other regressions have been reported on beta.
Comment 21•2 years ago
|
||
bugherder uplift |
Thunderbird 102.6.1:
https://hg.mozilla.org/releases/comm-esr102/rev/19933ae13c78
Reporter | ||
Comment 22•2 years ago
|
||
Thanks to Ben...
I suspect this also fixed signature nsFolderCompactState::OnStopRunningUrl whose crashes ended as of 102.6.1
https://crash-stats.mozilla.org/signature/?proto_signature=~nsFolderCompactState&product=Thunderbird&date=%3E%3D2022-11-04T09%3A38%3A00.000Z&date=%3C2023-02-04T09%3A38%3A00.000Z&_sort=-date&signature=nsFolderCompactState%3A%3AOnStopRunningUrl
Reporter | ||
Comment 23•2 years ago
|
||
Also gone in same general time frame is Mac+Linux crash signature nsFolderCompactState::ExpungedBytes, which is why I am noting it this bug report. But it probably doesn't have the same cause because
- its occurrence ends after 102.4.2 and 107.0b3, which doesn't match to the landing of this patch.
- it seems crashing from something js related
https://crash-stats.mozilla.org/signature/?proto_signature=~nsFolderCompactState&product=Thunderbird&signature=nsFolderCompactState%3A%3AExpungedBytes&date=%3E%3D2022-08-04T14%3A44%3A00.000Z&date=%3C2023-02-04T14%3A44%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_columns=startup_crash&_sort=-platform&_sort=-date&page=1#summary
Regardless, nsFolderCompactState::ExpungedBytes is gone.
Description
•