Closed
Bug 789397
Opened 11 years ago
Closed 9 years ago
searching virtual folder. crash in nsACString_internal::Assign due to out of range access from nsMsgSearchSession::GetNextUrl. Often compact related.
Categories
(Thunderbird :: Search, defect)
Tracking
(thunderbird36 fixed)
RESOLVED
FIXED
Thunderbird 36.0
Tracking | Status | |
---|---|---|
thunderbird36 | --- | fixed |
People
(Reporter: m_kato, Assigned: rkent)
References
()
Details
(Keywords: crash, Whiteboard: [gs])
Crash Data
Attachments
(1 file, 2 obsolete files)
5.98 KB,
patch
|
Irving
:
review+
|
Details | Diff | Splinter Review |
m_urlQueueIndex can be out of range by OnStopRunningUrl. This bug was filed from the Socorro interface and is report bp-a95c4a66-30fe-465b-8026-b8f612120904 . ============================================================= 0 xul.dll nsACString_internal::Assign xpcom/string/src/nsTSubstring.cpp:381 1 xul.dll nsACString_internal::Assign xpcom/string/src/nsTSubstring.cpp:347 2 xul.dll nsMsgSearchSession::GetNextUrl mailnews/base/search/src/nsMsgSearchSession.cpp:464 3 xul.dll nsMsgSearchSession::BuildUrlQueue mailnews/base/search/src/nsMsgSearchSession.cpp:446 4 xul.dll nsMsgSearchSession::TimerCallback mailnews/base/search/src/nsMsgSearchSession.cpp:502 5 xul.dll nsTimerImpl::Fire xpcom/threads/nsTimerImpl.cpp:473 6 xul.dll nsTimerEvent::Run xpcom/threads/nsTimerImpl.cpp:556 7 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:624 8 xul.dll NS_ProcessNextEvent_P objdir-tb/mozilla/xpcom/build/nsThreadUtils.cpp:213 9 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:113 10 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:201 11 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:175 12 xul.dll nsBaseAppShell::Run widget/xpwidgets/nsBaseAppShell.cpp:163 13 xul.dll nsAppShell::Run widget/windows/nsAppShell.cpp:232 14 xul.dll nsAppStartup::Run toolkit/components/startup/nsAppStartup.cpp:255 15 xul.dll XREMain::XRE_mainRun toolkit/xre/nsAppRunner.cpp:3793 16 xul.dll XREMain::XRE_main toolkit/xre/nsAppRunner.cpp:3870 17 xul.dll XRE_main toolkit/xre/nsAppRunner.cpp:3946 18 thunderbird.exe NS_internal_main mail/app/nsMailApp.cpp:200 19 thunderbird.exe wmain toolkit/xre/nsWindowsWMain.cpp:100 20 thunderbird.exe __tmainCRTStartup crtexe.c:552 21 kernel32.dll BaseProcessStart
Reporter | ||
Updated•11 years ago
|
Summary: crash in nsACString_internal::Assign due to out of range access in nsMsgSearchSession::BuildUrlQueue → crash in nsACString_internal::Assign due to out of range access
Comment 1•11 years ago
|
||
also bp-aea4e391-b47f-433f-b5d3-c67cc2120904 memcpy | nsACString_internal::Assign(char const*, unsigned int, mozilla::fallible_t const&) note not all such signatures fail from OnStopRunningUrl ... for example bp-ab62f29a-fb56-4f83-84c0-58d9f2120830
Crash Signature: [@ nsACString_internal::Assign(nsACString_internal const&, mozilla::fallible_t const&)] → [@ nsACString_internal::Assign(nsACString_internal const&, mozilla::fallible_t const&)]
[@ memcpy | nsACString_internal::Assign(char const*, unsigned int, mozilla::fallible_t const&) ]
Reporter | ||
Comment 2•11 years ago
|
||
Reporter | ||
Updated•11 years ago
|
Attachment #663903 -
Flags: review?(mconley)
Comment 3•11 years ago
|
||
Comment on attachment 663903 [details] [diff] [review] fix Review of attachment 663903 [details] [diff] [review]: ----------------------------------------------------------------- So this looks OK to me, but we might want some people more familiar with back-end to make sure I'm doing the right thing.
Attachment #663903 -
Flags: review?(mconley) → feedback+
Comment 4•11 years ago
|
||
Comment on attachment 663903 [details] [diff] [review] fix Irving: This looks like an OK solution, but it also might be a kind of end-of-pipe-y solution. I'm not sure though, seeing as this mailnews back-endy stuff isn't really my forte. Care to give this a look? -Mike
Attachment #663903 -
Flags: review?(irving)
Comment 5•11 years ago
|
||
Comment on attachment 663903 [details] [diff] [review] fix Review of attachment 663903 [details] [diff] [review]: ----------------------------------------------------------------- Do you have steps to reproduce this? With a bit of digging and help from :rkent I came up with a possible cause for the inconsistent value; I'll attach a Work In Progress patch for feedback. Whatever we decide to do with my WIP, adding the check from this patch is a good idea. ::: mailnews/base/search/src/nsMsgSearchSession.cpp @@ +462,5 @@ > return NS_OK; > > + // OnStopRunningUrl may set index to out of range > + if (m_urlQueueIndex >= m_urlQueue.Length()) > + return NS_OK; Because this is error checking for an unusual software state, I'd prefer to see NS_ENSURE_TRUE(m_urlQueueIndex < m_urlQueue.Length(), NS_ERROR_UNEXPECTED);
Attachment #663903 -
Flags: review?(irving) → review-
Comment 6•11 years ago
|
||
I'm not completely sure about this patch; it makes a significant change to the logic for when we call GetNextUrl() and it's possible that some of the removed calls are necessary
Attachment #665622 -
Flags: feedback?(mozilla)
Attachment #665622 -
Flags: feedback?(m_kato)
Attachment #665622 -
Flags: feedback?(kent)
Reporter | ||
Updated•11 years ago
|
Attachment #665622 -
Flags: feedback?(m_kato) → feedback+
Assignee | ||
Comment 7•11 years ago
|
||
Comment on attachment 665622 [details] [diff] [review] WIP - clean up logic for possible underlying cause of this crash I spent a bit of time looking at this. I think it is OK, but it is worth documenting why. My concern was what happens when we have needUrlQueue == true but queuedUrl == false. That is the condition that I believe leads to the crash in the first place. What happens is that in DoNextSearch we call SearchWOUrls(), which makes no sense because if needUrlQueue == true then you really can't search without a URL. What happens though is that you end up eventually in nsMsgSearchSession::TimeSliceSerial which tries to do the invalid search using: nsresult rv = scope->TimeSlice(aDone); which does on the relevant adapter: return m_adapter->Search(aDone); Both news and imap search return errors in that case, and we end up cleaning up with: if (*aDone || NS_FAILED(rv)) { EnableFolderNotifications(true); ReleaseFolderDBRef(); m_idxRunningScope++; EnableFolderNotifications(false); // check if the next scope is an online search; if so, // set *aDone to true so that we'll try to run the next // search in TimerCallback. scope = GetRunningScope(); if (scope && (scope->m_attribute == nsMsgSearchScope::onlineMail || (scope->m_attribute == nsMsgSearchScope::news && scope->m_searchServer))) { *aDone = true; return rv; } } That seems to me to be the correct cleanup. Note however that the above code is another place where you should implement needUrlQueue.
Attachment #665622 -
Flags: feedback?(kent) → feedback+
Comment 8•11 years ago
|
||
m_kato, are you interested in implementing Kent's suggestion? I don't have time to work on this right now.
Comment 9•11 years ago
|
||
Comment on attachment 665622 [details] [diff] [review] WIP - clean up logic for possible underlying cause of this crash So is this happening in mixed scope searches, e.g., a saved search that searches over both imap and local folders? I wonder if it's possible to recreate the crash and add a test. nit - extra blank line after needUrlQueue: +} + +
Attachment #665622 -
Flags: feedback?(mozilla) → feedback+
Comment 10•11 years ago
|
||
(In reply to Irving Reid (:irving) from comment #8) > m_kato, are you interested in implementing Kent's suggestion? I don't have > time to work on this right now.
Flags: needinfo?(m_kato)
Summary: crash in nsACString_internal::Assign due to out of range access → searching crash in nsACString_internal::Assign due to out of range access
Comment 11•11 years ago
|
||
(In reply to Irving Reid (:irving) from comment #8) > m_kato, are you interested in implementing Kent's suggestion? I don't have > time to work on this right now.
Comment 12•11 years ago
|
||
10 of 13 crash comments mention compact. Perhaps they are in unified inbox? For example bp-6c8da6f1-7158-4071-9be0-186052121111 bp-cf3fafb1-ca2a-4f89-a0d9-8c7bb2121115
Comment 13•11 years ago
|
||
so far, all the user I have contacted are using unified folder view - so the search is operating on a virtual folder.
Summary: searching crash in nsACString_internal::Assign due to out of range access → searching virtual crash in nsACString_internal::Assign due to out of range access
Comment 15•11 years ago
|
||
(In reply to David :Bienvenu from comment #9) > So is this happening in mixed scope searches, e.g., a saved search that > searches over both imap and local folders? Irving, Is there a clue that indicates both type folders are involved? Also, this hasn't picked been up yet by m_kato - perhaps won't be able to. Are you in a better position now to possibly finish it off? I am trying to get a user testcase.
Flags: needinfo?(irving)
Comment 16•11 years ago
|
||
With combined signatures, it's #36 top crasher in TB 17.0.2 so not a top crasher according to https://wiki.mozilla.org/CrashKill/Topcrash
Keywords: topcrash
Comment 17•11 years ago
|
||
sad to see this languish halfway (In reply to David :Bienvenu from comment #9) > Comment on attachment 665622 [details] [diff] [review] > WIP - clean up logic for possible underlying cause of this crash > > So is this happening in mixed scope searches, e.g., a saved search that > searches over both imap and local folders? I wonder if it's possible to > recreate the crash and add a test. I might have a user of Unified Folders in bp-b23140ab-c30d-4cdc-b84b-001062130211. We had a user n bug 739949, but that ship has passed. bp-4ed28f32-e756-4b07-8cdc-e761f2121115 uses Unified but cannot reproduce I don't think we can't depend on finding a user to move this bug along (In reply to Scoobidiver from comment #16) > With combined signatures, it's #36 top crasher in TB 17.0.2 so not a top > crasher according to https://wiki.mozilla.org/CrashKill/Topcrash I expect the sum of crashes makes this easily top 20-25 crash https://crash-stats.mozilla.com/query/query?product=Thunderbird&version=ALL%3AALL&range_value=1&range_unit=weeks&date=02%2F20%2F2013+14%3A14%3A52&query_search=signature&query_type=contains&query=nsACString_internal%3A%3AAssign&reason=&build_id=&process_type=any&hang_type=any&do_query=1
Comment 18•11 years ago
|
||
Hi, Wayne suggested I join this bug report because this crash has happened to me several times and so I might be able to help. I'm happy to try and help but I'm not a techie, so if you want to ask questions, etc. words of one syllable please.
Comment 19•11 years ago
|
||
(In reply to DaveL from comment #18) > Hi, > Wayne suggested I join this bug report because this crash has happened to me > several times and so I might be able to help. > I'm happy to try and help but I'm not a techie, so if you want to ask > questions, etc. words of one syllable please. Hi, I'm here for the same reason as DaveL. Furthermore, I think I might qualify as techie, so maybe I can cope with two syllable words wrt. tech stuff ;) So, please advise how I could help.
Assignee | ||
Comment 20•11 years ago
|
||
I suppose that I could finish this patch adding my suggestions from comment 7, but irving who would normally review probably should not review since it is his patch to begin with. We are sort of running out reviewers here (my other patches to Irving and Standard8 are at 13 days and counting now, so I am guessing they are probably absorbed in other Mozilla work now.) I could also do the review if someone else finished the patch. Would any qualified reviewer be willing to review if I finished the patch?
Comment 21•11 years ago
|
||
curious that coders no outstrip reviewers. :) Kent, given your deep understanding of the issues involved, I suggest plunging ahead to produce an updated patch and we'll be resourceful in getting the review done - maybe Neil? Or can magnus review this if you review his patch in bug 558452? :
Flags: needinfo?(m_kato) → needinfo?(neil)
Comment 22•11 years ago
|
||
Only DoNextSearch() calls BuildUrlQueue() and it does so only if the current (m_idxRunningScope) search scope term is online. BuildUrlQueue() then tries to add all of the remaining online search scope terms to the URL queue. We then run the URLs, skipping the search scope terms as we go! This looks wrong if it is possible to have a mixture of online and offline search scope terms. The only reason I can see for the crash is if no online terms have a search adapter. I don't know what a search adapter is, or what should happen if we hit such a search term. I agree with the comments regarding needUrlQueue though.
Flags: needinfo?(neil)
Comment 23•11 years ago
|
||
Just had another crash. Id: http://crash-stats.mozilla.com/report/index/bp-f64f74b2-3e64-4991-98af-74f0d2130226. Sequence: 1.delete item from inbox 2. Popup: do you want to compact now? 3. Yes: crash Running with unified view and with TB Conversations add-on. First for a while.
Updated•11 years ago
|
Flags: needinfo?(irving)
Comment 24•11 years ago
|
||
With combined signatures, it's #32 crasher in TB 17.0.5.
Keywords: topcrash
Comment 25•11 years ago
|
||
(In reply to Scoobidiver from comment #24) > With combined signatures, it's #32 crasher in TB 17.0.5. But there are more specific signatures than currently cited in the field. (ref comment 17) For example ... mozalloc_abort | NS_DebugBreak_P | nsACString_internal::Assign -- bp-d0725e1d-1240-4aa5-8a59-247c92130424 cites compact // bp-4409b8c6-a61c-4cde-842f-ad3262130410 emptying trash nsACString_internal::Assign -- bp-b2f19de7-2ce5-4d54-a3db-534bb2130417 compact mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::Assign(char const*, unsigned int) -- bp-c6f38f53-dffb-48f6-a99d-5a2692130426
Crash Signature: [@ nsACString_internal::Assign(nsACString_internal const&, mozilla::fallible_t const&)]
[@ memcpy | nsACString_internal::Assign(char const*, unsigned int, mozilla::fallible_t const&) ] → [@ nsACString_internal::Assign(nsACString_internal const&, mozilla::fallible_t const&)]
[@ memcpy | nsACString_internal::Assign(char const*, unsigned int, mozilla::fallible_t const&) ]
[@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::Assign ]
…
Keywords: topcrash
Comment 26•11 years ago
|
||
Kent, does comment 22 offer enough additional fodder for you to proceed? (hopefully)
Flags: needinfo?(kent)
Summary: searching virtual crash in nsACString_internal::Assign due to out of range access from nsMsgSearchSession::GetNextUrl. Some compact related → searching virtual crash in nsACString_internal::Assign due to out of range access from nsMsgSearchSession::GetNextUrl. Often compact related.
Assignee | ||
Comment 27•11 years ago
|
||
I've spent a couple of days now playing with the search session logic, trying to see if I am comfortable changing this (because if I change it, I'm afraid that I will own any future issues). Perhaps in agreement with comment 22, the logic of managing multiple URLs does not seem to make any sense to me. On each call to DoNextSearch where the current scope requires a URL, we append a list of all of the urls to the queue. Doesn't that mean that if there is a random mix of scopes that need or don't need urls, then we will add the urls multiple times? It also seems like there is the assumption in OnStopRunningUrl that the current running scope & adapter match the url, yet it seems like urls are pulled from the queue independently of the running scope. I've spent two days trying to figure out how this is supposed to work, and can't convince myself that it does. What I am tempted to do (and have a patch for) is to strip the whole idea of a url queue, and just have a single active url that matches the scope and adapter. Does anyone understand this better than I do, to convince that is a bad idea?
Flags: needinfo?(kent)
Comment 28•11 years ago
|
||
Kent thanks for the analysis! (In reply to Kent James (:rkent) from comment #27) > I've spent two days trying to figure out how this is supposed to work, and > can't convince myself that it does. What I am tempted to do (and have a > patch for) is to strip the whole idea of a url queue, and just have a single > active url that matches the scope and adapter. > > Does anyone understand this better than I do, to convince that is a bad idea? There being no objection ... do you want to float the patch further?
Flags: needinfo?(kent)
Updated•11 years ago
|
tracking-thunderbird24:
--- → ?
Comment 29•11 years ago
|
||
Not a significant enough crasher to track specifically for 24, but we'll obviously take a patch if we get it.
tracking-thunderbird24:
? → ---
Comment 30•11 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #28) > Kent thanks for the analysis! > > (In reply to Kent James (:rkent) from comment #27) > > I've spent two days trying to figure out how this is supposed to work, and > > can't convince myself that it does. What I am tempted to do (and have a > > patch for) is to strip the whole idea of a url queue, and just have a single > > active url that matches the scope and adapter. > > > > Does anyone understand this better than I do, to convince that is a bad idea? > > There being no objection ... do you want to float the patch further? m_kato, might you be interested in finishing patch?
Flags: needinfo?(m_kato)
Comment 31•11 years ago
|
||
https://getsatisfaction.com/mozilla_messaging/tags/bug_789397 reports crashing on every automatic and manual compact
Whiteboard: [gs]
Comment 32•11 years ago
|
||
(marginally top 20, so still topcrash. skiplisting nsACString_internal::Assign would really nice)
Crash Signature: [@ nsACString_internal::Assign(nsACString_internal const&, mozilla::fallible_t const&)]
[@ memcpy | nsACString_internal::Assign(char const*, unsigned int, mozilla::fallible_t const&) ]
[@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::Assign ]
… → [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::Assign ]
[@ nsACString_internal::Assign ]
[@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::Assign(char const*, unsigned int) ]
[@ nsACString_internal::Assign(nsACSt…
Whiteboard: [gs] → [patchlove][has draft patch][gs]
Comment 33•10 years ago
|
||
bp-7f138340-4e13-4330-be7f-f21c82130717 David reports using Unified Folders view and "crashes only occur when Thunderbird initiates the process. [i.e. automatic compact triggered by an immediately preceding delete message] When Thunderbird is still - not pulling in any emails - I can compact the folders myself, and 99% of the time, the process completes"
Comment 34•10 years ago
|
||
(In reply to :Irving Reid from comment #6) > Created attachment 665622 [details] [diff] [review] > WIP - clean up logic for possible underlying cause of this crash > > I'm not completely sure about this patch; it makes a significant change to > the logic for when we call GetNextUrl() and it's possible that some of the > removed calls are necessary Irving, have subsequent patch revisions, or comments satisfied your concerns on this point? I'd like to do this as a try build with some users who can reproduce. If the patch is flawed, what ill effects should user watch for? (In reply to David :Bienvenu from comment #9) > Comment on attachment 665622 [details] [diff] [review] > WIP - clean up logic for possible underlying cause of this crash > > So is this happening in mixed scope searches, e.g., a saved search that > searches over both imap and local folders? "Yes" would be my answer - many of the crash users are using unified folder view. (However, I have not determined if 100% of cases are mixed scope) > I wonder if it's possible to recreate the crash and add a test. If you mean users, we have users who can reproduce.
Flags: needinfo?(irving)
Keywords: topcrash → topcrash-thunderbird
Updated•10 years ago
|
Summary: searching virtual crash in nsACString_internal::Assign due to out of range access from nsMsgSearchSession::GetNextUrl. Often compact related. → searching virtual folder. crash in nsACString_internal::Assign due to out of range access from nsMsgSearchSession::GetNextUrl. Often compact related.
Comment 35•10 years ago
|
||
This patch needs someone to integrate Kent's remarks; at this point I don't have time to put further work into this.
Flags: needinfo?(irving)
Assignee | ||
Comment 36•10 years ago
|
||
Wayne asked today in the TB meeting if someone would take a look at this, and I agreed to look at it again.
Flags: needinfo?(kent)
Assignee | ||
Comment 37•10 years ago
|
||
So here's my theory. The urlQueue was included to support running multiple URLs in parallel, but in fact that was never actually implemented. The logic to support those queues has issues. So I just eliminated the queues and only use a single URL. I'm not real happy with the error handling in all of this, but I did not really change this. Evidence seems to be that the error happens in cases that are likely to be generating errors. In error cases, the errors seem to be dropped, which would result in the search never terminating. I suppose that is better than crashing.
Attachment #8460753 -
Flags: review?(irving)
Comment 38•10 years ago
|
||
Comment on attachment 8460753 [details] [diff] [review] 789397_v3.patch Review of attachment 8460753 [details] [diff] [review]: ----------------------------------------------------------------- removing unnecessary complication makes me almost as happy as these fresh local plums i'm eating
Attachment #8460753 -
Flags: review?(irving) → review+
Comment 40•9 years ago
|
||
for unknown reason - no longer topcrash signature for version 24 is nsACString_internal::Assign(nsACString_internal const&, mozilla::fallible_t const&) | nsACString_internal::Assign(nsACString_internal const&) | nsMsgSearchSession::GetNextUrl() and in version 31 there is some OOM | large | NS_ABORT_OOM(unsigned int) | nsACString_internal::Assign(nsACString_internal const&) | nsMsgSearchSession::GetNextUrl()
Crash Signature: , unsigned int) ]
[@ nsACString_internal::Assign(nsACString_internal const&, mozilla::fallible_t const&)]
[@ memcpy | nsACString_internal::Assign(char const*, unsigned int, mozilla::fallible_t const&) ] → , unsigned int) ]
[@ nsACString_internal::Assign(nsACString_internal const&, mozilla::fallible_t const&)]
[@ memcpy | nsACString_internal::Assign(char const*, unsigned int, mozilla::fallible_t const&) ]
[@ nsACString_internal::Assign(nsACString_interna…
Flags: needinfo?(m_kato)
Keywords: topcrash-thunderbird
Whiteboard: [patchlove][has draft patch][gs] → [gs]
Updated•9 years ago
|
Assignee: nobody → kent
Status: NEW → ASSIGNED
Assignee | ||
Comment 41•9 years ago
|
||
Checked in https://hg.mozilla.org/comm-central/rev/33d45705a4d1
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 36.0
Attachment #663903 -
Attachment is obsolete: true
Attachment #665622 -
Attachment is obsolete: true
Updated•9 years ago
|
status-thunderbird36:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•