Closed Bug 770262 Opened 13 years ago Closed 12 years ago

crash in mozalloc_abort ... nsMsgSearchTerm::MatchBody

Categories

(MailNews Core :: Search, defect)

x86
Windows NT
defect
Not set
critical

Tracking

(thunderbird15+ fixed, thunderbird16+ fixed, thunderbird17 fixed)

RESOLVED FIXED
Thunderbird 18.0
Tracking Status
thunderbird15 + fixed
thunderbird16 + fixed
thunderbird17 --- fixed

People

(Reporter: wsmwk, Assigned: rkent)

References

()

Details

(Keywords: crash, regression, topcrash+, Whiteboard: [gs][gssolved])

Crash Data

Attachments

(1 file, 1 obsolete file)

This bug was filed from the Socorro interface and is report bp-cf3afdb6-5bb7-4c04-9042-080e12120515 . ============================================================= 15.0a1 is where crash first appears. And seen in 15.0a2 20120630 16.0a1 20120629 0 mozalloc.dll mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:79 1 xul.dll NS_DebugBreak_P xpcom/base/nsDebugImpl.cpp:404 2 xul.dll nsACString_internal::SetCapacity xpcom/string/src/nsTSubstring.cpp:565 3 xul.dll nsACString_internal::SetLength xpcom/string/src/nsTSubstring.cpp:614 4 xul.dll nsMsgSearchTerm::MatchBody mailnews/base/search/src/nsMsgSearchTerm.cpp:993 5 xul.dll nsMsgSearchOfflineMail::ProcessSearchTerm mailnews/base/search/src/nsMsgLocalSearch.cpp:529 6 xul.dll nsMsgSearchBoolExpression::OfflineEvaluate mailnews/base/search/src/nsMsgLocalSearch.cpp:172 7 xul.dll nsMsgSearchBoolExpression::OfflineEvaluate mailnews/base/search/src/nsMsgLocalSearch.cpp:193 8 xul.dll nsMsgSearchBoolExpression::OfflineEvaluate mailnews/base/search/src/nsMsgLocalSearch.cpp:193 9 xul.dll nsMsgSearchOfflineMail::MatchTerms mailnews/base/search/src/nsMsgLocalSearch.cpp:720 10 xul.dll nsMsgSearchOfflineMail::MatchTermsForSearch mailnews/base/search/src/nsMsgLocalSearch.cpp:364 11 xul.dll nsMsgSearchOfflineMail::Search mailnews/base/search/src/nsMsgLocalSearch.cpp:770 12 xul.dll nsMsgSearchScopeTerm::TimeSlice mailnews/base/search/src/nsMsgSearchTerm.cpp:1921 13 xul.dll nsMsgSearchSession::TimeSliceSerial mailnews/base/search/src/nsMsgSearchSession.cpp:676 14 xul.dll nsMsgSearchSession::TimerCallback mailnews/base/search/src/nsMsgSearchSession.cpp:524 15 xul.dll nsTimerImpl::Fire xpcom/threads/nsTimerImpl.cpp:508
#1 crash in dailies
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] → [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ]
Summary: crash in mozalloc_abort → crash in mozalloc_abort ... nsMsgSearchTerm::MatchBody
Mark - suggest this is needs to be fixed for TB15 - #2 crash for tb15 beta (if one excludes #2 crash arena_dalloc | libsystem_c.dylib@0x2d8c7 coming from a single Mac user)
Keywords: topcrash
suspect also Mac crash TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetLength bp-43aa4258-b5f2-41d4-aa51-1d28b2120806
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ] → [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ] [@ TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_internal:…
This is probably a regression from Bug 737164 - Make strings malloc-infallible by default Without trying to understand the root causes, it might be possible to prevent the crash by switching buf from nsCAutoString to nsCString, or we could catch the OOM error with the new fallible forms of the string function.
I was casually thinking about this earlier, and wondering if this line: http://hg.mozilla.org/comm-central/annotate/b79072737e40/mailnews/base/search/src/nsMsgSearchTerm.cpp#l993 could be calling SetLength with -1. What I don't really like is the way the previous function call plays around with the buffer of the nsCAutoString without actually using that object. I don't think that would cause an issue but it doesn't really look nice or obvious.
Oh, on the SetLength issue, I could see a buf such as "=\n" causing the crash (because MsgStripQuotedPrintable would strip this to "" - zero length string and with softLineBreak = 1 would give a -1 to SetLength) Although I'm not convinced we'd actually get a situation where that buf possible.
The easiest solution to this would be to just use the fallible form of the SetCapacity function, like this example from nsXMLHttpRequest.cpp: 854 if (!mResponseText.SetCapacity(mResponseText.Length() + destBufferLen, fallible_t())) { 855 return NS_ERROR_OUT_OF_MEMORY;
Attached patch Check for errors (obsolete) — Splinter Review
I cannot reproduce this crash, so I'm simply adding checks in this patch. Not asking for review yet since I have not build yet on TB 18, only on TB 17.
TB16 signature for Mac is mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetLength bp-cade7bef-b36c-4470-93bd-b0f812120809
Crash Signature: nsACString_internal::SetLength] → nsACString_internal::SetLength] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetLength]
Comment on attachment 656540 [details] [diff] [review] Check for errors Should we use buf.Length() instead of strlen(buf.get())?
Attachment #656540 - Flags: review?(bugmail)
Also, MsgStripQuotedPrintable ((unsigned char*)buf.get()); is invalid usage for Mozilla code-style. If we need writable access, you have to use BeginWriting()/EndWriting() instread of
Comment on attachment 656540 [details] [diff] [review] Check for errors Sorry, I cancel this. MsgStripQuotedPrintable can causes bug easy. We should use nsACString by this function.
Attachment #656540 - Flags: review?(bugmail)
I think you'll want to ask :bienvenu, :rkent, :NeilAway, or :irving (reid) for review on this for your next rev of the patch.
Blocks: 737164
:asuth I am not a valid reviewer for that patch because a) I wrote it, and b) I am not a legal reviewer. To comment 12: My preference for this would be to do the most conservative patch possible, with the hopes of getting this into a 15.0.1 release. So though there are aspects to the underlying code I don't really like, any change beyond the more simple check is probably too risky for 15.0.1 I'm going to ask for review, with the one caveat: I don't really understand how mozilla::fallible_t() works, but that seems to be the way this check is done.
Attachment #656540 - Flags: review?(neil)
Comment on attachment 656540 [details] [diff] [review] Check for errors I'm confused here, because you're checking that you're not passing ~0U to SetLength but you're also calling the fallible version?
I am doing that because I do not know really why it is crashing, so I am adding as many checks as feasible. And what I am trying to avoid with the first check is not doing SetLength((uint32_t)(-1))
(In reply to Makoto Kato from comment #11) > > Should we use buf.Length() instead of strlen(buf.get())? Because MsgStripQuotedPrintable alters the underlying string, I suspect buf.Length() might be invalid. strlen() on the underlying buffer gets the altered length. (In reply to Makoto Kato from comment #13) > > Sorry, I cancel this. MsgStripQuotedPrintable can causes bug easy. We > should use nsACString by this function. I agree; maybe not for a 15.0.1 patch but certainly for a permanent solution. I'd rather not hide our logic behind a ternary operator; how about size_t bufLength = strlen(buf.get()); if ((bufLength > 0) && softLineBreak) bufLength -= 1; buf.setLength(bufLength); I don't have a strong opinion about whether to also use fallible_t.
(In reply to Irving Reid (:irving) from comment #19) > > I'd rather not hide our logic behind a ternary operator; how about > > size_t bufLength = strlen(buf.get()); > if ((bufLength > 0) && softLineBreak) > bufLength -= 1; > buf.setLength(bufLength); > In the past, I've had reviews rejected because I added extra code like that for explanatory purposes rather than collapse the result into a ternary operator. Neil, before I bother to change this could you tell me your preference? > I don't have a strong opinion about whether to also use fallible_t. The reason for including this is that we really don't know what is causing the crash. using -1 for bufLength is one theory, but in case that is wrong don't you think we should include the other option? I did this patch because of Wayne's flags, not because I am the expert or have some special insight here. Irving feel free to propose your own patch if you like - it might move this forward more quickly.
(In reply to Kent James from comment #20) > (In reply to Irving Reid from comment #19) > > I'd rather not hide our logic behind a ternary operator; how about > > > > size_t bufLength = strlen(buf.get()); > > if ((bufLength > 0) && softLineBreak) > > bufLength -= 1; > > buf.setLength(bufLength); > > In the past, I've had reviews rejected because I added extra code like that > for explanatory purposes rather than collapse the result into a ternary > operator. Neil, before I bother to change this could you tell me your > preference? I slightly prefer this version in the case, as long as you use -- ;-) > > I don't have a strong opinion about whether to also use fallible_t. > > The reason for including this is that we really don't know what is causing > the crash. using -1 for bufLength is one theory, but in case that is wrong > don't you think we should include the other option? > Are we seeing crashes on trunk/aurora, in which case we can tell if it works? (i.e. I'd rather avoid fallible_t if I can.)
(In reply to neil@parkwaycc.co.uk from comment #21) > Are we seeing crashes on trunk/aurora, in which case we can tell if it works? There are roughly 4-5 crashes per day per version for TB16 and Tb17, so the crash is prevalent enough that we should know the impact within a day or two of landing a fix. Not so for trunk - there are not enough users there - TB17.0a1 had only 1-2 crashes per week for the past month. If you want to know the impact you are going to have to land it on aurora IMO.
(In reply to Kent James (:rkent) from comment #20) > In the past, I've had reviews rejected because I added extra code like that > for explanatory purposes rather than collapse the result into a ternary > operator. I will happily join you in fighting those people. (In reply to neil@parkwaycc.co.uk from comment #21) > I slightly prefer this version in the case, as long as you use -- ;-) I was wondering if that would push anyone's button ;-> > Are we seeing crashes on trunk/aurora, in which case we can tell if it works? > (i.e. I'd rather avoid fallible_t if I can.) I say we land it without fallible_t, ask standard8 to push it to aurora, and watch crashes for a few days.
Assignee: nobody → kent
Status: NEW → ASSIGNED
Attachment #656540 - Attachment is obsolete: true
Attachment #656540 - Flags: review?(neil)
Attachment #657456 - Flags: review?(neil)
Attachment #657456 - Flags: review?(neil) → review+
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Flags: in-testsuite-
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 18.0
Comment on attachment 657456 [details] [diff] [review] Rev2: Consolidation of comments standard8? [Approval Request Comment] ref comment 23
Attachment #657456 - Flags: approval-comm-aurora?
Comment on attachment 657456 [details] [diff] [review] Rev2: Consolidation of comments [Triage Comment] Lets get this landed on aurora and beta for some testing.
Attachment #657456 - Flags: approval-comm-beta+
Attachment #657456 - Flags: approval-comm-aurora?
Attachment #657456 - Flags: approval-comm-aurora+
Mark, can you please do the beta and aurora checkins?
Comment on attachment 657456 [details] [diff] [review] Rev2: Consolidation of comments [Triage Comment] Taking for release for a 15.0.1.
Attachment #657456 - Flags: approval-comm-release+
seems to hve been fixed in 15.0.1
also being reported as solved in getstatisfaction. crash-stats looks pretty clean.
Status: RESOLVED → VERIFIED
Whiteboard: [gs] → [gs][gssolved]
Crash Signature: nsACString_internal::SetLength] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetLength] → nsACString_internal::SetLength] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetLength] Crash Reports for TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity
Crash Signature: nsACString_internal::SetLength] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetLength] Crash Reports for TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity → nsACString_internal::SetLength] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetLength] [@ TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ]
...coming from bug 790282. Still seeing this in latest 18.0a1 nightlies (x64 on Win7 x64): - manually create a simple message filter rule (move email from Inbox to certain subfolder). - save filter and execute it on inbox. - tb crashes. - after restarting tb, the message is *copied* (instead of moved) in the folder specified in the filter. Original message remain in Inbox. How can I troubleshoot this further? PS: I'm using mail.serverDefaultStoreContractID @mozilla.org/msgstore/maildirstore;1 but I've had this with the default @mozilla.org/msgstore/berkeleystore;1 too. Don't know if that's useful info, but I thought I should mention it since it deviates from the default config.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
(In reply to klonos from comment #37) > ...coming from bug 790282. > > Still seeing this in latest 18.0a1 nightlies (x64 on Win7 x64): > > - manually create a simple message filter rule (move email from Inbox to > certain subfolder). > - save filter and execute it on inbox. > - tb crashes. > - after restarting tb, the message is *copied* (instead of moved) in the > folder specified in the filter. Original message remain in Inbox. > > How can I troubleshoot this further? > > PS: I'm using mail.serverDefaultStoreContractID > @mozilla.org/msgstore/maildirstore;1 but I've had this with the default > @mozilla.org/msgstore/berkeleystore;1 too. Don't know if that's useful info, > but I thought I should mention it since it deviates from the default config. Try to catch it in a debugger and bug some devs on #maildev when you do (try rkent, irving or standard8)
please file a new bug
Status: REOPENED → RESOLVED
Closed: 13 years ago12 years ago
Resolution: --- → FIXED
(In reply to klonos from comment #37) > ...coming from bug 790282. > > Still seeing this in latest 18.0a1 nightlies (x64 on Win7 x64): You're likely seeing something different, especially as I see no crash reports for this crash on 18.0a1. As suggested, please file a new bug and include your steps to repeat and crash ids (https://support.mozillamessaging.com/en-US/kb/Mozilla-Crash-Reporter#w_viewing-crash-reports). > PS: I'm using mail.serverDefaultStoreContractID > @mozilla.org/msgstore/maildirstore;1 but I've had this with the default > @mozilla.org/msgstore/berkeleystore;1 too. Don't know if that's useful info, > but I thought I should mention it since it deviates from the default config. Please also mention this, it is entirely possible we've fixed the default case but not the maildirstore one (or they are just different cases anyway).
Yes, you are right, the title of the other bug marked as duplicate of this one here had a misleading title (using the term "message filter" when it was actually referring to message quicksearch). I went ahead and filed bug 797659
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: