The default bug view has changed. See this FAQ.

crash in mozalloc_abort ... nsMsgSearchTerm::MatchBody

RESOLVED FIXED in Thunderbird 18.0

Status

MailNews Core
Search
--
critical
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: wsmwk, Assigned: rkent)

Tracking

({crash, regression, topcrash+})

Trunk
Thunderbird 18.0
x86
Windows NT
crash, regression, topcrash+
Bug Flags:
in-testsuite -

Thunderbird Tracking Flags

(thunderbird15+ fixed, thunderbird16+ fixed, thunderbird17 fixed)

Details

(Whiteboard: [gs][gssolved], crash signature, URL)

Attachments

(1 attachment, 1 obsolete attachment)

This bug was filed from the Socorro interface and is 
report bp-cf3afdb6-5bb7-4c04-9042-080e12120515 .
============================================================= 

15.0a1 is where crash first appears. And seen in 
15.0a2 	20120630
16.0a1 	20120629

0	mozalloc.dll	mozalloc_abort	memory/mozalloc/mozalloc_abort.cpp:79
1	xul.dll	NS_DebugBreak_P	xpcom/base/nsDebugImpl.cpp:404
2	xul.dll	nsACString_internal::SetCapacity	xpcom/string/src/nsTSubstring.cpp:565
3	xul.dll	nsACString_internal::SetLength	xpcom/string/src/nsTSubstring.cpp:614
4	xul.dll	nsMsgSearchTerm::MatchBody	mailnews/base/search/src/nsMsgSearchTerm.cpp:993
5	xul.dll	nsMsgSearchOfflineMail::ProcessSearchTerm	mailnews/base/search/src/nsMsgLocalSearch.cpp:529
6	xul.dll	nsMsgSearchBoolExpression::OfflineEvaluate	mailnews/base/search/src/nsMsgLocalSearch.cpp:172
7	xul.dll	nsMsgSearchBoolExpression::OfflineEvaluate	mailnews/base/search/src/nsMsgLocalSearch.cpp:193
8	xul.dll	nsMsgSearchBoolExpression::OfflineEvaluate	mailnews/base/search/src/nsMsgLocalSearch.cpp:193
9	xul.dll	nsMsgSearchOfflineMail::MatchTerms	mailnews/base/search/src/nsMsgLocalSearch.cpp:720
10	xul.dll	nsMsgSearchOfflineMail::MatchTermsForSearch	mailnews/base/search/src/nsMsgLocalSearch.cpp:364
11	xul.dll	nsMsgSearchOfflineMail::Search	mailnews/base/search/src/nsMsgLocalSearch.cpp:770
12	xul.dll	nsMsgSearchScopeTerm::TimeSlice	mailnews/base/search/src/nsMsgSearchTerm.cpp:1921
13	xul.dll	nsMsgSearchSession::TimeSliceSerial	mailnews/base/search/src/nsMsgSearchSession.cpp:676
14	xul.dll	nsMsgSearchSession::TimerCallback	mailnews/base/search/src/nsMsgSearchSession.cpp:524
15	xul.dll	nsTimerImpl::Fire	xpcom/threads/nsTimerImpl.cpp:508
#1 crash in dailies
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] → [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ]
tracking-thunderbird15: --- → ?
tracking-thunderbird16: --- → ?
(Reporter)

Updated

5 years ago
Summary: crash in mozalloc_abort → crash in mozalloc_abort ... nsMsgSearchTerm::MatchBody
Mark - suggest this is needs to be fixed for TB15 - #2 crash for tb15 beta 

(if one excludes #2 crash arena_dalloc | libsystem_c.dylib@0x2d8c7 coming from a single Mac user)
Keywords: topcrash
suspect also Mac crash TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetLength bp-43aa4258-b5f2-41d4-aa51-1d28b2120806
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ] → [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ] [@ TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_in…
(Assignee)

Comment 4

5 years ago
This is probably a regression from Bug 737164 - Make strings malloc-infallible by default

Without trying to understand the root causes, it might be possible to prevent the crash by switching buf from nsCAutoString to nsCString, or we could catch the OOM error with the new fallible forms of the string function.
I was casually thinking about this earlier, and wondering if this line:

http://hg.mozilla.org/comm-central/annotate/b79072737e40/mailnews/base/search/src/nsMsgSearchTerm.cpp#l993

could be calling SetLength with -1.

What I don't really like is the way the previous function call plays around with the buffer of the nsCAutoString without actually using that object. I don't think that would cause an issue but it doesn't really look nice or obvious.
Oh, on the SetLength issue, I could see a buf such as "=\n" causing the crash (because MsgStripQuotedPrintable would strip this to "" - zero length string and with softLineBreak = 1 would give a -1 to SetLength)

Although I'm not convinced we'd actually get a situation where that buf possible.
this is looking very bad after 24hr ...

#1 crash for TB15, exceeding TB14 topcrash sigs crash rate by about 10x

https://crash-stats.mozilla.com/query/query?product=Thunderbird&version=Thunderbird%3A15.0&range_value=24&range_unit=hours&date=08%2F29%2F2012+14%3A23%3A16&query_search=signature&query_type=contains&query=&reason=&build_id=&process_type=any&hang_type=any&do_query=1  vs  https://crash-stats.mozilla.com/topcrasher/byversion/Thunderbird/14.0
Keywords: topcrash → topcrash+
(Assignee)

Comment 8

5 years ago
The easiest solution to this would be to just use the fallible form of the SetCapacity function, like this example from nsXMLHttpRequest.cpp:

854   if (!mResponseText.SetCapacity(mResponseText.Length() + destBufferLen, fallible_t())) {
855     return NS_ERROR_OUT_OF_MEMORY;
(Assignee)

Comment 9

5 years ago
Created attachment 656540 [details] [diff] [review]
Check for errors

I cannot reproduce this crash, so I'm simply adding checks in this patch. Not asking for review yet since I have not build yet on TB 18, only on TB 17.
TB16 signature for Mac is mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetLength  bp-cade7bef-b36c-4470-93bd-b0f812120809
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ] [@ TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_in… → [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ] [@ TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_in…
Comment on attachment 656540 [details] [diff] [review]
Check for errors

Should we use buf.Length() instead of strlen(buf.get())?
Attachment #656540 - Flags: review?(bugmail)
Also, MsgStripQuotedPrintable ((unsigned char*)buf.get()); is invalid usage for Mozilla code-style. If we need writable access, you have to use BeginWriting()/EndWriting() instread of
Comment on attachment 656540 [details] [diff] [review]
Check for errors

Sorry, I cancel this.  MsgStripQuotedPrintable can causes bug easy.  We should use nsACString by this function.
Attachment #656540 - Flags: review?(bugmail)
I think you'll want to ask :bienvenu, :rkent, :NeilAway, or :irving (reid) for review on this for your next rev of the patch.
(Reporter)

Updated

5 years ago
Blocks: 737164
Duplicate of this bug: 786955
(Assignee)

Comment 16

5 years ago
:asuth I am not a valid reviewer for that patch because a) I wrote it, and b) I am not a legal reviewer.

To comment 12: My preference for this would be to do the most conservative patch possible, with the hopes of getting this into a 15.0.1 release. So though there are aspects to the underlying code I don't really like, any change beyond the more simple check is probably too risky for 15.0.1

I'm going to ask for review, with the one caveat: I don't really understand how mozilla::fallible_t() works, but that seems to be the way this check is done.
(Assignee)

Updated

5 years ago
Attachment #656540 - Flags: review?(neil)
Comment on attachment 656540 [details] [diff] [review]
Check for errors

I'm confused here, because you're checking that you're not passing ~0U to SetLength but you're also calling the fallible version?
(Assignee)

Comment 18

5 years ago
I am doing that because I do not know really why it is crashing, so I am adding as many checks as feasible.

And what I am trying to avoid with the first check is not doing SetLength((uint32_t)(-1))
(Reporter)

Updated

5 years ago
Whiteboard: [gs]
(In reply to Makoto Kato from comment #11)
> 
> Should we use buf.Length() instead of strlen(buf.get())?

Because MsgStripQuotedPrintable alters the underlying string, I suspect buf.Length() might be invalid. strlen() on the underlying buffer gets the altered length.

(In reply to Makoto Kato from comment #13)
> 
> Sorry, I cancel this.  MsgStripQuotedPrintable can causes bug easy.  We
> should use nsACString by this function.

I agree; maybe not for a 15.0.1 patch but certainly for a permanent solution.


I'd rather not hide our logic behind a ternary operator; how about

size_t bufLength = strlen(buf.get());
if ((bufLength > 0) && softLineBreak)
  bufLength -= 1;
buf.setLength(bufLength);

I don't have a strong opinion about whether to also use fallible_t.
(Assignee)

Comment 20

5 years ago
(In reply to Irving Reid (:irving) from comment #19)
> 
> I'd rather not hide our logic behind a ternary operator; how about
> 
> size_t bufLength = strlen(buf.get());
> if ((bufLength > 0) && softLineBreak)
>   bufLength -= 1;
> buf.setLength(bufLength);
>

In the past, I've had reviews rejected because I added extra code like that for explanatory purposes rather than collapse the result into a ternary operator. Neil, before I bother to change this could you tell me your preference?

> I don't have a strong opinion about whether to also use fallible_t.

The reason for including this is that we really don't know what is causing the crash. using -1 for bufLength is one theory, but in case that is wrong don't you think we should include the other option?

I did this patch because of Wayne's flags, not because I am the expert or have some special insight here. Irving feel free to propose your own patch if you like - it might move this forward more quickly.
(In reply to Kent James from comment #20)
> (In reply to Irving Reid from comment #19)
> > I'd rather not hide our logic behind a ternary operator; how about
> > 
> > size_t bufLength = strlen(buf.get());
> > if ((bufLength > 0) && softLineBreak)
> >   bufLength -= 1;
> > buf.setLength(bufLength);
> 
> In the past, I've had reviews rejected because I added extra code like that
> for explanatory purposes rather than collapse the result into a ternary
> operator. Neil, before I bother to change this could you tell me your
> preference?
I slightly prefer this version in the case, as long as you use -- ;-)

> > I don't have a strong opinion about whether to also use fallible_t.
> 
> The reason for including this is that we really don't know what is causing
> the crash. using -1 for bufLength is one theory, but in case that is wrong
> don't you think we should include the other option?
> 
Are we seeing crashes on trunk/aurora, in which case we can tell if it works?
(i.e. I'd rather avoid fallible_t if I can.)
(In reply to neil@parkwaycc.co.uk from comment #21)
> Are we seeing crashes on trunk/aurora, in which case we can tell if it works?

There are roughly 4-5 crashes per day per version for TB16 and Tb17, so the crash is prevalent enough that we should know the impact within a day or two of landing a fix.  Not so for trunk - there are not enough users there - TB17.0a1 had only 1-2 crashes per week for the past month.  If you want to know the impact you are going to have to land it on aurora IMO.
(In reply to Kent James (:rkent) from comment #20)
> In the past, I've had reviews rejected because I added extra code like that
> for explanatory purposes rather than collapse the result into a ternary
> operator.

I will happily join you in fighting those people.

(In reply to neil@parkwaycc.co.uk from comment #21)
> I slightly prefer this version in the case, as long as you use -- ;-)

I was wondering if that would push anyone's button ;->

> Are we seeing crashes on trunk/aurora, in which case we can tell if it works?
> (i.e. I'd rather avoid fallible_t if I can.)

I say we land it without fallible_t, ask standard8 to push it to aurora, and watch crashes for a few days.
Assignee: nobody → kent
Status: NEW → ASSIGNED
(Assignee)

Comment 24

5 years ago
Created attachment 657456 [details] [diff] [review]
Rev2: Consolidation of comments
Attachment #656540 - Attachment is obsolete: true
Attachment #656540 - Flags: review?(neil)
Attachment #657456 - Flags: review?(neil)

Updated

5 years ago
Attachment #657456 - Flags: review?(neil) → review+
(Assignee)

Comment 25

5 years ago
Comment on attachment 657456 [details] [diff] [review]
Rev2: Consolidation of comments

Checked in https://hg.mozilla.org/comm-central/rev/25cd109a6a33
(Assignee)

Updated

5 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Flags: in-testsuite-
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 18.0
Comment on attachment 657456 [details] [diff] [review]
Rev2: Consolidation of comments

standard8?

[Approval Request Comment]
ref comment 23
Attachment #657456 - Flags: approval-comm-aurora?
Duplicate of this bug: 787626
Comment on attachment 657456 [details] [diff] [review]
Rev2: Consolidation of comments

[Triage Comment]
Lets get this landed on aurora and beta for some testing.
Attachment #657456 - Flags: approval-comm-beta+
Attachment #657456 - Flags: approval-comm-aurora?
Attachment #657456 - Flags: approval-comm-aurora+
(Assignee)

Comment 29

5 years ago
Mark, can you please do the beta and aurora checkins?
Checked in:

https://hg.mozilla.org/releases/comm-aurora/rev/b7c36a1813b8
https://hg.mozilla.org/releases/comm-beta/rev/525ce4c74497
status-thunderbird16: --- → fixed
tracking-thunderbird15: ? → +
tracking-thunderbird16: ? → +
Comment on attachment 657456 [details] [diff] [review]
Rev2: Consolidation of comments

[Triage Comment]
Taking for release for a 15.0.1.
Attachment #657456 - Flags: approval-comm-release+
Checked in:

http://hg.mozilla.org/releases/comm-release/rev/b4366b1e750c
status-thunderbird15: --- → fixed

Comment 33

5 years ago
seems to hve been fixed in 15.0.1
(Reporter)

Updated

5 years ago
Duplicate of this bug: 790172
also being reported as solved in getstatisfaction.
crash-stats looks pretty clean.
Status: RESOLVED → VERIFIED
Whiteboard: [gs] → [gs][gssolved]

Updated

5 years ago
Duplicate of this bug: 790282
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ] [@ TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_in… → [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ] [@ TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_in…
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ] [@ TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_in… → [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsACString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity ] [@ TouchBadMemory | mozalloc_abort | NS_DebugBreak_P | nsACString_in…

Comment 37

5 years ago
...coming from bug 790282.

Still seeing this in latest 18.0a1 nightlies (x64 on Win7 x64):

- manually create a simple message filter rule (move email from Inbox to certain subfolder).
- save filter and execute it on inbox.
- tb crashes.
- after restarting tb, the message is *copied* (instead of moved) in the folder specified in the filter. Original message remain in Inbox.

How can I troubleshoot this further?

PS: I'm using mail.serverDefaultStoreContractID @mozilla.org/msgstore/maildirstore;1 but I've had this with the default @mozilla.org/msgstore/berkeleystore;1 too. Don't know if that's useful info, but I thought I should mention it since it deviates from the default config.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
(In reply to klonos from comment #37)
> ...coming from bug 790282.
> 
> Still seeing this in latest 18.0a1 nightlies (x64 on Win7 x64):
> 
> - manually create a simple message filter rule (move email from Inbox to
> certain subfolder).
> - save filter and execute it on inbox.
> - tb crashes.
> - after restarting tb, the message is *copied* (instead of moved) in the
> folder specified in the filter. Original message remain in Inbox.
> 
> How can I troubleshoot this further?
> 
> PS: I'm using mail.serverDefaultStoreContractID
> @mozilla.org/msgstore/maildirstore;1 but I've had this with the default
> @mozilla.org/msgstore/berkeleystore;1 too. Don't know if that's useful info,
> but I thought I should mention it since it deviates from the default config.

Try to catch it in a debugger and bug some devs on #maildev when you do (try rkent, irving or standard8)
please file a new bug
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago5 years ago
Resolution: --- → FIXED
(In reply to klonos from comment #37)
> ...coming from bug 790282.
> 
> Still seeing this in latest 18.0a1 nightlies (x64 on Win7 x64):

You're likely seeing something different, especially as I see no crash reports for this crash on 18.0a1. As suggested, please file a new bug and include your steps to repeat and crash ids (https://support.mozillamessaging.com/en-US/kb/Mozilla-Crash-Reporter#w_viewing-crash-reports).

> PS: I'm using mail.serverDefaultStoreContractID
> @mozilla.org/msgstore/maildirstore;1 but I've had this with the default
> @mozilla.org/msgstore/berkeleystore;1 too. Don't know if that's useful info,
> but I thought I should mention it since it deviates from the default config.

Please also mention this, it is entirely possible we've fixed the default case but not the maildirstore one (or they are just different cases anyway).

Comment 41

5 years ago
Yes, you are right, the title of the other bug marked as duplicate of this one here had a misleading title (using the term "message filter" when it was actually referring to message quicksearch).

I went ahead and filed bug 797659

Updated

5 years ago
status-thunderbird17: --- → fixed
(Reporter)

Updated

4 years ago
Duplicate of this bug: 813903
(Reporter)

Updated

4 years ago
Duplicate of this bug: 833243
You need to log in before you can comment on or make changes to this bug.