Closed Bug 837438 Opened 11 years ago Closed 8 years ago

Thunderbird OOM crash mozalloc_abort(char const* const) | NS_DebugBreak_P | nsAString_internal::SetCapacity(unsigned int)

Categories

(MailNews Core :: Filters, defect)

x86
Windows 7
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: wsmwk, Unassigned)

References

(Depends on 1 open bug, )

Details

(Keywords: crash, topcrash-thunderbird, Whiteboard: [gs][startupcrash])

Crash Data

Muriel crashes. reported at 
report bp-9f7763fb-ace1-4e8e-99ad-4b6252130131 .
============================================================= 
0	mozalloc.dll	mozalloc_abort	memory/mozalloc/mozalloc_abort.cpp:23
1	xul.dll	NS_DebugBreak_P	xpcom/base/nsDebugImpl.cpp:410
2	xul.dll	nsAString_internal::SetCapacity	xpcom/string/src/nsTSubstring.cpp:533
3	xul.dll	nsAString_internal::SetLength	xpcom/string/src/nsTSubstring.cpp:582
4	xul.dll	EnsureStringLength<nsAString_internal>	objdir-tb/mozilla/dist/include/nsReadableUtils.h:358
5	xul.dll	nsBinaryInputStream::ReadString	xpcom/io/nsBinaryStream.cpp:665
6	xul.dll	nsXBLPrototypeHandler::Read	content/xbl/src/nsXBLPrototypeHandler.cpp:917
7	xul.dll	nsXBLPrototypeBinding::Read	content/xbl/src/nsXBLPrototypeBinding.cpp:1587
8	xul.dll	nsXBLDocumentInfo::ReadPrototypeBindings	content/xbl/src/nsXBLDocumentInfo.cpp:648
9	xul.dll	nsXBLService::LoadBindingDocumentInfo	content/xbl/src/nsXBLService.cpp:1007
10	xul.dll	nsXBLService::GetBinding	content/xbl/src/nsXBLService.cpp:770
11	xul.dll	nsXBLService::GetBinding	content/xbl/src/nsXBLService.cpp:743
12	xul.dll	nsXBLService::LoadBindings	content/xbl/src/nsXBLService.cpp:505
13	xul.dll	nsCSSFrameConstructor::AddFrameConstructionItemsInternal	layout/base/nsCSSFrameConstructor.cpp:5173
14	xul.dll	nsCSSFrameConstructor::AddFrameConstructionItems	layout/base/nsCSSFrameConstructor.cpp:5116
15	xul.dll	nsCSSFrameConstructor::ConstructFrame	layout/base/nsCSSFrameConstructor.cpp:5055
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
Scoobidiver, I'm deduping. sorry I didn't comment earlier ... I was still thinking

For the moment I would rather pursue this separate from bug 802152, as I am skeptical there is a close tie to the Firefox crashes. And, some of those core bugs don't seem to make good progress for Thunderbird.  Plus, the history of this crash signature in Thunderbird is not simple, as we will see from the history below. Perhaps there are multiple stacks and signatures involved

A. This bug is based principally on https://getsatisfaction.com/mozilla_messaging/topics/frequent_crash_mozalloc_abort_char_const_const_ns_debugbreak_p_nsastring_internal_setcapacity_unsigned_int

And so, we might focus here on TB17 crashes, which I think are in much lower numbers than TB16 and prior releases.


B. History of signature

there is a rich variety of reports over the past several months:

- many TB16 users report updating to TB17 solved their crash
- one TB17 user reported reinstalling solved his crash
- at least one reporter solved issue by updating quicktext addon

sample citations...

TB16 (resolution unknown) https://getsatisfaction.com/mozilla_messaging/topics/unusual_crash

TB16 solved by update to TB17 https://getsatisfaction.com/mozilla_messaging/topics/crash_reporter_help

TB16+calendar https://getsatisfaction.com/mozilla_messaging/topics/thunderbird_system_crash
Status: RESOLVED → REOPENED
Depends on: 802152
Keywords: qawanted
Resolution: DUPLICATE → ---
Summary: OOM crash in mozalloc_abort → OOM crash mozalloc_abort(char const* const) | NS_DebugBreak_P | nsAString_internal::SetCapacity(unsigned int)
I've been in touch with reporter of bp-1c8860fb-10cd-4f0c-bab0-dd9812130219 - david.
he has a ton of crash signatures


http://crash-stats.mozilla.com/report/index/bp-160d9909-1c6d-4cb5-bc90-a66222130219
http://crash-stats.mozilla.com/report/index/bp-1c8860fb-10cd-4f0c-bab0-dd9812130219
http://crash-stats.mozilla.com/report/index/bp-6af6a552-4a28-4aca-bdb0-f63da2130219
http://crash-stats.mozilla.com/report/index/bp-29be82b7-dcdd-4418-a831-f0e0c2130218
http://crash-stats.mozilla.com/report/index/bp-c89bef21-b4f5-4a63-b969-dd09b2130218
http://crash-stats.mozilla.com/report/index/bp-5ed1120a-1c55-4971-9fbf-5ebf42130215
http://crash-stats.mozilla.com/report/index/bp-67c23c95-c33e-4e62-ad0d-f24ba2130215
http://crash-stats.mozilla.com/report/index/bp-acbd2b15-20ee-4b07-b9a1-d84422130215
http://crash-stats.mozilla.com/report/index/bp-f4cb4283-8e93-4a3e-b49e-f907a2130214
http://crash-stats.mozilla.com/report/index/bp-6a5e0781-1239-4fda-a025-87fa42130214
http://crash-stats.mozilla.com/report/index/bp-e44fb4c6-76fd-4237-84cb-080c12130214
http://crash-stats.mozilla.com/report/index/bp-41d7d5d0-6eb8-4a36-bee2-06dcf2130214
http://crash-stats.mozilla.com/report/index/bp-98744288-acd0-48fd-bc16-64c5b2130214
http://crash-stats.mozilla.com/report/index/bp-39906a48-d063-4511-9ba5-19ad22130214
http://crash-stats.mozilla.com/report/index/bp-f7f1c11e-a9e9-477c-8a2d-45d882130214
http://crash-stats.mozilla.com/report/index/bp-8ba30fe5-49db-486c-87ec-aac032130214
http://crash-stats.mozilla.com/report/index/bp-6c67405e-4397-4d02-89d7-ee8242130214
http://crash-stats.mozilla.com/report/index/bp-4be6ed22-e221-43e2-a473-016b12130214
which includes signatures
mozalloc_abort | mozalloc_handle_oom | moz_xmalloc | UTF8ToNewUnicode 
mozalloc_abort | mozalloc_handle_oom | moz_xmalloc | mozilla::layers::ThebesLayerBuffer::GetContextForQuadrantUpdate 
gfxTextRunFactory::Release 
mozalloc_abort | mozalloc_handle_oom | moz_xmalloc | ToNewUnicode 
and more

"It never crashes if I don't have an internet connection. All crashes occur while fetching new mails and moving them between folders (using filters)."

so I'm moving this to filters for now
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsAString_internal::SetCapacity(unsigned int)] → [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsAString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity]
Component: General → Filters
Product: Thunderbird → MailNews Core
Need a dev to look at crashes please. 

non-startup examplles
bp-85ca2a6f-089f-45f4-9b17-ad2e32130228
bp-9e0d5053-8c0f-4f21-9551-6ca352130122 
startup examples
bp-6b997731-713a-4554-b7e5-ffdae2130303
bp-3825dee2-0e97-4343-9d4d-d31f22130302

1. startup - it looks like >50% of TB crash are startup. But a non-trivial percentage that are NOT startup.  (Unfortunately I can't really tell what percentage of thunderbird crashes are startup, because socorro throws Firefox into the stats, which I do NOT want!)

2. OS - *odd*, 77% of crashes are win7 (But again, Firefox crashes are included in the stats, which I do not want.)

3. No correlations to addons or dll as far as I can tell
OS: Windows NT → Windows 7
Summary: OOM crash mozalloc_abort(char const* const) | NS_DebugBreak_P | nsAString_internal::SetCapacity(unsigned int) → Thunderbird OOM crash mozalloc_abort(char const* const) | NS_DebugBreak_P | nsAString_internal::SetCapacity(unsigned int)
Whiteboard: [gs] → [gs][startupcrash]
Mark, crash rate might have dropped dramatically between version 16 and 17. But I cannot tell, because version 16 is not selectable for topcrash https://crash-stats.mozilla.com/topcrasher/byversion/Thunderbird/16.0.2 or crashes per user report. Do you have a way to check, or can we enable 16 in socorro for a few days?
Flags: needinfo?(mbanner)
(In reply to Wayne Mery (:wsmwk) from comment #5)
> Do you have a way to check, or can we enable 16 in socorro for a few days?
See https://crash-stats.mozilla.com/query/query?product=Thunderbird&version=Thunderbird%3A16.0.2&do_query=1
Flags: needinfo?(mbanner)
Thanks for suggesting that - don't know why I didn't think of it.
So, it did drop dramatically, from #1 in version 16 to ~#30 in 17.0 and 17.0.2. ANd ~#100 now in version 17.0.3 

So the concern now is why so many people still using version 16, because today there are still many version 16 users crashing.

(It'd allso be interesting to know why the crashes dropped in TB17)
None of the crash stack is in mailnews code.

Some analysis though:

In http://hg.mozilla.org/releases/mozilla-release/annotate/0507d387617c/xpcom/io/nsBinaryStream.cpp#l665 we have

665 if (!EnsureStringLength(aString, length))
666   return NS_ERROR_OUT_OF_MEMORY;

and EnsureStringLength is defined to return an error if the string capacity cannot be set. But the implementation uses infallible methods, so EnsureStringLength in fact aborts the program if it fails.

Naively preventing the crash without understanding the underlying cause could be done by modifying EnsureStringLength in nsReadableUtils.h to use fallible allocation methods. That seems to me to be the correct thing to do, since EnsureStringLength is defined to expect an error return (and not an abort) if the allocation fails. Whether that would be approved by the powers that be that own the string functions, I don't know.
(In reply to Kent James (:rkent) from comment #8)
> None of the crash stack is in mailnews code.
> 
> Some analysis though:
> 
> In
> http://hg.mozilla.org/releases/mozilla-release/annotate/0507d387617c/xpcom/
> io/nsBinaryStream.cpp#l665 we have
> 
> 665 if (!EnsureStringLength(aString, length))
> 666   return NS_ERROR_OUT_OF_MEMORY;
> 
> and EnsureStringLength is defined to return an error if the string capacity
> cannot be set. But the implementation uses infallible methods, so
> EnsureStringLength in fact aborts the program if it fails.
> 
> Naively preventing the crash without understanding the underlying cause
> could be done by modifying EnsureStringLength in nsReadableUtils.h to use
> fallible allocation methods. That seems to me to be the correct thing to do,
> since EnsureStringLength is defined to expect an error return (and not an
> abort) if the allocation fails. Whether that would be approved by the powers
> that be that own the string functions, I don't know.

No, We use infallible allocator, so it means that we crash application process by OOM.
To avoid OOM as possible, we should use ...

- use AString/ACString in IDL if possible.  (AutoString doesn't allocate heap and it can share string data.)
- Learn memory usages on mailnews.

Of course, by bug (ex. forgetting null-terminate), OOM can occur.  But most signatures aren't this situation.
(In reply to Makoto Kato from comment #9)
> (In reply to Kent James (:rkent) from comment #8)
> > ... But the implementation uses infallible methods, so
> > EnsureStringLength in fact aborts the program if it fails.
> > 
> 
> No, We use infallible allocator, so it means that we crash application
> process by OOM.

Aren't you just saying the same thing that I said? Then why the "no"?
Blocks: 843570
(In reply to Kent James (:rkent) from comment #11)
> (In reply to Makoto Kato from comment #9)
> > (In reply to Kent James (:rkent) from comment #8)
> > > ... But the implementation uses infallible methods, so
> > > EnsureStringLength in fact aborts the program if it fails.
> > > 
> > 
> > No, We use infallible allocator, so it means that we crash application
> > process by OOM.
> 
> Aren't you just saying the same thing that I said? Then why the "no"?

sorry.  this is typo.
Depends on: 804742
(In reply to Makoto Kato (:m_kato) from comment #10)
> To avoid OOM as possible, we should use ...
> 
> - use AString/ACString in IDL if possible.  (AutoString doesn't allocate
> heap and it can share string data.)
> - Learn memory usages on mailnews.

do we have bug #s for these?


> Depends on: bug 804742 - EnsureStringLength doesn't work

how, and to what extent should we expect this to help TB24?
Flags: needinfo?(m_kato)
(In reply to Wayne Mery (:wsmwk) from comment #13)
> (In reply to Makoto Kato (:m_kato) from comment #10)
> > To avoid OOM as possible, we should use ...
> > 
> > - use AString/ACString in IDL if possible.  (AutoString doesn't allocate
> > heap and it can share string data.)
> > - Learn memory usages on mailnews.
> 
> do we have bug #s for these?

No.  We should use AString/ACString in IDL for shared string.  Some cases can save memory.


> > Depends on: bug 804742 - EnsureStringLength doesn't work
> 
> how, and to what extent should we expect this to help TB24?

We should use fallible allocator if we can handle OOM as possible.  But we need modify more codes, so tb24 is too short for it.  We need more time.
Flags: needinfo?(m_kato)
#10 crash for TB24.0, counting only ONE crash signature.
Keywords: topcrash
Depends on: 924278
(In reply to Makoto Kato (:m_kato) from comment #14)
> (In reply to Wayne Mery (:wsmwk) from comment #13)
> > (In reply to Makoto Kato (:m_kato) from comment #10)
> > > Depends on: bug 804742 - EnsureStringLength doesn't work
> > 
> > how, and to what extent should we expect this to help TB24?
> 
> We should use fallible allocator if we can handle OOM as possible.  But we
> need modify more codes, so tb24 is too short for it.  It needs more times.

Are you speaking in general within THunderbird?
Or just with respect to what is causing this crash?


(In reply to Makoto Kato (:m_kato) from comment #14)
> (In reply to Wayne Mery (:wsmwk) from comment #13)
> > (In reply to Makoto Kato (:m_kato) from comment #10)
> > > To avoid OOM as possible, we should use ...
> > > 
> > > - use AString/ACString in IDL if possible.  (AutoString doesn't allocate
> > > heap and it can share string data.)
> > > - Learn memory usages on mailnews.
> > 
> > do we have bug #s for these?
> 
> No.  We should use AString/ACString in IDL for shard string.  Some cases can
> save memory.

filed bug  924278
Flags: needinfo?(m_kato)
(In reply to Wayne Mery (:wsmwk) from comment #16)
> (In reply to Makoto Kato (:m_kato) from comment #14)
> > (In reply to Wayne Mery (:wsmwk) from comment #13)
> > > (In reply to Makoto Kato (:m_kato) from comment #10)
> > > > Depends on: bug 804742 - EnsureStringLength doesn't work
> > > 
> > > how, and to what extent should we expect this to help TB24?
> > 
> > We should use fallible allocator if we can handle OOM as possible.  But we
> > need modify more codes, so tb24 is too short for it.  It needs more times.
> 
> Are you speaking in general within THunderbird?
> Or just with respect to what is causing this crash?

General.  Ex. bp-b383bb4f-56d0-459b-a669-242442130930, we should use nsAutoCString instead of nsCString since parameter of GetUsername is ACString.  This uses stack if possible.  So it is less memory allocation.
Flags: needinfo?(m_kato)
A few users have also cited McAfee as causing this crash signature.

I suspect this has morphed in recent versions to one of the signatures like OOM | small
Depends on: 1028720
Removing myslef on all the bugs I'm cced on. Please NI me if you need something on MailNews Core bugs from me.
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsAString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity] → [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | nsAString_internal::SetCapacity(unsigned int)] [@ mozalloc_abort | NS_DebugBreak_P | nsACString_internal::SetCapacity] [@ mozalloc_abort | NS_DebugBreak_P | nsAString_internal::SetCapacity]
(In reply to Kent James (:rkent) from comment #8)
> None of the crash stack is in mailnews code.
> 
> Some analysis though:
> 
> In
> http://hg.mozilla.org/releases/mozilla-release/annotate/0507d387617c/xpcom/
> io/nsBinaryStream.cpp#l665 we have
> 
> 665 if (!EnsureStringLength(aString, length))
> 666   return NS_ERROR_OUT_OF_MEMORY;
> 
> and EnsureStringLength is defined to return an error if the string capacity
> cannot be set. But the implementation uses infallible methods, so
> EnsureStringLength in fact aborts the program if it fails.

bug 804742 fix (mozilla23) 2013-05-07 should have killed the crashing related to using EnsureStringLength - in fact m_kato is present there :)  So crash will have gone away in TB24


(In reply to Makoto Kato [:m_kato] from comment #14)
> (In reply to Wayne Mery (:wsmwk) from comment #13)
> > (In reply to Makoto Kato (:m_kato) from comment #10)
> > > To avoid OOM as possible, we should use ...
> > > 
> > > - use AString/ACString in IDL if possible.  (AutoString doesn't allocate
> > > heap and it can share string data.)
> > > - Learn memory usages on mailnews.
> > 
> > do we have bug #s for these?
> 
> No.  We should use AString/ACString in IDL for shard string.  Some cases can
> save memory.

Should we still file that bug?  Or has the work already been done since
Status: REOPENED → RESOLVED
Closed: 11 years ago8 years ago
Flags: needinfo?(m_kato)
Keywords: qawanted
Resolution: --- → WORKSFORME
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #21)
> > No.  We should use AString/ACString in IDL for shard string.  Some cases can
> > save memory.
> 
> Should we still file that bug?  Or has the work already been done since

You should file a bug if tb team is necessary.
Flags: needinfo?(m_kato)
(In reply to Makoto Kato [:m_kato] from comment #22)
> (In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment
> #21)
> > > No.  We should use AString/ACString in IDL for shard string.  Some cases can
> > > save memory.
> > 
> > Should we still file that bug?  Or has the work already been done since
> 
> You should file a bug if tb team is necessary.

Ah, I see 3yrs ago after comment 14 I filed bug 924278
You need to log in before you can comment on or make changes to this bug.