Closed Bug 718575 Opened 8 years ago Closed 2 years ago

OOM crash in pref_savePref @ ToNewCString

Categories

(Core :: Preferences: Backend, defect, critical)

11 Branch
x86
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: scoobidiver, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: crash, regression, Whiteboard: startupcrash, probably fixed by bug 766173, need to verify)

Crash Data

It's a startup crash that first appeared in 11.0a1/20111124.
It's currently #11 top crasher in 11.0a2.

Signature 	mozalloc_abort(char const* const) | mozalloc_handle_oom() | ToNewCString(nsACString_internal const&) More Reports Search
UUID	eb9ea0ab-1241-4052-a266-af0302120110
Date Processed	2012-01-10 19:54:12
Uptime	1
Last Crash	35.0 minutes before submission
Install Age	18.6 hours since version was first installed.
Install Time	2012-01-10 09:16:05
Product	Firefox
Version	12.0a1
Build ID	20120109085652
Release Channel	nightly
OS	Windows NT
OS Version	6.1.7600
Build Architecture	x86
Build Architecture Info	GenuineIntel family 6 model 23 stepping 10
Crash Reason	EXCEPTION_BREAKPOINT
Crash Address	0x6dd0195d
Processor Notes 	WARNING: JSON file missing Add-ons
EMCheckCompatibility	False

Frame 	Module 	Signature [Expand] 	Source
0 	mozalloc.dll 	mozalloc_abort 	memory/mozalloc/mozalloc_abort.cpp:77
1 	mozalloc.dll 	mozalloc_handle_oom 	memory/mozalloc/mozalloc_oom.cpp:54
2 	xul.dll 	ToNewCString 	xpcom/string/src/nsReadableUtils.cpp:323
3 	xul.dll 	pref_savePref 	modules/libpref/src/prefapi.cpp:374
4 	xul.dll 	PL_DHashTableEnumerate 	obj-firefox/xpcom/build/pldhash.cpp:754
5 	xul.dll 	mozilla::Preferences::WritePrefFile 	modules/libpref/src/Preferences.cpp:753
6 	xul.dll 	mozilla::Preferences::SavePrefFileInternal 	modules/libpref/src/Preferences.cpp:692
7 	xul.dll 	mozilla::Preferences::Observe 	modules/libpref/src/Preferences.cpp:378
8 	xul.dll 	nsObserverList::NotifyObservers 	xpcom/ds/nsObserverList.cpp:130
9 	xul.dll 	nsObserverService::NotifyObservers 	xpcom/ds/nsObserverService.cpp:182
10 	xul.dll 	xul.dll@0xb759ab 	
11 	xul.dll 	ScopedXPCOMStartup::~ScopedXPCOMStartup 	toolkit/xre/nsAppRunner.cpp:1108
12 	xul.dll 	XRE_main 	
13 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:107
14 	firefox.exe 	firefox.exe@0x4033 	
15 	firefox.exe 	__tmainCRTStartup 	crtexe.c:594
16 	firefox.exe 	_SEH_epilog4 	
17 	kernel32.dll 	BaseThreadInitThunk 	
18 	ntdll.dll 	__RtlUserThreadStart 	
19 	ntdll.dll 	KiIntSystemCall 	
20 	kernel32.dll 	CreateToolhelp32Snapshot 	

More reports at:
https://crash-stats.mozilla.com/report/list?signature=mozalloc_abort%28char+const*+const%29+|+mozalloc_handle_oom%28%29+|+ToNewCString%28nsACString_internal+const%26%29
I sampled some of these reports. It appears that they are shutdown crashes, although they may be cases where we start up and shut down soon or immediately after. They also appear to be "real" out-of-memory situations, that is, jemalloc is aborting because it can't allocate memory. But at least for the report listed, there appears to be available memory:

"AvailableVirtualMemory": "2034356224"
"TotalVirtualMemory": "2147352576"
"SystemMemoryUsePercentage": "69"

Without the abort, the effect of this bug would be that the preferences file would not be completely written, which could leave it empty or just stale.

In this case the infallible malloc is working as expected, so if this is a bug we need to look at the data actually being stored in the pref file to see if somebody is trying to store large amounts of data in the file.
Component: Preferences: Backend → General
QA Contact: preferences-backend → general
As gcp commented in an e-mail:

 * I see that nsExceptionHandler collects GlobalMemoryStatusEx (which has AvailableVirtualMemory, among others), but I can't find this information on the crash report website.  Where is it?

 * I'm concerned that malloc is failing while there's apparently plenty of free memory.  This could be because we're requesting a ton of memory, or it could be because of a bug in jemalloc.

We've seen similar OOM crashes in bug 702217 where we *think* we're only allocating 10mb or so.  Do we have any way to figure out how much memory is being requested here?  Maybe we could hook infallible malloc into the exception handler and report in the requested size?
(In reply to Justin Lebar [:jlebar] from comment #2)
> As gcp commented in an e-mail:
> 
>  * I see that nsExceptionHandler collects GlobalMemoryStatusEx (which has
> AvailableVirtualMemory, among others), but I can't find this information on
> the crash report website.  Where is it?

It never made it into the public crash reports unfortunately. If you have access, you can find it by logging in to crash-stats, clicking on the "Raw Dump" tab, then clicking on the link to the raw JSON file at the bottom of the page. There will be "AvailableVirtualMemory" and "TotalVirtualMemory" keys in there. (Note that this is only available on Windows.)

>  * I'm concerned that malloc is failing while there's apparently plenty of
> free memory.  This could be because we're requesting a ton of memory, or it
> could be because of a bug in jemalloc.
> 
> We've seen similar OOM crashes in bug 702217 where we *think* we're only
> allocating 10mb or so.  Do we have any way to figure out how much memory is
> being requested here?  Maybe we could hook infallible malloc into the
> exception handler and report in the requested size?

I thought about this in another bug and wound up filing bug 716638. bsmedberg has a WIP patch there that's not suitable for landing, but it could probably be massaged into a landable state.
Crash Signature: [@ mozalloc_abort(char const* const) | mozalloc_handle_oom() | ToNewCString(nsACString_internal const&)] → [@ mozalloc_abort(char const* const) | mozalloc_handle_oom() | ToNewCString(nsACString_internal const&)] [@ mozalloc_abort(char const* const) | mozalloc_handle_oom(unsigned int) | moz_xrealloc | ToNewCString(nsACString_internal const&)]
Summary: Startup crash in pref_savePref @ mozalloc_abort(char const* const) | mozalloc_handle_oom() | ToNewCString(nsACString_internal const&) → OOM crash in pref_savePref @ ToNewCString
Crash Signature: [@ mozalloc_abort(char const* const) | mozalloc_handle_oom() | ToNewCString(nsACString_internal const&)] [@ mozalloc_abort(char const* const) | mozalloc_handle_oom(unsigned int) | moz_xrealloc | ToNewCString(nsACString_internal const&)] → [@ mozalloc_abort(char const* const) | mozalloc_handle_oom() | ToNewCString(nsACString_internal const&)] [@ mozalloc_abort(char const* const) | mozalloc_handle_oom(unsigned int) | moz_xrealloc | ToNewCString(nsACString_internal const&)] [@ mozalloc_abor…
Here is a crash report in 15.0a2 after the patch of bug 716638 landed: bp-991e494e-38b6-4c9a-b271-f869c2120620.
> Available Page File 4901203968
> OOMAllocationSize 72

Plenty of available memory, and note the 72B size class, like bug 709860 comment 30.
Yes, this is probably caused by 766173 and we should try to verify its disappearance along with the other related bugs.
Depends on: 766173
Whiteboard: startupcrash → startupcrash, probably fixed by bug 766173, need to verify
This crash does not seem to have gone away.  Bug maybe fixing bug 675260 will solve the problem; that bug also seems to be heap corruption.
Depends on: 675260
There are 171 crashes in 15.0.1.
Keywords: topcrash
It spikes across in 18.0.1 along with bug 837497. It's currently #19 top browser crasher in 18.0.1.

Correlations are similar to those of bug 837497:
  mozalloc_abort(char const* const) | mozalloc_handle_oom(unsigned int) | moz_xmalloc | ToNewCString(nsACString_internal const&)|EXCEPTION_BREAKPOINT (748 crashes)
     48% (357/748) vs.   1% (2518/174704) crossriderapp4479@crossrider.com
          0% (0/748) vs.   0% (1/174704) 0.83.15
          0% (0/748) vs.   0% (1/174704) 0.83.20
          0% (1/748) vs.   0% (2/174704) 0.83.33
          0% (0/748) vs.   0% (4/174704) 0.84.36
          0% (0/748) vs.   0% (1/174704) 0.85.40
          0% (0/748) vs.   0% (1/174704) 0.85.42
          0% (0/748) vs.   0% (6/174704) 0.85.43
          0% (3/748) vs.   0% (54/174704) 0.86.44
          7% (53/748) vs.   0% (432/174704) 0.87.66
         26% (198/748) vs.   1% (1240/174704) 0.88.67
         14% (102/748) vs.   0% (776/174704) 0.88.83
     17% (130/748) vs.   4% (7782/174704) plugin@yontoo.com (1.20.00)
     13% (98/748) vs.   1% (1039/174704) crossriderapp4637@crossrider.com
          0% (1/748) vs.   0% (1/174704) 0.83.13
          0% (0/748) vs.   0% (3/174704) 0.83.24
          0% (0/748) vs.   0% (5/174704) 0.86.35
          0% (0/748) vs.   0% (1/174704) 0.86.40
         13% (97/748) vs.   1% (1026/174704) 0.87.43
          0% (0/748) vs.   0% (3/174704) 0.88.58
     14% (104/748) vs.   4% (6219/174704) ffxtlbr@babylon.com
          0% (0/748) vs.   0% (1/174704) 1.1.0
          0% (0/748) vs.   0% (9/174704) 1.1.3
          0% (0/748) vs.   0% (10/174704) 1.1.8
          5% (40/748) vs.   1% (1572/174704) 1.1.9
          2% (18/748) vs.   1% (1188/174704) 1.2.0
          0% (0/748) vs.   0% (2/174704) 1.4.15.4
          6% (46/748) vs.   2% (3437/174704) 1.5.0
Component: General → Preferences: Backend
Keywords: topcrash
(In reply to Scoobidiver from comment #9)
> It spikes across in 18.0.1 along with bug 837497. It's currently #19 top
> browser crasher in 18.0.1.

I didn't report this here because I think this actually is another case of whatever bug 837497 really is (nobody has a complete clue, but crossrider tried pushing an update of their framework to see if it helps).
There are less than 25 crashes in 19.0.
Keywords: topcrash
Crash Signature: mozalloc_abort(char const* const) | mozalloc_handle_oom(unsigned int) | moz_xmalloc | ToNewCString(nsACString_internal const&)] → mozalloc_abort(char const* const) | mozalloc_handle_oom(unsigned int) | moz_xmalloc | ToNewCString(nsACString_internal const&)] [@ mozalloc_abort | mozalloc_handle_oom | ToNewCString] [@ mozalloc_abort | mozalloc_handle_oom | moz_xrealloc | ToNewCStrin…
This crash bug report hasn't been touched in 5 years. At this point it's unlikely to be useful.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.