Closed Bug 1460838 Opened 2 years ago Closed 2 years ago

MOZ_ASSERT(base_mapped >= base_committed) @ MozJemalloc::jemalloc_stats

Categories

(Core :: Memory Allocator, defect, blocker)

defect
Not set
blocker

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox-esr60 --- fixed
firefox61 --- fixed
firefox62 --- fixed

People

(Reporter: mayhemer, Assigned: glandium)

References

Details

Attachments

(2 files)

m-c (up to date), debug x64 mingw build, windows 10 x64, running from visual studio, clean profile.

all spawned processes soon after startup crash on this assertion:

MOZ_ASSERT(base_mapped >= base_committed);
base_mapped = 0
base_committed = 16384

>	mozglue.dll!Allocator<MozJemallocBase>::jemalloc_stats(0x000000a399bf7de0) Line 4383	C++
 	mozglue.dll!Allocator<ReplaceMallocBase>::jemalloc_stats(0x000000a399bf7de0) Line 51	C++
 	mozglue.dll!jemalloc_stats(0x000000a399bf7de0) Line 51	C++
 	xul.dll!nsMemoryReporterManager::GetHeapOverheadFraction(0x000000a399bf7fd0) Line 2458	C++
 	xul.dll!XPTC__InvokebyIndex() Line 99	Unknown


0 h(id = "MEMORY_HEAP_OVERHEAD_FRACTION", units = 3, amountName = "heapOverheadFraction") ["resource://gre/modules/TelemetrySession.jsm":1064]
1 p(id = "MEMORY_HEAP_OVERHEAD_FRACTION", n = "heapOverheadFraction") ["resource://gre/modules/TelemetrySession.jsm":1075]
2 gatherMemory() ["resource://gre/modules/TelemetrySession.jsm":1090]
    this = [object Object]
3 delayedInit/this._delayedInitTask<() ["resource://gre/modules/TelemetrySession.jsm":1457]
4 InterpretGeneratorResume(gen = [object Generator], val = undefined, kind = "next") ["self-hosted":1264]
5 next(val = undefined) ["self-hosted":1219]
    this = [object Generator]
Are you able to reliably reproduce? If so, can you put a breakpoing on base_alloc and note the sizes passed in there until the crash?
Attached file log.txt
I'm able to reproduce reliably. Here's the log with dumping base_alloc's aSize.

Anything else you need?
Flags: needinfo?(mh+mozilla)
And with this log, the values for base_mapped and base_committed on crash are like in comment 0?
Flags: needinfo?(mh+mozilla) → needinfo?(bdahl)
With the sequence of allocations from the log in comment 2, I get base_mapped = 1048576 and base_committed = 53248... there's no way this can trigger that assertion without some kind of memory corruption happening. Or somehow malloc_init_hard runs after the first allocation? Can you get a stack trace for the first base_alloc, and one for malloc_init_hard (and see whether it happens before or after) ?
I came across with a similar error on my first build (m-c) on a fresh win10 machine. It crashes just few seconds after run at this assertion:

Assertion Failure: base_mapped >= base_committed, @memory/build/mozjemalloc.cpp:4383

base_mapped = 0
base_committed = 24576

>	mozglue.dll!Allocator<MozJemallocBase>::jemalloc_stats(jemalloc_stats_t * aStats) Line 4383	C++
 	mozglue.dll!Allocator<ReplaceMallocBase>::jemalloc_stats(jemalloc_stats_t * arg1) Line 51	C++
 	mozglue.dll!jemalloc_stats(jemalloc_stats_t * arg1) Line 51	C++
 	xul.dll!nsMemoryReporterManager::GetHeapOverheadFraction(__int64 * aAmount) Line 2457	C++
 	xul.dll!_NS_InvokeByIndex() Line 57	Unknown
 	xul.dll!CallMethodHelper::Invoke() Line 1707	C++
malloc_init_hard is called more than once.  I added a data change breakpoint for *malloc_initialized.mValue* and I'm getting the following hits:


(0x3DA0 is the parent process pid.)

* set to true FIRST TIME:

0x3DA0 	mozglue.dll!malloc_init_hard
	mozglue.dll!malloc_init
	mozglue.dll!BaseAllocator::malloc
	mozglue.dll!Allocator<MozJemallocBase>::malloc
	mozglue.dll!Allocator<ReplaceMallocBase>::malloc
	mozglue.dll!je_malloc
	mozglue.dll!moz_xmalloc
	mozglue.dll!operator new
	mozglue.dll!mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128>::{ctor}::__l2::<lambda>
	mozglue.dll!mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128>::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128><>
	mozglue.dll!mozilla::interceptor::WindowsDllPatcherBase<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> >::WindowsDllPatcherBase<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> ><>
	mozglue.dll!mozilla::interceptor::WindowsDllDetourPatcher<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> >::WindowsDllDetourPatcher<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> ><>
	mozglue.dll!mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> >::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> ><>
	mozglue.dll!`dynamic initializer for 'NtDllIntercept''
	ucrtbase.dll!_initterm
	mozglue.dll!dllmain_crt_process_attach
	mozglue.dll!dllmain_dispatch
	ntdll.dll!LdrpCallInitRoutine
	ntdll.dll!LdrpInitializeNode
	ntdll.dll!LdrpInitializeGraphRecurse
	ntdll.dll!LdrpInitializeGraphRecurse
	ntdll.dll!LdrpInitializeProcess
	ntdll.dll!_LdrpInitialize
	ntdll.dll!LdrpInitialize
	ntdll.dll!LdrInitializeThunk
	


followed by allocations:

0x3DA0 base_alloc(unsigned __int64) aSize=0x00000000000006f0
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000020
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040
0x3DA0 base_alloc(unsigned __int64) aSize=0x0000000000000040



* set to BACK TO FALSE:

Main thread, parent process

>	mozglue.dll!std::atomic<unsigned int>::atomic<unsigned int>(0x00000000) Line 175	C++
 	mozglue.dll!mozilla::detail::AtomicBase<unsigned int,2>::AtomicBase<unsigned int,2>(0x00000000) Line 315	C++
 	mozglue.dll!mozilla::Atomic<bool,2,void>::Atomic<bool,2,void>(false) Line 534	C++
 	mozglue.dll!`dynamic initializer for 'malloc_initialized''() Line 541	C++
 	ucrtbase.dll!_initterm()	Unknown
 	mozglue.dll!dllmain_crt_process_attach(0x00007ffa7eff0000, 0x000000c5187ffad0) Line 65	C++
 	mozglue.dll!dllmain_dispatch(0x00007ffa7eff0000, 0x00000001, 0x000000c5187ffad0) Line 194	C++
 	ntdll.dll!LdrpCallInitRoutine()	Unknown
 	ntdll.dll!LdrpInitializeNode()	Unknown
 	ntdll.dll!LdrpInitializeGraphRecurse()	Unknown
 	ntdll.dll!LdrpInitializeGraphRecurse()	Unknown
 	ntdll.dll!LdrpInitializeProcess()	Unknown
 	ntdll.dll!_LdrpInitialize()	Unknown
 	ntdll.dll!LdrpInitialize()	Unknown
 	ntdll.dll!LdrInitializeThunk()	Unknown




followed by an obvious duplicate call to malloc_init_hard (main thread, same parent process):

0x3DA0 	mozglue.dll!malloc_init_hard
	mozglue.dll!malloc_init
	mozglue.dll!BaseAllocator::malloc
	mozglue.dll!Allocator<MozJemallocBase>::malloc
	mozglue.dll!Allocator<ReplaceMallocBase>::malloc
	mozglue.dll!je_malloc
	mozglue.dll!moz_xmalloc
	firefox.exe!operator new
	firefox.exe!std::_Default_allocate_traits::_Allocate
	firefox.exe!std::_Allocate<16,std::_Default_allocate_traits,0>
	firefox.exe!std::allocator<wchar_t>::allocate
	firefox.exe!std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> >::_Reallocate_for<void <lambda>(wchar_t * const, const unsigned __int64, const wchar_t * const),wchar_t const * __ptr64>
	firefox.exe!std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> >::assign
	firefox.exe!std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> >::assign
	firefox.exe!std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> >::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> >
	firefox.exe!mozilla::sandboxing::`dynamic initializer for 'ZONE_IDENTIFIER_STR''
	ucrtbase.dll!_initterm
	firefox.exe!__scrt_common_main_seh
	kernel32.dll!BaseThreadInitThunk
	ntdll.dll!RtlUserThreadStart
Flags: needinfo?(bdahl) → needinfo?(mh+mozilla)
I have Entrian Attach and Spawn Process Catcher X visual studio extensions installed.  Disabling them doesn't solve this.
Note that all allocations before the second call to malloc_init_hard are coming from the frame:

	mozglue.dll!mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128>::{ctor}::__l2::<lambda>
	mozglue.dll!mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128>::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128><>
	mozglue.dll!mozilla::interceptor::WindowsDllPatcherBase<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> >::WindowsDllPatcherBase<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> ><>
	mozglue.dll!mozilla::interceptor::WindowsDllDetourPatcher<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> >::WindowsDllDetourPatcher<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> ><>
	mozglue.dll!mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> >::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,128> ><>
	mozglue.dll!`dynamic initializer for 'NtDllIntercept''



Adding Aaron, since this seems to be related to interceptor::VMSharingPolicyShared
Blocks: 1432653
Flags: needinfo?(aklotz)
(In reply to Honza Bambas (:mayhemer) from comment #8)
> this seems to be related to
> interceptor::VMSharingPolicyShared

wanted to write interceptor::WindowsDllInterceptor
Looks like this is running afoul of the unspecified ordering between `dynamic initializer for 'NtDllIntercept'' and `dynamic initializer for 'malloc_initialized''.

I wonder if we could just leave off malloc_initialized's constructor param and let it begin life as zeroed static data. But maybe that would just move the problem somewhere else. It's kinda scary that the interceptor is doing malloc and locks and stuff during DLL init at all.
sigh. Why can't gcc figure out that it doesn't /need/ a static initializer to set a value to 0? That's not the first time this is happening, and won't be the last, I'm sure. Even more amazing is that the constructor is even constexpr!
Flags: needinfo?(mh+mozilla)
Even more amazing is that gcc doesn't use a static initializer on Linux.
(In reply to Mike Hommey [:glandium] from comment #11)
> sigh. Why can't gcc figure out that it doesn't /need/ a static initializer to set a value to 0?

FWIW clang-cl did this too. And maybe even MSVC (I'm not sure if this is what you used, Yusuf?)
(In reply to David Major [:dmajor] from comment #13)
> FWIW clang-cl did this too.

Er, I'm wrong, clang-cl did not generate an initializer for this. I don't see one in treeherder m-c debug builds (with MSVC) either.
How are you guys building your failing builds? Because doing a mingw builds, I don't get a static initializer either. That's with the mingw-gcc from automation, which is 6.4.

Or are your talking about --disable-optimize msvc builds?
I built with --enable-debug and --disable-optimize
With what? msvc? mingw?
Ok, I confirmed it happens with MSVC with --disable-optimize.
... and changing to `static Atomic<bool> malloc_initialized;` (without the explicit value) doesn't change anything, which is not entirely surprising.
worse, still, while `malloc_initialized` doesn't have a static initializer in opt builds, gInitLock does. That's pretty bad.
although, practically speaking, that probably doesn't cause any problem, contrary to malloc_initialized.
Hmm, I was hoping the fix would come from the other side (don't malloc in dllmain). Is that not feasible?
Even if that's desirable, there could be other things that could allocate during initialization of mozglue, and that would still be a problem. The immediate problem is that mozjemalloc is meant to *not* have static initialization for those variables. The code was explicitly written for that, except MSVC doesn't want to do it. If you want to remove the memory allocation from the interceptor, feel free to file a separate bug.
Assignee: nobody → mh+mozilla
If you're claiming that MSVC is the problem here, could you at least avoid this workaround under clang-cl? (It defines both _MSC_VER and __clang__.)
(In reply to Mike Hommey [:glandium] from comment #24)
> If you want to remove the memory
> allocation from the interceptor, feel free to file a separate bug.

Fine: bug 1463961.
Comment on attachment 8980158 [details]
Bug 1460838 - Avoid static initializers in mozjemalloc with MSVC.

https://reviewboard.mozilla.org/r/246306/#review252438
Attachment #8980158 - Flags: review?(n.nethercote) → review+
(In reply to David Major [:dmajor] from comment #25)
> If you're claiming that MSVC is the problem here, could you at least avoid
> this workaround under clang-cl? (It defines both _MSC_VER and __clang__.)

Let's add a !defined(_clang__) to the condition, then.
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/autoland/rev/b1f8ccb26696
Avoid static initializers in mozjemalloc with MSVC. r=njn
Comment on attachment 8980158 [details]
Bug 1460838 - Avoid static initializers in mozjemalloc with MSVC.

Approval Request Comment
[Feature/Bug causing the regression]: Partially a regression from bug 1412717, partially from bug 1418389.
[User impact if declined]: There might not be direct visible problem, but there's a risk that some internal allocator data is broken wrt early allocations, depending whether MSVC uses static initializers or not. As of the current releases, it /seems/ everything is fine on pgo builds, but better safe than sorry.
[Is the change risky?]: No
[Why is the change risky/not risky?]: For malloc_initialized, we're back to the code before bug 1412717, and we've shipped with that for years. For gInitLock, there is no practical difference from the patch.
Attachment #8980158 - Flags: approval-mozilla-esr60?
Attachment #8980158 - Flags: approval-mozilla-beta?
https://hg.mozilla.org/mozilla-central/rev/b1f8ccb26696
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Comment on attachment 8980158 [details]
Bug 1460838 - Avoid static initializers in mozjemalloc with MSVC.

Harden our code so compilers don't do bad things without warning. Approved for 61.0b8 and ESR 60.1.
Flags: needinfo?(aklotz)
Attachment #8980158 - Flags: approval-mozilla-esr60?
Attachment #8980158 - Flags: approval-mozilla-esr60+
Attachment #8980158 - Flags: approval-mozilla-beta?
Attachment #8980158 - Flags: approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.