Closed Bug 1284677 Opened 5 years ago Closed 5 years ago

[10.12] Nightly doesn't start on Sierra Beta 2

Categories

(Core :: Memory Allocator, defect, P1)

All
macOS
defect

Tracking

()

VERIFIED FIXED
mozilla50
Tracking Status
firefox50 --- verified

People

(Reporter: mstange, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

On 10.12 Beta 2, Nightly doesn't start. It hangs with this callstack, caught in a loop:

> * frame #0: 0x00007fffc41b9842 libsystem_malloc.dylib`malloc_zone_register_while_locked + 342
>   frame #1: 0x00007fffc41bdfc4 libsystem_malloc.dylib`malloc_zone_register + 58
>   frame #2: 0x0000000100094028 libmozglue.dylib`register_zone + 376
>   frame #3: 0x000000010001c69b dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 385
>   frame #4: 0x000000010001c89e dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
>   frame #5: 0x00000001000181da dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 338
>   frame #6: 0x0000000100018171 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 233
>   frame #7: 0x0000000100017254 dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 138
>   frame #8: 0x00000001000172e9 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 75
>   frame #9: 0x000000010000947a dyld`dyld::initializeMainExecutable() + 195
>   frame #10: 0x000000010000d7c0 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 3928
>   frame #11: 0x0000000100008249 dyld`dyldbootstrap::start(macho_header const*, int, char const**, long, macho_header const*, unsigned long*) + 470
>   frame #12: 0x0000000100008036 dyld`_dyld_start + 54

It also prints this error:

> firefox(2051,0x7fffcca583c0) malloc: *** malloc_zone_unregister() failed for 0x7fffcca4f548

Firefox Developer Edition starts up normally.

I tried to get a regression range with mozregression, which uses Nightly, and ended up with the following regression range:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e9b2b72f4e6c&tochange=f2f81e83f4ff

This range includes two patches related to jemalloc, bug 1014300 and bug 1014308.
From IRC discussion:

> firefox(2051,0x7fffcca583c0) malloc: *** malloc_zone_unregister() failed for 0x7fffcca4f548

This comes from this call:
https://dxr.mozilla.org/mozilla-central/source/memory/build/replace_malloc.c#500


And that:
>   frame #2: 0x0000000100094028 libmozglue.dylib`register_zone + 376

is:
https://dxr.mozilla.org/mozilla-central/source/memory/build/replace_malloc.c#514

As mentioned on irc, this is going to be hard to fix without the source for libmalloc from 10.12, which is not on opensource.apple.com yet.

The same code we have in replace_malloc.c is in upstream jemalloc, so this also affects upstream jemalloc, and by extension, anything including it, such as the rust compiler.
Component: Widget: Cocoa → Memory Allocator
Comment on attachment 8769049 [details]
Bug 1284677 - Change how the default OSX malloc zone is found.

Can you double check this fixes your issue? It does fix my testcase with your libsystem_malloc.dylib (which allowed me to reproduce on 10.11)

Review note: This applies a part of upstream 847ff22 (https://github.com/jemalloc/jemalloc/commit/847ff223dedad6b0f5186f904c817c0306ce599f) along with the changes from the corresponding jemalloc PR for this bug (https://github.com/jemalloc/jemalloc/pull/427), in a squashed form.

Our copy of jemalloc4 will need the same change, but I'll see if that can be pushed along an upstream update.
Attachment #8769049 - Flags: feedback?(mstange)
For clarification, the question was for mstange and the review note for njn.
Comment on attachment 8769049 [details]
Bug 1284677 - Change how the default OSX malloc zone is found.

Yes it does, thanks!
Attachment #8769049 - Flags: feedback?(mstange) → feedback+
Duplicate of this bug: 1285459
Severity: normal → major
Duplicate of this bug: 1285366
See Also: → 1285766
Duplicate of this bug: 1285766
Comment on attachment 8769049 [details]
Bug 1284677 - Change how the default OSX malloc zone is found.

https://reviewboard.mozilla.org/r/63030/#review60218

rs=me
Attachment #8769049 - Flags: review?(n.nethercote) → review+
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/autoland/rev/7968a51e61a6
Change how the default OSX malloc zone is found. r=njn
Priority: -- → P1
https://hg.mozilla.org/mozilla-central/rev/7968a51e61a6
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Verified FIXED with today's nightly build: Build identifier: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:50.0) Gecko/20100101 Firefox/50.0
Status: RESOLVED → VERIFIED
I am still seeing this with latest nightly on Sierra (16A254g) the callstack looks very similar 

firefox(3428,0x112bd93c0) malloc: *** malloc_zone_unregister() failed for 0x112b13000 it just hangs and never launches
(In reply to Benjamin Kerensa [:bkerensa] from comment #13)
> I am still seeing this with latest nightly on Sierra (16A254g) the callstack
> looks very similar 
> 
> firefox(3428,0x112bd93c0) malloc: *** malloc_zone_unregister() failed for
> 0x112b13000 it just hangs and never launches

Can you send me libsystem_malloc.dylib from your system?
Updating flags based on Comment 12.
Depends on: 1286613
You need to log in before you can comment on or make changes to this bug.