Closed Bug 1286613 Opened 8 years ago Closed 8 years ago

[10.12] Daily crashes on startup in nsAbOSXDirectory::Init on Sierra, via malloc_zone_batch_malloc, and _PFAllocateObjects

Categories

(Core :: Memory Allocator, defect)

All
macOS
defect
Not set
blocker

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
thunderbird_esr45 ? affected
thunderbird_esr52 + fixed
firefox51 --- wontfix
firefox52 --- wontfix
firefox-esr52 --- wontfix
firefox53 --- fixed

People

(Reporter: mstange, Assigned: glandium)

References

Details

(4 keywords, Whiteboard: [startupcrash][regression:TB50])

Crash Data

Attachments

(8 files)

Now that the startup hang has been fixed (bug 1285366), Daily brings up a window and then proceeds to crash. Sample crash report: https://crash-stats.mozilla.com/report/index/838f2d85-9eb0-4e39-8937-2db002160713 The crash happens in malloc_zone_batch_malloc, called by _PFAllocateObjects in CoreData, when initializing the OS X address book. I'm attaching a crash report from OS X which has full symbols.
I don't know if this is related. I don't have crashes, but since 10.12 I have some Thunderbird spin reports in /Library/Logs/DiagnosticReports and all of them are containing: Heaviest stack for the main thread of the target process: 62 nsAbOSXDirectory::Init(char const*) + 80 (nsAbOSXDirectory.mm:466,32 in XUL + 595200) [0x1010d8500] 50 +[ABAddressBook sharedAddressBook] + 67 (AddressBook + 60121) [0x7fffc3508ad9] 50 +[ABAddressBook nts_CreateSharedAddressBook] + 48 (AddressBook + 60514) [0x7fffc3508c62] 50 +[ABAddressBook nts_SharedAddressBook] + 141 (AddressBook + 60869) [0x7fffc3508dc5] 50 ABRunWithLock + 190 (AddressBook + 71352) [0x7fffc350b6b8] 50 __38+[ABAddressBook nts_SharedAddressBook]_block_invoke + 24 (AddressBook + 71462) [0x7fffc350b726] 50 -[ABAddressBook(ABAddressBook_CoreData_Private) managedObjectContext] + 109 (AddressBook + 71646) [0x7fffc350b7de] 50 -[ABAddressBook(ABAddressBook_CoreData_Private) nts_managedObjectContextWithStoreDescription:databasePath:loadFailure:] + 284 (AddressBook + 72673) [0x7fffc350bbe1] 50 -[ABPersistentStoreCoordinatorCache coordinatorForAllSources] + 48 (AddressBook + 73076) [0x7fffc350bd74] 50 -[ABPersistentStoreCoordinatorMap nts_coordinatorForAllSourcesAndDidUpdateMap:] + 114 (AddressBook + 73250) [0x7fffc350be22] 50 -[ABPersistentStoreCoordinatorFactory makeCoordinatorForAllAvailableSources] + 68 (AddressBook + 73800) [0x7fffc350c048] 50 ABResultWithAutoreleasePool + 61 (AddressBook + 73867) [0x7fffc350c08b] 50 -[ABPersistentStoreCoordinatorFactory pool_makeCoordinatorForAllAvailableSources] + 95 (AddressBook + 74038) [0x7fffc350c136] 50 -[ABAccountRepository persistentAccounts] + 23 (AddressBook + 75052) [0x7fffc350c52c] 50 -[ABAccountRepository allAccounts] + 208 (AddressBook + 20431) [0x7fffc34fefcf] 50 -[ABAccountRepository runWithLockLoadingExistingAccountsIfNecessary:] + 258 (AddressBook + 20843) [0x7fffc34ff16b] 28 -[ABAccountFactory uncachedAccounts] + 186 (AddressBook + 21666) [0x7fffc34ff4a2] 28 -[ABAccountFactory uncachedLdapAccounts] + 31 (AddressBook + 31933) [0x7fffc3501cbd] 28 -[ABAccountFactory userLDAPAccounts] + 33 (AddressBook + 32066) [0x7fffc3501d42] 28 -[ABXPCACAccountStore allContactsAccounts] + 241 (AddressBook + 32335) [0x7fffc3501e4f] 28 -[CNFuture resultBeforeDate:error:] + 55 (ContactsFoundation + 171653) [0x7fffd0111e85] 28 -[NSConditionLock lockWhenCondition:beforeDate:] + 232 (Foundation + 523827) [0x7fffc750be33] 28 -[NSCondition waitUntilDate:] + 335 (Foundation + 524231) [0x7fffc750bfc7] 28 __psynch_cvwait + 10 (libsystem_kernel.dylib + 105622) [0x7fffd9fb3c96] *28 psynch_cvcontinue + 0 (pthread + 39290) [0xffffff7f80dcd97a] And later: 1 calloc + 30 (libsystem_malloc.dylib + 21599) [0x7fffda00a45f] 6 1 malloc_zone_calloc + 87 (libsystem_malloc.dylib + 19230) [0x7fffda009b1e] 6 1 szone_malloc_should_clear + 2888 (libsystem_malloc.dylib + 13597) [0x7fffda00851d] (running) 6 (I can attach the full log if this helps)
Severity: normal → critical
Keywords: crash
20160711120908 is the oldest nightly build with signature of [@ @0x0 | CoreData@0x66f4 ] platform versions are 10.12.0 16A239j, 10.12.0 16A238m
Crash Signature: [@ @0x0 | CoreData@0x66f4 ]
tobbi is seeing this
Crash Signature: [@ @0x0 | CoreData@0x66f4 ] → [@ @0x0 | CoreData@0x66f4 ] [@ @0x0 | CoreData@0x6694 ]
A workaround is unchecked the permission for Daily to access your OSX contact will prevent the crash.
Crash Signature: [@ @0x0 | CoreData@0x66f4 ] [@ @0x0 | CoreData@0x6694 ] → [@ @0x0 | CoreData@0x66f4 ] [@ @0x0 | CoreData@0x6694 ] [@ @0x0 | CoreData@0x6174 ]
Whiteboard: [startupcrash]
Requesting tracking for 52. It would be preferable to not have this issue in beta.
Summary: [10.12] Daily crashes on startup in nsAbOSXDirectory::Init on Sierra → [10.12] Daily crashes on startup in nsAbOSXDirectory::Init on Sierra, via malloc_zone_batch_malloc, and _PFAllocateObjects
libsystem_kernel.dylib@0x19dda another signature for this? bp-9f5e3edb-f566-415b-a80a-11a3c2161202 Ø 0 libsystem_kernel.dylib libsystem_kernel.dylib@0x19dda ... Ø 8 AddressBook AddressBook@0xf489 9 XUL nsAbOSXDirectory::Init(char const*) /builds/slave/tb-rel-c-esr45-m64_bld-0000000/build/mailnews/addrbook/src/nsAbOSXDirectory.mm:466 10 XUL nsAbManager::GetDirectory(nsACString_internal const&, nsIAbDirectory**) /builds/slave/tb-rel-c-esr45-m64_bld-0000000/build/mailnews/addrbook/src/nsAbManager.cpp:295
(In reply to Thomasy from comment #4) > A workaround is unchecked the permission for Daily to access your OSX > contact will prevent the crash. I don't get tb to be up long enough to get to a UI that would let me switch this. Any guidance on how to set this? Also, this is dead-on reproducable on my newly-updated-to-sierra macbook, any way I can help out?
(In reply to Axel Hecht [:Pike] from comment #9) > (In reply to Thomasy from comment #4) > > A workaround is unchecked the permission for Daily to access your OSX > > contact will prevent the crash. > > I don't get tb to be up long enough to get to a UI that would let me switch > this. > > Any guidance on how to set this? > > Also, this is dead-on reproducable on my newly-updated-to-sierra macbook, > any way I can help out? Head to System Preferences > Security & Privacy and select the Privacy tab. Here you can select Daily that access your contact.
Crash Signature: [@ @0x0 | CoreData@0x66f4 ] [@ @0x0 | CoreData@0x6694 ] [@ @0x0 | CoreData@0x6174 ] → [@ @0x0 | CoreData@0x66f4 ] [@ @0x0 | CoreData@0x6694 ] [@ @0x0 | CoreData@0x6174 ] [@ @0x0 | CoreData@0x69d4 ] [@ @0x0 | CoreData@0x6c34 ] [@ libsystem_kernel.dylib@0x19dda ]
Crash Signature: [@ @0x0 | CoreData@0x66f4 ] [@ @0x0 | CoreData@0x6694 ] [@ @0x0 | CoreData@0x6174 ] [@ @0x0 | CoreData@0x69d4 ] [@ @0x0 | CoreData@0x6c34 ] [@ libsystem_kernel.dylib@0x19dda ] → [@ @0x0 | CoreData@0x66f4 ] [@ @0x0 | CoreData@0x6694 ] [@ @0x0 | CoreData@0x6174 ] [@ @0x0 | CoreData@0x69d4 ] [@ @0x0 | CoreData@0x6c34 ] [@ libsystem_kernel.dylib@0x19dda ] [@ libsystem_kernel.dylib@0x19dd6 ]
Mac AB is enabled by default according to bug 397811. We must determine how pervasive this is and what action to take. So var we don't have information about why one Mac user crashes and others not crash. Note: standard8 and bienvienu were heavily involved in enabling this feature. And Bug 642549 - Re-review Mac OS X address book integration code and check for leaks
See Also: → 1320048
#2 Mac crash for 45.6.0
rkent, not asking for analysys, only asking that you agree this is an important enough issue to set tracking+ - https://crash-stats.mozilla.com/topcrashers/?product=Thunderbird&version=51.0b1 should illustrate importance. eg. how often do we have Mac as #1 topcrash? Unless mstange has insight, finding the mac person to address this is another matter
Flags: needinfo?(rkent)
Keywords: steps-wanted
Mstange, do you have a theory for this crash? I have stopped automatic updates for beta channel because of this crash. Earliest crashes are indeed build 20160713[1], which definitely puts bug 1284677 in the cross hairs. Crash signature exists on both beta and nightly, but not for aurora 52.0a2. HOWEVER, I don't I believe the problem doesn't exist for 52.a2. I may be misenterpreting the data, but I'm wondering if automatic updates are broken because data indicatres we have very Mac users on aurora channel. [2] [1] https://crash-stats.mozilla.com/search/?signature=~%400x0%20%7C%20CoreData%400x6&product=Thunderbird&date=%3E%3D2016-07-14T19%3A11%3A48.000Z&date=%3C2016-08-07T19%3A11%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature @0x0 | CoreData@0x6be4 @0x0 | CoreData@0x6134 @0x0 | CoreData@0x6264 @0x0 | CoreData@0x66f4 @0x0 | CoreData@0x6d24 example f364d034-9171-42ed-bd06-cb1e12160802 2016-07-18 17:27:53 @0x0 | CoreData@0x6264 [2] Aurora Mac crashes for past month - https://crash-stats.mozilla.com/search/?release_channel=aurora&product=Thunderbird&platform=Mac%20OS%20X&date=%3E%3D2016-12-14T19%3A39%3A39.000Z&date=%3C2017-01-14T19%3A39%3A39.000Z&_sort=-build_id&_sort=-date&_facets=signature&_facets=version&_columns=date&_columns=signature&_columns=version&_columns=build_id&_columns=user_comments#facet-version 1 51.0a2 55 2 52.0a2 2 3 46.0a2 1
Blocks: 1284677
Severity: critical → blocker
Flags: needinfo?(mstange)
Keywords: regression
Whiteboard: [startupcrash] → [startupcrash][regression:TB50]
(In reply to Thomasy from comment #10) > (In reply to Axel Hecht [:Pike] from comment #9) > > ... > > Also, this is dead-on reproducable on my newly-updated-to-sierra macbook, > > any way I can help out? > > Head to System Preferences > Security & Privacy and select the Privacy tab. > Here you can select Daily that access your contact. thomasy, to confirm my suspicions that this reproduces with aurora, can you reproduce this with earlybird from https://ftp.mozilla.org/pub/thunderbird/nightly/latest-comm-aurora/ ?
Flags: needinfo?(thomas)
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #19) > Mstange, do you have a theory for this crash? The fact that this only happens on the Daily channel and not on the Earlybird channel, and the fact that we crash during memory allocation, definitely point to jemalloc / replace_malloc, similar to bug 1284677. > I have stopped automatic updates for beta channel because of this crash. > Earliest crashes are indeed build 20160713[1], which definitely puts bug > 1284677 in the cross hairs. Well, yes, earlier builds didn't crash because they didn't even get that far. They just wouldn't start at all. I don't know how to fix this bug, but glandium might have ideas.
Flags: needinfo?(mstange) → needinfo?(mh+mozilla)
See Also: → 1324424
(In reply to Markus Stange [:mstange] from comment #21) > (In reply to Wayne Mery (:wsmwk, NI for questions) from comment #19) > > Mstange, do you have a theory for this crash? > > The fact that this only happens on the Daily channel and not on the > Earlybird channel, and the fact that we crash during memory allocation, > definitely point to jemalloc / replace_malloc, similar to bug 1284677. Thanks for the info. To clarify - I now know this crash is also happens for earlybird channel. But probably not be reported because of bug 1324424, where I have just noted that as of this morning, 3 Thunderbird 51.0b1 beta users who crashed with bug 1286613 (this bug) have tried 52.0a2, and all report they also crash with TB 52.0a2 AND crash reporter is crashing.
There is also mention in other bug reports of needing bug 1324892 Install Mac 10.12 SDK (Sierra) on Mac builders (including for cross-compile)
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #22) > To clarify - I now know this crash is also happens for earlybird channel. That's surprising. Are you sure that's the same crash, and not the Touch Bar restore-from-sleep crash?
(In reply to Markus Stange [:mstange] from comment #24) > (In reply to Wayne Mery (:wsmwk, NI for questions) from comment #22) > > To clarify - I now know this crash is also happens for earlybird channel. > > That's surprising. Are you sure that's the same crash, and not the Touch Bar > restore-from-sleep crash? Same crash? Hard to say, because so far everyone I talked to crash reporter crashes = sucks :) However, I have at least one non-touch bar user report crash on earlybird. Still waiting on more reports.
This can be reproduced on current Daily too (I am sure I am getting exactly this crash). Note it can only be reproduced when executing Thunderbird with the OSX launcher, not when running via /Applications/Thunderbird.app/Contents/MacOS/thunderbird. If you want to debug this, start Thunderbird normally, then lldb attach. The crash occurs in https://dxr.mozilla.org/comm-central/rev/b1e7c35b6e09699c5a4aa731874c043983dfef71/mailnews/addrbook/src/nsAbOSXDirectory.mm#466 and is probably due to the jemalloc replacer as mstange mentioned in comment 21. A workaround is to use --disable-replace-malloc (untested), but this doesn't sounds like a good idea for production builds. We could disable the OSX integration to save our users from crashing, either completely or if this is only on specific macOS/OSX versions then selectively. We could of course also show a big warning when enabling it, with a hint on how to recover. As it is unlikely that we will get a jemalloc expert to look at this any time soon, I'd suggest this as a path of action.
(In reply to Philipp Kewisch [:Fallen] from comment #26) > This can be reproduced on current Daily too (I am sure I am getting exactly > this crash). Does it crash immediately at startup or do you have to do anything to crash it? I'm asking, because I use the OS X address book only in Thunderbird. I don't have any crashes on 10.12. and wondering why.
Hi I am using El Capitan OS with TB 51 and Lt 5.3b1...and reported (now linked as a dupe bug) TB crashing as soon as I updated the app. I reverted back to 50 and it was stable. followed advice following another bug and unchecked the mac address link in Sys pref under security which works. sooo i rechecked the mac address and as soon as TB starts it crashes, reporting this here to ans the Q. above
Sigh. This is yet another breakage from changes in Apple's libmalloc, the same that caused bug 1284677. > A workaround is to use --disable-replace-malloc (untested) replace-malloc is only enabled on nightly.
Assignee: nobody → mh+mozilla
Component: Address Book → Memory Allocator
Flags: needinfo?(mh+mozilla)
Product: Thunderbird → Core
(In reply to Mike Hommey [:glandium] from comment #29) > > A workaround is to use --disable-replace-malloc (untested) > > replace-malloc is only enabled on nightly. So there's something fishy here. I would actually expect --disable-replace-malloc to be a workaround, because I just figured out we actually don't enable jemalloc on Sierra by default unless replace malloc is enabled. But since replace-malloc is only enabled on nightly, this bug here shouldn't be happening on non-nightly builds
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #25) >... > However, I have at least one non-touch bar user report crash on earlybird. glandium, This user is zafar. He writes: MacBook Pro (Retina, 13-inch, Mid 2014) As my sierra 10.12.2 (16C67) update was done on Jan 4 and it was working after. Last change I recall there was a series of latest MS mac office patches automatically applied. I think a recent 51,0B1 upgrade triggered this. His first crash is on 01/14, well after updating to 10.12.2 bp-7c191b76-6532-4ab1-8847-1bf872170114 He has has three other crash signatures, as do a few other users with these signatures, which are also startup crashes. mozilla::layers::LayerManager::LayerUserDataDestroy bp-13ddf333-96f6-4850-b5cb-a48912170114 https://crash-stats.mozilla.com/signature/?signature=mozilla%3A%3Alayers%3A%3ATextureClient%3A%3ALock&date=%3E%3D2017-01-03T13%3A16%3A16.000Z&date=%3C2017-01-17T13%3A16%3A16.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1#reports mozilla::layers::TextureClient::Lock bp-13ddf333-96f6-4850-b5cb-a48912170114 https://crash-stats.mozilla.com/signature/?product=Thunderbird&signature=mozilla%3A%3Alayers%3A%3ATextureClient%3A%3ALock&date=%3E%3D2017-01-10T13%3A20%3A00.000Z&date=%3C2017-01-17T13%3A20%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1#reports AppKit@0x3a5126 bp-f9d1e711-0997-4f58-b5e3-3901a2170114
I also have a MacBook Pro (Retina, 13-inch, Mid 2014), and the crash crash doesn't happen upon restore-from-sleep.
Those are different crashes than the one this bug was filed for.
(In reply to Nomis101 from comment #27) > (In reply to Philipp Kewisch [:Fallen] from comment #26) > > This can be reproduced on current Daily too (I am sure I am getting exactly > > this crash). > > Does it crash immediately at startup or do you have to do anything to crash > it? I'm asking, because I use the OS X address book only in Thunderbird. I > don't have any crashes on 10.12. and wondering why. OK, now I can reproduce. But only on TB 53 and on old versions of TB 52. Current versions of TB 52 doesn't crash for me. I was wondering, why I see this crashes for old versions of TB 52 but not for current ones and tested some version: 52 11-Nov-2016 crash 52 12-Nov-2016 crash 52 14-Nov-2016 crash 52 15-Nov-2016 OK 52 20-Nov-2016 OK 52 23-Nov-2016 OK 14/15 November is the time when TB 52 switched from comm-central to aurora-central.
Attachment #8827773 - Flags: review?(n.nethercote)
The force_lock/force_unlock functions in the unified zone allocator was wrong.
Comment on attachment 8827772 [details] Bug 1286613 - Properly call mozjemalloc pre/post fork hooks on OSX when replace-malloc is enabled. https://reviewboard.mozilla.org/r/105376/#review106460 You misspelt "shortcomings" in the log message. ::: memory/build/zone.c:31 (Diff revision 1) > + * owned by the allocator. > + */ > + > +#include <stdlib.h> > +#include <malloc/malloc.h> > +#include "mozilla/Assertions.h" Move these to the top of the file?
Attachment #8827772 - Flags: review?(n.nethercote) → review+
Comment on attachment 8827774 [details] Bug 1286613 - Use the same zone allocator implementation as replace-malloc for mozjemalloc. https://reviewboard.mozilla.org/r/105380/#review106468 rs=me ::: memory/build/zone.c (Diff revision 1) > jemalloc_postfork_parent(); > } > > #else > > -#define JEMALLOC_ZONE_VERSION 6 Is there a reason not to keep the named constant?
Attachment #8827774 - Flags: review?(n.nethercote) → review+
Comment on attachment 8827775 [details] Bug 1286613 - Don't rely on OSX SDK malloc/malloc.h for malloc_zone struct definitions. https://reviewboard.mozilla.org/r/105382/#review106470 rs=me
Attachment #8827775 - Flags: review?(n.nethercote) → review+
Comment on attachment 8827776 [details] Bug 1286613 - Add dummy implementations for most remaining OSX zone allocator functions. https://reviewboard.mozilla.org/r/105384/#review106472 rs=me
Attachment #8827776 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #42) > ::: memory/build/zone.c:31 > (Diff revision 1) > > + * owned by the allocator. > > + */ > > + > > +#include <stdlib.h> > > +#include <malloc/malloc.h> > > +#include "mozilla/Assertions.h" > > Move these to the top of the file? FWIW, I left them there for the diff to be more straightforward. (In reply to Nicholas Nethercote [:njn] from comment #43) > > -#define JEMALLOC_ZONE_VERSION 6 > > Is there a reason not to keep the named constant? I didn't see a reason to keep it. In the other branch of that #if, it comes from jemalloc 4, and it was removed there. That being said, there's only one place using it now, and it feels to me it's better to have to update the version there, alongside additional changes to the zone struct, rather than modify a macro definition at the top of the file.
(In reply to Mike Hommey [:glandium] from comment #46) > (In reply to Nicholas Nethercote [:njn] from comment #42) > > ::: memory/build/zone.c:31 > > (Diff revision 1) > > > + * owned by the allocator. > > > + */ > > > + > > > +#include <stdlib.h> > > > +#include <malloc/malloc.h> > > > +#include "mozilla/Assertions.h" > > > > Move these to the top of the file? > > FWIW, I left them there for the diff to be more straightforward. ... but it wasn't because I forgot to turn on copy detection.
Blocks: 1332161
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #20) > (In reply to Thomasy from comment #10) > > (In reply to Axel Hecht [:Pike] from comment #9) > > > ... > > > Also, this is dead-on reproducable on my newly-updated-to-sierra macbook, > > > any way I can help out? > > > > Head to System Preferences > Security & Privacy and select the Privacy tab. > > Here you can select Daily that access your contact. > > thomasy, > > to confirm my suspicions that this reproduces with aurora, can you reproduce > this with earlybird from > https://ftp.mozilla.org/pub/thunderbird/nightly/latest-comm-aurora/ ? The Earlybird 52.0a2 2017/1/19 will not crash, while the daily channel today still crashes.
Flags: needinfo?(thomas)
Comment on attachment 8827772 [details] Bug 1286613 - Properly call mozjemalloc pre/post fork hooks on OSX when replace-malloc is enabled. Mozreview is too confused and doesn't want to change r+ flags, so reflecting the real wanted state on the bugzilla end.
Attachment #8827772 - Flags: review+ → review?(n.nethercote)
Attachment #8827773 - Flags: review?(n.nethercote) → review+
Attachment #8827774 - Flags: review+ → review?(n.nethercote)
Attachment #8828584 - Flags: review?(n.nethercote) → review+
Comment on attachment 8827772 [details] Bug 1286613 - Properly call mozjemalloc pre/post fork hooks on OSX when replace-malloc is enabled. https://reviewboard.mozilla.org/r/105376/#review106868
Attachment #8827772 - Flags: review?(n.nethercote) → review+
Comment on attachment 8827774 [details] Bug 1286613 - Use the same zone allocator implementation as replace-malloc for mozjemalloc. https://reviewboard.mozilla.org/r/105380/#review106870
Attachment #8827774 - Flags: review?(n.nethercote) → review+
Attachment #8828584 - Flags: review?(n.nethercote) → review+
Comment on attachment 8827773 [details] Bug 1286613 - Move replace-malloc zone allocator to a separate file. https://reviewboard.mozilla.org/r/105378/#review106876 Once more with feeling!
Attachment #8827773 - Flags: review?(n.nethercote) → review+
Blocks: 1332508
Pushed by mh@glandium.org: https://hg.mozilla.org/integration/autoland/rev/ee77d1e9cfda Properly call mozjemalloc pre/post fork hooks on OSX when replace-malloc is enabled. r=njn https://hg.mozilla.org/integration/autoland/rev/b5f2e22b2548 Move replace-malloc zone allocator to a separate file. r=njn https://hg.mozilla.org/integration/autoland/rev/5cc62bcc4a53 Use the same zone allocator implementation as replace-malloc for mozjemalloc. r=njn https://hg.mozilla.org/integration/autoland/rev/0b74f21e41cd Don't rely on OSX SDK malloc/malloc.h for malloc_zone struct definitions. r=njn https://hg.mozilla.org/integration/autoland/rev/07cb49de060b Add dummy implementations for most remaining OSX zone allocator functions. r=njn https://hg.mozilla.org/integration/autoland/rev/464bdc2b1c2c Update jemalloc 4 to c6943ac. r=njn
Sounds like we probably want this on 52 as well?
Flags: needinfo?(mh+mozilla)
Yes, and same for 1332508, but I wanted to wait for some more baking.
Flags: needinfo?(mh+mozilla)
Wayne asked for my comment a week ago. Yes we should track this.
Flags: needinfo?(rkent)
Blocks: 1332246
I originally raised a bug 1330940 reporting that TB 51.0 crashed nearly always on start-up otherwise opening up a mail.Lightning 5.3b1 was also installed. My bug was seen as a dupe to this one. Advised to deselect address book under security and this worked. Now updated TB to 51.0b2, just rechecked the address book and still crashes. I am on OS 10.11.6. For your info mainly
(In reply to Antony from comment #70) > I originally raised a bug 1330940 reporting that TB 51.0 crashed nearly > always on start-up otherwise opening up a mail.Lightning 5.3b1 was also > installed. My bug was seen as a dupe to this one. Advised to deselect > address book under security and this worked. Now updated TB to 51.0b2, just > rechecked the address book and still crashes. I am on OS 10.11.6. For your > info mainly Thanks for the update. Without a crash report/crash ID it's hard to diagnose your problem, but could be bug 1332246 which we recently refined. It occurs when new mail arrives, and as a result accesses Mac contacts. To narrow the possibilities, you would need to a) have MacOS updated to 10.12.3, b) run a Thunderbird daily build from https://archive.mozilla.org/pub/thunderbird/nightly/latest-comm-central/ If you crash, hopefully a crash report will have been created.
I sent a crash report through the pop-up crash window when first encountered. Asked before for the ID.. never got or get one, even just now. My crash is not with new mail but almost always as TB opens up main window and is refreshing to stablize. So maybe linked to looking up for new mail and crashes anyway new mail or not Ill get the daily as noted above and try again for you. I usually use FTP to get my updates (and reverses when needed like 51.0 to 50.0) As for Sierra, 10.12 not really interested... not like 10.11 and wondered about going back to 10.9 Mavericks!!!! But as with all things OS these days PC and Mac, blasted nightmere to reverse... I go back before Windows to DOS and CP/M.
crash IDs linked to my original bug report 1330940..appreciate this particular bug is more 10.12 OSX related, I am on 10.11.6 bp-f809a931-629b-45cc-8711-9d3d32170126 26/01/2017 bp-661292c3-6044-4b54-b9a6-83c422170126 26/01/2017 bp-a4f3dc26-d4a4-47bf-a3fd-0618d2170113 13/01/2017 bp-5dbf6beb-0bc1-4e7d-9e6a-d98d32170113 13/01/2017 bp-5517eef5-208d-4abb-b1f9-fc22b2170113 13/01/2017 ciao
Anthony, this is exactly bug 1332246 which I filed lately.
Cool bananas - apols for delay with reports as did not know where they were. D'oh, should have done. In that case I will follow 1332246 from now and remove me from the mail list here, tx and ciao
(In reply to Mike Hommey [:glandium] from comment #68) > Yes, and same for 1332508, but I wanted to wait for some more baking. good to uplift to aurora?
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #76) > (In reply to Mike Hommey [:glandium] from comment #68) > > Yes, and same for 1332508, but I wanted to wait for some more baking. > > good to uplift to aurora? This is already on aurora, by virtue of having landed during the 53 cycle. It needs an uplift to beta, though.
Comment on attachment 8827776 [details] Bug 1286613 - Add dummy implementations for most remaining OSX zone allocator functions. Approval Request Comment [Feature/Bug causing the regression]: system allocator changes in OSX 10.12 [User impact if declined]: system libraries using some specific system allocator APIs can cause crashes on OSX 10.12. In practice, it probably doesn't affect Firefox except maybe with some obscure addons, but it does affect Thunderbird and I guess Seamonkey because they use the CoreData framework for Address Book data (AIUI), and the CoreData framework calls into some of those system allocator APIs that Gecko's hooking triggers crashes when they're being used. [Is this code covered by automated tests?]: we don't have automated tests on OSX 10.12, and we don't have uses of the problematic system allocator APIs in place. [Has the fix been verified in Nightly?]: I checked with try builds before landing. [Needs manual test from QE? If yes, steps to reproduce]: I guess the Thunderbird side should do some checking. I don't think we need to check anything on the Firefox side. [List of other uplifts needed for the feature/fix]: All the patches in this bug [Is the change risky?]: Yes and no. [Why is the change risky/not risky?]: The cumulation of all the patches in this bug is rather intrusive in that it changes how mozjemalloc is hooked with the system allocator to share the hooking system that has been used for replace-malloc for a long time. The hooking system used for replace-malloc is enabled on nightly only, and the one used for mozjemalloc was used on other branches, including aurora. So the patches didn't really do anything risky (except fixing the bug) on nightly, because they didn't change the hooking because that's the hooking that was used anyways. However, when the changes rode the trains to aurora, that made mozjemalloc use the shared hooking instead, and if it had caused problems, we would have heard of it already. (Plus, I had done several central-as-beta pushes to try). IOW, there is no(t supposed to be any) difference in configuration between aurora and beta that would cause the code to behave differently on beta. [String changes made/needed]: None
Attachment #8827776 - Flags: approval-mozilla-beta?
Thanks for reminding me that patches are on aurora. (In reply to Mike Hommey [:glandium] from comment #78) > Comment on attachment 8827776 [details] > Bug 1286613 - Add dummy implementations for most remaining OSX zone > allocator functions. > > Approval Request Comment > ... > [Needs manual test from QE? If yes, steps to reproduce]: I guess the > Thunderbird side should do some checking. I don't think we need to check > anything on the Firefox side. I can offer the following about Thunderbird: * Spot checks of libsystem_kernel.dylib@0x19dd6 in 53.0a2 are all bug 1332246. i.e. I'm not finding the exact stack of this bug in recent earlybird crashes. * CoreData@0x6c34 last 53.0a1 crash is build 20170120030212 * CoreData@0x69d4 which occurred for 53.0a1 and 52.0a1 is not since since 20170117030214 build So it seems likely the patches here had a positive effect. Is that sufficient info for green light? OTOH, if first person user confirmation is needed, that will probably take at least a couple more days and would include 45.x users crashing with libsystem_kernel.dylib@0x19dd6
Flags: needinfo?(mh+mozilla)
Thunderbird Daily still crashes inside nsAbOSXDirectory::Init for me on 10.12, reproducably a short time (< 1 second) after I focus the main tree view: https://crash-stats.mozilla.com/report/index/e4f55fc9-d165-4a4b-a333-aaa9e2170131 https://crash-stats.mozilla.com/report/index/fdf6bb3b-eea5-4c51-8439-9fab62170131 This is from the current build from https://ftp.mozilla.org/pub/thunderbird/nightly/latest-comm-central/thunderbird-54.0a1.en-US.mac.dmg which was built on January 25th: 20170125030213 https://hg.mozilla.org/comm-central/rev/0e51f36ee7eb906e9131ea71e0f67d5439bf6667 https://hg.mozilla.org/mozilla-central/rev/6dccae211ae5fec6a1c1244b878ce0b93860154f And the mozilla-central revision 6dccae211ae5fec6a1c1244b878ce0b93860154f includes both the patches from this bug and from bug 1332508.
(In reply to Markus Stange [:mstange] from comment #80) > Thunderbird Daily still crashes inside nsAbOSXDirectory::Init for me on > 10.12, reproducably a short time (< 1 second) after I focus the main tree > view: > https://crash-stats.mozilla.com/report/index/e4f55fc9-d165-4a4b-a333- > aaa9e2170131 > https://crash-stats.mozilla.com/report/index/fdf6bb3b-eea5-4c51-8439- > 9fab62170131 Those are bug 1332246. (In reply to Wayne Mery (:wsmwk, NI for questions) from comment #79) > So it seems likely the patches here had a positive effect. Is that > sufficient info for green light? Yes, thanks.
Flags: needinfo?(mh+mozilla)
Comment on attachment 8827776 [details] Bug 1286613 - Add dummy implementations for most remaining OSX zone allocator functions. This seems rather too intrusive for beta, especially if the impact on firefox is expected to only be theoretical, sorry.
Attachment #8827776 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
I tried to uplift this onto our beta release branch THUNDERBIRD520b2_2017020901_RELBRANCH. However, I got some merge conflicts which weren't too hard to resolve manually. At the end I noticed that this patch set upgrades from -4.4.0-0-gf1f76357313e7dcad7262f17a48ff0a2e005fcdc to +4.4.0-3-gc6943acb3c56d1b3d1e82dd43b3fcfeae7771990. However, the version on mozilla52 beta is 4.3.1-0-g0110fa8451af905affd77c3bea0d545fee2251b2 So I don't think it's safe to apply this patch set to the wrong base version. Either we uplift the upgrade from 4.3.1 to 4.4.0 as well, or we forget about this. Nicholas, any advice for us on this? Safe to apply to 4.3.1 or which bug did the 4.3.1 to 4.4.0?
Flags: needinfo?(n.nethercote)
Sorry, found it: bug 1322027.
Flags: needinfo?(n.nethercote)
I ended up uplifting 12 changesets from 4 bugs to the THUNDERBIRD520b2_2017020901_RELBRANCH branch. Details here: https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?changeset=5f22c17aeac24ce0854f80521e13fb17f601c744
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: