Closed Bug 514921 Opened 16 years ago Closed 14 years ago

[10.6.X] Crash [@objc_msgSend | TFSInfo::~TFSInfo() ]

Categories

(Core :: Widget: Cocoa, defect)

x86
macOS
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mossop, Assigned: smichaud)

References

Details

(Keywords: crash, topcrash, Whiteboard: [10.6][10.6.1][Apple bug])

Crash Data

Flags: blocking1.9.2?
Also seen a stack very similar for this in TextWrangler so I suspect there is an underlying OS problem here
blocking2.0: --- → ?
Keywords: dogfood
> everytime I use a filepicker on OSX Please be more specific. When you open it? When you close it? When you choose a file? Does the problem happen with the frequency when you run without extensions?
After the filepicker has been closed, though perhaps not immediately. It happens regardless of whether extensions are installed.
Thanks. Here's another question: I notice you're using the DivX Decoder plugin, which seems to be implicated in another weird crash (bug 509130). What happens when you disable it? (What you've reported about TextWrangler and Adobe apps makes it less likely that the DivX Decoder plugin is causing trouble ... but disabling it is still worth a try.)
And by the way, are you using the file picker to access files over an SMB connection? > After the filepicker has been closed, though perhaps not immediately. After you've chosen a file? Do the crashes also happen when you cancel out?
Happens with all plugins disabled and both when cancelling the picker or when choosing something.
After a few quick tests, I can't reproduce this problem. I tried viewing a file (fairly small) over an SMB connection -- no crash. I also tried opening the file picker to change my download directory, then canceling out -- again no crash. > And by the way, are you using the file picker to access files over > an SMB connection? Please let us know. Adobe thinks the problem is more likely over an SMB connection. They also think it's more likely with larger files.
Marcia, could you test this?
(In reply to comment #10) > > And by the way, are you using the file picker to access files over > > an SMB connection? > > Please let us know. Adobe thinks the problem is more likely over an > SMB connection. Sorry, no this is all local files.
> Marcia, could you test this? And Henrik too, of course :-)
Whiteboard: [10.6 OS issue?]
Almost certainly 10.6, we have 724 crash reports with this sig in the past week, all on 10.6, but on all branches of Firefox it seems: http://bit.ly/2JJEkS
Summary: Consistent crash after using filepicker [ @ objc_msgSend | TFSInfo:: ~TFSInfo ()‎ ] → Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎]
Whiteboard: [10.6 OS issue?] → [10.6]
Yes I will test this and report back. I also monitor twitter 10.6 crash reports but I have not seen this stack reported yet.
Quite a lot of these are the same issue: http://bit.ly/SSxha
Summary: Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎] → Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease]
Summary: Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease] → [10.6] Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease]
Version: Trunk → unspecified
Keywords: topcrash
Mossop: Something I forgot to ask earlier -- Do you see anything interesting/relevant in the console log?
This came up shortly after the crash, not sure how helpful it is though: 10/09/2009 21:06:35 GrowlHelperApp[226] *** attempt to pop an unknown autorelease pool (0x911a00)
So ... do you still see crashes when you disable Growl?
(In reply to comment #19) > So ... do you still see crashes when you disable Growl? yes
Oh, well :-( Needless to say, we still need better information on how to reproduce this. And perhaps also on what interactions might be taking place with other apps.
(By the way, I suspect the GrowlHelperApp error is a consequence of your crash.)
Mossop: A 10.6.1 update has just appeared on Apple's Software Update. Try installing it and see what happens.
I wasn't able to reproduce this yet, I just updated to 10.6.1. I remember looking at some of the comments from the CFrelease crashes and trying to reproduce, but I didn't have any luck back them. Steven - Does the CF release function have something to do with a file lock?
I suspect both crashes (those whose top level is CFRelease and those whose top level is objc_msgSend) are caused by access to a deleted object (deep in Apple's code). In other words I suspect both are (more or less) the same crash (or at least have the same cause). Of course only the CFRelease crashes that have TFSInfo::~TFSInfo() in the stack are relevant to this bug.
http://blogs.adobe.com/jnack/2009/09/a_few_problems_found_with_ps_sl.html seems to indicate that other programs such as Photoshop are having issues with File open and save operations.
After updating to 10.6.1 I spent a couple of minutes just opening and closing a file dialog with no crash. Unlikely that I'd have been able to do that before so I think we might be able to call this fixed. I'll reopen if it does crash again though. Hopefully the crash reports will start to tail off too.
Status: NEW → RESOLVED
blocking2.0: ? → ---
Closed: 16 years ago
Flags: blocking1.9.2?
Resolution: --- → WORKSFORME
> Steven - Does the CF release function have something to do with a > file lock? Oops, Marcia, I didn't answer this question. The answer is "no". CFRetain and CFRelease (respectively) increase or decrease a CoreFoundation object's reference count. Once the reference count goes to zero, the object will get deleted. I suspect what happened here is that the object in question had already been deleted by the time CFRelease was called on it.
The [@objc_msgSend | TFSInfo::~TFSInfo()] crash is still a topcrasher (#56 on OS X with 429 crashes in the last week), and also happens on OS X 10.6.1. So I'm going to have to reopen this bug. (Note that most of the CFRelease crashes are unrelated. As I mentioned in comment #25, only those CFRelease crashes that have TFSInfo::~TFSInfo() in the stack are relevant here.) All these crashes happen on 10.6 or 10.6.1. I'm pretty sure they're ultimately caused by one or more Apple bugs. I've been looking further into them. I'll have more to say in my next comment.
Status: RESOLVED → REOPENED
Keywords: dogfood
Resolution: WORKSFORME → ---
Assignee: nobody → smichaud
Status: REOPENED → ASSIGNED
I suspect the STR aren't accurate for the remaining crashes since I haven't crashed with this signature since installing 10.6.1.
TFSInfo::~TFSInfo() and friends are undocumented functions in the DesktopServicesPriv framework (in /System/Library/PrivateFrameworks/). They're not new with 10.6 (they and the DesktopServicesPriv framework also exist in OS X 10.5.X). As best I can tell all the TFSInfo::~TFSInfo() crashes are on the main thread. But there are also a few seemingly related crashes (on trunk) in secondary threads at TFSInfo::AddPtrReference(). So I suspect the TFSInfo::~TFSInfo() crashes may be caused by an issue with threads. At the bottom of each of the TFSInfo::AddPtrReference() stacks is the symbol start_wqthread -- not the thread_start you normally find at the bottom of secondary-thread stacks. Each of these threads is a "worker thread", created and controlled by what appears to be an undocumented, Apple-only extension of the Posix pthreads API. This API is different on 10.5 and 10.6. But both OSes have a pthread_workqueue_create_np() method (in libSystem.dylib). I'll call this new API the "pthread_workqueue" API. Of course, nothing in the tree uses the pthread_workqueue API directly. But if you run FF in gdb (on OS X 10.6.1), even in a plain-vanilla environment (no extensions, only standard plugins, fresh profile), and break on pthread_workqueue_create_np, you'll find that this target gets hit several times on startup (during a call to the documented CoreFoundation function CFURLGetFSRef()). Then if you allow FF to finish loading, break into gdb, and do 'thread apply all bt', you'll find one thread whose bottom symbol is start_wqthread. All this implies that we may be able to make progress by running FF in gdb (on OS X 10.6.X) and finding what else hits the pthread_workqueue_create_np target. (I'll continue my remarks in my next comment.)
Note that the TFSInfo::~TFSInfo() and TFSInfo::AddPtrReference() crash stacks all have several threads whose bottom symbol is start_wqthread. (More coming up.)
> All this implies that we may be able to make progress by running FF > in gdb (on OS X 10.6.X) and finding what else hits the > pthread_workqueue_create_np target. Of course we should also try to find out what hits the TFSInfo::~TFSInfo() and TFSInfo::AddPtrReference() targets. But I can't persuade gdb to break on those targets -- probably because their original symbols (in DesktopServicesPriv) have been mangled (using C++ name mangling), and Breakpad doesn't demangle them fully. (I wish I knew how to reverse-engineer C++ name mangling. Anyone have pointers to documents on how to do that?) But the TFSInfo::~TFSInfo() stacks also have a non-mangled NodeContextClose target, and the TFSInfo::AddPtrReference() stacks have a non-mangled __PostNodeTaskRequest_block_invoke_2 target. (Both targets are also in DesktopServicesPriv.) So we could use those targets instead. So to sum up my conclusions so far: We may be able to make progress resolving these crashes by running FF in gdb and having it break on the following targets, then seeing what causes them to be called: pthread_workqueue_create_np NodeContextClose __PostNodeTaskRequest_block_invoke_2 (Continued)
Finally, we've got to ask ourselves why these crashes have started happening on OS X 10.6.X? TFSInfo::~TFSInfo() and friends and pthread_workqueue_create_np() friends existed on OS X 10.5.X. But as far as we know they don't cause any problems there. My hunch is we're looking for something new on OS X 10.6 that has to do with threading. An obvious candidate is "Grand Central Dispatch" (http://developer.apple.com/mac/articles/cocoa/introblocksgcd.html). Among other things it implements "dispatch queues", for which it presumably uses worker threads (and the pthread_workqueue API). FF doesn't (yet) use Grand Central Dispatch. And it's likely that (for now) only Apple apps and libraries use it. I don't yet know what symbols to look for in a library to show that it uses Grand Central Dispatch. But I'm going to be working on that.
(In reply to comment #30) > I suspect the STR aren't accurate for the remaining crashes since I > haven't crashed with this signature since installing 10.6.1. I suspect you're right, so I'll rename this bug.
Summary: [10.6] Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease] → [10.6.X] Crashes [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease]
Whiteboard: [10.6] → [10.6][10.6.1]
Here's something that may or may not be relevant: http://www.openradar.appspot.com/6332143 It's a bug, seemingly in worker threads and dispatch queues, on OS X 10.5.5. The bug is triggered by using the Cocoa NSOperationQueue and NSInvocationOperation classes -- which don't appear to be involved with this bug (bug 514291). This may indicate that the pthread_workqueue API was already fragile in OS X 10.5.X.
(Following up comment #6) > Looks like this problem is fairly widespread: > > http://www.cocoabuilder.com/archive/message/cocoa/2009/9/7/244557 > http://kb2.adobe.com/cps/506/cpsid_50654.html The first link is definitely the same crash (in another app). For some reason Google doesn't work properly searching on "TFSInfo::~TFSInfo()" or "TFSInfo::~TFSInfo". But if you search on "TNode::IsUnresolved()", you get *lots* of hits for this bug's crash, in *many* different apps. Here are a few links: http://discussions.apple.com/thread.jspa?threadID=2147095&tstart=0 http://code.google.com/p/chromium/issues/detail?id=24326 http://forums.adobe.com/thread/499962 http://forum.videolan.org/viewtopic.php?f=12&t=64812&p=216400 I don't think there can be any further doubt that this is an Apple bug.
Apple's Grand Central Dispatch reference is at http://developer.apple.com/mac/library/documentation/Performance/ReferenceGCD_libdispatch_Ref/Reference/refere\ nce.html. I picked a symbol from that reference (dispatch_get_global_queue) and grepped the following directories for matches (on an OS X 10.6.1 partition): /System/Library/Frameworks /System/Library/PrivateFrameworks /Applications Here are the matches I found: AddressBook.framework AppKit.framework System.framework QuickTime Player.app I fiddled around a bit with QuickTime videos from http://apple.com/trailers, but I don't think QuickTime is implicated. This tends to be confirmed by dbaron's (wonderful) correlation data at http://people.mozilla.org/~dbaron/crash-stats/20090929-interesting-modules (search in it for "TFSInfo::~TFSInfo()"). I also found how to trigger calls to TFSInfo::~TFSInfo() -- not surprisingly, they happen whenever you open Apple's modal Open File or Save As dialogs. I suspect the GCD consumer that's triggering these crashes is either the AppKit framework or the System framework -- probably the former.
dispatch_async is a GCD function that "submits a block for asynchronous execution on a dispatch queue and returns immediately". It gets called many times whenever you open or close a Save As or File Open dialog, often in code under a TNode:: method.
Whiteboard: [10.6][10.6.1] → [10.6][10.6.1][Apple bug]
(Following up comment #38) > Here are the matches I found: > > AddressBook.framework > AppKit.framework > System.framework > QuickTime Player.app These are apps that use the dispatch_... API documented in the GCD Reference from comment #39. But GCD consumers can also use extensions to the C and C++ languages, as documented at the following links: http://thirdcog.eu/pwcblocks/ http://clang.llvm.org/docs/BlockLanguageSpec.txt http://clang.llvm.org/docs/BlockImplementation.txt The last link shows that all users of the GCD language extensions should reference the following symbols (the initial underscores are actually part of the symbols' names, and not the extra underscore that gets added to all symbols in OS X binaries): _NSConcreteGlobalBlock _NSConcreteStackBlock Grepping on these symbols turns up a much longer (and more realistic) list of GCD consumers. It includes the following from /Applications, plus virtually everything in /System/Library/Frameworks and /System/Library/PrivateFrameworks -- including the DesktopServicesPriv framework. Address Book.app Font Book.app iChat.app Mail.app Preview.app QuickTime Player.app System Preferences.app TextEdit.app Utilities/AppleScript Editor.app Utilities/Console.app Utilities/Podcast Capture.app
> I suspect the GCD consumer that's triggering these crashes is either > the AppKit framework or the System framework -- probably the former. So this is wrong. The GCD consumer that's triggering these crashes is most likely the DesktopServicesPriv framework.
The TFSInfo::... and TNode::... APIs also exist in DesktopServicesPriv on OS X 10.5.8, and get used whenever you open a File Open or Save As dialog. But (of course) we don't see any of this bug's crashes on OS X 10.5.X.
I should add that I never see any FF threads with start_wqthread as the bottom symbol on OS X 10.5.8 -- so OS-supported "worker threads" (and the pthread_workqueue API) appear to be used less often on OS X 10.5.X.
Too many of the CFRelease crashes are unrelated.
Summary: [10.6.X] Crashes [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease] → [10.6.X] Crashes [@objc_msgSend | TFSInfo::~TFSInfo()‎]
Summary: [10.6.X] Crashes [@objc_msgSend | TFSInfo::~TFSInfo()‎] → [10.6.X] Crash [@objc_msgSend | TFSInfo::~TFSInfo() ]
http://blogs.adobe.com/jnack/2009/11/snow_leopard_1062_fixes_problems_with_ps.html reports that Apple fixed a open/save dialog crasher in 10.6.2.
Here are a couple of stacks in OS X 10.6.2 for what I suspect is a related crash: bp-419c530c-2fe4-45f0-bb70-52a822091111 bp-cf505500-a25f-449a-aec5-4977d2091111 It's still too early to be sure. But I suspect 10.6.2 didn't fix this bug.
Severity: normal → critical
Here's a stack on OS X 10.6.2 with this bug's crash: bp-eb58422d-f990-45c3-af4a-b67042091119 But it seems to be happening a lot less often, so maybe Apple's done something right: http://crash-stats.mozilla.com shows 25 instances of this bug's crash in the last day, but only one of them happened on 10.6.2. (The others are on 10.6.1 and 10.6.)
Crash Signature: [@objc_msgSend | TFSInfo::~TFSInfo() ]
We don't see this signature in the last 4 weeks on any version. Maybe Apple fixed something. Resolving as works for me.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago14 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.