[10.6.X] Crash [@objc_msgSend | TFSInfo::~TFSInfo() ]

RESOLVED WORKSFORME

Status

()

Core
Widget: Cocoa
--
critical
RESOLVED WORKSFORME
8 years ago
6 years ago

People

(Reporter: mossop, Assigned: smichaud)

Tracking

({crash, topcrash})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [10.6][10.6.1][Apple bug], crash signature)

(Reporter)

Description

8 years ago
Almost everytime I use a filepicker on OSX (imageshack or selecting the downloads folder in preferences) trunk crashes, here are some stacks:

http://crash-stats.mozilla.com/report/index/3ae1b101-2979-4e1d-aff3-8e3a02090906
http://crash-stats.mozilla.com/report/index/c5fcc297-4e01-42a4-b8b2-cebfd2090906
http://crash-stats.mozilla.com/report/index/1bf9babd-ff3d-405c-8f10-6e8392090904
http://crash-stats.mozilla.com/report/index/6437e9d2-752f-489c-b213-48aeb2090904

This might be 10.6 only since I never saw it before the upgrade.
(Reporter)

Updated

8 years ago
Duplicate of this bug: 514922
(Reporter)

Comment 2

8 years ago
Seems to happen on 1.9.2 branch too:

http://crash-stats.mozilla.com/report/index/ac1b2483-1113-4654-9f7d-fd5b82090906?p=1
Flags: blocking1.9.2?
(Reporter)

Comment 3

8 years ago
Also seen a stack very similar for this in TextWrangler so I suspect there is an underlying OS problem here
(Reporter)

Updated

8 years ago
blocking2.0: --- → ?
Keywords: dogfood
(Assignee)

Comment 4

8 years ago
> everytime I use a filepicker on OSX

Please be more specific.  When you open it?  When you close it?  When you choose a file?

Does the problem happen with the frequency when you run without extensions?
(Reporter)

Comment 5

8 years ago
After the filepicker has been closed, though perhaps not immediately. It happens regardless of whether extensions are installed.
(Reporter)

Comment 6

8 years ago
Looks like this problem is fairly widespread:

http://www.cocoabuilder.com/archive/message/cocoa/2009/9/7/244557
http://kb2.adobe.com/cps/506/cpsid_50654.html
(Assignee)

Comment 7

8 years ago
Thanks.  Here's another question:

I notice you're using the DivX Decoder plugin, which seems to be implicated in another weird crash (bug 509130).  What happens when you disable it?

(What you've reported about TextWrangler and Adobe apps makes it less likely that the DivX Decoder plugin is causing trouble ... but disabling it is still worth a try.)
(Assignee)

Comment 8

8 years ago
And by the way, are you using the file picker to access files over an SMB connection?

> After the filepicker has been closed, though perhaps not immediately.

After you've chosen a file?  Do the crashes also happen when you cancel out?
(Reporter)

Comment 9

8 years ago
Happens with all plugins disabled and both when cancelling the picker or when choosing something.
After a few quick tests, I can't reproduce this problem.

I tried viewing a file (fairly small) over an SMB connection -- no
crash.  I also tried opening the file picker to change my download
directory, then canceling out -- again no crash.

> And by the way, are you using the file picker to access files over
> an SMB connection?

Please let us know.  Adobe thinks the problem is more likely over an
SMB connection.

They also think it's more likely with larger files.
Marcia, could you test this?
(Reporter)

Comment 12

8 years ago
(In reply to comment #10)
> > And by the way, are you using the file picker to access files over
> > an SMB connection?
> 
> Please let us know.  Adobe thinks the problem is more likely over an
> SMB connection.

Sorry, no this is all local files.
> Marcia, could you test this?

And Henrik too, of course :-)
Whiteboard: [10.6 OS issue?]
(Reporter)

Comment 14

8 years ago
Almost certainly 10.6, we have 724 crash reports with this sig in the past week, all on 10.6, but on all branches of Firefox it seems: http://bit.ly/2JJEkS
Summary: Consistent crash after using filepicker [ @ objc_msgSend | TFSInfo:: ~TFSInfo ()‎ ] → Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎]
Whiteboard: [10.6 OS issue?] → [10.6]
Yes I will test this and report back. I also monitor twitter 10.6 crash reports but I have not seen this stack reported yet.
(Reporter)

Comment 16

8 years ago
Quite a lot of these are the same issue: http://bit.ly/SSxha
Summary: Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎] → Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease]
(Assignee)

Updated

8 years ago
Summary: Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease] → [10.6] Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease]
(Assignee)

Updated

8 years ago
Version: Trunk → unspecified
(Assignee)

Updated

8 years ago
Keywords: topcrash
Mossop:  Something I forgot to ask earlier -- Do you see anything interesting/relevant in the console log?
(Reporter)

Comment 18

8 years ago
This came up shortly after the crash, not sure how helpful it is though:

10/09/2009 21:06:35	GrowlHelperApp[226]	*** attempt to pop an unknown autorelease pool (0x911a00)
So ... do you still see crashes when you disable Growl?
(Reporter)

Comment 20

8 years ago
(In reply to comment #19)
> So ... do you still see crashes when you disable Growl?

yes
Oh, well :-(

Needless to say, we still need better information on how to reproduce this.  And perhaps also on what interactions might be taking place with other apps.
(By the way, I suspect the GrowlHelperApp error is a consequence of your crash.)
Mossop:  A 10.6.1 update has just appeared on Apple's Software Update.  Try installing it and see what happens.
I wasn't able to reproduce this yet, I just updated to 10.6.1. I remember looking at some of the comments from the CFrelease crashes and trying to reproduce, but I didn't have any luck back them.  Steven - Does the CF release function have something to do with a file lock?
I suspect both crashes (those whose top level is CFRelease and those whose top level is objc_msgSend) are caused by access to a deleted object (deep in Apple's code).  In other words I suspect both are (more or less) the same crash (or at least have the same cause).

Of course only the CFRelease crashes that have TFSInfo::~TFSInfo() in the stack are relevant to this bug.
http://blogs.adobe.com/jnack/2009/09/a_few_problems_found_with_ps_sl.html seems to indicate that other programs such as Photoshop are having issues with File open and save operations.
(Reporter)

Comment 27

8 years ago
After updating to 10.6.1 I spent a couple of minutes just opening and closing a file dialog with no crash. Unlikely that I'd have been able to do that before so I think we might be able to call this fixed. I'll reopen if it does crash again though. Hopefully the crash reports will start to tail off too.
Status: NEW → RESOLVED
blocking2.0: ? → ---
Last Resolved: 8 years ago
Flags: blocking1.9.2?
Resolution: --- → WORKSFORME
> Steven - Does the CF release function have something to do with a
> file lock?

Oops, Marcia, I didn't answer this question.

The answer is "no".  CFRetain and CFRelease (respectively) increase or
decrease a CoreFoundation object's reference count.  Once the
reference count goes to zero, the object will get deleted.

I suspect what happened here is that the object in question had
already been deleted by the time CFRelease was called on it.
The [@objc_msgSend | TFSInfo::~TFSInfo()] crash is still a topcrasher
(#56 on OS X with 429 crashes in the last week), and also happens on
OS X 10.6.1.  So I'm going to have to reopen this bug.

(Note that most of the CFRelease crashes are unrelated.  As I
mentioned in comment #25, only those CFRelease crashes that have
TFSInfo::~TFSInfo() in the stack are relevant here.)

All these crashes happen on 10.6 or 10.6.1.  I'm pretty sure they're
ultimately caused by one or more Apple bugs.

I've been looking further into them.  I'll have more to say in my next
comment.
Status: RESOLVED → REOPENED
Keywords: dogfood
Resolution: WORKSFORME → ---
(Assignee)

Updated

8 years ago
Assignee: nobody → smichaud
Status: REOPENED → ASSIGNED
(Reporter)

Comment 30

8 years ago
I suspect the STR aren't accurate for the remaining crashes since I haven't crashed with this signature since installing 10.6.1.
TFSInfo::~TFSInfo() and friends are undocumented functions in the
DesktopServicesPriv framework (in /System/Library/PrivateFrameworks/).
They're not new with 10.6 (they and the DesktopServicesPriv framework
also exist in OS X 10.5.X).

As best I can tell all the TFSInfo::~TFSInfo() crashes are on the main
thread.  But there are also a few seemingly related crashes (on trunk)
in secondary threads at TFSInfo::AddPtrReference().  So I suspect the
TFSInfo::~TFSInfo() crashes may be caused by an issue with threads.

At the bottom of each of the TFSInfo::AddPtrReference() stacks is the
symbol start_wqthread -- not the thread_start you normally find at the
bottom of secondary-thread stacks.  Each of these threads is a "worker
thread", created and controlled by what appears to be an undocumented,
Apple-only extension of the Posix pthreads API.  This API is different
on 10.5 and 10.6.  But both OSes have a pthread_workqueue_create_np()
method (in libSystem.dylib).  I'll call this new API the
"pthread_workqueue" API.

Of course, nothing in the tree uses the pthread_workqueue API
directly.  But if you run FF in gdb (on OS X 10.6.1), even in a
plain-vanilla environment (no extensions, only standard plugins, fresh
profile), and break on pthread_workqueue_create_np, you'll find that
this target gets hit several times on startup (during a call to the
documented CoreFoundation function CFURLGetFSRef()).  Then if you
allow FF to finish loading, break into gdb, and do 'thread apply all
bt', you'll find one thread whose bottom symbol is start_wqthread.

All this implies that we may be able to make progress by running FF in
gdb (on OS X 10.6.X) and finding what else hits the
pthread_workqueue_create_np target.

(I'll continue my remarks in my next comment.)
Note that the TFSInfo::~TFSInfo() and TFSInfo::AddPtrReference() crash
stacks all have several threads whose bottom symbol is start_wqthread.

(More coming up.)
> All this implies that we may be able to make progress by running FF
> in gdb (on OS X 10.6.X) and finding what else hits the
> pthread_workqueue_create_np target.

Of course we should also try to find out what hits the
TFSInfo::~TFSInfo() and TFSInfo::AddPtrReference() targets.  But I
can't persuade gdb to break on those targets -- probably because their
original symbols (in DesktopServicesPriv) have been mangled (using C++
name mangling), and Breakpad doesn't demangle them fully.

(I wish I knew how to reverse-engineer C++ name mangling.  Anyone have
pointers to documents on how to do that?)

But the TFSInfo::~TFSInfo() stacks also have a non-mangled
NodeContextClose target, and the TFSInfo::AddPtrReference() stacks
have a non-mangled __PostNodeTaskRequest_block_invoke_2 target.  (Both
targets are also in DesktopServicesPriv.)  So we could use those
targets instead.

So to sum up my conclusions so far:

We may be able to make progress resolving these crashes by running FF
in gdb and having it break on the following targets, then seeing what
causes them to be called:

pthread_workqueue_create_np
NodeContextClose
__PostNodeTaskRequest_block_invoke_2

(Continued)
Finally, we've got to ask ourselves why these crashes have started
happening on OS X 10.6.X?

TFSInfo::~TFSInfo() and friends and pthread_workqueue_create_np()
friends existed on OS X 10.5.X.  But as far as we know they don't
cause any problems there.

My hunch is we're looking for something new on OS X 10.6 that has to
do with threading.  An obvious candidate is "Grand Central Dispatch"
(http://developer.apple.com/mac/articles/cocoa/introblocksgcd.html).
Among other things it implements "dispatch queues", for which it
presumably uses worker threads (and the pthread_workqueue API).

FF doesn't (yet) use Grand Central Dispatch.  And it's likely that
(for now) only Apple apps and libraries use it.  I don't yet know what
symbols to look for in a library to show that it uses Grand Central
Dispatch.  But I'm going to be working on that.
(In reply to comment #30)

> I suspect the STR aren't accurate for the remaining crashes since I
> haven't crashed with this signature since installing 10.6.1.

I suspect you're right, so I'll rename this bug.
Summary: [10.6] Consistent crash after using filepicker [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease] → [10.6.X] Crashes [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease]
Whiteboard: [10.6] → [10.6][10.6.1]
Here's something that may or may not be relevant:

http://www.openradar.appspot.com/6332143

It's a bug, seemingly in worker threads and dispatch queues, on OS X
10.5.5.  The bug is triggered by using the Cocoa NSOperationQueue and
NSInvocationOperation classes -- which don't appear to be involved
with this bug (bug 514291).

This may indicate that the pthread_workqueue API was already fragile
in OS X 10.5.X.
(Following up comment #6)

> Looks like this problem is fairly widespread:
>
> http://www.cocoabuilder.com/archive/message/cocoa/2009/9/7/244557
> http://kb2.adobe.com/cps/506/cpsid_50654.html

The first link is definitely the same crash (in another app).

For some reason Google doesn't work properly searching on
"TFSInfo::~TFSInfo()" or "TFSInfo::~TFSInfo".  But if you search on
"TNode::IsUnresolved()", you get *lots* of hits for this bug's crash,
in *many* different apps.  Here are a few links:

http://discussions.apple.com/thread.jspa?threadID=2147095&tstart=0
http://code.google.com/p/chromium/issues/detail?id=24326
http://forums.adobe.com/thread/499962
http://forum.videolan.org/viewtopic.php?f=12&t=64812&p=216400

I don't think there can be any further doubt that this is an Apple
bug.
Apple's Grand Central Dispatch reference is at
http://developer.apple.com/mac/library/documentation/Performance/ReferenceGCD_libdispatch_Ref/Reference/refere\
nce.html.

I picked a symbol from that reference (dispatch_get_global_queue) and
grepped the following directories for matches (on an OS X 10.6.1
partition):

/System/Library/Frameworks
/System/Library/PrivateFrameworks
/Applications

Here are the matches I found:

AddressBook.framework
AppKit.framework
System.framework
QuickTime Player.app

I fiddled around a bit with QuickTime videos from
http://apple.com/trailers, but I don't think QuickTime is implicated.
This tends to be confirmed by dbaron's (wonderful) correlation data at
http://people.mozilla.org/~dbaron/crash-stats/20090929-interesting-modules
(search in it for "TFSInfo::~TFSInfo()").

I also found how to trigger calls to TFSInfo::~TFSInfo() -- not
surprisingly, they happen whenever you open Apple's modal Open File or
Save As dialogs.

I suspect the GCD consumer that's triggering these crashes is either
the AppKit framework or the System framework -- probably the former.
Here's the link to Apple's Grand Central Dispatch Reference over again:
http://developer.apple.com/mac/library/documentation/Performance/Reference/GCD_libdispatch_Ref/Reference/reference.html
dispatch_async is a GCD function that "submits a block for
asynchronous execution on a dispatch queue and returns immediately".

It gets called many times whenever you open or close a Save As or File
Open dialog, often in code under a TNode:: method.
(Assignee)

Updated

8 years ago
Whiteboard: [10.6][10.6.1] → [10.6][10.6.1][Apple bug]
(Following up comment #38)

> Here are the matches I found:
>
> AddressBook.framework
> AppKit.framework
> System.framework
> QuickTime Player.app

These are apps that use the dispatch_... API documented in the GCD
Reference from comment #39.  But GCD consumers can also use extensions
to the C and C++ languages, as documented at the following links:

http://thirdcog.eu/pwcblocks/
http://clang.llvm.org/docs/BlockLanguageSpec.txt
http://clang.llvm.org/docs/BlockImplementation.txt

The last link shows that all users of the GCD language extensions
should reference the following symbols (the initial underscores are
actually part of the symbols' names, and not the extra underscore that
gets added to all symbols in OS X binaries):

_NSConcreteGlobalBlock
_NSConcreteStackBlock

Grepping on these symbols turns up a much longer (and more realistic)
list of GCD consumers.  It includes the following from /Applications,
plus virtually everything in /System/Library/Frameworks and
/System/Library/PrivateFrameworks -- including the DesktopServicesPriv
framework.

Address Book.app
Font Book.app
iChat.app
Mail.app
Preview.app
QuickTime Player.app
System Preferences.app
TextEdit.app
Utilities/AppleScript Editor.app
Utilities/Console.app
Utilities/Podcast Capture.app
> I suspect the GCD consumer that's triggering these crashes is either
> the AppKit framework or the System framework -- probably the former.

So this is wrong.  The GCD consumer that's triggering these crashes is
most likely the DesktopServicesPriv framework.
The TFSInfo::... and TNode::... APIs also exist in DesktopServicesPriv
on OS X 10.5.8, and get used whenever you open a File Open or Save As
dialog.  But (of course) we don't see any of this bug's crashes on OS
X 10.5.X.
I should add that I never see any FF threads with start_wqthread as the bottom symbol on OS X 10.5.8 -- so OS-supported "worker threads" (and the pthread_workqueue API) appear to be used less often on OS X 10.5.X.
Too many of the CFRelease crashes are unrelated.
Summary: [10.6.X] Crashes [@objc_msgSend | TFSInfo::~TFSInfo()‎] [@CFRelease] → [10.6.X] Crashes [@objc_msgSend | TFSInfo::~TFSInfo()‎]
(Assignee)

Updated

8 years ago
Summary: [10.6.X] Crashes [@objc_msgSend | TFSInfo::~TFSInfo()‎] → [10.6.X] Crash [@objc_msgSend | TFSInfo::~TFSInfo() ]
http://blogs.adobe.com/jnack/2009/11/snow_leopard_1062_fixes_problems_with_ps.html reports that Apple fixed a open/save dialog crasher in 10.6.2.
Here are a couple of stacks in OS X 10.6.2 for what I suspect is a
related crash:

bp-419c530c-2fe4-45f0-bb70-52a822091111
bp-cf505500-a25f-449a-aec5-4977d2091111

It's still too early to be sure.  But I suspect 10.6.2 didn't fix this
bug.

Updated

8 years ago
Severity: normal → critical
Here's a stack on OS X 10.6.2 with this bug's crash:

bp-eb58422d-f990-45c3-af4a-b67042091119

But it seems to be happening a lot less often, so maybe Apple's done
something right:  http://crash-stats.mozilla.com shows 25 instances of
this bug's crash in the last day, but only one of them happened on
10.6.2.  (The others are on 10.6.1 and 10.6.)
Crash Signature: [@objc_msgSend | TFSInfo::~TFSInfo() ]

Comment 49

6 years ago
We don't see this signature in the last 4 weeks on any version. Maybe Apple fixed something. Resolving as works for me.
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.