Closed Bug 565412 Opened 15 years ago Closed 13 years ago

Request for changes to Socorro signature filter for modules without symbols

Categories

(Socorro :: Infra, task)

All
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: smichaud, Unassigned)

References

Details

The following "signatures" in stack traces tell very little about the crashes with which they're associated. (These correspond to commonly used system libraries (on OS X), so crashes in them can have many different causes.) libclient.dylib@0xnnnnnnnn libawt.jnilib@0xnnnnnnnn libobjc.A.dylib@0xnnnnnnnn I'd like to change the Socorro signature filter (specifically the prefixSignatureRegEx variable) so that any stack trace beginning with 'libclient.dylib@', 'libawt.jnilib@' or 'libobjc.A.dylib@' is given a compound signature. https://bugzilla.mozilla.org/show_bug.cgi?id=540870#c10 tells me that 'libobjc.A.dylib@0x1568.' is already in prefixSignatureRegEx. But this is too specific -- I'd like it changed to 'libobjc.A.dylib@.'. So, in summary, I'd like the following changes to prefixSignatureRegEx: 1) Change 'libobjc.A.dylib@0x1568.' to 'libobjc.A.dylib@.' 2) Add 'libclient.dylib@.' 3) Add 'libawt.jnilib@.' Thanks very much in advance!
Blocks: 564625
my take on this: from prefixSignatureRegEx in the processor configuration: remove: "|libobjc.A.dylib@0x1568." and then add: "|libobjc\.A\.dylib@.*|libclient\.dylib@.*|libawt\.jnilib@.*"
Sounds reasonable to me. The current libobjc.A.dylib entry (in prefixSignatureRegEx) seems to "work", even though it doesn't escape its '.'s. But that's probably just an accident. http://crash-stats.mozilla.com/query/query?product=Firefox&version=ALL%3AALL&platform=mac&range_value=1&range_unit=weeks&date=05%2F12%2F2010+12%3A47%3A29&query_search=signature&query_type=contains&query=libobjc&build_id=&process_type=all&do_query=1
Don't know if the trailing '*' is really necessary, though.
> Don't know if the trailing '*' is really necessary, though. Oops, I take that back. Now I see that it *is* necessary. The current libobjc.A.dylib entry ('libobjc.A.dylib@0x1568.') only matches "signatures" of the form 'libobjc.A.dylib@0x1568N'.
My concern about doing this wholesale for Mac OS X libraries is that we'll end up with signatures that have 5 frames or more of libobjc+address before we finally hit something else that will hopefully be useful. I don't know if there are limits on the length of signatures or not, but if there are, we open ourselves up to a situation where we exhaust the signature length prepending unresymbolized OS library frames and never end up with a useful frame in the signature. I think with libobjc we'll be OK, for the most part (though I do know first-hand that libobjc.A.dylib@0xeca0 [aka _objc_error] crashes typically have 2-3 frames before they switch to a framework or hit other code. libclient crashes look like they could also be in this range. But I think we should move carefully before we start expanding this to other (larger) unresymbolized Mac OS X core Cocoa and Carbon libraries and frameworks, though. If we find certain high-frequency addresses in these other frameworks that have varied stacks below them, we should start by just setting those addresses to prepend (like we did for the handful of libobjc.A.dylib@0x1568* [objc_msgSend] frames two years ago), rather than jumping straight to "prepend any unresymbolized frame in this framework."
> My concern about doing this wholesale for Mac OS X libraries is that > we'll end up with signatures that have 5 frames or more of > libobjc+address before we finally hit something else that will > hopefully be useful. That'll make things no worse off than they are now. The "signatures" of the form 'library@0xnnnnnnnn' are almost completely useless. Occasionally we'll find that some of the crashes at a particular address really do have something in common, but that's very much the exception. Even then, the "same" crash will have different addresses on different platforms (PPC versus Intel) and different OS versions. (For example, bug 564625's crash at libclient.dylib@0x8b560 happens only on OS X 10.5.8 PPC.) I'm more concerned with the possible impact on Socorro performance of my requested changes. Please, let's take my requested changes as soon as possible. Then, if they don't cause performance problems, let's keep adding new compound signatures until we have reasonably complete coverage of the more commonly used system libraries.
Alternately someone could come up with a fix for bug 422527.
> Alternately someone could come up with a fix for bug 422527. But that bug's been open for two years now, and is unlikely to be fixed anytime soon. Let's take my requested changes now. They're easy to make ... and easy to back out if/when they're no longer needed.
Sure, I'm just noting that the underlying problem you're working around is "we don't have symbols for OS X system libraries".
And there's another problem: Even a fix for bug 422527 wouldn't help with Java libraries (like libclient.dylib and libawt.jnilib), which have most of their symbols stripped.
Just to make clear -- by "my requested changes" I mean what Lars suggests in comment #1 (that is, "my requested changes as corrected by Lars").
This seems to align somewhat with my recent exchange with tomcat about Mac modules and skiplists. He suggested [1] we discuss my question at June 1 12:30pm PTD crashkill meeting. https://wiki.mozilla.org/CrashKill/2010-06-01 (dont see notes there yet) [1] posted 2010-05-12 http://groups.google.com/group/mozilla.dev.quality/browse_thread/thread/6c057556c8df8c82 which sadly garnered no responses (perhaps I posted in the wrong newsgroup?) ... Do we need skiplists to get stack sigs for some Mac crashes which correlate to windows crashes sigs, like ... https://crash-stats.mozilla.com/report/index/383f6b03-28ff-4a8e-b616-0ed7c2100512 0 libxpcom_core.dylib libxpcom_core.dylib@0x34c1 1 thunderbird-bin thunderbird-bin@0x9a9740 2 thunderbird-bin nsMsgCopyService::ClearRequest mailnews/base/src/nsMsgCopyService.cpp:216 3 thunderbird-bin mailnews/base/src/nsMsgCopyService.cpp:664 4 thunderbird-bin mailnews/imap/src/nsImapMailFolder.cpp:7960 5 thunderbird-bin mailnews/imap/src/nsImapMailFolder.cpp:5256 https://crash-stats.mozilla.com/report/index/03c933cb-8fca-43c2-b6d9-958932100511 0 @0x1 1 thunderbird-bin thunderbird-bin@0x9a9740 2 thunderbird-bin nsMsgCopyService::ClearRequest mailnews/base/src/nsMsgCopyService.cpp:216 3 thunderbird-bin thunderbird-bin@0x9a9921 4 thunderbird-bin nsMsgCopyService::Release mailnews/base/src/nsMsgCopyService.cpp:424 5 libxpcom_core.dylib nsCOMPtr_base::assign_with_AddRef xpcom/glue/nsCOMPtr.h:456 https://crash-stats.mozilla.com/report/index/f7f7b024-6978-4a05-b937-7bd5f2100512 0 @0x0 1 thunderbird-bin thunderbird-bin@0x9a9740 2 thunderbird-bin nsMsgCopyService::ClearRequest mailnews/base/src/nsMsgCopyService.cpp:216 3 thunderbird-bin nsMsgCopyService::NotifyCompletion mailnews/base/src/nsMsgCopyService.cpp:664 4 thunderbird-bin nsImapMailFolder::OnCopyCompleted mailnews/imap/src/nsImapMailFolder.cpp:7960
Summary: Request for changes to Socorro signature filter → Request for changes to Socorro signature filter for modules without symbols
Wayne, I don't think it's appropriate to skip frames that are in binaries we are shipping, even if they don't have symbols. If they're legitimately missing symbols for some reason then it's either a bug (and we should fix it) or a crash that goes through a non-function address, which is perfectly valid.
(In reply to comment #13) > crash that goes through a non-function address, which is perfectly valid. I think it most cases it's this. We regularly have crashes in Camino where we're in some CSS code and a couple of frames are way-out-of-wack hex (bp-e19b7563-f00c-404b-8d4d-7defa2100531 is an example, though it's much worse than usual; you can see from the rest of the frames--as you can from the stacks in comment 12--that we do have symbols for those binaries); early on, smorgan looked at a couple of these and went through the effort to figure out what the values for the frames might be (I'm fuzzy on the details now), so unless we have one of these sorts of stacks that's not involving the CSS code, I don't worry too much. ;)
Component: Socorro → General
Product: Webtools → Socorro
Component: General → Infra
This has been lingering for a long time and bug 422527 has been fixed in the mean time, so I'm resolving this. Please file new bugs for skiplist additions that need to be made, and be precise on what is needed there.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.