Closed Bug 1269471 Opened 8 years ago Closed 8 years ago

Add a bunch more lock-related things to the irrelevant signature list

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mccr8, Assigned: mccr8)

References

Details

User Story

I'd like to have WaitForSingleObject{,Ex}, WaitForMultipleObjects{,Ex}, and PR_WaitCondVar added to the irrelevant signature list.
e10s beta has a bunch of crashes that look kind of like this:

WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait | mozilla::ipc::MessageChannel::WaitForSyncNotify | mozilla::ipc::MessageChannel::Send | mozilla::net::PCookieServiceChild::SendGetCookieString

All that lock gunk (WaitForSingleObjectEx, WaitForSingleObject, PR_WaitCondVar, mozilla::CondVar::Wait) does not help things. It should be added to the list of signature things that are omitted entirely. I'll spend some time in the next day or two to get a specific list.
I guess the thing I'm looking for is "irrelevant_signature_re". It looks like WaitForSingleObjectEx, WaitForSingleObject, PR_WaitCondVar, mozilla::CondVar::Wait, WaitForMultipleObjectsEx would cover what I see in beta. I guess there must be a WaitForMultipleObjects, too.

Nathan, how does that sound to you? I suppose there's an argument for not removing the two CondVar things from crash signatures to make it more obvious that we're waiting, but maybe not.

This should turn the signature in comment 0 into:
  mozilla::ipc::MessageChannel::WaitForSyncNotify | mozilla::ipc::MessageChannel::Send | mozilla::net::PCookieServiceChild::SendGetCookieString
Flags: needinfo?(nfroyd)
Just confirming that `irrelevant_signature_re` is what you're looking for. When parsing the crashing thread to generate a signature, our algorithm will skip everything that matches an item of the that list.
(In reply to Andrew McCreight [:mccr8] from comment #1)
> I guess the thing I'm looking for is "irrelevant_signature_re". It looks
> like WaitForSingleObjectEx, WaitForSingleObject, PR_WaitCondVar,
> mozilla::CondVar::Wait, WaitForMultipleObjectsEx would cover what I see in
> beta. I guess there must be a WaitForMultipleObjects, too.
> 
> Nathan, how does that sound to you? I suppose there's an argument for not
> removing the two CondVar things from crash signatures to make it more
> obvious that we're waiting, but maybe not.

I think I'd remove WaitForSingleObject{,Ex}, WaitForMultipleObjects, PR_WaitCondVar and keep the mozilla::CondVar::Wait to make the waiting obvious.
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #3)
> I think I'd remove WaitForSingleObject{,Ex}, WaitForMultipleObjects,
> PR_WaitCondVar and keep the mozilla::CondVar::Wait to make the waiting
> obvious.

That sounds fine to me, though that won't help for places that don't have CondVar in the signature. For instance, I see "shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown" on beta right now. (As a side note, the giant pile of crashes on beta with all of this lock stuff have fixes on the way.)
Ah, hm, maybe it doesn't matter all that much, then.  I wonder why CondVar::Wait is getting inlined in some places, but not in others?
I don't know. I'm not even sure where ProcessNextEvent calls PR_WaitCondVar.
Two more things we should add to the list are ntdll.dll@0x4bb7a and kernelbase.dll@0x10ab, which I'm guessing are Windows functions related to implementing locks:
https://crash-stats.mozilla.com/report/index/a550cd66-1329-4706-b918-de0b12160418

These show up in a ton of different crashes and produce totally useless signatures.
Summary: Add a bunch more lock-related things to the elision list → Add a bunch more lock-related things to the irrelevant signature list
Blocks: 1265812
Or does it just cut a part from the sig?
(In reply to Tobias B. Besemer [:BesTo] (QA) from comment #9)
> Or does it just cut a part from the sig?

It just removes the frame of the signature. So, if ntdll.dll@0x4bb7a and kernelbase.dll@0x10ab and all of the other lock stuff in this bug are added to the irrelevant signature list, then the signature for the crash I put in comment 7 would change from [@ ntdll.dll@0x4bb7a ] to [@ mozilla::ipc::MessageChannel::WaitForSyncNotify], which is more useful.
K, great!

I send out my last ~20 crashes and think I will wait some days/weeks before I do anything to it again... ;-)
IMHO, it's a bit problematic to add (In reply to Andrew McCreight [:mccr8] from comment #7)
> Two more things we should add to the list are ntdll.dll@0x4bb7a and
> kernelbase.dll@0x10ab, which I'm guessing are Windows functions related to
> implementing locks:
> https://crash-stats.mozilla.com/report/index/a550cd66-1329-4706-b918-de0b12160418

Actually, those libraries in this crash do not have  symbols, which usually just means we need to ensure that those symbols are pulled from the MS symbol server into ours. Ted is usually caring about this but he is off, not sure who can take a look here.

We should not add things like bare DLL addresses to the ignore lists, we should get symbols fixed instead if possible. :)
(In reply to Robert Kaiser (not working on stability any more) from comment #13)
> We should not add things like bare DLL addresses to the ignore lists, we
> should get symbols fixed instead if possible. :)

I was wondering about that. I guess ignore those two .dll ones and I'll file some other bug about it.
No longer blocks: 1265812
I filed bug 1270190 for that.
Andrew, can you consolidate the specific change we've settled on into the "User Story" field of this bug so that the Socorro team has something "final" to implement?
Flags: needinfo?(continuation)
Sorry, I should have done that.
User Story: (updated)
Flags: needinfo?(continuation)
Assignee: nobody → adrian
Sorry for taking this long getting to this. I have recently been working on making it a lot easier for users to edit those signatures lists themselves. We now have them in separate text files in our repo, and I wrote some documentation to help people make changes there. All of that is here: https://github.com/mozilla/socorro/tree/master/socorro/siglists

Andrew, would you like to give it a try and make those changes yourself? We can meet in London to do that if you want. Feedback on that process would be appreciated of course, we are doing this to make the whole process of updating the skip lists easier and faster for everyone. :)
Flags: needinfo?(continuation)
Sure, I can do that. I might not be able to get to it for a few weeks, but I think that's okay.
Assignee: adrian → continuation
Flags: needinfo?(continuation)
Commit pushed to master at https://github.com/mozilla/socorro

https://github.com/mozilla/socorro/commit/ee69035fd8bc16ff4e9ccca6c0a27aef5f0c2d5d
Fixes bug 1269471 - Add more lock related signatures to the irrelevant signature list. (#3429)

r=adngdb
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.