Closed Bug 386343 Opened 13 years ago Closed 12 years ago

Breakpad doesn't always catch exceptions

Categories

(Toolkit :: Crash Reporting, defect, critical)

x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 422308

People

(Reporter: Peter6, Unassigned)

References

Details

Attachments

(1 file)

249.35 KB, application/octet-stream
Details
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a6pre) Gecko/20070629 Minefield/3.0a6pre ID:2007062904

I've seen this since 2 weeks (last sucsessfull breakpad report 20070619) and thought the reason was because I mostly use hourly builds.

Obviously more people see the same

repro:
Open an FF nightly build
disable Talkback (allthough it makes no difference here)
crash on https://bugzilla.mozilla.org/attachment.cgi?id=270229

result:
FF dies, that's it, no talkback (if enable) or breakpad feedback
Nothing is created in the profile directories used by breakpad
Flags: blocking-firefox3?
Thats like my yahoo beta mail crash, no breakpad until I switch to regular yahoo mail and crash.  Not the same crashing though I don't think.  Actually that was the first time I've seen breakpad work since it was turned on and I've had at least a hundred crashes.
Component: General → Breakpad Integration
Flags: blocking-firefox3?
Product: Firefox → Toolkit
QA Contact: general → breakpad.integration
Flags: blocking1.9?
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a6pre) Gecko/20070628 Minefield/3.0a6pre
I did my best quite some times but I never saw breakpad working. I even have no folder Crash Reports in Application Data :(.
On the last series of nightly crashes, breakpad did not activate on the first four and then suddenly something happened and next crashes did activate it, all of them.

Definitely random for me.
(In reply to comment #2)
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a6pre) Gecko/20070628
> Minefield/3.0a6pre
> I did my best quite some times but I never saw breakpad working. I even have no
> folder Crash Reports in Application Data :(.
>

I take it back. I saw breakpad for the first time with this test case: https://bugzilla.mozilla.org/attachment.cgi?id=270462
Note that also talkback was enabled. 

This seems to be profile related somehow.  I was trying to see if I could find a regression window, but using a different profile, and I consistently got the crash reporter.  However, testing with my daily use profile I don't get the crash reporter now.
a few days ago I tried the build with Cairo 1.4.10 enabled which crashes at startup and breakpad showed up every time I crashed (5 out of 5 attempts).
For all the other testcases (all crashes while browsing) I tried, nothing happened.
Ok, weirder even.
I used "crash on printpreview ACID2"

I started up my current build, 20070630 ->crash, no breakpad
I started up a build from 20070610 -> crash, breakpad
I started up a build from 20070620 -> crash, breakpad
I started up my current build again, 20070630 ->crash, breakpad

So starting up an older build somehow set something right 
And yes, I was so stupid not to make a copy of my prefs before doing this, so I can't tell if anything changed.
I started up todays nightly and crashed on ACID2 and get breakpad to pop up, it wouldn't 
Than I repeated the above steps and breakpad never showed up.
I went back to a 20070601 build, still no breakpad.
Is this bug still present (Me, being on win2k, wouldn't know because it doesn't work on win2k!) but it seems to me quite important for the forthcoming GP Alpha 7 release to enable us to get as much crash feedback as possible.
Yeah, it's still buggy as hell.
Sometimes it pops up right away, sometimes well after the crash and most often not at all.
All depending on the type of crash I guess.
Well since June 14 crash reporter/break pad/air bag has caught and reported 7 of my crashes (out of around 200).  3 were same crash due to same bug, another 3 were the same crash due to same bug and one report for a third crasher bug. I'm guessing 10 more crashes were caught but failed to submit. Crash reporter is definitely not working and I consider it useless due to it not even submitting any of those crashes.  Sucks that there will be a release without a fully functional crash reporter.
same here it crashed 10 times or so but breakpad nowhere. Also it doesn't show/register in addons or anywhere, is it in hiding?
It's not supposed to show up in addons or anywhere.  What would help fix this bug would be if someone can narrow down the circumstances where it appears/doesn't appear.  Obviously it is working in some cases, but not in others.  We need to figure out if it's specific crashes that make it not work, or a configuration issue, or a certain way of starting Firefox (like via software update).  Something is causing it to not catch crashes sometimes, but until we narrow it down, it's very hard to do anything about it.
(In reply to comment #13)
> It's not supposed to show up in addons or anywhere.  What would help fix this
> bug would be if someone can narrow down the circumstances where it
> appears/doesn't appear.  Obviously it is working in some cases, but not in
> others.  We need to figure out if it's specific crashes that make it not work,
> or a configuration issue, or a certain way of starting Firefox (like via
> software update).  Something is causing it to not catch crashes sometimes, but
> until we narrow it down, it's very hard to do anything about it.
> 

And how would you propose we do that when the reporter doesn't catch the crash to report what caused Firefox to crash?!

There are many bugs where people have reported crashes and said break pad did not catch it on some systems or even some OSes (excluding win2k).  And before talk back was removed, there were reports on then break pad not catching a crash but talkback catching it.
You can use a local debugger and the Mozilla symbol server to get backtraces from local crashes: http://developer.mozilla.org/en/docs/Using_the_Mozilla_symbol_server

Also, you can try installing my memory corrupter extension which will cause crashes pretty quickly: http://benjamin.smedbergs.us/tests/memory-corrupter-0.2.xpi and report whether breakpad catches these crashes. If not, I can walk you through breakpointing some functions to see whether the exception handler is being installed properly and whether it's being called.
(In reply to comment #15)
> You can use a local debugger and the Mozilla symbol server to get backtraces
> from local crashes:
> http://developer.mozilla.org/en/docs/Using_the_Mozilla_symbol_server
> 

Will try later.


> Also, you can try installing my memory corrupter extension which will cause
> crashes pretty quickly:
> http://benjamin.smedbergs.us/tests/memory-corrupter-0.2.xpi and report whether
> breakpad catches these crashes. If not, I can walk you through breakpointing
> some functions to see whether the exception handler is being installed properly
> and whether it's being called.
> 

Of course for once it catches a crash...but fails to submit it. (I know that is another bug).
I have also that problem, and attached windbg to the firefox process to be able create a minidump with ".dump /m" when the crash occurs.

Here is a minidump where I attached windbg to the process before crashing it :
http://jmdesp.free.fr/i18n/fxcrash/windbg_dump/20070817_1/
And here is a minidump where I attached windbg to the process after crashing it :
http://jmdesp.free.fr/i18n/fxcrash/windbg_dump/20070817_2/

If those specific crash don't have the required info, contact me and I'll create others. The various names correspond to ".dump /m" ".dump /mi" ".dump /mip" and one ".dump /ma"
Is my minidump any useful ? bs told me on irc someone could work on that while he's on holiday, iirc he referenced ted.
Not going to block on this unless somebody comes up with reproducable steps.
Flags: blocking1.9? → blocking1.9-
So your not even going to look into it? There are dozens of reports in the nightly build threads, here and even own thread on this issue.  People keep offering info but no one has even responded to it.  
If this bug is due to a race condition or something (ie: not reproducible on demand) then it won't receive any attention? That seems harsh to me. Crash reports play an important enough part in Firefox's release cycles for a bug such as this to receive some attention on purely anecdotal evidence imo.
Kurt in comment #14
> (In reply to comment #13)
> > ... We need to figure out if it's specific crashes that make it not work,
> > or a configuration issue, or a certain way of starting Firefox (like via
> > software update).  Something is causing it to not catch crashes sometimes, but
> > until we narrow it down, it's very hard to do anything about it.
> 
> And how would you propose we do that when the reporter doesn't catch the crash
> to report what caused Firefox to crash?!
> 
> There are many bugs where people have reported crashes and said break pad did
> not catch it on some systems or even some OSes (excluding win2k).  And before
> talk back was removed, there were reports on then break pad not catching a
> crash but talkback catching it.

Should be easy with citations that have testcases.  Are citations not possible?
 

Peter in comment #6
> a few days ago I tried the build with Cairo 1.4.10 enabled which crashes at
> startup and breakpad showed up every time I crashed (5 out of 5 attempts).
> For all the other testcases (all crashes while browsing) I tried, nothing
> happened.

Peter, what testcases?
All of these bugs crash for me without break pad working.  All of the bugs were reported on XP and have break pad crash reports in them.

Bug 394173
Bug 394403
Bug 394404
Bug 394239
Bug 394208

Need more examples?
Well with a new profile break pad caught every single one of those :/
Can't tell people to start a new profile as a fix though.
Let's be clear here:

* the blocking- flag does not mean that we're not going to fix the bug, just that if we don't fix the bug, we're not going to hold the release.

* It's pretty clear that something about the contents of the profile is causing the problem, and the particular crash is probably irrelevant

* We're going to try and add some NSPR logging enabled in release builds so that we can narrow the problem down
The reason why I posted particular crashes is because sometimes breakpad does work (As I stated in comment 1 and comment 6.  I have other crashes that were caught but the break pad doesn't even submit when it does actually catch it.

Do you have a bug number so I can monitor the progress for the NSPR logging?  And why only on release builds?
Attached file profile
With attached profile I get a crash just by opening and closing firefox (no breakpad at all in about 20 tests). I know it's not an "absolutely" clean profile, but I needed noscript to reproduce the crash.
I know the crash may be plugin related, so let me know if there's anything else you need me to send. 
Tested with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a8pre) Gecko/2007091405 Minefield/3.0a8pre ID:2007091405 on XP SP2, Java 1.6u2, Flash 9.0 r45 and other.

Hope it helps
From checking the current code :
http://mxr.mozilla.org/mozilla/source/toolkit/crashreporter/google-breakpad/src/client/windows/handler/exception_handler.cc#113
and some existing discussion about the subject :
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101337

It seems quite public knowledge that the method currently used by mozilla will not catch all exceptions : "But there're a lot of checks in CRT code that pass around and invoke Dr.Watson directly"

According to the answers Microsoft gives there, the only definitive solution would be to replace the RT functions that call D.Watson directly.

Another option might be the one described here :
http://groups.google.fr/group/microsoft.public.win32.programmer.kernel/browse_thread/thread/baee0603932a2e7f/4ef2f6b151b6127d#4ef2f6b151b6127d
"To have a more dependable implementation, [...], implement a "monitor" process that runs a basic debugger's loop against the child process."

Duplicate of this bug: 406497
>bds comment #25
>>
>> * We're going to try and add some NSPR logging enabled in release builds so
>> that we can narrow the problem down
>> 
Kurt comment #26
> Do you have a bug number so I can monitor the progress for the NSPR logging? 
> And why only on release builds?

not filed yet?  I'm not finding one
Flags: blocking-thunderbird3?
Duplicate of this bug: 416427
Breakpad works for the first time for me tonight, it hasn't been working since its introduction but the report was sent successfully for the first time tonight! Smile

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9b4pre) Gecko/2008022906 Minefield/3.0b4pre ID:2008022906
Summary: Breakpad doesn't work → Breakpad doesn't always catch exceptions
Running XP SP2 fully patched (fresh installation in december), I have seen breakpad pop-up 1 single time in a few 100 crashes.
For some reason it doesn't work.
I am guessing it's bug 422308. In fact, we should probably dupe one of these to the other. That bug has more info, fwiw.
Peter: can you try disabling Flash and see if this makes Breakpad work for you?
Sure Ted, do you have any garanteed crash (bug ?) for me to try it on.
http://ted.mielczarek.org/code/mozilla/crashme.xpi is a good test case. :) Tools -> Crash me now. In my tests on bug 422308, with flash running on a page, I don't get Breakpad.
Right I tried Core:Layout Bug 416907 and BP indeed only pops up if I disable flash (which is not part of that page), so my problem is Bug 422308.
Thanks for the help/tip
Ok, I'm going to dupe this over as the other bug has more information. I wish we had figured this out sooner! The Adobe folks do say that they have an updated plugin that should fix this issue, I guess we'll see when it gets released.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 422308
DOH! We could have figured this out about 8 months ago by comment 1 if we would have thought about disabling flash.
Flags: blocking-thunderbird3?
You need to log in before you can comment on or make changes to this bug.