Why don't we get a opensourced crash reportging tool to replace the proprietary
one? we can use openoffice's or gnome's bugbuddy?
At least gnome's bug-buddy requires shipping binaries with symbols, that would
increase the download size a lot.
Yes, you really need to do what Talkback does, and catch the raw crash data,
send it back and pair it up with the symbols stored on a server. IMO this should
be an Open Source project all its own. I'd love to work on it. Alas, my full
time contributing days are coming to a close :-(
In Windows OpenOffice, i see a "Sun Microsystems, Inc. Crashreporter v.1.1" with
LGPL license mentioned.
(File THIRDPARTYLICENSEREADME.html in OOo_1.1rc3_030813_Win32Intel_install_de.zip)
Which crash reporting tool do they use on Linux?
I mailed firstname.lastname@example.org (Hennes Rohling), who owns the fixed OOo bug
"Crashreporter doesn't work under Linux"
I also asked him with whom of the OOo people we could talk about using their
tool in Mozilla.
Is this the code we're looking for?
One issue to consider is the captured information and access to the database
containing it. Netscape's talkback database was restricted to employees, thus
this data wasn't accessible to the world. The chances of sensitive data being
captured in the dump is small, it could happen.
I'm not sure there's a good answer for this. People could easily opt out if the
so choose, but ideally you want people to use it and feel reasonable safe in
doing so. Having a Talkback like database open to the world might give cause for
more users to opt to turn it off.
Also, for Windows, you can could create a mini dump via MiniDumpWriteDump. It's
not cross platform, though. I've been able to get this working under Windows.
The catch is, dumps can be created on older OS's by supplying the dbghelp.dll,
but I think you'd need VC7 to view it. It might be possible to create a simple
app to generate a report similar to what Talkback provides.
Mozilla Foundation is setting up a seperate infrastructure for Talkback. Stay
tuned for more details.
I wish I was cc'd on this bug a long time ago, because I have been thinking about this for a long time. I just haven't had the time or resources to look into possible alternatives.
The Sun OOo crash reporting tool is the first thing that comes to mind... so if anyone has contacts at Sun, please find out if it would be feasible to intregrate their crash tool into other products like Firefox and Thunderbird.
One critical requirement for any such alternative is that is cross-platform. We cannot afford to integrate different tools for each platform, so if we do ever decide to go with a different system, it must work on at least the three major platforms (MacOSX, Windows, and Linux).
GNOME would like such a thing too. Preferrably something that could deal with missing debug info (it should add the info back). This must work across distributions
Perhaps distributions provide the debug packages to a GNOME server while the client provides MD5 sums of all libraries. Then it is the servers job to figure out what debug info to load.
Not only the reporting tool is needed. Also the code for building the debug server which has the debug info for the shipped binaries as well as a copy of every popular OS over there (some kind of jailed instalations). Also a good interface for accessing this info would be great
I've worked already on such a solution. I never found informations how GNOME's crashreporter worked so this all is build from my own. Take a look at: http://seamonkey.itkombinat.de/crashrep/
Very nice, Alexander. Only six incident reports?. Is the service in production?.
Is the tool integrated really in SeaMonkey?. Are you in contact with mozilla people to integrate such a tool in firefox/thunderbird?.
Current situation is awful. I have tons of crashes and can't report/investigate them :-/
Jay, please take a look at comment 12. That's what you are waiting for? ;)
It isn't in production it is only enabled in my own build. But I've to much work and I'm to often ill atm ... I hope I can bring it forward.
Alexander, if I can help. I'm a busy person, but this issue is a priority for me.
If you have Linux I can give you my latest build and there you can try the crashes.
Created attachment 215450 [details] [diff] [review]
Patch to include crashrep
Created attachment 215452 [details]
The crashrep client as seen in the screenshoots
Created attachment 215454 [details]
The components part that calls the client on a crash
This is the source that's needed by mozilla for linux crash reporter.
Please excuse that the component didn't look very clean on memory handling but my C knowledge is less. Also I don't know what is the state of a application after crashing. That's why it looks so ugly.
I've been working on this too. I'm starting with the win32-specific pieces, and am working on a portable win32 digester that can walk dump stacks on any platform. My goal is to handle crash reporting for all three major platforms, with portable digester code for each.
Mark: Sounds good - Alexander has major parts for the Linux client and the server side, you're working on a Windows solution, both have a cross-platform tool as a target...
Would it be possible to merge your approach into the work already attached here?
We should probably get the current stuff into the CVS tree (not built by default for now) and work on improving it from there.
Especially the parts of the patch that are not crashrep-specific files would need review though, I guess. Of course, review would also be nice for the crashrep stuff itself.
Hi, Alexander nice work. But using backtrace gives us interesting data only when debug info is present on the system. That will work only with people compiling the software themself and people with *-debuginfo packages installed. So to get interesting backtraces from the most common user case we have two options:
a) Detetect if no dcebuginfo is available (some little ELF magic) and ask the user to install the debuginfo package before getting the backtrace.
b) Include in the crash report the exact version of every mapped code (md5sum of binaries and libs) and re-create the backtrace on a master server with every know binary installed.
The problem I see with a) is that there is no standard way across multiple distributions to ask a system package to be installed. Also I don't know it glibc backtrace is smart enough to get debuginfo from /var/debuginfo/* (I know that gdb is).
With b) we would need some kind of recreate-backtrace software and some way of install every binary from a distro and create a mapping md5sum-->binary. Of course also a dedicated server is needed for each architecture.
What do you think about this?
If it is planned to replace the existing Talkback setup, the client portion absolutely must be able to get the info it needs *without* any debug symbols, and send the data to a server which can peice the data together into a real stack.
It should be simple enough to use the stackwalking code in nsStackFrameUnix to get the actual stack frame addresses, and then convert those to symbolic information on the server side.
Where feasible, I'm actually hoping to send stacks for all threads up to the server. The server will hold the symbols and will have a way of mapping builds to a set of image files. Talkback uses build IDs for this purpose, we can do the same.
I'm also toying with the idea of doing some symbol-matching on the client side, to do a better job of presenting stack frames in system code without requiring the server to know too much about system libraries.
Everything will be done portably: there's no reason the servers will need to be tied to a specific architecture. Pulling useful data out of the pdb files is the last question mark on my win32 punchlist.
Funny how after you say something aloud, you get to thinking about it and come up with a solution. .dbg files are more than enough for symbol-mapping, contain information in COFF format, and aren't likely to incompatibly change between MSVC releases like .pdb files.
yeah, but we're hoping to let the symbol servers do more than just get stacks eventually. if only simply make the pdbs available to people for debugging, although that could be handled differently as long as you can generate both dbg and pdb which i believe you can....
one plea. please please please don't use build ids. use the approach symstore uses, namely a very specific hash of each dll. that way if someone mixes dlls, or perhaps installs an extension (xforms, domi) from a different build you can still get correct stacks. this also enables us to deal w/ things for which we don't have symbols today, but perhaps someone else does (e.g. a plugin vendor).
Also we may be interested on crashed happening on system libs code, like a wrong call to a gtk function and so.
get all the linux vendors to supply a standard symbolserver system like microsoft does and we'll talk. until then, that's a problem that's probably going to be too hard for us to solve.
linux users like building their own libraries, and often w/ slightly different options and w/o symbols, which means that you'll never find another box anywhere in the world that actually can answer symbol questions about the system library.
linux distros also like sticking things in very random places, i'd be very surprised if there was a standard location for symbol files that could be found by a disstressed application (one that has already crashed, possibly crashed because the libraries it needed were on a network file system which has gone away). certainly trying to get function names for "system" libraries before sending the information to the server would be good. but i don't see any useful proposals for how one can do that.
We need to care though that the server side can meet the needed storage requirements. Storing all .pdb files permanently can be quite costly on the storage side over time, from what I heard (at least if we plan to use the same server for all nightlies of a bunch of different applications like Firefox, Thunderbird, XULRunner, SeaMonkey, Camino, etc.).
symbol storage can be distributed. symstore/symsrv rely on the concept of multiple stores, each of which has s database that maps a requested dll-hash to a symbol file.
note that there's nothing wrong w/ the symbol server knowing which pdbs belong to a given build, that's fairly standard for synstore. i just don't want the client to remember and try to suggest to the server that it use a build id as a lookup key. if two dlls have the same hash, then you can actually *save* space by only keeping a single copy of its pdb instead of one for each and every build where that library was the same.
@Fernando Herrera from comment #24
I'm using way b)
After a crash the client sends the trace and the memory map (taken from /proc/self/map) with a build id to the server ( http://seamonkey.itkombinat.de/crashrep ) and is examined there. It is send via POST command and not uploaded as file.
Yeah, I wan't to replace the build ID with a md5 hash so we see if someone changed so files. But thats for later I think.
I know nothing about .pdb and so on MS Windows systems, but I won't depend on MSVC or else. Maybe we want to switch to gcc, Watcom (who knows).
Official win32 builds are produced with MSVC. As such, that's all I really care about for the purposes of crash reporting on that platform for the time being.
I don't see any problems with using a hash to identify libraries, but I'd want the client to at least specify a build ID as a backup hint. There are cases where the hash of a client-side library might change: consider prebinding on OS X.
> Yeah, I wan't to replace the build ID with a md5 hash so we see if someone
> changed so files. But thats for later I think.
It would be nice to avoid changes to the protocol.
I'm going to be changing the buildid stuff on trunk to separate out various uses of buildids. For our purposes the buildID used by crashrep and update should be a long identifier such as "gaius-trunk-2006032212-en-US" that identifies the precise bits being staged. The buildID will be specified by the tinderbox configuration (thus debug homebrew builds won't have this kind of unique ID).
Don't let my plans get in the way of using NS_BUILD_ID or dllhashes for the time being until I can get my act in gear.
Yet we don't have a protkoll, and I excuse, didn't mean replace ... I meant extend. So we only add one more parameter.
But before I've anything for win32 I won't call it Protokoll.
Is it possible to get the source for the walker?
On Linux I didn't implement a walker I used the gtk(?) function for that. So maybe the Linux version needs an own walker. I don't like to depend on /proc for the memory mapping.
I also won't use minidump as it seems only available from WinXP upwards.
Ah I forgot to upload the tinderbox patch and as I speak of buildID ...
the build isn't identified by buildID it is identified by a produktID.
At the moment following is done by the tinderbox script:
1) Build a debug enabled build.
2) Connect via ssh to the crashrep Server and login to your account.
3) Start a script 'symbols/addnewbuild.php $vendor $name $type $extra $buildID $machine' that returns you a new produktID
4) the tinderbox script copies all binarys in a directory and strips out all what isn't needed for debug processing. (like talkback seems to do)
5) The files are packaged into crashrep-$productID.tar.bz2
6) Writing productID into crashrep.ini
7) Files are uploaded to server into ~/symbols
8) striping out debug symbols from origional (like normal in tinderbox script)
9) packaging like normal in tinderbox script
Avoid topic creep. File new bugs early, and file often.
I opened bug 331357 to track the win32 stuff I'm working on.
The current talkback client is not accessible on Linux. We need a new one based on an accessible widget set like GTK.
I really doubt Jay can work on a new Talkback client, "thanks" to the NDA tied to the old Talkback code.
This is now being worked on by the airbag project: http://code.google.com/p/airbag/
(In reply to comment #43)
> I really doubt Jay can work on a new Talkback client, "thanks" to the NDA tied
> to the old Talkback code.
Yes, I don't plan on working on any new version of Talkback, since I haven't seen the code and don't on looking anytime soon. All Talkback related bugs in Bugzilla are default assigned to me, since I'm the only person maintaining the servers at MoCo... and many will probably never be fixed... that's what airbag is for! :-)
(In reply to comment #44)
> This is now being worked on by the airbag project:
Should we mutate this bug into getting airbag integrated to mozilla (once it's usable), or then close it and open a new one ?
*** This bug has been marked as a duplicate of 354980 ***
*** Bug 118994 has been marked as a duplicate of this bug. ***
changed URL from http://code.google.com/p/airbag/ to http://code.google.com/p/google-breakpad/ due to http://groups.google.com/group/google-breakpad-discuss/browse_thread/thread/4f40867980fe7452
Should this stay open and assigned to nobody, or actually go FIXED due to breakpad being available or at least get some assignee that marks it when breakpad is fully deployed?
Let's call this FIXED! woot
Is this now shipping on nightly trunk Seamonkey builds? If so, I am not seeing it.
This bug is only about _implementing_ such a tool at all. It is currently shipped on Mac and Windows for Firefox (Linux will come soon), and it's been tested well enough there, it will be shipped with other applications.
Track bug 383125 for that tool being shipped with SeaMonkey trunk.