Closed Bug 358302 Opened 19 years ago Closed 18 years ago

set up staging/testing server for breakpad collection and reporting

Categories

(mozilla.org Graveyard :: Server Operations: Projects, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ted, Unassigned)

References

Details

Attachments

(4 obsolete files)

Now that bug 354980 is in, we need to get the server side of airbag sorted out. I am currently running a very simple collector/processor at http://mavra.perilith.com/~luser/airbag-collector/ The collector source: http://mavra.perilith.com/~luser/airbag-collector/index.txt The lister and source: http://mavra.perilith.com/~luser/airbag-collector/list.pl http://mavra.perilith.com/~luser/airbag-collector/list.txt This just uses a SQLite backend and a slightly modified version of the minidump_stackwalk code from airbag. We need to figure out how we want to host this among other things. We'll have lots of flat files for the debug symbols, and the processed crash reports will be in a database (MySQL probably).
Blocks: 360327
Blocks: 362970
I rewrote some of the code for this, so I might as well update the progress here. The collector and lister are written in Python now, and the reports are stored in a MySQL database. Attached is the schema for the DB I'm using right now. Current crash reports can be seen at: http://mavra.perilith.com/~luser/airbag-collector/list.py I'll attach the rest of the source shortly.
Stackwalk program for processing minidumps. Based heavily on minidump_stackwalk.cc from the airbag source, but with machine-readable output. The collect script runs this on a minidump when it receives it and parses the result.
Attached file crash report collector CGI (obsolete) —
This python CGI collects crash reports via POST from the crashreporter client. It runs moz_stackwalk on the minidump, parses the output, and puts it all into the database.
Attached file crash report list CGI (obsolete) —
This Python CGI lists crash reports in the database, and lets you see the contents of each one. It's pretty simplistic, but it's a start. I don't know if we'd want to write our talkback-public equivalent in Python or PHP or what. Last attachment, I swear. :)
No longer blocks: 362970
Ok, now that bug 362970 has landed, I think it's time to address this. We can easily configure tinderboxes to upload their symbols somewhere for airbag to use. We just need a place to put them, and then we need to setup the database and CGIs attached to this bug.
The moz_stackwalk program here won't compile against the latest breakpad source, but the minidump_stackwalk program that comes with breakpad can now product machine readable output. The python code here will need to be modified, as the output is slightly different from what it's expecting, but it's pretty close. You run minidump_stackwalk with the -m option to get machine readable output.
steps for symbols that server uses: make sure in your mozconfig you have --enable-debugger-info-modules then after you build do |make buildsymbols| in the root of your objdir you will wind up with a $BUILD_ID directory alongside your mozilla dir with a tar.bz2 of your symbols
also, "export MOZ_DEBUG_SYMBOLS"
Summary: set up staging/testing server for airbag collection and reporting → set up staging/testing server for breakpad collection and reporting
I adapted these for the breakpad trunk and ported them to Pylons. http://code.google.com/p/socorro/ Seems to work!
Attachment #250492 - Attachment is obsolete: true
Attachment #250493 - Attachment is obsolete: true
Attachment #250494 - Attachment is obsolete: true
Attachment #250496 - Attachment is obsolete: true
Most of the comments in this bug are off-topic ;-) For the initial rollout, we would like two servers: one to do collection and reporting, and one to do processing. They need to have shared access to a file store of at least 10G, and a shared MySQL database. The tinderboxes will need SCP access to copy files to this filestore, and morgamic/ted will also need access to that filestore for maintenance purposes (uploading windows system symbols). The collection/reporting machine should be behind SSL at a suitable URL such as crash-reporting.mozilla.com. The processing machine does not need direct internet access and should probably stay completely behind the firewall.
Aravind, we are going to go with PostgreSQL for our database backend because it offers some features we need to take advantage of. See: http://wiki.mozilla.org/Breakpad/Design/Database Would it be possible to use 8.2/8.1? What is RHEL support for Postgres like? Also, for replication, schrep mentioned Slony (http://slony.info/) -- though I'm not entirely sure we will need replication real-time -- the reporting db could possibly just be done as a batch job nightly; not sure how real-time it has to be -- anybody have thoughts on whether or not we need real-time replication?
mrdb-stage01 (the stage db box for stage) now has the postgres 8.1.3 server. mrapp-stage03 (the digestor) has postgres 8.1.8 client libraries and the python-postgres bindings. I looked at slony and it seems like it would work for what we need. Let me know if you need me to get anything else installed. If we need support for large objects in the db, slony may not work (current versions may have fixed this), but pgcluster works with large db objects.
Staging servers are up, and this bug is overly general, so we'll file new bugs as needed.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
seems complicated that we seem to be tracking mozilla.org/com server changes over in http://code.google.com/p/socorro/issues/list I guess there is no way to avoid duplication if we tried to track stuff in bugzilla, but there are firefox beta blockers that should surface on the radar that involve things like being able to track the status of things like http://code.google.com/p/socorro/issues/detail?id=117 and http://code.google.com/p/socorro/issues/detail?id=119 from a crash analysis standpoint it doesn't make much sense to send out the beta ( or more importantly the final release ) until we can jump on analysis of the incoming data. how should we track issues like that? for blockers kinds of issues should we duplicate in bugzilla?
I have duplicated Breakpad bugs into bugzilla for tracking purposes, since that's truly a separate project. We talked about just moving socorro bugs into a Webtools : Socorro component, but at the time we felt it wasn't really worth the effort. Then again, even in bugzilla not all components have blocking-1.9 or blocking-firefox3 flags, so you still might not be able to indicate what you really want. :-/
(In reply to comment #15) > Then again, even in bugzilla not all components have blocking-1.9 > or blocking-firefox3 flags, so you still might not be able to indicate what you > really want. :-/ You file a bug under mozilla.org :: Bugzilla: Keywords & Components to get blocking flags added... :)
I don't think that it really makes a difference whether it's in googlecode or BMO: the problem is finding somebody with time to actually fix the bugs.
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: