Closed Bug 332645 Opened 20 years ago Closed 19 years ago

Nagios needs to monitor tinderbox build results

Categories

(mozilla.org Graveyard :: Server Operations, task, P2)

All
Other

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mtschrep, Assigned: justdave)

Details

Attachments

(1 file, 4 obsolete files)

We need to know when a critical tinderbox build stops producing. One idea on a quick way to do this was to simply monitor: http://stage.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla1.8/ http://stage.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/ For all platforms and the equivalent for Thunderbird. If all the files are not posted within 24 hours alerts would be kicked off.
Assignee: server-ops → justdave
Priority: -- → P2
Here is a plugin to do this. First run creates a status file, subsequent runs compare this status file to the target file/dir using the 'find' command's '-newer' option. If target file/dir is not newer than status file, then the plugin reports an error. This plugin is intended to be run on the server (e.g. using NRPE). The -f option specifies the file/dir name to check. The -a option specifies the allowed age of the file/dir (in seconds) before this condition is reported as an error. Perl and find must be on the $PATH for this plugin to work (PATH is configurable at the top of the plugin). Perl is used to get the current system time in "unix time" format. This dependency can be removed if required by using GNU date's "+%s" formatting option.
Here is a version which specifies "-maxdepth 1" for the find command, so it will run quickly even if used on a directory with a deeply nested subdirectory structure.
Attachment #217806 - Attachment is obsolete: true
Thanks! I'll need suggestions for what files to monitor with it...
I'd suggest at least looking for (at least): http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/argo-trunk/firefox-1.6a1.en-US.linux-i686.tar.bz2 http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/gaius-trunk/firefox-3.0a1.en-US.win32.zip http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/atlantia-trunk/firefox-1.6a1.en-US.mac.dmg on the ftp site. These 3 should cover the machines that do nightly builds on the 3 main platforms. I'd check every 2-3 hours instead of every 24 hours so we have faster notification.
I've compiled a list of the nightly directories on the public wiki: http://wiki.mozilla.org/Build:Nightly_deliverables These all should exist on stage, and should be updated once every 24 hours, otherwise build@mozilla.org should get an email. Note that this list is going to change over time. Are the nagios configs checked in somewhere? If so I can send you diffs as things change, or maybe we can figure out a way to import a list easily into the config, that I can give you periodic diffs for.
What you could do is have the check script pull a list off the wiki of directories to check. There's a parameter you can pass to mediawiki to get the raw source of the page (which is used to get the css for the pages, btw). The script could use LWP to grab that and parse it. As long as it knows to syntax/sanity check the list and error out/page if it's busted. :)
That's a good idea, but the docs would need to be a lot more solid (which I am working on). For right now, please get the following monitored as soon as possible, these files should update at least once every 24 hours, if not build@mozilla.org should get an email: /pub/mozilla.org/firefox/nightly/latest-trunk/firefox-3.0a1.en-US.linux-i686.complete.mar /pub/mozilla.org/firefox/nightly/latest-trunk/firefox-3.0a1.en-US.linux-i686.installer.tar.bz2 /pub/mozilla.org/firefox/nightly/latest-trunk/firefox-3.0a1.en-US.linux-i686.tar.bz2 /pub/mozilla.org/firefox/nightly/latest-trunk/firefox-3.0a1.en-US.mac.complete.mar /pub/mozilla.org/firefox/nightly/latest-trunk/firefox-3.0a1.en-US.mac.dmg /pub/mozilla.org/firefox/nightly/latest-trunk/firefox-3.0a1.en-US.win32.complete.mar /pub/mozilla.org/firefox/nightly/latest-trunk/firefox-3.0a1.en-US.win32.installer.exe /pub/mozilla.org/firefox/nightly/latest-trunk/firefox-3.0a1.en-US.win32.zip /pub/mozilla.org/firefox/nightly/latest-mozilla1.8/firefox-2.0a1.en-US.linux-i686.installer.tar.gz /pub/mozilla.org/firefox/nightly/latest-mozilla1.8/firefox-2.0a1.en-US.linux-i686.mar /pub/mozilla.org/firefox/nightly/latest-mozilla1.8/firefox-2.0a1.en-US.linux-i686.tar.gz /pub/mozilla.org/firefox/nightly/latest-mozilla1.8/firefox-2.0a1.en-US.mac.dmg /pub/mozilla.org/firefox/nightly/latest-mozilla1.8/firefox-2.0a1.en-US.mac.mar /pub/mozilla.org/firefox/nightly/latest-mozilla1.8/firefox-2.0a1.en-US.win32.installer.exe /pub/mozilla.org/firefox/nightly/latest-mozilla1.8/firefox-2.0a1.en-US.win32.mar /pub/mozilla.org/firefox/nightly/latest-mozilla1.8/firefox-2.0a1.en-US.win32.zip
justdave pointed out on IRC how this script could be made much simpler and more intuitive without the use of the state file. also, pull the utils.sh in from the nagios plugin dir (the nagios_plugin_path is now configurable in the script).
Attachment #217811 - Attachment is obsolete: true
Attachment #218861 - Attachment is obsolete: true
Attached file better error message (obsolete) —
Attachment #218862 - Attachment is obsolete: true
Attached file better error checking
check to make sure file exists, and that -f and -a args are used
hey, shouldn't this also monitor the source tarball at http://ftp.mozilla.org/pub/mozilla.org/mozilla/nightly/latest/mozilla-source.tar.bz2? (which seems to be very out of date at the moment...)
to comment #11 : only if it's automatically generated (I believe that it is not right now). The purpose of this bug is to make sure tinderboxes is getting nightly builds out to testers on time.
Done, it's live. Right now I have this set up in a batch file that consolidates the output of running this check script on the entire list of files, and sends one page when any of them are too old. If you want, maybe we can separate it out later. This was the quickest way to get it working, since you have to set up a separate service check for each one on the nagios server to separate them out. But I can still do it that way if you want... :) (btw, the last page that went out to the build list is real, that wasn't a test -- the linux files are a few days old)
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
mozilla-source used to be automatically generated each night. that periodically stops working though... see, for example, bug 305297, bug 275805, bug 179042
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: