Closed Bug 485015 Opened 16 years ago Closed 16 years ago

MozillaStaging is horked

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: reed)

Details

Like in bug 467922, we are having random drops. Can we please remake the tree?
I'll save the entire tree so cls can help debug.
Assignee: server-ops → reed
Component: Server Operations: Tinderbox Maintenance → Server Operations
[cls@vortex MozillaStaging]$ head -1 build.dat 1196846220|1196846220|Linux staging-prometheus-vm Depend Fx-Nightly|unix|building|1196846220.1196846243.24628.gz| [cls@vortex MozillaStaging]$ perl -e '$t=localtime(1196846220); print "$t\n";' Wed Dec 5 01:17:00 2007 [cls@vortex MozillaStaging]$
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → DUPLICATE
Nevermind. I'm checking the wrong end of the file. Too many balls in the air.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Wrong test; right conclusion. If you look at the admin page, none of the builds are marked as "Current". That's the first clue. Adding 'print STDERR' to load_buildlog in tbglobals.pl confirms it. Plus below: [cls@vortex MozillaStaging]$ tail -20 build.dat | awk -F\| '{ print $2 "--" $3 }' | perl -e 'while(<STDIN>) { chomp; ($t,$n) = split(/--/,$_); $p=localtime($t); print "$p -- $n\n";}' Thu Dec 4 10:55:28 2008 -- Linux mozilla-central nightly Thu Dec 4 11:33:00 2008 -- WINNT 5.2 mozilla-central nightly Thu Dec 4 10:55:28 2008 -- Linux mozilla-central nightly Thu Dec 4 11:33:00 2008 -- WINNT 5.2 mozilla-central nightly Thu Dec 4 10:55:28 2008 -- Linux mozilla-central nightly Thu Dec 4 11:33:00 2008 -- WINNT 5.2 mozilla-central nightly Thu Dec 4 10:55:28 2008 -- Linux mozilla-central nightly Thu Dec 4 11:33:00 2008 -- WINNT 5.2 mozilla-central nightly Thu Dec 4 10:55:28 2008 -- Linux mozilla-central nightly Thu Dec 4 11:33:00 2008 -- WINNT 5.2 mozilla-central nightly Thu Dec 4 10:55:28 2008 -- Linux mozilla-central nightly Thu Dec 4 11:33:00 2008 -- WINNT 5.2 mozilla-central nightly Thu Dec 4 10:55:28 2008 -- Linux mozilla-central nightly Thu Dec 4 11:33:00 2008 -- WINNT 5.2 mozilla-central nightly Thu Dec 4 10:55:28 2008 -- Linux mozilla-central nightly Thu Dec 4 11:33:00 2008 -- WINNT 5.2 mozilla-central nightly Thu Dec 4 10:55:28 2008 -- Linux mozilla-central nightly Thu Dec 4 11:33:00 2008 -- WINNT 5.2 mozilla-central nightly Thu Dec 4 10:55:28 2008 -- Linux mozilla-central nightly Thu Dec 4 11:33:00 2008 -- WINNT 5.2 mozilla-central nightly [cls@vortex MozillaStaging]$
cls ran tinderbox's clean.pl script against a copy of MozillaStaging, and that resolved the problem with the broken waterfall. reed then ran that against tinderbox.m.o's tree and it loaded quickly with recent builds shown. Armen could confirm if the builds reflect reality or not, but it looks plausible to me. clean.pl removes old builds from build.dat as well as old build logs, while the existing tidy mechanism was a cron job deleting build logs modified more than 60 days ago. Switching cron to clean.pl will cap the waterfall display to 60 days rather than <a long time ago>, but I think it's worth it for the responsiveness of the page load. Some 570k builds got removed from MozillaStaging, and I bet Firefox and Firefox3.0 trees would load much faster after tidying. Will post to the newsgroups to make sure no-one cares about older history. That still leaves how the tree got horked in the first place, which comes back to the issue that tinderbox has with old mail (bug 225735). Possibly there are (or were) incorrect clocks on some build slaves, or the timestamp is being calculated incorrectly when sending tindermail. We'd have to diff build.dat against an older copy of itself to trace that. Can I ask you do that reed ? If we use cleanup.pl then at least broken trees are resets to a working state once a day.
Just in case I don't read that newsgroup: I fairly often do care about older history (along the lines of "oh, crap, how long have we been having that error in uploading symbols and what might have happened the day it started?"), though I also quite often fail to find it because the tree was renamed, or the machine was renamed, or both, or something else that's not obvious.
Post is http://groups.google.com/group/mozilla.dev.tree-management/browse_thread/thread/61bf4585c4fafc40# (In reply to comment #6) So you'd care about the 60 day limit for logs then ? I'm suggesting limiting the waterfall to the same value.
Apparently my memory of what I've done is totally untrustworthy. Someone might care about the loss of starred builds, I guess, but what I thought I've done seems to have never been possible.
From looking at: http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaStaging&maxdate=1237930839&legend=0&norules=1 I can see these columns: Linux mozilla-1.9.1 l10n % WINNT 5.2 mozilla-central l10n % linux_l10n_nightly % macosx_l10n_nightly % win32_l10n_nightly % Now I can see more columns than I used to do when I filled the bug. Thanks for putting this into a better shape but do we know a reason of why we are missing the following builders: 1.9.1 moz-central Linux YES NO Win32 NO YES Mac NO NO BTW, I have noted that the tinderbox pages do not "show the last 12 hours" but certain amount of rows and since the L10n builds are as many as 70 it feels quite fast
There's been no feedback on the newsgroups. I think we should go ahead with modifying the cron job, with backups of */build.dat if you feel so inclined.
There are no builds on MozillaStaging right now, should at least show a 1.9.1 l10n build which started at Mon Mar 30 11:31:39 2009 (from our staging system) and finished at 11:35:45. There are no errors about sending mail in the log.
The staging environment has been reporting to MozillaTest instead of MozillaStaging. I do not know if this has been changed since last time I used it.
Looks like MozillaStaging in the TinderboxMailNotifier setups for both staging-master:/builds/buildbot/moz2-master/master-main.cfg staging-1.9-master:/builds/buildbot/staging-trunk-master/master.cfg
(In reply to comment #10) > There's been no feedback on the newsgroups. I think we should go ahead with > modifying the cron job, with backups of */build.dat if you feel so inclined. Done.
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.