Closed Bug 225735 Opened 16 years ago Closed Last year

tinderbox server chokes when client sends old, queued-up mail

Categories

(Webtools Graveyard :: Tinderbox, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: dbaron, Unassigned)

References

Details

I've observed this problem a number of times in the past few weeks (mostly on
SeaMonkey-Ports) after rebooting a tinderbox that was having network problems. 
This causes queued-up failure mail to all get flushed through to the server --
sometimes multiple days worth of mail.  This causes the showbuilds.cgi page to
go blank, and then only show information newer than when the email was flushed.
 (This is the case right now on SeaMonkey-Ports.)

I suspect the problem is something like that the tinderbox server goes backwards
through the email in the reverse order of receiving it, and once it hits a
message older than what it's supposed to be displaying, it stops.  (This fits
with the fact that adding &hours=240 or something like that to the URL makes the
problem go away, and the &hours=NNN that is required seems to match how much
queued-up mail there was that was flushed by the client.)
This shows where the problem is:

Index: tbglobals.pl
===================================================================
RCS file: /cvsroot/mozilla/webtools/tinderbox/tbglobals.pl,v
retrieving revision 1.28
diff -p -u -4 -r1.28 tbglobals.pl
--- tbglobals.pl        24 Mar 2004 03:37:38 -0000      1.28
+++ tbglobals.pl        10 Dec 2004 21:50:41 -0000
@@ -337,8 +337,12 @@ sub load_buildlog {
     if ($buildtime < $mindate - 2*60*60) {
       # Occasionally, a build might show up with a bogus time.  So,
       # we won't judge ourselves as having hit the end until we
       # hit a full 20 lines in a row that are too early.
+      # XXXldb This is the wrong way of doing things.  When a flood
+      # of backed up mail from one machine comes in, it can easily be
+      # more than 20.  We should be looking at the mail receipt time
+      # to decide when we're done rather than the build time.
       last if $tooearly++ > 20;

       next;
     }
QA Contact: timeless → tinderbox
Once the changes from bug 363102 land, then we will no longer be able to check the mail receipt/processing time.  

The $tooearly check is just an optimization.  How badly are we affected if we read in the entire build.dat on each run?  Probably too badly to seriously contemplate given the size of build.dat on my relatively small setup (102k lines, 8M). 

I'm wondering if it would be worth it to replace the simple structure of build.dat with something like sqlite.
build.dat was a great idea back when there wasn't such common solutions like sqlite so I agree we should look into it

does tinderbox have similiar code to purge build.dat of older entries during processing?
When we trim the logs via the admin interface, we also purge the old entries from build.dat.
Depends on: 415502
Assignee: mcafee → nobody
Duplicate of this bug: 485015
Product: Webtools → Webtools Graveyard
Tinderbox isn't maintained anymore. Closing.
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.