Closed Bug 354462 Opened 18 years ago Closed 18 years ago

process build mail in batches via cron

Categories

(Webtools Graveyard :: Tinderbox, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: justdave, Assigned: cls)

References

Details

Attachments

(1 file, 2 obsolete files)

Since we got tinderbox-stage calling buildwho.pl tonight, we've been getting periodic bounce mails with the following error in them:

buildwho.pl: Lock unavailable: Resource temporarily unavailable at ./buildwho.pl line 107.

===
sub lock_datafile {
    my ($tree) = @_;

    my $lock_fh = new FileHandle ">>$tree/buildwho.sem"
      or die "Couldn't open semaphore file!";

    # Get an exclusive lock with a non-blocking request
    unless (flock($lock_fh, LOCK_EX|LOCK_NB)) {
        die "buildwho.pl: Lock unavailable: $!";           <--- line 107
    }
    return $lock_fh;
}
===

Now, to me, this looks like "ask for a lock, and if it's not available, die instead of waiting for it".

Since the locks appear to be tree-specific, it stands to reason that the only reason a lock would fail is if another process was already running buildwho.pl on the same tree.  And if that's the case, there's no reason we really need to be running, since it's already being done.

With that in mind, it seems to me like buildwho.pl should just silently exit if it can't get a lock, and let the rest of processbuild continue on its merry way as if it had succeeded, instead of dieing and causing the mail to bounce.

Am I reading this right?
only concern i have is that the current process will probably build a who w/ outdated info, you almost want an asynchronous reaper scheduled to deal w/ this.
The current process probably hasn't been running for any longer than it took the tinderbox in question to run the entire build it's reporting on, and the who list only needs to contain changes to that tree prior to the tinderbox starting to build, I thought.
I think the lock in buildwho.pl is (possibly inadvertantly) being used to serialize the calls to the other per-build scripts (scrape.pl, warnings.pl) as well as the rebuild of the static pages.  We don't want multiple processes trying to write those files.  We really need to move mail processing into a cron job.
Attached patch v0.9 (obsolete) — Splinter Review
Here's a work in progress (wfm) to process mail via cron.  I still need to figure out how to efficiently handle warnings & log scraping.  Run handlemail.pl with --cron to spool mail into the data/ dir and call processbuild.pl --cron from a cron job (preferably as the tinderbox user) to process the mail.
Assignee: morgamic → cls
Status: NEW → ASSIGNED
Summary: buildwho.pl should exit instead of dieing if it can't get a lock → process build mail in batches via cron
Attached patch v1.0 (obsolete) — Splinter Review
Ok, I worked out the namespace tainting issues with calling warnings.pl & scrape.pl.  I've updated the INSTALL instructions.  The v0.9 patch made using cron optional and now it's mandatory.

I've also reduced the number of times buildwho.pl needs to be called.  Instead of being called for each build mail that gets processed, it only gets called once per changed tree per cron interval.  That should reduce the load on the bonsai/viewvc db quite a bit.
Attachment #241285 - Attachment is obsolete: true
Attachment #241354 - Flags: review?(bear)
Comment on attachment 241354 [details] [diff] [review]
v1.0

I've reviewed the code and run thru the install step but I don't have a live tbox that receives mail to do the final test
Attachment #241354 - Flags: review?(bear) → review+
I'll play with it on tinderbox-stage at some point this evening.
Attached patch v1.1Splinter Review
When using perl's require feature, the file is only read once regardless of if it's in a loop.  I had to add a hash to save the contents of $scrape_builds so that subsequent logs from the same tinderbox tree would be scraped.

Also, added locking around the updating of the warnings & scrape datafiles and added a few more debug messages.
Attachment #241354 - Attachment is obsolete: true
Attachment #241408 - Flags: review?(justdave)
Comment on attachment 241408 [details] [diff] [review]
v1.1

Dude, this outright rocks. :)

It's an order of magnitude performance improvement.  Stuff is showing up on tinderbox-stage before it shows up on tinderbox, despite the fact that tinderbox gets the mail first (tinderbox is re-sending the mail to tinderbox-stage), and depite tinderbox-stage having to wait for the cron job to run.

I have the cron job running once a minute.  Watching the mail queue up, I've seen it get 40 or 50 of them in the queue before the cron job fires, and it gets them all within about 5 to 10 seconds.  I betcha the setup/teardown of the perl interpreter has a lot to do with it.  Processing 40 mails from the same perl process is lots faster than 40 separate perl processes. :)
Attachment #241408 - Flags: review?(justdave) → review+
Checking in webtools/tinderbox/INSTALL;
/cvsroot/mozilla/webtools/tinderbox/INSTALL,v  <--  INSTALL
new revision: 1.5; previous revision: 1.4
done
Checking in webtools/tinderbox/buildwho.pl;
/cvsroot/mozilla/webtools/tinderbox/buildwho.pl,v  <--  buildwho.pl
new revision: 1.19; previous revision: 1.18
done
Checking in webtools/tinderbox/handlemail.pl;
/cvsroot/mozilla/webtools/tinderbox/handlemail.pl,v  <--  handlemail.pl
new revision: 1.6; previous revision: 1.5
done
Checking in webtools/tinderbox/processbuild.pl;
/cvsroot/mozilla/webtools/tinderbox/processbuild.pl,v  <--  processbuild.pl
new revision: 1.58; previous revision: 1.57
done
Checking in webtools/tinderbox/scrape.pl;
/cvsroot/mozilla/webtools/tinderbox/scrape.pl,v  <--  scrape.pl
new revision: 1.11; previous revision: 1.10
done
Checking in webtools/tinderbox/tbglobals.pl;
/cvsroot/mozilla/webtools/tinderbox/tbglobals.pl,v  <--  tbglobals.pl
new revision: 1.45; previous revision: 1.44
done
Checking in webtools/tinderbox/warnings.pl;
/cvsroot/mozilla/webtools/tinderbox/warnings.pl,v  <--  warnings.pl
new revision: 1.98; previous revision: 1.97
done
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Blocks: 359464
Blocks: 359323, 359470
Blocks: 360727
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: