Closed Bug 539105 Opened 12 years ago Closed 12 years ago

Reduce talos load from 3.0 builds

Categories

(Release Engineering :: General, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: lsblakk, Assigned: catlee)

Details

Attachments

(2 files)

Currently the en-US builds are running continuously under the tinderbox client, this is creating a large, unnecessary workload for Talos.  Windows isn't a concern here, so just Linux and Mac building less frequently would be a big win.
How long do we want to wait after each build ? 30 mins ? 60 mins ?

The linux builder is just doing opt, no-op build is about 6 minutes.

Mac is doing opt then debug, then it would wait, then repeat from the top. No-op build times are ~ 13 and 8 minutes respectively.
Assignee: nobody → nrthomas
Status: NEW → ASSIGNED
(cc-ing beltzner, dveditz for comment)

Win32 opt builds take approx 3hrs; its so slow that, aiui, we have 2 machines for win32... one doing debug, one doing opt.

How about delaying linux, mac so they run as (in)frequently as win32, meaning run every 3 hours? That brings the number of builds for each OS closer to parity, and at the same time significantly reduces the load on Talos.
three hour turn-around is terrible and I'm never going to agree to that. Where do you see 3 hours anyway? The slow build (fx-win32-tbox Depend Nightly) takes at most 1:20, and the faster "fxdbug-win32-tb Depend Debug + Leak Test" takes about 20 minutes.

The real problem is that Talos is getting too many new builds, not that tinderbox clients are making them. Instead of slowing down the builds you could alternatively throw away the bits most of the time rather than feed Talos after every one.

If a build an hour is still too much for Talos then I'd really rather you take the second approach than delay the builds even more.
Not sure if this will work or not, haven't had time to test it.  Basically, we add a minimum time between builds in the ftp poller.  So any new builds that are detected within an hour of the last one submitted are ignored.

I need access to talos staging to test this.
(In reply to comment #4)
> Created an attachment (id=421453) [details]
> I need access to talos staging to test this.
ping?
Comment on attachment 421453 [details] [diff] [review]
add delay to ftp poller

Seems to be working in staging.
Attachment #421453 - Flags: review?(nrthomas)
Attachment #421453 - Flags: review?(anodelman)
Comment on attachment 421453 [details] [diff] [review]
add delay to ftp poller

Looks fine to me. When I looked at moving the builds to buildbot-driven it would have been quite a bit more effort than this.
Attachment #421453 - Flags: review?(nrthomas) → review+
Attachment #421453 - Flags: review?(anodelman) → review+
Assignee: nrthomas → catlee
Summary: Configure a wait-between-builds for Linux and Mac 3.0 en-US builds → Reduce talos load from 3.0 builds
Comment on attachment 421453 [details] [diff] [review]
add delay to ftp poller

changeset:   2033:8219c9dd5d55
Attachment #421453 - Flags: checked-in+
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
This is the same patch as attachment 421453 [details] [diff] [review] but applied to the talos-r3 directory, no fuzz was required. Diffing talos-pool and talos-r3 gives only sensible-looking changes in config.py from the different slave pools, and the ports in master.cfg. We'll need to add Lorentz to talos-r3 at some point though.
Attachment #428081 - Flags: review?(catlee)
Attachment #428081 - Flags: review?(catlee) → review+
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.