Closed Bug 539588 Opened 15 years ago Closed 14 years ago

Tracking bug for getting schedulerdb working

Categories

(Release Engineering :: General, defect, P5)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: catlee)

References

Details

(Whiteboard: [q2goal][automation])

Attachments

(9 files, 1 obsolete file)

49.52 KB, patch
bhearsum
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
3.20 KB, patch
bhearsum
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
3.80 KB, patch
bhearsum
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
3.43 KB, patch
nthomas
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
1.71 KB, patch
nthomas
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
553 bytes, patch
nthomas
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
1.17 KB, patch
nthomas
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
9.31 KB, patch
nthomas
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
4.19 KB, patch
nthomas
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
Objectives:
* Get buildbot with schedulerdb working on all our masters
* Ability to have a master crash/go down, and not have the list of pending jobs lost
* Ability to have multiple masters handle changes for the same branch without duplicating effort
Depends on: 508672
Depends on: 539589
Depends on: 553300
Depends on: 556391, 556390
Blocks: 557613
Whiteboard: [q2goal][automation]
Blocks: 559880
Blocks: 559882
Blocks: 559885
Blocks: 559886
This is the beginning of the new single-directory layout.

Several symlinks need to be set up, depending on the master being created.

builder/scheduler_master.cfg -> master.cfg
staging/production_config.py -> localconfig.py
staging_builder_master_localconfig.py / staging_scheduler_mater_localconfig.py -> master_localconfig.py
Attachment #439599 - Flags: review?(bhearsum)
Attachment #439599 - Flags: review?(bhearsum) → review+
Comment on attachment 439599 [details] [diff] [review]
buildbot-configs for staging scheduler/builder masters

changeset:   2327:77d10fffd0ae
branch:      buildbot-0.8.0
Attachment #439599 - Flags: checked-in+
Attachment #442684 - Flags: review?(bhearsum)
Attachment #442684 - Flags: review?(bhearsum) → review+
Comment on attachment 442684 [details] [diff] [review]
Production configs for scheduler/builder master

changeset:   2341:c4649fd8a33d
Attachment #442684 - Flags: checked-in+
Depends on: 565427
Attachment #444981 - Flags: review?(bhearsum)
Comment on attachment 444981 [details] [diff] [review]
Sync over config changes to mozilla/

You should probably remove the talos-master.m.o:9012 from unittest_masters, as Nick suggested I do in that bug.
Attachment #444981 - Flags: review?(bhearsum) → review+
Depends on: 567192
Comment on attachment 444981 [details] [diff] [review]
Sync over config changes to mozilla/

changeset:   2444:842e9f53c3e8
Attachment #444981 - Flags: checked-in+
Comment on attachment 444981 [details] [diff] [review]
Sync over config changes to mozilla/

whoops.  this was checked in, but not as that changeset
Depends on: 568568
Depends on: 568570
Depends on: 568617
Depends on: 568848
Depends on: 569551
Depends on: 569696
No longer depends on: 568617
Attachment #449777 - Flags: review?(nrthomas)
Attachment #449778 - Flags: review?(nrthomas)
Attachment #449779 - Flags: review?(nrthomas)
Attachment #449780 - Flags: review?(nrthomas)
Comment on attachment 449777 [details] [diff] [review]
Enable mozilla-1.9.1, 1.9.2 on schedulerdb

OK, but lets watch out for 75 l10n builds grabbing all the slaves and overloading hg.m.o each time a nightly finishes.
Attachment #449777 - Flags: review?(nrthomas) → review+
Attachment #449778 - Flags: review?(nrthomas) → review+
Comment on attachment 449780 [details] [diff] [review]
Disable mozilla-1.9.2, 1.9.1 on pm,pm02

r+ if you remove SchedulerL10n which will have no builders to drive after this.
Attachment #449780 - Flags: review?(nrthomas) → review+
Comment on attachment 449779 [details] [diff] [review]
Disable mozilla-central on production-master

r+ with the l10n scheduler removed.
Attachment #449779 - Flags: review?(nrthomas) → review+
Attachment #449932 - Flags: review?(nrthomas)
Comment on attachment 449932 [details] [diff] [review]
Limit the number of l10n jobs we do to 20 per branch per master

>diff --git a/misc.py b/misc.py
>+    # Limit us to doing 20 l10n jobs per branch per master
>+    l10nLock = locks.MasterLock("%s-l10n" % name, maxCount=20)

The code changes make sense, but we could get 40 total jobs per branch once we migrate m-1.9.1/m-1.9.2/m-c to buildbot 0.8.0 ? (assuming two masters is enough). There seems to be overlap between branches in the current up-to-8-slaves setup, given nightlies fire off at 30 minute intervals, so that could go even higher. And that seems fairly hard on hg.m.o.

Are you expressing a preference for lowering the maxCount if we hit issues ?
(In reply to comment #18)
> (From update of attachment 449932 [details] [diff] [review])
> >diff --git a/misc.py b/misc.py
> >+    # Limit us to doing 20 l10n jobs per branch per master
> >+    l10nLock = locks.MasterLock("%s-l10n" % name, maxCount=20)
> 
> The code changes make sense, but we could get 40 total jobs per branch once we
> migrate m-1.9.1/m-1.9.2/m-c to buildbot 0.8.0 ? (assuming two masters is
> enough). There seems to be overlap between branches in the current
> up-to-8-slaves setup, given nightlies fire off at 30 minute intervals, so that
> could go even higher. And that seems fairly hard on hg.m.o.
> 
> Are you expressing a preference for lowering the maxCount if we hit issues ?

Will lowering this to 10 help?

Another thing we could do is to have a global variable so we limit to N per master rather than N per branch.

Taking this even further, we could create an hg lock that everything that touches hg needs to acquire before running, and then wouldn't need to limit l10n in particular.
Attachment #449935 - Flags: review?(nrthomas) → review+
Attachment #449932 - Flags: review?(nrthomas)
Comment on attachment 449932 [details] [diff] [review]
Limit the number of l10n jobs we do to 20 per branch per master

(In reply to comment #19)
> Will lowering this to 10 help?
> Another thing we could do is to have a global variable so we limit to N per
> master rather than N per branch.

Some lower value combined with a per-master limit would be better if the other approaches we discussed on irc don't work out (eg first N slaves connected to master may do l10n)

> Taking this even further, we could create an hg lock that everything that
> touches hg needs to acquire before running, and then wouldn't need to limit
> l10n in particular.

Are all hg operations of equal weight ? I suspect cloning m-c > updating m-c > cloning tools but I can't back that up </drevil>.
Attachment #449932 - Attachment is obsolete: true
Attachment #450233 - Flags: review?(nrthomas)
Comment on attachment 450233 [details] [diff] [review]
Limit ourselves to using the first 8 connected slaves for l10n jobs

Looks good, thanks for fixing this.
Attachment #450233 - Flags: review?(nrthomas) → review+
Can we have no lock for L10n release repackages and use the whole pool? (once we move that to schedulerdb)
Depends on: 572188
Comment on attachment 449935 [details] [diff] [review]
Add properties for our branch and platform to builders

768:e355a4b0422b
Attachment #449935 - Flags: checked-in+
Comment on attachment 450233 [details] [diff] [review]
Limit ourselves to using the first 8 connected slaves for l10n jobs

767:8fea408513ee
Attachment #450233 - Flags: checked-in+
Comment on attachment 449780 [details] [diff] [review]
Disable mozilla-1.9.2, 1.9.1 on pm,pm02

2512:ab150c39bacd
Attachment #449780 - Flags: checked-in+
Comment on attachment 449777 [details] [diff] [review]
Enable mozilla-1.9.1, 1.9.2 on schedulerdb

2511:512ea8b96cf4
Attachment #449777 - Flags: checked-in+
Blocks: 571571
Comment on attachment 449778 [details] [diff] [review]
Enable trunk on schedulerdb

2520:af400edd659e
Attachment #449778 - Flags: checked-in+
Comment on attachment 449779 [details] [diff] [review]
Disable mozilla-central on production-master

2521:ed1073a3ce34
Attachment #449779 - Flags: checked-in+
No longer blocks: 571571
OS: Linux → All
Hardware: x86 → All
Tracking bug -> P5
Priority: -- → P5
No longer blocks: 559880
No longer blocks: 559885
Only things left are mobile and releases, which have their own bugs.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: